Media Types presentation

About This Presentation

Transcript and Presenter's Notes

Title: Media Types

1
Media Types

Text
Image
Graphics
Audio
Video

2
Text
Representation
ASCII
ISO Character Sets
Marked-up Text
Structured Text
Hypertext
Operations
Character Operations
String Operations
Editing
Formatting
Pattern-matching searching
Sorting
Compression
Encryption
Language-specific operations
3
Text - Representation

ASCII
7-bit code
128 values in ASCII character set
use of 8th bit in text editors/word processors
creates incompatibility
ISO character sets
extended ASCII to support non-English text
ISO Latin provides support for accented
characters
à, ö, ø, etc.
ISO sets include Chinese, Japanese, Korean
Arabic
UNICODE
16 bit format
32768 different symbols

4
Text - Representation

Marked-up text
nroff, troff
LaTEX
SGML
HTML
HyTime
XML, XSL, XLL
Structured Text
structure of text represented in data structure,
usually tree-based
ODA, structure embedded in byte-stream with
content
Hypertext
non-linear
graph or web structure nodes and links
currently subject of intensive ISO standards
activity

5
Text - Operations

Character operations
basic data type with assigned value
permits direct character comparison (altb)
String operations
comparison
concatenation
substring extraction and manipulation
Editing
perhaps the most familiar set of operations on
text
cut/copy/paste
strings v. blocks, dependent on document structure

6
Text - Operations

Formatting
interactive or non-interactive (WYSIWYG v. LaTEX)
formatted output
bitmap
page description language (Postscript, PDF)
font management
typeface
point size (1 point 1/72 of an inch)
TrueType fonts geometric description kerning
Pattern-matching and Searching
search and replace
wildcards
regular expressions
for large bodies of text, or text databases, use
of inverted indices, hashing techniques and
clustering.

7
Text - Operations

Sorting
numerous varieties of sort, all of them
extensively studied in basic programming
sort complexity is a major factor in data
handling performance
Compression
ASCII uses 7 bits per character, though most
word-processors actually use the 8th bit to use
up a byte per character
Information theory estimates 1-2 bits per
character to be sufficient for natural language
text
This redundancy can be removed by encoding
Huffman varies the numbers of bits used to
represent characters, shortest codes for highest
frequency characters
Lempel-Ziv identifies repeating strings and
replaces them by pointers to a table
Both techniques compress English text at a ratio
of between 21 and 31

8
Text - Operations

Encryption
text encryption is widely used in electronic mail
and networked information systems
most widely-used techniques
DES
RSA public-key
PGP
subject of major controversy
key escrow systems
Clipper chip
strong encryption now being legally outlawed in
a number of countries
Language-specific operations
spell-checking
parsing and grammar checking
style analysis

9
Image
Representation
Colour Model
Alpha Channels
Number of Channels
Channel Depth
Interlacing
Indexing
Pixel Aspect Ratio
Compression
Operations
Editing
Point operations
Filtering
Compositing
Geometric transformations
Conversion
10
Image - Representation

Colour Model
2 main types
colour production on output device
theory of human colour perception
CIE colour space
international standard used to calibrate other
colour models
developed in 1931, as CIE XYZ, based on
tristimulus theory of colour specification

11
Image - Representation

RGB
numeric triple specifying red, green and blue
intensities
convenient for video display drivers since
numbers can be easily mapped to voltages for RGB
guns in colour CRTs
HSB
Hue - dominant colour of sample, angular value
varying from red to green to blue at 120
intervals
Saturation - the intensity of the colour
Brightness - the amount of gray in the colour
CMYK
displays emit light, so produce colours by adding
red, green and blue intensities
paper reflects light, so to produce a colour on
paper one uses inks that subtract all colours
other than the one desired
printers use inks corresponding to the
subtractive primaries, cyan, magenta and yellow
(complements of RGB)

12
Image - Representation

additionally, since inks are not pure, a special
black ink is used to give better blacks and grays
YUV
colour model used in the television industry
also YIQ, YCbCr, and YPbPr
Y represents luminance, effectively the
black-and-white portion of a video signal
UV are colour difference signals, form the colour
portion of a video signal, and are called
chrominance or chroma
YUV makes efficient use of bandwidth as the human
eye has greater sensitivity to changes in
luminance than chrominance, so bandwidth can be
better utilised by allocating more to luminance
and less to chrominance
Alpha Channels
images may have one or more alpha channels
defining regions of full or partial transparency

13
Image - Representation

can be used to store selections and to create
masks and blends
Number of channels
the number of pieces of information associated
with each pixel
usually the dimensionality of the colour model
plus the number of alpha channels
Channel depth
number of bits-per-pixel used to encode the
channel values
commonly 1,2,4 or 8 bits, less commonly 5,6,12 or
16bits
in a multiple channel image, different channels
can have different depths
Interlacing
storage layout of a multiple channel image could
separate channel values (all R values, followed
by all G, followed by all B) or could use
interlacing (all RGB for pixel 1, all RGB for
pixel 2.........)

14
Image - Representation

Indexing
pixel colours can be represented by an index in a
colour map or a colour lookup table (CLUT)
Pixel aspect ratio
ratio of pixel width to height
square pixels are simple to process, but some
displays and scanners work with rectangular
pixels
if the pixel aspect ratios of an image and a
display differ the image will appear stretched or
squeezed
Compression
a page-sized 24-bit colour image produced by a
scanner at 300dpi takes up about 20 Mbytes
many image formats compress pixel data, using
run-length coding, LZW, predictive coding and
transform coding
many image formats JPEG, GIF, TIFF, BMP most
widely used

15
Image - Operations

These operations can operate directly on pixel
data or on higher-level features such as edges,
surfaces and volumes
Operations on higher-level features fall into the
domain of image analysis and understanding and
will not be considered here
Editing
changing individual pixels for image touch-up,
forms the basis of airbrushing and texturing
cutting, copying and pasting are supported for
groups of pixels, from simple shape manipulation
through to more complex foreground and background
masking and blending
Point operations
consists of applying a function to every pixel in
an image

16
Image - Operations

only uses the pixels current value, neighbouring
pixels cannot be used
Thresholding
a pixel is set to 1 or 0 depending on whether it
is above or below a threshold value - creates
binary images which are often used as masks when
compositing
Colour Correction
modifying the image to increase or reduce
contrast, brightness, gamma effects, or to
strengthen or weaken particular colours
Filtering
like point operations, operate on every pixel in
an image, but use values of neighbouring pixels
as well
used to blur, sharpen or distort images,
producing a variety of special effects

17
Image - Operations

Compositing
the combining of two or more images to produce a
new image
generally done by specifying mathematical
relationships between the images
Geometric Transformations
basic transformations involve displacing,
rotating, mirroring or scaling an image
more advanced transformations involve skewing and
warping images
Conversions
conversions between image formats are commonplace
and a number of p.d, shareware and commercial
tools exist to support these
other forms of conversion include compression and
decompression, changing colour models, and
changing image depth and resolution

Graphics

Representation
Geometric Models
Solid Models
Physically-based Models
Empirical Models
Drawing Models
External formats for Models
Operations
Primitive Editing
Structural Editing
Shading
Mapping
Lighting
Viewing
Rendering
19
Graphics - Representation

The central notion of graphics, as opposed to
image data, is in the rendering of graphical data
to produce an image. A graphics type or model is
therefore the combination of a data type plus a
rendering operation
Graphics Representation
Please note - object in graphics modelling
usually refers to an element of the scene being
modelled, unless you are using object-oriented
graphics programming
Geometric Models
consist of 2D and/or 3D geometric primitives
2D primitives include lines, rectangles, ellipses
plus more general polygons and curves
3D primitives include the above plus surfaces of
various forms. Curves and curved surfaces
described by parameterised polynomials

20
Graphics - Representation

primitives are first described in local or object
co-ordinates, then arranged in groups in a common
world co-ordinate system by applying modelling
transformations
transformations include rotation, translation and
scaling
primitives can be used to build structural
hierarchies, allowing each structure thus created
to be broken down into lower-level structures and
primitives (i.e. blueprinting)
Several standard device-independent graphics
libraries are based on geometric modelling
GKS (Graphic Kernel System(ISO))
PHIGS (Programmers Hierarchical Interactive
Graphic System (ISO)) - see also PHIGS and PEX
OpenGL - portable version of Silicon Graphics
library
Solid Models
Constructive Solid Geometry (CSG) solid objects
are combined using the set operators union,
intersection and difference.

21
Graphics - Representation

Surfaces of revolution a solid is formed by
rotating a 2D curve about an axis in 3D space -
lathing
Extrusion a 2D outline is extended in 3D space
along an arbitrary path
Using the above techniques will produce models
much faster than building them up from geometric
primitives, but rendering them will be expensive
Physically-based Models
realistic images can be produced by modelling the
forces, stresses and strains on objects
when one deformable object hits another, the
resulting shape change can be numerically
determined from their physical properties
Empirical Models
complex natural phenomena (clouds, waves, fire,
etc.) are difficult to describe realistically
using geometric or solid modelling

22
Graphics - Representation

while physically based models are possible, they
may be computationally expensive or intractable
the alternative is to develop models based on
observation rather than physical laws, such
models do not embody the underlying physical
processes that cause these phenomena but they do
produce realistic images
fractals, probabilistic graph grammars (used for
branching plant structures) and particle
systems(used for fires and explosions) are
examples of empirical models
Drawing Models
describing an object in terms of drawing or
painting actions
the description can be seen as a sequence of
commands to an imaginary drawing device -
Postscript, LOGO turtle graphics
External formats for Models
need for export/import formats between graphics
packages
CGM CAD are OK. Postscript and RIB are
render-only

23
Graphics - Operations

Primitive editing
specifying and modifying the parameters
associated with the model primitives
e.g. specify the type of a primitive and the
vertex coordinates and surface normals
Structural editing
creating and modifying collections of primitives
establish spatial relationships between members
of collections
Shading
the modelling techniques described so far have
provided the means to specify the shape of
objects, but shading provides further information
for the image in describing the interaction of
light with the object. This interaction is
described in terms of the colour of an object,
how it reflects light and if it transmits light

24
Graphics - Operations

several general-purpose methods exist to describe
shading, most initially describe the surface of
the object using meshes of small, polygonal
surface patches
flat shading - each patch is given a constant
colour
Gouraud shading - colour information is
interpolated across a patch
Phong shading - surface normal information is
interpolated across a patch
Ray tracing Radiosity - physical models of
light behaviour are used to calculate colour
information for each patch, giving highly
realistic results
for photorealistic images extremely flexible
shading is required, tools such as RenderMan
actually provide programmable shaders which can
be attached to objects, simulating different
light effects and surface normals.
Mapping
techniques for enhancing the visual appearance of
objects

25
Graphics - Operations

Texture mapping
an image, the texture map, is applied to a
surface
requires a mapping from 3D surface coordinates to
2D image coordinates, so given a point on the
surface the image is sampled and the resulting
value used to colour the surface at that point
shaders can also provide solid textures, where
the texture is obtained from 3D rather than 2D
space, and procedural textures, where the texture
is calculated rather than sampled
Bump mapping
as texture mapping, but used to change the vector
of the surface rather than the colour
used to describe minor surface changes such as
scratches or scrapes
Displacement mapping
local modifications to the position of a surface
produces ridges or grooves

26
Graphics - Operations

Environment mapping
also known as reflection mapping, used to handle
limited forms of reflection
more primitive technique than ray-tracing
Shadow mapping
similar to environment mapping in that it
provides a primitive lighting effect without the
expense of ray-tracing
produces shadows
Lighting
within a model, in addition to the graphics
objects, there are lights to illuminate the
scene. There are various forms of light source,
each of which can be parametrically specified
ambient light - background lighting, comes from
all directions with equal intensity
point lights - come from specific points in
space, intensity governed by inverse square law

27
Graphics - Operations

directional lights - located at infinity in some
direction, intensity is constant
spot lights - illuminating a cone-shaped volume
Viewing
to produce an image of a 3D model we require a
transformation which projects 3D world
coordinates onto 2D image coordinates
transformation applied to viewing volume, that
part of the model that appears in the image
view specification consists of selecting the
projection transformation, usually from parallel
or perspective projections although camera
attributes can be specified in some renderers,
and the view volume
Rendering
rendering converts a model, including shading,
lighting and viewing information, into an image
software allows selection and fine-tuning of
control parameters

28
Graphics - Operations

output resolution - the width and height of the
output image in pixels, and the pixel depth
rendering time - quick and low-quality v. slow
and high resolution

29
Digital Video
Representation
Analog formats sampled
Sampling rate
Sample size and quantisation
Data rate
Frame rate
Compression
Support for interactivity
Scalability
Operations
Storage
Retrieval
Synchronisation
Editing
Mixing
Conversion
30
Digital Video - Representation

Analog formats sampled
Digital video frames can obtained in two ways
Synthesis - usually by a computer program
Sampling - of an analog video signal. Since
analog video comes in various different flavours,
according to frame rate, scan rate, composite v
component, sampling rate and size vary.

31
Digital Video - Representation

Sampling rate
the value of the sampling rate determines the
storage requirement and data transfer rate
the lower limit for the frequency at which to
sample in order to faithfully reproduce the
signal, the Nyquist rate, is twice the highest
frequency within the signal
video processing is simplified if each frame and
each scan line give rise to the same number of
samples, requiring the sampling frequency to be
an integer multiple of the scan rate
Sample size and quantisation
sample size is the number of bits used to
represent sample values
quantisation refers to the mapping from the
continuous range of the analog signal to discrete
sample values
choice of sample size is based on
signal to noise ratio of sampled signal
sensitivity of medium used to display frames

32
Digital Video - Representation

sensitivity of the human eye
digital video commonly uses linear quantisation,
where quantisation levels are evenly distributed
over the analog range (as opposed to logarithmic
quantisation)
Data rate
high data rate formats can be reduced to lower
data rates by a combination of
compression
reducing horizontal and vertical resolution
reducing the frame rate
for example
start with broadcast quality digital video at
10Mbytes/s
divide the horizontal and vertical resolutions by
2, giving VHS quality resolution
divide the frame rate by 2
compress at a rate of 101
data rate becomes 1Mbit/s, suitable for use on
LANs and on optical storage devices (i.e. CD-ROM)

33
Digital Video - Representation

Frame rate
25 or 30 fps equates to analog frame rate, or
full-motion video
at 10-15 fps motion is less accurately depicted
and the image flickers, but the data rate is much
reduced
Compression
we have already considered compression
techniques, in digital video we can compare
methods by three factors
Lossy v. lossless
Real-time compression - trade-off between
symmetric models and asymmetric models with
real-time decompression
Interframe (relative) v. Intraframe (absolute)
compression (i.e. MPEG-1 v. Motion JPEG)
Support for interactivity
random access to frames
differential rate and reverse playback
cut and paste capability

34
Digital Video - Representation

Scalability
scalable video allows control over video quality,
we can identify 2 forms
Transmit scalability - encoded data rate is
chosen at compression time from a range of rates,
governed by transmission and processing
constraints and/or storage capacity. Currently in
use for low rate digital video
Receive scalability - decoded data rate is chosen
at decompression time to match playback
requirements. Attractive concept but not yet
available in current video coding standards
current approaches to low rate digital video
include
DVI (Digital Video Interactive) - two forms,
Production Level Video (PLV) and Real-Time Video
(RTV). PLV only really intended for playback, RTV
produces poorer quality but is intended for
compression. Both use interframe compression to
achieve rates of 1Mbit/s, but require costly
hardware.
MPEG-1 - 1Mbit/s

35
Digital Video - Representation

MPEG-2 - broadcast quality video at rates between
2-15Mbit/s
MPEG-4 - low data rate video
MPEG-7 - metadata standard for video
representation
Motion JPEG
px64 (CCITT H.261) - intended for video
applications using ISDN (Integrated Services
Digital Network). Known as px64 since it produces
rates that are multiples of ISDNs 64Kbits/s B
channel rate. Uses similar techniques to MPEG
but, since compressions and decompression must be
real-time, quality tends to be poorer.
H.263 - based on H.261, but offers 2.5 times
greater compression, uses MPEG-1 and MPEG-2
techniques.

36
Digital Video - Operations

Storage
to record or playback digital video in real-time,
the storage system must be capable of sustaining
data transfer at the video data rate
4 main forms of storage for digital video are
Magnetic tape - at present only magnetic tape can
provide the vary high capacity storage required
for digital video at practical costs ( 1 hour of
CCIR 601 422 uses 72 Gbytes, while 1 hour of
digital HDTV requires nearly 1 Tbyte)
Special purpose magnetic storage systems - useful
for short durations of high data rate digital
video, can be connected direct to external
equipment and are thus useful for capture and
editing (see diagram)
Video memory boards - specialist boards with
large amounts of semiconductor memory (several
hundred Mbytes or more), capable of storing short
durations of uncompressed digital video, useful
for capture and editing.

37
Digital Video - Operations

General purpose magnetic and optical storage
systems - most low data rate video
representations (MPEG, etc.) were designed to
support the use of conventional storage media for
real-time video playback. Problem is size of
storage, even using MPEG-1 13 minutes of video
will fill a 100Mbyte disk.
Retrieval
uses frame addressing, as in analog video, but
there are some problems
low data rate formats result in variable sized
frames, so an index giving frame offsets needs to
be maintained to support random access
interframe compression techniques, i.e. MPEG,
only code key frames independently, other frames
are derived from these key frames. So random
access requires to first find the nearest key
frame and then use this to decode the desired
frame, again using the index but enhancing it
with key frame locations

38
Digital Video - Operations

Synchronisation
suffers same problems as analog video, so uses
same techniques
digital video also has some additional techniques
not available in analog video, such as changing
resolution to maintain frame rate
Editing
2 types
tape-based - same procedures as with analog
video, except no generation loss and the players
are on the same machine
nonlinear - basically a clips-library, using cut
and paste techniques to build a video sequence
Mixing
real-time effects, such as tumbles, wipes and
fades, are calculated in the same way as for
analog video, in fact for the majority of such
effects whether the original source is analog or
digital, the effects are digitised

39
Digital Video - Operations

non-real-time effects are only possible using
digital video, and obviate the need for
specialist equipment, being only dependent on the
speed of the processor and the patience of the
user, storage considerations can be overcome with
the use of pointers and single frame editing
Conversion
variety of formats demands conversion formats
real-time conversion requires specialist hardware
compression/decompression within a single format
also requires specialist software/hardware

40
Digital Audio
Representation
Sampling frequency
Sample size and quantisation
Number of channels (tracks)
Interleaving
Negative samples
Encoding
Operations
Storage
Retrieval
Editing
Effects and filtering
Conversion
41
Digital Audio - Representation

Digital Audio Representation
2 main areas
telecommunications
entertainment (audio CD)
Produced by sampling a continuous signal
generated by a sound source. An analog-to-digital
converter (ADC) takes as input an electrical
signal corresponding to the sound and converts it
into a digital data stream. The reverse process,
to generate the sound through an amplifier and
speakers, involves a digital-to-analog converter
(DAC)
Sampling frequency (rate)
sampling theory shows that a signal can be
reproduced without error from a set of samples,
providing the sampling frequency is at least
twice the highest frequency present in the
original signal

42
Digital Audio - Representation

telephone networks allocate a 3.4kHz bandwidth to
voice-grade lines, thus a sampling rate of 8kHz
is used for digital telecommunications
the human ear is sensitive to frequencies of up
to about 20kHz, so to digitise any perceivable
sound a sampling rate of over 40kHz is required
Sample size and quantisation
during sampling, the continuously varying
amplitude of the analog signal is approximated by
digital values, this introduces a quantisation
error, being the difference between the actual
amplitude and the digital approximation
quantisation error is apparent when the signal is
reconverted to analog form as distortion, a loss
in audio quality
quantisation error can be reduced by increasing
the sample size, as allowing more bits per sample
will improve the accuracy of the approximation

43
Digital Audio - Representation

quantisation refers to breaking the continuous
range of the analog signal into a number of
unique digital intervals, based on one of a
number of schemes
linear quantisation - uses equally spaced
intervals, so if the sample size is 3 bits and
the maximum signal variation is 5.0 then the
quantisation interval would be 0.625 units of
signal amplitude
nonlinear quantisation (especially logarithmic
quantisation) - uses non-equally spaced
intervals, lower amplitude intervals are more
closely spaced than higher amplitude, results in
greater sensitivity to lower amplitude sound
where the human ear is most sensitive
Number of channels (tracks)
speech quality audio is mono (1 track)
stereo audio requires 2 tracks
some consumer audio equipment use 4 tracks
(quadrophonic)
professional audio equipment uses 16, 32 or more

44
Digital Audio - Representation

Interleaving
a multi-channel audio value can be encoded by
interleaving channel samples or by providing
separate streams for each channel
the advantage of interleaving is in
synchronisation, and it also offers some benefits
in storage and transmission
the disadvantages of interleaving are that it can
be wasteful of space or bandwidth if not all
channels are needed, it freezes the
synchronisation between channels thus preventing
temporal shifts, and it may not allow variation
in the number of channels
Negative samples
the voltages found in analog audio signals
alternate between positive and negative values
negative values can be encoded successfully for
processing in twos complement, ones complement or
sign-magnitude representation

45
Digital Audio - Representation

Encoding
encoding audio data reduces storage and
transmission costs, and compressed audio also
provides better quality when compared to
uncompressed audio at the same data rate
2 commonly-used methods
PCM (Pulse Code Modulation) - uses the fact that
a digital signal can be formed from a series of
pulses. PCM values are simply sequences of
uncompressed samples, so they provide a reference
format for comparison with more complex coding
methods
ADPCM (Adaptive Delta Pulse Code Modulation) -
reduces PCM data rate by encoding the differences
between samples. ADPCM is widely used and is
associated with some encoding standards, such as
CCITT G.721.

46
Digital Audio - Operations

Storage
it is possible to record digital audio, even at
the data rates of the high quality formats, on
general purpose magnetic storage
theoretically, a magnetic disk with a sustainable
transfer rate of 5 Mbytes per second could
playback 50 channels of CD-quality digital audio.
In practice this would not be possible without a
highly optimised layout, but one or two channels
are easily within the reach of small computer
systems
since an hour of stereo digital audio, at the CD
data rate, requires over half a Gigabyte of
storage, tertiary storage in the form of DAT
tapes, CD discs or optical disks is normally
adopted, with the information being mounted onto
the system manually or through a jukebox
Retrieval
need to support random access and ensure
continuous flow of data to DAC

47
Digital Audio - Operations

portions of audio sequences, segments, are
identified by their starting time and duration,
these can be located is by mapping the starting
time to a segment address, which the file system
then maps to a physical address on disk
where there is no direct mapping to enable
segment location by time code, an index of
segments must be separately maintained
continuous flow of data is easy to maintain with
a dedicated storage system, but requires careful
control where storage is scheduled for a number
of such tasks
Editing
as with digital video, 2 types
tape-based
disk-based
to avoid audible clicks when inserting one sample
into another, cross-fades are used, where the
amplitudes of the original segment and the
inserted segment are added and scaled about the
insertion point

48
Digital Audio - Operations

digital audio also supports non-destructive
editing, where the segments of data are accessed
through a data structure known as a play-list,
which essentially contains a set of pointers to
the data and details on ordering and other forms
of edit to be performed on the data when it is
joined
Effects and filtering
digital filtering techniques permit a number of
effects on audio
Delay
Equalisation Normalisation
Noise reduction Time compression and expansion
Pitch shifting
Stereoisation
Acoustic environments
Conversion
one format to another (uncompressing ADPCM-gtPCM)
altering encoding parameters (i.e. resampling at
lower frequency)

49
Music
Representation
Operational v. Symbolic
MIDI
SMDL
Operations
Playback Synthesis
Timing
Editing Composition
50
Music - Representation

The existence of powerful, low-cost, digital
signal processors mean that many computers can
now record, generate and process music.
Music is also widely used in multimedia
applications, so we require a media type for
music to focus on the computers musical
capabilities.
Representation of Music
Operational v. Symbolic
operational representations specify exact timings
for music and physical descriptions of the sounds
to be produced
symbolic representations use descriptive
symbolism to describe the form of the music and
allow great freedom in the interpretation
both types are described as structural
representations, since instead of representing
music by audio samples there is information about
the internal structure of the music

51
Music - Representation

The existence of powerful, low-cost, digital
signal processors mean that many computers can
now record, generate and process music.
Music is also widely used in multimedia
applications, so we require a media type for
music to focus on the computers musical
capabilities.
Representation of Music
Operational v. Symbolic
operational representations specify exact timings
for music and physical descriptions of the sounds
to be produced
symbolic representations use descriptive
symbolism to describe the form of the music and
allow great freedom in the interpretation
both types are described as structural
representations, since instead of representing
music by audio samples there is information about
the internal structure of the music

52
Music - Representation

The existence of powerful, low-cost, digital
signal processors mean that many computers can
now record, generate and process music.
Music is also widely used in multimedia
applications, so we require a media type for
music to focus on the computers musical
capabilities.
Representation of Music
Operational v. Symbolic
operational representations specify exact timings
for music and physical descriptions of the sounds
to be produced
symbolic representations use descriptive
symbolism to describe the form of the music and
allow great freedom in the interpretation
both types are described as structural
representations, since instead of representing
music by audio samples there is information about
the internal structure of the music

53
Music - Representation

To illustrate the structural representations, we
can consider two
MIDI - a widely use protocol allowing the
connection of computers and musical equipment, an
operational representation
SMDL - a proposal for a standard structure for
documents containing musical information, having
both operational and symbolic aspects
MIDI
the Musical Instrument Digital Interface was
developed in the early 80s by musical equipment
makers
Devices
electronic keyboards and synthesisers
drum machines
sequencers (to record and play back MIDI
messages)
musiclt-gtfilm and musiclt-gtvideo synchronisation
equipment

54
Music - Representation

Connection ports
MIDI OUT - allows a device to send MIDI messages
it has produced to other MIDI devices
MIDI IN - receives MIDI messages from other MIDI
devices
MIDI THRU - repeats received messages, permitting
daisy-chaining of MIDI devices
MIDI devices process MIDI messages differently,
according to their function or to the sound
palette used by the device, hence different
synthesisers can produce different sounds
supplied with the same MIDI messages
MIDI Concepts
Channel - a MIDI connection has 16 message
channels, devices can be set to respond to all
channels or only to specific channels
Key number - notes are identified by key number,
128 compared with a standard keyboard of 88
Controller - 128 different controllers are
available under the MIDI protocol, though not all
are currently defined, changing the value of a
controller typically alters sound production

55
Music - Representation

Patch/program - an audio palette is called a
program or patch, a synthesiser capable of having
a number of patches active at the same time is
called multi-timbral
Polyphony - the ability of a synthesiser to play
many notes at a time
Song - a recorded or preprogrammed MIDI sequence
Timing clock - a MIDI sequencer timestamps
messages using a timebase measured in parts per
quarter note (PPQ). Typical timebase values are
24, 96 and 480 PPQ. To convert the timebase into
actual time you use the tempo, measured in beats
per minute (BPM) where we assume that one beat is
equal to a quarter note. Thus if we have a tempo
of 180 BPM, a time base of 96PPQ 1/3 x 1/96
3.47ms
MIDI synchronisation - MIDI devices can be set to
internal synch or external synch, when set to
internal synch a device is known as a master and
produces a timing clock message on its MIDI OUT
at 24PPQ which slave devices use for external
synch
MTC - MIDI Time Code is used to synchronise MIDI
with film or video, used to trigger sound effects
or musical sequences

56
Music - Representation

MIDI Protocol
based on 8-bit code for messages, each message
consists of a single command byte and possibly
one or more data bytes (see table)
Channel voice messages (8c-Ec) - determine the
actual notes played, speed of hit and release and
the values of controllers
Channel mode messages (Bc, with controllers
121-127) - selects the mode of a synthesiser,
responding to one channel or all channels, each
channel separately voiced or all voices used for
one channel
System messages (F0-FF) - general system
functions, timing clock, MIDI time code messages,
system reset, start device, stop device, etc.
Limitations of MIDI
operates at 31250bps, allows 500 notes per second
which may not be enough for complex pieces
limited number of channels, lack of device
addressing and other flaws make configuring large
MIDI networks difficult
device dependence of MIDI data

57
Music - Representation

SMDL
the Standard Music Description Language was
developed by the MIPS committee of ANSI
SMDL encompasses representation of music for
electronic dissemination and production by
software, the representation of scores and
musical examples in printed documents and the
representation of musical annotation and
attributes used for musical analysis or by music
databases
SMDL is a DTD of SGML, based on a document type
called musical works or works. Each work has 4
hierarchically structured sections
core section - musical events, such as note
sequences, which form the work
gestural section - performances of the core,
which may differ in interpretation
visual section - displays the core in printed,
includes formatting and lyrics
analytical section - allows a number of
theoretical analyses on the core, its score and
performances to be included in the work

58
Music - Operations

In considering music representation, we can
recognise several advantages over audio
music representation will be more compact than
audio
it is portable and can be synthesised with the
fidelity and complexity appropriate to the output
devices used
while digital audio suffers from inherent noise,
musical representations are noise free
many operations can be performed on music that
would be infeasible or require extensive
processing on audio
Playback Synthesis
during audio playback, the listener has limited
influence over the musical aspects of the
performance, beyond changing the volume or
processing the audio in some way. If music is
produced by synthesis from a structural
representation the listener can

59
Music - Operations

independently change pitch and tempo, increase
or decrease individual instruments volumes or
change the sounds they produce
musical representations offer greater potential
for interactivity than audio
Timing
structural representation makes timing of musical
events explicit
the ability to modify tempo makes it possible to
alter the timing of groups of musical events and
adjust the synchronisation of those events with
other events (film, video, etc.)
Editing Composition
basic editing allows the user to modify primitive
events and notes
more complex editing operations operate on
musical aggregates (chords, bars, etc.) to permit
phrase-repetition, melody replacement and other
such functions
composition software simplifies the task of
generating and combining or rearranging tracks,
and prints the score

60
Animation
Representation
Cel models
Scene-based models
Event-based models
Key frames
Articulated objects hierarchical models
Scripting procedural models
Physically-based empirical models
Operations
Graphics operations
Motion parameter control
Rendering
Playback
61
Animation - Representation

Separating animation and video follows the same
track we took in separating image and graphic,
based on modelling.
Animation types provide models which are rendered
to produce video.
Animation is distinct from graphic in that it is
time-dependent, but as in the imagelt-gtvideo
relationship, sampling an animation model at a
particular time will result in a graphics model,
which can be rendered to produce an image
Animation Representation
Cel models
early animators drew on transparent celluloid
sheets or cels, different sheets contained
different parts of the scene, which was assembled
by overlaying the sheets
in animation, cels are digital images with a
transparency channel

62
Animation - Representation

scenes are rendered by drawing the cels back to
front, with movement being added by changing the
position of cels from one frame to the next
a cel model is therefore a set of images, their
back to front order, and their relative position
and orientation in each frame
Scene-based models
simply a sequence of graphics models, each
representing a complete scene

Write a Comment

User Comments (0)

About PowerShow.com

Media Types PowerPoint PPT Presentation