Media Types - PowerPoint PPT Presentation


PPT – Media Types PowerPoint presentation | free to view - id: b1337-Y2MyY


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

Media Types


international standard used to calibrate other colour models ... Curves and curved surfaces described by parameterised polynomials. Graphics - Representation ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 67
Provided by: iana167
Tags: media | types


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Media Types

Media Types
  • Text
  • Image
  • Graphics
  • Audio
  • Video

ISO Character Sets
Marked-up Text
Structured Text
Character Operations
String Operations
Pattern-matching searching
Language-specific operations
Text - Representation
  • 7-bit code
  • 128 values in ASCII character set
  • use of 8th bit in text editors/word processors
    creates incompatibility
  • ISO character sets
  • extended ASCII to support non-English text
  • ISO Latin provides support for accented
  • à, ö, ø, etc.
  • ISO sets include Chinese, Japanese, Korean
  • 16 bit format
  • 32768 different symbols

Text - Representation
  • Marked-up text
  • nroff, troff
  • LaTEX
  • SGML
  • HTML
  • HyTime
  • Structured Text
  • structure of text represented in data structure,
    usually tree-based
  • ODA, structure embedded in byte-stream with
  • Hypertext
  • non-linear
  • graph or web structure nodes and links
  • currently subject of intensive ISO standards

Text - Operations
  • Character operations
  • basic data type with assigned value
  • permits direct character comparison (altb)
  • String operations
  • comparison
  • concatenation
  • substring extraction and manipulation
  • Editing
  • perhaps the most familiar set of operations on
  • cut/copy/paste
  • strings v. blocks, dependent on document structure

Text - Operations
  • Formatting
  • interactive or non-interactive (WYSIWYG v. LaTEX)
  • formatted output
  • bitmap
  • page description language (Postscript, PDF)
  • font management
  • typeface
  • point size (1 point 1/72 of an inch)
  • TrueType fonts geometric description kerning
  • Pattern-matching and Searching
  • search and replace
  • wildcards
  • regular expressions
  • for large bodies of text, or text databases, use
    of inverted indices, hashing techniques and

Text - Operations
  • Sorting
  • numerous varieties of sort, all of them
    extensively studied in basic programming
  • sort complexity is a major factor in data
    handling performance
  • Compression
  • ASCII uses 7 bits per character, though most
    word-processors actually use the 8th bit to use
    up a byte per character
  • Information theory estimates 1-2 bits per
    character to be sufficient for natural language
  • This redundancy can be removed by encoding
  • Huffman varies the numbers of bits used to
    represent characters, shortest codes for highest
    frequency characters
  • Lempel-Ziv identifies repeating strings and
    replaces them by pointers to a table
  • Both techniques compress English text at a ratio
    of between 21 and 31

Text - Operations
  • Encryption
  • text encryption is widely used in electronic mail
    and networked information systems
  • most widely-used techniques
  • DES
  • RSA public-key
  • PGP
  • subject of major controversy
  • key escrow systems
  • Clipper chip
  • strong encryption now being legally outlawed in
    a number of countries
  • Language-specific operations
  • spell-checking
  • parsing and grammar checking
  • style analysis

Colour Model
Alpha Channels
Number of Channels
Channel Depth
Pixel Aspect Ratio
Point operations
Geometric transformations
Image - Representation
  • Colour Model
  • 2 main types
  • colour production on output device
  • theory of human colour perception
  • CIE colour space
  • international standard used to calibrate other
    colour models
  • developed in 1931, as CIE XYZ, based on
    tristimulus theory of colour specification

Image - Representation
  • RGB
  • numeric triple specifying red, green and blue
  • convenient for video display drivers since
    numbers can be easily mapped to voltages for RGB
    guns in colour CRTs
  • HSB
  • Hue - dominant colour of sample, angular value
    varying from red to green to blue at 120
  • Saturation - the intensity of the colour
  • Brightness - the amount of gray in the colour
  • CMYK
  • displays emit light, so produce colours by adding
    red, green and blue intensities
  • paper reflects light, so to produce a colour on
    paper one uses inks that subtract all colours
    other than the one desired
  • printers use inks corresponding to the
    subtractive primaries, cyan, magenta and yellow
    (complements of RGB)

Image - Representation
  • additionally, since inks are not pure, a special
    black ink is used to give better blacks and grays
  • YUV
  • colour model used in the television industry
  • also YIQ, YCbCr, and YPbPr
  • Y represents luminance, effectively the
    black-and-white portion of a video signal
  • UV are colour difference signals, form the colour
    portion of a video signal, and are called
    chrominance or chroma
  • YUV makes efficient use of bandwidth as the human
    eye has greater sensitivity to changes in
    luminance than chrominance, so bandwidth can be
    better utilised by allocating more to luminance
    and less to chrominance
  • Alpha Channels
  • images may have one or more alpha channels
    defining regions of full or partial transparency

Image - Representation
  • can be used to store selections and to create
    masks and blends
  • Number of channels
  • the number of pieces of information associated
    with each pixel
  • usually the dimensionality of the colour model
    plus the number of alpha channels
  • Channel depth
  • number of bits-per-pixel used to encode the
    channel values
  • commonly 1,2,4 or 8 bits, less commonly 5,6,12 or
  • in a multiple channel image, different channels
    can have different depths
  • Interlacing
  • storage layout of a multiple channel image could
    separate channel values (all R values, followed
    by all G, followed by all B) or could use
    interlacing (all RGB for pixel 1, all RGB for
    pixel 2.........)

Image - Representation
  • Indexing
  • pixel colours can be represented by an index in a
    colour map or a colour lookup table (CLUT)
  • Pixel aspect ratio
  • ratio of pixel width to height
  • square pixels are simple to process, but some
    displays and scanners work with rectangular
  • if the pixel aspect ratios of an image and a
    display differ the image will appear stretched or
  • Compression
  • a page-sized 24-bit colour image produced by a
    scanner at 300dpi takes up about 20 Mbytes
  • many image formats compress pixel data, using
    run-length coding, LZW, predictive coding and
    transform coding
  • many image formats JPEG, GIF, TIFF, BMP most
    widely used

Image - Operations
  • These operations can operate directly on pixel
    data or on higher-level features such as edges,
    surfaces and volumes
  • Operations on higher-level features fall into the
    domain of image analysis and understanding and
    will not be considered here
  • Editing
  • changing individual pixels for image touch-up,
    forms the basis of airbrushing and texturing
  • cutting, copying and pasting are supported for
    groups of pixels, from simple shape manipulation
    through to more complex foreground and background
    masking and blending
  • Point operations
  • consists of applying a function to every pixel in
    an image

Image - Operations
  • only uses the pixels current value, neighbouring
    pixels cannot be used
  • Thresholding
  • a pixel is set to 1 or 0 depending on whether it
    is above or below a threshold value - creates
    binary images which are often used as masks when
  • Colour Correction
  • modifying the image to increase or reduce
    contrast, brightness, gamma effects, or to
    strengthen or weaken particular colours
  • Filtering
  • like point operations, operate on every pixel in
    an image, but use values of neighbouring pixels
    as well
  • used to blur, sharpen or distort images,
    producing a variety of special effects

Image - Operations
  • Compositing
  • the combining of two or more images to produce a
    new image
  • generally done by specifying mathematical
    relationships between the images
  • Geometric Transformations
  • basic transformations involve displacing,
    rotating, mirroring or scaling an image
  • more advanced transformations involve skewing and
    warping images
  • Conversions
  • conversions between image formats are commonplace
    and a number of p.d, shareware and commercial
    tools exist to support these
  • other forms of conversion include compression and
    decompression, changing colour models, and
    changing image depth and resolution

  • Graphics

Geometric Models
Solid Models
Physically-based Models
Empirical Models
Drawing Models
External formats for Models
Primitive Editing
Structural Editing
Graphics - Representation
  • The central notion of graphics, as opposed to
    image data, is in the rendering of graphical data
    to produce an image. A graphics type or model is
    therefore the combination of a data type plus a
    rendering operation
  • Graphics Representation
  • Please note - object in graphics modelling
    usually refers to an element of the scene being
    modelled, unless you are using object-oriented
    graphics programming
  • Geometric Models
  • consist of 2D and/or 3D geometric primitives
  • 2D primitives include lines, rectangles, ellipses
    plus more general polygons and curves
  • 3D primitives include the above plus surfaces of
    various forms. Curves and curved surfaces
    described by parameterised polynomials

Graphics - Representation
  • primitives are first described in local or object
    co-ordinates, then arranged in groups in a common
    world co-ordinate system by applying modelling
  • transformations include rotation, translation and
  • primitives can be used to build structural
    hierarchies, allowing each structure thus created
    to be broken down into lower-level structures and
    primitives (i.e. blueprinting)
  • Several standard device-independent graphics
    libraries are based on geometric modelling
  • GKS (Graphic Kernel System(ISO))
  • PHIGS (Programmers Hierarchical Interactive
    Graphic System (ISO)) - see also PHIGS and PEX
  • OpenGL - portable version of Silicon Graphics
  • Solid Models
  • Constructive Solid Geometry (CSG) solid objects
    are combined using the set operators union,
    intersection and difference.

Graphics - Representation
  • Surfaces of revolution a solid is formed by
    rotating a 2D curve about an axis in 3D space -
  • Extrusion a 2D outline is extended in 3D space
    along an arbitrary path
  • Using the above techniques will produce models
    much faster than building them up from geometric
    primitives, but rendering them will be expensive
  • Physically-based Models
  • realistic images can be produced by modelling the
    forces, stresses and strains on objects
  • when one deformable object hits another, the
    resulting shape change can be numerically
    determined from their physical properties
  • Empirical Models
  • complex natural phenomena (clouds, waves, fire,
    etc.) are difficult to describe realistically
    using geometric or solid modelling

Graphics - Representation
  • while physically based models are possible, they
    may be computationally expensive or intractable
  • the alternative is to develop models based on
    observation rather than physical laws, such
    models do not embody the underlying physical
    processes that cause these phenomena but they do
    produce realistic images
  • fractals, probabilistic graph grammars (used for
    branching plant structures) and particle
    systems(used for fires and explosions) are
    examples of empirical models
  • Drawing Models
  • describing an object in terms of drawing or
    painting actions
  • the description can be seen as a sequence of
    commands to an imaginary drawing device -
    Postscript, LOGO turtle graphics
  • External formats for Models
  • need for export/import formats between graphics
  • CGM CAD are OK. Postscript and RIB are

Graphics - Operations
  • Primitive editing
  • specifying and modifying the parameters
    associated with the model primitives
  • e.g. specify the type of a primitive and the
    vertex coordinates and surface normals
  • Structural editing
  • creating and modifying collections of primitives
  • establish spatial relationships between members
    of collections
  • Shading
  • the modelling techniques described so far have
    provided the means to specify the shape of
    objects, but shading provides further information
    for the image in describing the interaction of
    light with the object. This interaction is
    described in terms of the colour of an object,
    how it reflects light and if it transmits light

Graphics - Operations
  • several general-purpose methods exist to describe
    shading, most initially describe the surface of
    the object using meshes of small, polygonal
    surface patches
  • flat shading - each patch is given a constant
  • Gouraud shading - colour information is
    interpolated across a patch
  • Phong shading - surface normal information is
    interpolated across a patch
  • Ray tracing Radiosity - physical models of
    light behaviour are used to calculate colour
    information for each patch, giving highly
    realistic results
  • for photorealistic images extremely flexible
    shading is required, tools such as RenderMan
    actually provide programmable shaders which can
    be attached to objects, simulating different
    light effects and surface normals.
  • Mapping
  • techniques for enhancing the visual appearance of

Graphics - Operations
  • Texture mapping
  • an image, the texture map, is applied to a
  • requires a mapping from 3D surface coordinates to
    2D image coordinates, so given a point on the
    surface the image is sampled and the resulting
    value used to colour the surface at that point
  • shaders can also provide solid textures, where
    the texture is obtained from 3D rather than 2D
    space, and procedural textures, where the texture
    is calculated rather than sampled
  • Bump mapping
  • as texture mapping, but used to change the vector
    of the surface rather than the colour
  • used to describe minor surface changes such as
    scratches or scrapes
  • Displacement mapping
  • local modifications to the position of a surface
  • produces ridges or grooves

Graphics - Operations
  • Environment mapping
  • also known as reflection mapping, used to handle
    limited forms of reflection
  • more primitive technique than ray-tracing
  • Shadow mapping
  • similar to environment mapping in that it
    provides a primitive lighting effect without the
    expense of ray-tracing
  • produces shadows
  • Lighting
  • within a model, in addition to the graphics
    objects, there are lights to illuminate the
    scene. There are various forms of light source,
    each of which can be parametrically specified
  • ambient light - background lighting, comes from
    all directions with equal intensity
  • point lights - come from specific points in
    space, intensity governed by inverse square law

Graphics - Operations
  • directional lights - located at infinity in some
    direction, intensity is constant
  • spot lights - illuminating a cone-shaped volume
  • Viewing
  • to produce an image of a 3D model we require a
    transformation which projects 3D world
    coordinates onto 2D image coordinates
  • transformation applied to viewing volume, that
    part of the model that appears in the image
  • view specification consists of selecting the
    projection transformation, usually from parallel
    or perspective projections although camera
    attributes can be specified in some renderers,
    and the view volume
  • Rendering
  • rendering converts a model, including shading,
    lighting and viewing information, into an image
  • software allows selection and fine-tuning of
    control parameters

Graphics - Operations
  • output resolution - the width and height of the
    output image in pixels, and the pixel depth
  • rendering time - quick and low-quality v. slow
    and high resolution

Digital Video
Analog formats sampled
Sampling rate
Sample size and quantisation
Data rate
Frame rate
Support for interactivity
Digital Video - Representation
  • Analog formats sampled
  • Digital video frames can obtained in two ways
  • Synthesis - usually by a computer program
  • Sampling - of an analog video signal. Since
    analog video comes in various different flavours,
    according to frame rate, scan rate, composite v
    component, sampling rate and size vary.

Digital Video - Representation
  • Sampling rate
  • the value of the sampling rate determines the
    storage requirement and data transfer rate
  • the lower limit for the frequency at which to
    sample in order to faithfully reproduce the
    signal, the Nyquist rate, is twice the highest
    frequency within the signal
  • video processing is simplified if each frame and
    each scan line give rise to the same number of
    samples, requiring the sampling frequency to be
    an integer multiple of the scan rate
  • Sample size and quantisation
  • sample size is the number of bits used to
    represent sample values
  • quantisation refers to the mapping from the
    continuous range of the analog signal to discrete
    sample values
  • choice of sample size is based on
  • signal to noise ratio of sampled signal
  • sensitivity of medium used to display frames

Digital Video - Representation
  • sensitivity of the human eye
  • digital video commonly uses linear quantisation,
    where quantisation levels are evenly distributed
    over the analog range (as opposed to logarithmic
  • Data rate
  • high data rate formats can be reduced to lower
    data rates by a combination of
  • compression
  • reducing horizontal and vertical resolution
  • reducing the frame rate
  • for example
  • start with broadcast quality digital video at
  • divide the horizontal and vertical resolutions by
    2, giving VHS quality resolution
  • divide the frame rate by 2
  • compress at a rate of 101
  • data rate becomes 1Mbit/s, suitable for use on
    LANs and on optical storage devices (i.e. CD-ROM)

Digital Video - Representation
  • Frame rate
  • 25 or 30 fps equates to analog frame rate, or
    full-motion video
  • at 10-15 fps motion is less accurately depicted
    and the image flickers, but the data rate is much
  • Compression
  • we have already considered compression
    techniques, in digital video we can compare
    methods by three factors
  • Lossy v. lossless
  • Real-time compression - trade-off between
    symmetric models and asymmetric models with
    real-time decompression
  • Interframe (relative) v. Intraframe (absolute)
    compression (i.e. MPEG-1 v. Motion JPEG)
  • Support for interactivity
  • random access to frames
  • differential rate and reverse playback
  • cut and paste capability

Digital Video - Representation
  • Scalability
  • scalable video allows control over video quality,
    we can identify 2 forms
  • Transmit scalability - encoded data rate is
    chosen at compression time from a range of rates,
    governed by transmission and processing
    constraints and/or storage capacity. Currently in
    use for low rate digital video
  • Receive scalability - decoded data rate is chosen
    at decompression time to match playback
    requirements. Attractive concept but not yet
    available in current video coding standards
  • current approaches to low rate digital video
  • DVI (Digital Video Interactive) - two forms,
    Production Level Video (PLV) and Real-Time Video
    (RTV). PLV only really intended for playback, RTV
    produces poorer quality but is intended for
    compression. Both use interframe compression to
    achieve rates of 1Mbit/s, but require costly
  • MPEG-1 - 1Mbit/s

Digital Video - Representation
  • MPEG-2 - broadcast quality video at rates between
  • MPEG-4 - low data rate video
  • MPEG-7 - metadata standard for video
  • Motion JPEG
  • px64 (CCITT H.261) - intended for video
    applications using ISDN (Integrated Services
    Digital Network). Known as px64 since it produces
    rates that are multiples of ISDNs 64Kbits/s B
    channel rate. Uses similar techniques to MPEG
    but, since compressions and decompression must be
    real-time, quality tends to be poorer.
  • H.263 - based on H.261, but offers 2.5 times
    greater compression, uses MPEG-1 and MPEG-2

Digital Video - Operations
  • Storage
  • to record or playback digital video in real-time,
    the storage system must be capable of sustaining
    data transfer at the video data rate
  • 4 main forms of storage for digital video are
  • Magnetic tape - at present only magnetic tape can
    provide the vary high capacity storage required
    for digital video at practical costs ( 1 hour of
    CCIR 601 422 uses 72 Gbytes, while 1 hour of
    digital HDTV requires nearly 1 Tbyte)
  • Special purpose magnetic storage systems - useful
    for short durations of high data rate digital
    video, can be connected direct to external
    equipment and are thus useful for capture and
    editing (see diagram)
  • Video memory boards - specialist boards with
    large amounts of semiconductor memory (several
    hundred Mbytes or more), capable of storing short
    durations of uncompressed digital video, useful
    for capture and editing.

Digital Video - Operations
  • General purpose magnetic and optical storage
    systems - most low data rate video
    representations (MPEG, etc.) were designed to
    support the use of conventional storage media for
    real-time video playback. Problem is size of
    storage, even using MPEG-1 13 minutes of video
    will fill a 100Mbyte disk.
  • Retrieval
  • uses frame addressing, as in analog video, but
    there are some problems
  • low data rate formats result in variable sized
    frames, so an index giving frame offsets needs to
    be maintained to support random access
  • interframe compression techniques, i.e. MPEG,
    only code key frames independently, other frames
    are derived from these key frames. So random
    access requires to first find the nearest key
    frame and then use this to decode the desired
    frame, again using the index but enhancing it
    with key frame locations

Digital Video - Operations
  • Synchronisation
  • suffers same problems as analog video, so uses
    same techniques
  • digital video also has some additional techniques
    not available in analog video, such as changing
    resolution to maintain frame rate
  • Editing
  • 2 types
  • tape-based - same procedures as with analog
    video, except no generation loss and the players
    are on the same machine
  • nonlinear - basically a clips-library, using cut
    and paste techniques to build a video sequence
  • Mixing
  • real-time effects, such as tumbles, wipes and
    fades, are calculated in the same way as for
    analog video, in fact for the majority of such
    effects whether the original source is analog or
    digital, the effects are digitised

Digital Video - Operations
  • non-real-time effects are only possible using
    digital video, and obviate the need for
    specialist equipment, being only dependent on the
    speed of the processor and the patience of the
    user, storage considerations can be overcome with
    the use of pointers and single frame editing
  • Conversion
  • variety of formats demands conversion formats
  • real-time conversion requires specialist hardware
  • compression/decompression within a single format
    also requires specialist software/hardware

Digital Audio
Sampling frequency
Sample size and quantisation
Number of channels (tracks)
Negative samples
Effects and filtering
Digital Audio - Representation
  • Digital Audio Representation
  • 2 main areas
  • telecommunications
  • entertainment (audio CD)
  • Produced by sampling a continuous signal
    generated by a sound source. An analog-to-digital
    converter (ADC) takes as input an electrical
    signal corresponding to the sound and converts it
    into a digital data stream. The reverse process,
    to generate the sound through an amplifier and
    speakers, involves a digital-to-analog converter
  • Sampling frequency (rate)
  • sampling theory shows that a signal can be
    reproduced without error from a set of samples,
    providing the sampling frequency is at least
    twice the highest frequency present in the
    original signal

Digital Audio - Representation
  • telephone networks allocate a 3.4kHz bandwidth to
    voice-grade lines, thus a sampling rate of 8kHz
    is used for digital telecommunications
  • the human ear is sensitive to frequencies of up
    to about 20kHz, so to digitise any perceivable
    sound a sampling rate of over 40kHz is required
  • Sample size and quantisation
  • during sampling, the continuously varying
    amplitude of the analog signal is approximated by
    digital values, this introduces a quantisation
    error, being the difference between the actual
    amplitude and the digital approximation
  • quantisation error is apparent when the signal is
    reconverted to analog form as distortion, a loss
    in audio quality
  • quantisation error can be reduced by increasing
    the sample size, as allowing more bits per sample
    will improve the accuracy of the approximation

Digital Audio - Representation
  • quantisation refers to breaking the continuous
    range of the analog signal into a number of
    unique digital intervals, based on one of a
    number of schemes
  • linear quantisation - uses equally spaced
    intervals, so if the sample size is 3 bits and
    the maximum signal variation is 5.0 then the
    quantisation interval would be 0.625 units of
    signal amplitude
  • nonlinear quantisation (especially logarithmic
    quantisation) - uses non-equally spaced
    intervals, lower amplitude intervals are more
    closely spaced than higher amplitude, results in
    greater sensitivity to lower amplitude sound
    where the human ear is most sensitive
  • Number of channels (tracks)
  • speech quality audio is mono (1 track)
  • stereo audio requires 2 tracks
  • some consumer audio equipment use 4 tracks
  • professional audio equipment uses 16, 32 or more

Digital Audio - Representation
  • Interleaving
  • a multi-channel audio value can be encoded by
    interleaving channel samples or by providing
    separate streams for each channel
  • the advantage of interleaving is in
    synchronisation, and it also offers some benefits
    in storage and transmission
  • the disadvantages of interleaving are that it can
    be wasteful of space or bandwidth if not all
    channels are needed, it freezes the
    synchronisation between channels thus preventing
    temporal shifts, and it may not allow variation
    in the number of channels
  • Negative samples
  • the voltages found in analog audio signals
    alternate between positive and negative values
  • negative values can be encoded successfully for
    processing in twos complement, ones complement or
    sign-magnitude representation

Digital Audio - Representation
  • Encoding
  • encoding audio data reduces storage and
    transmission costs, and compressed audio also
    provides better quality when compared to
    uncompressed audio at the same data rate
  • 2 commonly-used methods
  • PCM (Pulse Code Modulation) - uses the fact that
    a digital signal can be formed from a series of
    pulses. PCM values are simply sequences of
    uncompressed samples, so they provide a reference
    format for comparison with more complex coding
  • ADPCM (Adaptive Delta Pulse Code Modulation) -
    reduces PCM data rate by encoding the differences
    between samples. ADPCM is widely used and is
    associated with some encoding standards, such as
    CCITT G.721.

Digital Audio - Operations
  • Storage
  • it is possible to record digital audio, even at
    the data rates of the high quality formats, on
    general purpose magnetic storage
  • theoretically, a magnetic disk with a sustainable
    transfer rate of 5 Mbytes per second could
    playback 50 channels of CD-quality digital audio.
    In practice this would not be possible without a
    highly optimised layout, but one or two channels
    are easily within the reach of small computer
  • since an hour of stereo digital audio, at the CD
    data rate, requires over half a Gigabyte of
    storage, tertiary storage in the form of DAT
    tapes, CD discs or optical disks is normally
    adopted, with the information being mounted onto
    the system manually or through a jukebox
  • Retrieval
  • need to support random access and ensure
    continuous flow of data to DAC

Digital Audio - Operations
  • portions of audio sequences, segments, are
    identified by their starting time and duration,
    these can be located is by mapping the starting
    time to a segment address, which the file system
    then maps to a physical address on disk
  • where there is no direct mapping to enable
    segment location by time code, an index of
    segments must be separately maintained
  • continuous flow of data is easy to maintain with
    a dedicated storage system, but requires careful
    control where storage is scheduled for a number
    of such tasks
  • Editing
  • as with digital video, 2 types
  • tape-based
  • disk-based
  • to avoid audible clicks when inserting one sample
    into another, cross-fades are used, where the
    amplitudes of the original segment and the
    inserted segment are added and scaled about the
    insertion point

Digital Audio - Operations
  • digital audio also supports non-destructive
    editing, where the segments of data are accessed
    through a data structure known as a play-list,
    which essentially contains a set of pointers to
    the data and details on ordering and other forms
    of edit to be performed on the data when it is
  • Effects and filtering
  • digital filtering techniques permit a number of
    effects on audio
  • Delay
  • Equalisation Normalisation
  • Noise reduction Time compression and expansion
  • Pitch shifting
  • Stereoisation
  • Acoustic environments
  • Conversion
  • one format to another (uncompressing ADPCM-gtPCM)
  • altering encoding parameters (i.e. resampling at
    lower frequency)

Operational v. Symbolic
Playback Synthesis
Editing Composition
Music - Representation
  • The existence of powerful, low-cost, digital
    signal processors mean that many computers can
    now record, generate and process music.
  • Music is also widely used in multimedia
    applications, so we require a media type for
    music to focus on the computers musical
  • Representation of Music
  • Operational v. Symbolic
  • operational representations specify exact timings
    for music and physical descriptions of the sounds
    to be produced
  • symbolic representations use descriptive
    symbolism to describe the form of the music and
    allow great freedom in the interpretation
  • both types are described as structural
    representations, since instead of representing
    music by audio samples there is information about
    the internal structure of the music

Music - Representation
  • The existence of powerful, low-cost, digital
    signal processors mean that many computers can
    now record, generate and process music.
  • Music is also widely used in multimedia
    applications, so we require a media type for
    music to focus on the computers musical
  • Representation of Music
  • Operational v. Symbolic
  • operational representations specify exact timings
    for music and physical descriptions of the sounds
    to be produced
  • symbolic representations use descriptive
    symbolism to describe the form of the music and
    allow great freedom in the interpretation
  • both types are described as structural
    representations, since instead of representing
    music by audio samples there is information about
    the internal structure of the music

Music - Representation
  • The existence of powerful, low-cost, digital
    signal processors mean that many computers can
    now record, generate and process music.
  • Music is also widely used in multimedia
    applications, so we require a media type for
    music to focus on the computers musical
  • Representation of Music
  • Operational v. Symbolic
  • operational representations specify exact timings
    for music and physical descriptions of the sounds
    to be produced
  • symbolic representations use descriptive
    symbolism to describe the form of the music and
    allow great freedom in the interpretation
  • both types are described as structural
    representations, since instead of representing
    music by audio samples there is information about
    the internal structure of the music

Music - Representation
  • To illustrate the structural representations, we
    can consider two
  • MIDI - a widely use protocol allowing the
    connection of computers and musical equipment, an
    operational representation
  • SMDL - a proposal for a standard structure for
    documents containing musical information, having
    both operational and symbolic aspects
  • MIDI
  • the Musical Instrument Digital Interface was
    developed in the early 80s by musical equipment
  • Devices
  • electronic keyboards and synthesisers
  • drum machines
  • sequencers (to record and play back MIDI
  • musiclt-gtfilm and musiclt-gtvideo synchronisation

Music - Representation
  • Connection ports
  • MIDI OUT - allows a device to send MIDI messages
    it has produced to other MIDI devices
  • MIDI IN - receives MIDI messages from other MIDI
  • MIDI THRU - repeats received messages, permitting
    daisy-chaining of MIDI devices
  • MIDI devices process MIDI messages differently,
    according to their function or to the sound
    palette used by the device, hence different
    synthesisers can produce different sounds
    supplied with the same MIDI messages
  • MIDI Concepts
  • Channel - a MIDI connection has 16 message
    channels, devices can be set to respond to all
    channels or only to specific channels
  • Key number - notes are identified by key number,
    128 compared with a standard keyboard of 88
  • Controller - 128 different controllers are
    available under the MIDI protocol, though not all
    are currently defined, changing the value of a
    controller typically alters sound production

Music - Representation
  • Patch/program - an audio palette is called a
    program or patch, a synthesiser capable of having
    a number of patches active at the same time is
    called multi-timbral
  • Polyphony - the ability of a synthesiser to play
    many notes at a time
  • Song - a recorded or preprogrammed MIDI sequence
  • Timing clock - a MIDI sequencer timestamps
    messages using a timebase measured in parts per
    quarter note (PPQ). Typical timebase values are
    24, 96 and 480 PPQ. To convert the timebase into
    actual time you use the tempo, measured in beats
    per minute (BPM) where we assume that one beat is
    equal to a quarter note. Thus if we have a tempo
    of 180 BPM, a time base of 96PPQ 1/3 x 1/96
  • MIDI synchronisation - MIDI devices can be set to
    internal synch or external synch, when set to
    internal synch a device is known as a master and
    produces a timing clock message on its MIDI OUT
    at 24PPQ which slave devices use for external
  • MTC - MIDI Time Code is used to synchronise MIDI
    with film or video, used to trigger sound effects
    or musical sequences

Music - Representation
  • MIDI Protocol
  • based on 8-bit code for messages, each message
    consists of a single command byte and possibly
    one or more data bytes (see table)
  • Channel voice messages (8c-Ec) - determine the
    actual notes played, speed of hit and release and
    the values of controllers
  • Channel mode messages (Bc, with controllers
    121-127) - selects the mode of a synthesiser,
    responding to one channel or all channels, each
    channel separately voiced or all voices used for
    one channel
  • System messages (F0-FF) - general system
    functions, timing clock, MIDI time code messages,
    system reset, start device, stop device, etc.
  • Limitations of MIDI
  • operates at 31250bps, allows 500 notes per second
    which may not be enough for complex pieces
  • limited number of channels, lack of device
    addressing and other flaws make configuring large
    MIDI networks difficult
  • device dependence of MIDI data

Music - Representation
  • SMDL
  • the Standard Music Description Language was
    developed by the MIPS committee of ANSI
  • SMDL encompasses representation of music for
    electronic dissemination and production by
    software, the representation of scores and
    musical examples in printed documents and the
    representation of musical annotation and
    attributes used for musical analysis or by music
  • SMDL is a DTD of SGML, based on a document type
    called musical works or works. Each work has 4
    hierarchically structured sections
  • core section - musical events, such as note
    sequences, which form the work
  • gestural section - performances of the core,
    which may differ in interpretation
  • visual section - displays the core in printed,
    includes formatting and lyrics
  • analytical section - allows a number of
    theoretical analyses on the core, its score and
    performances to be included in the work

Music - Operations
  • In considering music representation, we can
    recognise several advantages over audio
  • music representation will be more compact than
  • it is portable and can be synthesised with the
    fidelity and complexity appropriate to the output
    devices used
  • while digital audio suffers from inherent noise,
    musical representations are noise free
  • many operations can be performed on music that
    would be infeasible or require extensive
    processing on audio
  • Playback Synthesis
  • during audio playback, the listener has limited
    influence over the musical aspects of the
    performance, beyond changing the volume or
    processing the audio in some way. If music is
    produced by synthesis from a structural
    representation the listener can

Music - Operations
  • independently change pitch and tempo, increase
    or decrease individual instruments volumes or
    change the sounds they produce
  • musical representations offer greater potential
    for interactivity than audio
  • Timing
  • structural representation makes timing of musical
    events explicit
  • the ability to modify tempo makes it possible to
    alter the timing of groups of musical events and
    adjust the synchronisation of those events with
    other events (film, video, etc.)
  • Editing Composition
  • basic editing allows the user to modify primitive
    events and notes
  • more complex editing operations operate on
    musical aggregates (chords, bars, etc.) to permit
    phrase-repetition, melody replacement and other
    such functions
  • composition software simplifies the task of
    generating and combining or rearranging tracks,
    and prints the score

Cel models
Scene-based models
Event-based models
Key frames
Articulated objects hierarchical models
Scripting procedural models
Physically-based empirical models
Graphics operations
Motion parameter control
Animation - Representation
  • Separating animation and video follows the same
    track we took in separating image and graphic,
    based on modelling.
  • Animation types provide models which are rendered
    to produce video.
  • Animation is distinct from graphic in that it is
    time-dependent, but as in the imagelt-gtvideo
    relationship, sampling an animation model at a
    particular time will result in a graphics model,
    which can be rendered to produce an image
  • Animation Representation
  • Cel models
  • early animators drew on transparent celluloid
    sheets or cels, different sheets contained
    different parts of the scene, which was assembled
    by overlaying the sheets
  • in animation, cels are digital images with a
    transparency channel

Animation - Representation
  • scenes are rendered by drawing the cels back to
    front, with movement being added by changing the
    position of cels from one frame to the next
  • a cel model is therefore a set of images, their
    back to front order, and their relative position
    and orientation in each frame
  • Scene-based models
  • simply a sequence of graphics models, each
    representing a complete scene