Title: Video Coding
1Video Coding
- TSBK01 Image Coding and Data Compression
- Lecture 10
- Jörgen Ahlberg
2Outline
- Colour coding
- Moving images From 2D to 3D?
- Hybrid coding
- Video coding standards
3Part IColour Coding
- The base colours of colour television are
- Red 700 nm
- Green 546 nm
- Blue 435 nm
Three base colours enough tosynthesize any
visible colour!
4The Colour Vector
5The PAL colours
R
Y
Matrix
G
R-Y
B
B-Y
- Y 0.30B 0.59G 0.11B
- Cr 0.70R - 0.59G - 0.11B
- Cb - 0.30R - 0.59G 0.89B
- Y luminance Cr, Cb chrominance
6Digital Colour Coding
- Change basis to YUV (almost the same as YCrCb).
- For more info on color spaces, see colour FAQ at
www.poynton.com/Poynton-color.html - The Human Visual System perceives the luminance
in higher resolution than the chrominance! - Subsample the colour components.
7Part IICoding of Moving Images
- Principle I - Extend known methods to 3D
Coding Method Prestanda (bpp) Complexity Decoding complexity
PCM 6 8 Low Low
VQ 0.5 2 Very high Low
Predictive 2 5 Low Low
Transform 0.5 1.5 High High
Subband/Wavelet 0.1 1.0 High High
Fractal 0.1 - 0.5 Very high Low
8Extending 2D Methods
- Predictive coding
- 3D predictors
- Motion compensated predictors
- Transform coding
- 3D transforms
- Subband coding
- 3D subband filters
- BUT! The properties of the image signal are
different in the temporal and the spatial domain!
9Thus
Principle II Hybrid methods
Hybrid predictive/transform coding popular
10Part IIIHybrid Coding
- Combine predictive coding and transform coding.
- Use predictive coding to predict the next frame
in the sequence. - Use transform coding to code the prediction error.
11Transform Coding
T TransformQ QuantizerVLC Variable Length
Coder
12Predictive Coding
Q
VLC
Q-1
P
Q QuantizerQ-1 Inverse quantizer
(reconstructor)P Predictor
13Hybrid Coding
T
Q
VLC
Q-1
T-1
P
14Frame Prediction
Better prediction if it can compensate for
motion!
15Motion Compensation
16Motion Compensated Hybrid Coding
17Motion Compensation
- Typically one motion vector per macroblock (4
transform blocks) - Motion estimation is a time consuming process
- Hierarchical motion estimation
- Maximum length of motion vectors
- Clever search strategies
- Motion vector accuracy
- Integer, half or quarter pixel
- Bilinear interpolation
18Part IVVideo Coding Standards
Mobilevideophone
Videophoneover PSTN
ISDNvideophone
Digital TV
HDTV
Video CD
MPEG-4
MPEG-1
MPEG-2
H.261
H.263
19Standards
- H.26x
- Standards for real time communication like video
telephony and video conferencing. - Standardized by ITU.
- MPEG
- Standards for stored video data like movies on
CDs, DVDs, etc. - Standardized by ISO.
20H.261
- Standard for ISDN picture phones in 1990.
- Motion compensation
- One motion vector per macroblock.
- One macroblock four 88 luminance blocks two
chrominance blocks (one U and one V). - Motion vectors max 15 pixels long in each
direction. - Format
- CIF (352288) or QCIF (176144)
- 7.5 30 frames/s.
- Bitrate Multiple of 64 kbit/s (ISDN) including
audio. - Quality Acceptable for small motion at 128
kbit/s.
21H.263
- Standard for picture telephones over analog
subscriber lines in 1995. - Format
- CIF, QCIF or Sub-QCIF.
- Usually less than 10 frames/s.
- Bitrate Typically 20 30 kbit/s.
- Quality With new options as good as H.261 (at
half the bitrate).
22MPEG
- Moving Pictures Expert Group a committee under
ISO and IEC. - Original plan
- MPEG-1 for 1.5 Mbit/s (VideoCD)
- MPEG-2 for 10 Mbit/s (Digital TV)
- MPEG-3 for 40 Mbit/s (HDTV)
- What happened
- MPEG-1 for 1.5 Mbit/s (Video CD)
- MPEG-2 for 2 60 Mbit/s (TV and HDTV)
- MPEG-4, -7 and -21 for other things.
23MPEG-1
- ISO/IEC standard in 1991.
- Target bitrate around 1.5 Mbit/s (Video CD).
- Properties
- Bi-directionally predictively coded frames
(B-frames, see next slide). - More flexible than H.261.
- Almost JPEG for intra frames.
- Format
- CIF
- No interlace.
- 24 30 frames/s.
24MPEG Frame Types
25MPEG-coding of I-frames
- Intracoded
- 88 DCT
- Arbitrary weighting matrix for coefficients
- Predictive coding of DC-coefficients
- Uniform quantization
- Zig-zag, run-level, entropy coding
26MPEG-coding of P-frames
- Motion compensated prediction from I- or P-frame.
- Half-pixel accuracy of motion vectors, bilinear
interpolation. - Predictive coding of motion vectors.
- Prediction error coded as I-frame.
27MPEG-coding of B-frames
- Motion compensated prediction from two
consecutive I- or P-frames. - Forward prediction only (1 vector/macroblock).
- Backward prediction only (1 vector/macroblock).
- Average of fwd and bwd (2 vectors/macroblock).
- Otherwise as P-frames.
28MPEG-2
- ISO/IEC standard in 1994.
- Properties
- Handles interlace (optimized for TV)
- Even more flexible than MPEG-1
- Format
- 352288
- 704576 (25 frames/s) or 720480 (30 frames/s)
- 14401152 or 19201080 (HDTV)
- Bitrate
- 2 60 Mbit/s
- 4 Mbits/s Image quality similar to PAL / NTSC /
SECAM. - 18 20 Mbit/s HDTV.
29MPEG-2 (cont.)
- Profiles
- Simple profile without B-frames.
- Scaleable profiles.
- Experience tells that
- At 1.5 2 Mbit/s MPEG-2 is not better than
MPEG-1. - With manual interaction at the coding, good
quality can be achieved at 3 4 Mbit/s. - Problems with implementing the full standard has
caused compatibility problems. - Buffering and rate control hard problems.
30MPEG-4
- ISO/IEC standard in 1998, version 2 in 1999
- Instead of frames as coding units, MPEG-4 use
audio-visual objects - Focus is not primarily on compression, but on
content-based functionality - Contains definitions of
- Media object types (video, audio, text, graphics,
...) - Parameters for describing the objects
- Bitstream syntax for the (compressed) parameters
- Scene description, file format, streaming,
synchronization, ... - Allows mixing of media objects.
31Parts of the MPEG-4 standard
- Part 1, Systems, contains
- The bitstream syntax and the the binary
language for scene description - Computer graphics object descriptions
- Multiplexing, transport, ...
- Part 2, Visual, contains
- Video coding
- Still image coding
- Texture coding, ...
- Part 3, Audio, contains a toolbox of audio coders
for different applications - ...
32Structure of an MPEG-4 Decoder
A/Vobject
Decoder
A/Vobject
Decoder
Bitstream
Audio/Video scene
MUX
Compositor
A/Vobject
Decoder
33MPEG-4 (Natural) Video
- Instead of frames Video Object Planes
- Coded with Shape Adaptive DCT
34MPEG-4 Video Coding
35Synthetic/Natural Hybrid Coding
- Mix traditional video with 2D/3D graphics
- Compose virtual environments
- Easy to add text, graphs, images, etc
- High compression
- Receive object from separate sources
- Use predefined or locally defined objects
- Scaleability
- Progressive decoding
- Better terminal gives better quality.
36Synthetic Objects
- 2D/3D graphics
- Lines, polygons
- Still images
- Image/video mapping on polygon meshes
- VRML scenes and objects
- Animated people
- More on animation and virtual characters in
Lecture 12! - Synthetic audio
- More on natural and synthetic audio in Lecture 11!
37All mixed inthe decoder!!!
38Virtual Environments
- Downloaded virtual environment
- Different environments for different users
- Simple change between environments
- Synthetic environments are cheaper than real ones
39Tools for Synthetic Objects
- Wavelet-based still image compression
- Scaleable quality and resolution
- Progressive decoding
- Can be mapped on 2D or 3D meshes
- Compression of 2D and 3D meshes
- Mesh geometry and animation
- Transmit vertex coordinates and let the receiving
terminal calculate the polygons - A moving or still image can be mapped on the mesh
(texture mapping).
40More Tools for Synthetic Objects
- Face and Body Animation
- Text-to-speech (TTS) interface
- View-dependent scaleable texture
- Information about the users view position in a 3D
scene is transmitted on a back-channel - Only the necessary texture information is
transmitted to the user
41View-dependent Scaleable Texture
42Other formats
- Microsoft, RealVideo, QuickTime, ...
- All are variations of the hybrid coder used in
MPEG-coders, with some extra features.
43New Stuff
- ITU and ISO in cooperation
- H.264MPEG-4 part 10
- Finished in 2003.
44H.264 / MPEG-4 part 10
- 44 integer transform (approximating DCT).
- Prediction of blocks of sizes up to 1616.
- Motion vectors for blocks of sizes 44 up to
1616. - Up to 5 reference images for prediction.
- Non-uniform qunatization.
- Arithmetic coding of run-level pairs.
45What about the sound?
- MPEG-1
- Audio layer I, II and III (mp3).
- MPEG-2
- Four channels, same codec as in MPEG-1.
- AAC (Advanced Audio Codec) added later.
- MPEG-4
- AAC
- Two speech coders
- Structured audio
- And more...
More on audio codingin Lecture 11.
46Conclusion
- Color coding
- Change basis from RGB to YUV
- Colour components are compressed harder than the
luminance - Moving image coding
- Hybrid coding Motion compensated predictive
coding and transform coding of the prediction
error - I-, P-, and B-frames
- Object-based coding (MPEG-4) mixing synthetic and
natural audio video
47Conclusion (cont)
- Standards
- MPEG-1 Video CD
- MPEG-2 Digital TV
- MPEG-4 Multimedia
- H.261 ISDN videophone
- H.263 PSTN videophone
- H.264 / MPEG-4 part 10 Universal video
48That was the last slide!