Video coding - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

Video coding

Description:

Prompted explosion of digital video applications: MPEG1 video CD and ... Currently, the ID3 simply appends simple metadata tags such as Artist, Album, Song Title, etc. ... – PowerPoint PPT presentation

Number of Views:110
Avg rating:3.0/5.0
Slides: 44
Provided by: Gon76
Category:
Tags: coding | video

less

Transcript and Presenter's Notes

Title: Video coding


1
Video coding
??
2
Video coding
Types of redundancies Spatial Correlation
between neighboring pixel values Spectral
Correlation between different color planes or
spectral bands Temporal Correlation between
different frames in a video sequence In video
coding, temporal correlation is also exploited,
typically using motion compensation (a predictive
coding based on motion estimation)
3
Video standards review
4
H.261
For video-conferencing/video phone Low delay
(real-time, interactive) Slow motion in
general For transmission over ISDN Fixed
bandwidth px64 Kbps, p1,2,,30
5
H.261
Video Format CIF (352x288, above 128 Kbps)
QCIF (176x144, 64-128 Kbps) 420 color format,
progressive scan Published in 1990 Each
macroblock can be coded in intra- or inter-mode
Periodic insertion of intra-mode to eliminate
error propagation due to network impairments
6
DCT coefficient quantization
DC Coefficient in Intra-mode Uniform Others Unif
orm with deadzone (to avoid too many small
coefficients being coded, which are typically due
to noise)
MVs coded differentially (DMV) DCT coefficients
are converted into runlength representations and
then coded using VLC (Huffman coding for each
pair of symbols) Symbol (Zero run-length,
non-zero value range) Other information is also
coded using VLC (Huffman coding)
7
MPEG-1
Finalized in 1991 Audio/video on CD-ROM
(1.5 Mbps, CIF 352x240, 30 fps). Maximum
1.856 mbps, 768x576 pels Progressive frames
only Prompted explosion of digital video
applications MPEG1 video CD and downloadable
video over Internet Software only decoding,
made possible by the introduction of Pentium
chips, key to the success in the commercial
market MPEG-1 Audio Offers 3 coding options
(3 layers), higher layers have higher coding
efficiency with more computations MP3 MPEG1
layer 3 audio
8
MPEG-1 vs H.261
Developed at about the same time Must enable
random access (Fast forward/rewind) Using GOP
structure with periodic I-picture and P-picture
Not for interactive applications Does not
have as stringent delay requirement Fixed rate
(1.5 Mbps), good quality (VHS equivalent) SIF
video format (similar to CIF) CIF 352x288,
SIF 352x240 Using more advanced motion
compensation Half-pel accuracy motion
estimation, range up to /- 64 Using
bi-directional temporal prediction
Important for handling uncovered regions
9
MPEG-1 GOP
Encoding order 1 4 2 3 8 5 6 7
10
MPEG-1 coder
11
H.263
Targeted for visual telephone over PSTN or
Internet Enable video phone over regular phone
lines (28.8 Kbps) or wireless modem Developed
later than H.261, can accommodate
computationally more intensive options Initial
version (H.263 baseline) 1995 H.263 1997
H.263 2000 Result Significantly better
quality at lower rates Better video at 18-24
Kbps than H.261 at 64 Kbps
12
H.263
13
(some of the ) H.263 improvements over H.261
Better motion estimation half-pel accuracy
motion estimation with bilinear interpolation
filter larger motion search range -31.5,31,
and unrestricted MV at boundary blocks more
efficient predictive coding for MVs (median
prediction using three neighbors) overlapping
block motion compensation (option) variable
block size 16x16 -gt 8x8, 4 MVs per MB (option)
use bidirectional temporal prediction (PB
picture) (option) 3-D VLC for DCT coefficients
(runlength, value, EOB) Syntax-based arithmetic
coding (option at 50 more computations)
14
H.263 and beyond
  • Aimed particularly at video coding for low bit
    rates (typically 20-30 Kbps and above).
  • Similar to that used by H.261, however with some
    improvements and changes to improve performance
    and error recovery.
  • Main differences
  • - Half pixel precision is used for motion
    compensation
  • - Four optional negotiable options
  • - Unrestricted Motion Vectors
  • - Syntax-based arithmetic coding,
  • - Advance prediction, and
  • - forward and backward frame prediction
    (similar to MPEG called P-B frames)
  • - Five resolutions instead of two
  • Further improvements in H.263 and H.264

15
H.263
Example MissAmerica Description Average
PSNR(dB) Bitrate (Kbit/s) Compr.
Ratio Original, 30fps 11 n/a 9124 10fps,
20Kbps 1391 29.79 21.83 10fps, 100Kbps
291 36.0 105.47
16
MPEG-2
MPEG-2 finalized in 1994 Field-interlaced
video Levels and profiles Profiles Define
bit stream scalability and color space
resolutions Levels Define image resolutions
and maximum bit-rate per profile
17
MPEG-2
A/V broadcast (TV, HDTV, Terrestrial, Cable,
Satellite, High Speed Inter/Intranet) as well as
DVD video 48 Mbps for TV quality, 10-15 for
better quality at SDTV resolutions (BT.601)
18-45 Mbps for HDTV applications MPEG-2 video
high profile at high level is the video coding
standard used in HDTV Test in 11/91, Committee
Draft 11/93 Consist of various profiles and
levels Backward compatible with MPEG1 MPEG-2
Audio Support 5.1 channel MPEG2 AAC
requires 30 fewer bits than MPEG1 layer 3
18
MPEG-2 vs MPEG-1
  • MPEG1 only handles progressive sequences (SIF).
  • MPEG2 is targeted primarily at interlaced
    sequences and at higher resolution (BT.601
    4CIF).
  • More sophisticated motion estimation methods
    (frame/field prediction mode) are developed to
    improve estimation accuracy for interlaced
    sequences.
  • - Frame Motion Vectors one motion vector is
    generated per MB in each direction, which
    corresponds to a 16x16 pels luminance area.
  • - Field Motion Vectors two motion vectors
    per MB is generated for each direction, one for
    each of the fields. Each vector corresponds to a
    16x8 pels luminance area.
  • Different DCT modes and scanning methods are
    developed for interlaced sequences.
  • MPEG2 has various scalability modes.
  • MPEG2 has various profiles and levels, each
    combination targeted for different application

19
MPEG-2 scalability
Data partition All headers, MVs, first few
DCT coefficients in the base layer Can be
implemented at the bit stream level Simple
SNR scalability Base layer includes coarsely
quantized DCT coefficients Enhancement layer
further quantizes the base layer quantization
error Relatively simple Spatial
scalability Complex Temporal scalability
Simple
20
SNR scalability
21
Spatial scalability
22
temporal scalability
or
23
MPEG-2 profiles and levels
  • Profiles tools
  • Levels parameter range for a given profile
  • Main profile at main level (mp_at_ml) is the most
    popular, used for digital TV
  • Main profile at high level (mp_at_hl) HDTV
  • 422 at main level (422_at_ml) is used for studio
    production

24
MPEG-4
  • New features
  • Provides technologies to view access and
    manipulate objects rather than pixels
  • Entire scene is decomposed into multiple
    objects
  • Object segmentation is the most difficult
    task!
  • But this does not need to be standardized
    ?
  • Each object is specified by its shape, motion,
    and texture (color)
  • - Shape and texture both changes in time
    (specified by motion)
  • - Texture encoding is done with DCT (8x8
    pixel blocks) or Wavelets
  • MPEG-4 assumes the encoder has a segmentation
    map available, specifies how to code (actually
    decode!) shape, motion and texture

25
MPEG-4
26
Example of Scene Composition
27
Object-Based Coding
28
MPEG-4
  • MPEG-4 block diagram

29
MPEG-4
  • MPEG-4
  • Coding Tools
  • Shape coding Binary or Gray Scale
  • Motion Compensation Similar to H.263,
    Overlapped mode is supported
  • Texture Coding Block-based DCT and Wavelets
    for Static Texture
  • Type of Video Object Planes (VOPs)
  • I-VOP VOP is encoded independently of any
    other VOPs
  • P-VOP Predicted VOP using another previous
    VOP and motion compensation
  • B-VOP Bidirectional Interpolated VOP using
    other I-VOPs or P-VOPs
  • Similar concept to MPEG-2

30
Mesh Animation
  • An object can be described by an initial mesh
    and MVs of the nodes in the following frames
  • MPEG-4 defines coding of mesh geometry, but not
    mesh generation

31
Body and Face Animation
  • MPEG-4 defines a default 3-D body model
    (including its geometry and possible motion)
    through body definition table (BDP)
  • The body can be animated using the body
    animation parameters (BAP)
  • Similarly, face definition table (FDP) and face
    animation parameters (FAP) are specified for a
    face model and its animation
  • E.g. eye blink (FAP19)

32
Text-to-Speech Synthesis with Face Animation
33
Others
  • Sprite
  • Code a large background in the beginning of
    the sequence, plus affine mappings, which map
    parts of the background to the displayed scene at
    different time instances
  • Decoder can vary the mapping to zoom in/out,
    pan left/right
  • Global motion compensation
  • Using 8-parameter projective mapping
  • Effective for sequences with large global
    motion
  • Quarter-pixel motion estimation
  • DivX
  • - based on MPEG-4
  • - can reduce an MPEG-2 video (the same format
    used for DVD and pay per view) to 10 percent of
    its original size (so that a DVD can be recorded
    on a CD)
  • - audio is normally coded using MP3

34
MPEG-7
  • MPEG-1/2/4 make content available, whereas MPEG-7
    allows you to find the content you need!
  • A content description standard
  • Video/images Shape, size, texture, color,
    movements and positions, etc
  • Audio Key, mood, tempo, changes, position
    in sound space, etc
  • Applications
  • Digital Libraries
  • Multimedia Directory Services
  • Broadcast Media Selection
  • Editing, etc
  • Example
  • Draw an object and be able to find object with
    similar characteristics.
  • Play a note of music and be able to find
    similar type of music

35
MPEG-21
  • Aims at standardizing interfaces and tools to
    facilitate the exchange of multimedia resources
    across heterogeneous devices, networks and users.
  • More specifically, it standardizes requisite
    elements for packaging, identifying, adapting and
    processing these resources as well as managing
    their usage rights.
  • This framework will benefit the entire
    consumption chain from creators and rights
    holders to service providers and consumers.
  • Basic unit of transaction in the MPEG-21
    Multimedia Framework the Digital Item, which
    packages resources along with identifiers,
    metadata, licenses and methods that enable
    interaction with the Digital Item.
  • Another key concept the User, i.e. any entity
    that interacts in the MPEG-21 environment or
    makes use of Digital Items.

36
MPEG-21
  • MPEG-21 can be seen as providing a framework in
    which one User interacts with another User and
    the object of that interaction is a Digital Item.
  • Some example interactions include content
    creation, management, protection, archiving,
    adaptation, delivery and consumption.

37
MPEG-A
  • MPEGs Multimedia Application Formats (MAF)
    provide the framework for integration of elements
    from several MPEG standards into a single
    specification that is suitable for specific, but
    widely usable applications.
  • Typically, MAFs specify how to combine metadata
    with timed media information for a presentation
    in a well-defined format that facilitates
    interchange, management, editing, and
    presentation of the media. The presentation may
    be local to the system or may be via a network
    or other stream delivery mechanism. 

38
MPEG-A
  • MAF specifications shall integrate elements from
    different MPEG standards into a single
    specification that is useful for specific but
    very widely used applications. Examples are
    delivering music, pictures or home videos. MAF
    specifications may use elements from MPEG-1,
    MPEG-2, MPEG-4, MPEG-7 and MPEG-21. Typically,
    MAF specifications include
  • - The ISO File Format family for storage
  • - A simple MPEG-7 tool set for Metadata
  • - One or more coding Profiles for
    representing the Media
  • - Tools for encoding metadata in either
    binary or XML form

39
MPEG-A
  • MAFs may specify use of
  • - MPEG-21 Digital Item Declaration Language
    for representing the Structure of the Media and
    the Metadata
  • - Other MPEG-21 tools
  • - non-MPEG coding tools (e.g., JPEG) for
    representation of "non-MPEG" media
  • - Elements from non-MPEG standards that are
    required to achieve full interoperability

40
MPEG-A 2 examples
  • 3on4
  • - MP3, is one of the most widely used MPEG
    standards.  Currently, the ID3 simply appends
    simple metadata tags such as Artist, Album, Song
    Title, etc.  
  • -MPEG-4 specifies what MPEG expects to be
    another very successful specification, the MPEG-4
    File Format, while MPEG-7 specifies not only
    signal-derived meta-data, but also archival
    meta-data such as Artist, Album and Song Title.
  • - As such, MPEG-4 and MPEG-7 represent an
    ideal environment to support the current MP3
    music library user experience, and, moreover, to
    extend that experience in new directions.

41
MPEG-A 2 examples
  • Jon4
  • - Digital Cameras -gt library with thousands
    of digital photos
  • - Search for photographs of interest can be
    difficult -gt
  • - Need for provision of suitable metadata photo
    content (e.g. the subject being photographed),
    author, shoot location, imaging parameters, etc,
    stored in a standardized format
  • - The EXIF standard (commonly adopted by
    camera manufacturers) does not support advanced
    metadata.
  • MPEG-7 defines rich metadata descriptions for
    still images, audio and also provides associated
    systems tools (file formats, etc)
  • As such, MPEG-7 and MPEG-4 file format represent
    an ideal environment to support the current
    Digital Photos Library user experience

42
Summary (1/2)
  • H.261
  • First video coding standard, targeted for video
    conf. over ISDN
  • Uses block-based hybrid coding framework with
    integer-pel MC
  • H.263, H.264
  • Improved quality at lower bit rate, to enable
    video conferencing/telephony below 54 Kbps
    (modems or internet access, desktop
    conferencing) half-pixel MC
  • MPEG-1 video
  • Video on CD and video on the Internet (good
    quality at 1.5 Mbps)
  • Half-pixel MC and bidirectional MC
  • MPEG-2 video
  • TV/HDTV/DVD (4-15 Mbps)
  • Extended from MPEG-1, considering interlaced
    video

43
Summary (2/2)
  • MPEG-4
  • To enable object manipulation and scene
    composition at the decoder -gt interactive
    TV/virtual reality
  • Object-based video coding shape coding
  • Coding of synthetic video and audio animation
  • MPEG-7
  • To enable search and browsing of multimedia
    documents
  • Defines the syntax for describing the
    structural and conceptual content
  • MPEG-21 beyond MPEG-7, considering
    intellectual property protection, etc.
  • MPEG-A integration of elements from different
    MPEG standards into a single specification that
    is useful for specific but very widely used
    applications
Write a Comment
User Comments (0)
About PowerShow.com