Overview%20of%20the%20Scalable%20Video%20Coding%20Extension%20of%20the%20H.264/AVC%20Standard - PowerPoint PPT Presentation

View by Category
About This Presentation
Title:

Overview%20of%20the%20Scalable%20Video%20Coding%20Extension%20of%20the%20H.264/AVC%20Standard

Description:

Introduction - definition. Scalable video stream. Scalability ... Functionality of SVC. Graceful degradation when 'right' parts of the bit-stream are lost ... – PowerPoint PPT presentation

Number of Views:162
Avg rating:3.0/5.0
Slides: 57
Provided by: vcCsNt
Learn more at: http://vc.cs.nthu.edu.tw
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Overview%20of%20the%20Scalable%20Video%20Coding%20Extension%20of%20the%20H.264/AVC%20Standard


1
Overview of the Scalable Video Coding Extension
of the H.264/AVC Standard
  • Heiko Schwarz, Detlev Marpe, and Thomas Wiegand
  • CSVT, Sept. 2007

2
Outline
  • Introduction
  • Problems
  • Definition
  • Functionality
  • Goal
  • Competition
  • Applications
  • Targets
  • History of SVC
  • Structure of SVC
  • Temporal Scalability
  • Spatial Scalability
  • Quality Scalability
  • Combined Scalability
  • Profiles of SVC
  • Conclusions

3
Introduction - problem
  • Non-Scalable Video Streaming
  • Multiple video streams are needed for
    heterogeneous clients

8Mb/s
512Kb/s
1Mb/s
6Mb/s
4Mb/s
4
Introduction - definition
  • Scalable video stream
  • Scalability
  • Removal of parts of the video bit-stream to adapt
    to the various needs of end users and to varying
    terminal capabilities or network conditions

High quality
Sub-stream n
Sub-stream ki


reconstruction
Sub-stream 2
Sub-stream k2
Low quality
Sub-stream 1
Sub-stream k1
5
Introduction - functionality
  • Functionality of SVC
  • Graceful degradation when right parts of the
    bit-stream are lost
  • Bit-rate adaptation to match the channel
    throughput
  • Format adaptation for backwards compatible
    extension
  • Power adaptation for trade-off between runtime
    and quality

6
Introduction - goal
  • Goal of SVC
  • Scalability mode
  • Fidelity reduction (SNR scalability)
  • Picture size reduction (spatial scalability)
  • Frame rate reduction (temporal scalability)
  • Sharpness reduction (frequency scalability)
  • Selection of content (ROI or object-based
    scalability)

Sub-stream ki
H.264/AVC bit-stream

(Quality)
Sub-stream k2
Sub-stream k1
7
Introduction - competition
  • SVC is an old research topic (gt 20 years) and has
    been included in H.262/MPEG-2, H.263, and MPEG-4
    Visual.
  • Rarely used because
  • The characteristics of traditional video
    transmission systems
  • Significant loss of coding efficiency and large
    increase in decoder complexity
  • Competition
  • Simulcast
  • Transcoding

8
Introduction - applications
  • Applications
  • Heterogeneous clients
  • Unequal protection
  • Surveillance
  • Problems
  • Increased decoder complexity
  • Decreased coding efficiency
  • Temporal scalability is more often supported
    than spatial and quality scalability.

9
Introduction - targets
  • Targets
  • Little decrease in coding efficiency
  • Little increase in decoding complexity
  • Support of temporal, spatial, and quality
    scalability
  • A backward compatible base layer
  • Simple bit-stream adaptations after encoding

10
History of SVC
  • October 2003 MPEG issues a call for proposals of
    Scalable Video Coding
  • 12 wavelet-based
  • 2 extensions of H.264/AVC
  • October 2004 MSRA vs. HHI proposal
    (Wavelet-based vs. H.264 Extension)
  • October 2004 HHI proposal adopted as starting
    point (due to reduction of the encoder and
    decoder and improvements in coding efficiency)
  • January 2005 MPEG and VCEG agree to jointly
    finalize the SVC project as an Amendment of
    H.264/AVC
  • Spring 2007 Finalization

11
Structure of SVC
SNR scalable coding
Prediction
Base layer coding
Temporal scalable coding
Multiplex
Spatial decimation
SNR scalable coding
Temporal scalable coding
Prediction
Base layer coding
12
Outline
  • Introduction
  • History of SVC
  • Structure of SVC
  • Temporal Scalability
  • Hierarchical prediction structure
  • Spatial Scalability
  • Quality Scalability
  • Combined Scalability
  • Profiles of SVC
  • Conclusions

13
Temporal Scalability
  • Hierarchical prediction structures

Hierarchical B pictures
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
GOP
Non-dyadic hierarchical prediction
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Hierarchical prediction with zero delay
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
14
Temporal Scalability
  • Combination with multiple reference picture
  • Arbitrary modification of the prediction
    structure
  • Issue of quantization
  • Lower layers with higher fidelity ? Smaller QPs
    are used in lower layers
  • Propagation of quantization error ? smaller QPs
    are used in higher layers

15
Temporal Scalability
  • Quantization flow from top to bottom of pyramid
    explains necessary to decrease the quality
  • Quantization step size should be increased in
    next lower layer hierarchy level by (1.5)1/2

0.25
0.25
0.25
0.25
0.5
0.5
11.50 12?0.521.51 12?0.524 ?0.2522
?0.521.52
0.5
0.5
1
This slide is copied from http//iphome.hhi.de/wie
gand/assets/pdfs/H264AVC_SVC.pdf
16
Temporal Scalability
Video Coding Experiment with H.264/MPEG4-AVC Forem
an, CIF 30Hz _at_ 1320kbps Performance as a function
of N Cascaded QP assignment QP(P) ? QP(B0)-3 ?
QP(B1)-4 ? QP(B2)-5
N1
I
P
P
P
P
P
P
P
P
N2
Temporal scalability
I
P
P
P
P
B0
B0
B0
B0
N4
I
P
B0
P
B0
B1
B1
B1
B1
N8
I
P
B0
B1
B1
B2
B2
B2
B2
This slide is copied from JVT-W132-Talk
17
Temporal Scalability
  • When different prediction references are
    available at encoder and decoder, an additional
    penalty occurs which is relatively small in case
    of hierarchical B pictures with optimum
    quantization
  • Can only be avoided by using closed-loop encoding
    with same references

1
0.25
?1
0.5
12 0.52/1.5 (0.520.252)/2.25 0.252/2.25
1 0.167 0.1388 0.0277
0.5
0.25
?1/(1.5)1/2
0.5
0.5
?1/1.5
This slide is copied from http//iphome.hhi.de/wie
gand/assets/pdfs/H264AVC_SVC.pdf
18
Temporal Scalability
  • Coding efficiency of hierarchical prediction
  • JSVM11, High profile with CABAC
  • Only one reference frame

CIF
19
Temporal Scalability
  • Compared with IPPP (With and without delay
    constraint)
  • Providing temporal scalability usually doesnt
    have any negative impact on coding efficiency

20
Outline
  • Introduction
  • History of SVC
  • Structure of SVC
  • Temporal Scalability
  • Spatial Scalability
  • Inter layer prediction
  • Inter layer motion prediction
  • Inter layer residual prediction
  • Inter layer intra prediction
  • Quality Scalability
  • Combined Scalability
  • Profiles of SVC
  • Conclusions

21
Spatial Scalability
texture
Hierarchical MCP Intra-prediction
Base layer coding
motion
  • Inter-layer prediction
  • Intra
  • Motion
  • Residual

Spatial decimation
texture
Hierarchical MCP Intra-prediction
Base layer coding
Multiplex
Scalable bit-stream
motion
  • Inter-layer prediction
  • Intra
  • Motion
  • Residual

Spatial decimation
H.264/AVC compatible base layer bit-stream
texture
H.264/AVC MCP Intra-prediction
Base layer coding
motion
H.264/AVC compatible coder
22
Spatial Scalability
  • Similar to MPEG-2, H.263, and MPEG-4
  • Arbitrary resolution ratio
  • The same coding order in all spatial layers
  • Combination with temporal scalability
  • Inter-layer prediction

Spatial 1 Temporal 2
Spatial 0 Temporal 0 Temporal 1
23
Spatial Scalability
  • The prediction signals are formed by
  • MCP inside the enhancement layer (Temporal)
    (small motion and high spatial detail)
  • Up-sampling from the lower layer (Spatial)
  • Average of the above two predictions (Temporal
    Spatial)
  • Inter-layer prediction
  • Three kinds of inter-layer prediction
  • Inter-layer motion prediction
  • Inter-layer residual prediction
  • Inter-layer intra prediction
  • Base mode MB
  • Only residual are transmitted, but no additional
    side info.

24
Spatial Scalability
  • Inter-layer motion prediction
  • base_mode_flag 1
  • The reference layer is inter-coded
  • Data are derived from the reference layer
  • MB partitioning
  • Reference indices
  • MVs
  • motion_pred_flag
  • 1 MV predictors are obtained from the reference
    layer
  • 0 MV predictors are obtained by conventional
    spatial predictors.

(2x2,2y2)
(2x1,2y1)
16
16
(x1,y1)
(x2,y2)
Reference layer
8
8
25
Spatial Scalability
  • Inter-layer residual prediction
  • residual_pred_flag 1
  • Predictor
  • Block-wise up-sampling by a bi-linear filter from
    the corresponding 8?8 sub-MB in the reference
    layer
  • Transform block basis

26
Spatial Scalability
  • Inter-layer intra prediction
  • base_mode_flag 1
  • The reference layer is intra-coded
  • Up-sampling from the reference layer
  • Luma one-dimensional 4-tap FIR filter
  • Chroma bi-linear filter

27
Spatial Scalability
  • Past spatial scalable video
  • Inter-layer intra prediction requires completely
    decoding of base layer.
  • Multiple motion compensation and deblocking
    filter are needed.
  • Full decoding inter-layer prediction
    complexity gt simulcast.
  • Single-loop decoding
  • Inter-layer intra prediction is restricted to MBs
    for which the co-located base layer is
    intra-coded

28
Spatial Scalability
  • Single-loop vs. multi-loop decoding

Inter
I
B
P
This slide is copied from http//iphome.hhi.de/wie
gand/assets/pdfs/H264AVC_SVC.pdf
29
Spatial Scalability
  • Generalized spatial scalability in SVC
  • Arbitrary ratio
  • Only restriction Neither the horizontal nor the
    vertical resolution can decrease from one layer
    to the next.
  • Cropping
  • Containing new regions
  • Higher quality of interesting regions

30
Spatial Scalability
  • Coding efficiency
  • Multiple-loop gt Single-loop

31
(No Transcript)
32
Spatial Scalability
  • Coding efficiency (IPPP)
  • Multi-loop gt Single-loop

33
Spatial Scalability
  • Encoder control (JSVM)
  • Base layer
  • p0? is optimized for base layer
  • Enhancement layer
  • p1? is optimized for enhancement layer
  • Decisions of p1 depend on p0
  • Efficient base layer coding but inefficient
    enhancement layer coding

34
Spatial Scalability
  • Encoder control (optimization)
  • Base layer
  • Considering enhancement layer coding
  • Eliminating p0s disadvantaging enhancement
    layer coding
  • Enhancement layer
  • No change
  • w
  • w 0 JSVM encoder control
  • w 1 Single-loop encoder control (base layer is
    not controlled)

35
Spatial Scalability
  • Coding efficiency of optimal encoder control
  • Optimized encoder vs. JSVM encoder (QPE QPB 4)

36
Outline
  • Introduction
  • History of SVC
  • Structure of SVC
  • Temporal Scalability
  • Spatial Scalability
  • Quality Scalability
  • CGS
  • MGS
  • Drift control
  • Combined Scalability
  • Profiles of SVC
  • Conclusions

37
Quality Scalability
  • Coarse-grain quality scalability (CGS)
  • A special case of spatial scalability
  • Identical sizes (resolution) for base and
    enhancement layers
  • Smaller quantization step sizes for higher
    enhancement residual layers
  • Designed for only several selected bit-rate
    points
  • Supported bit-rate points Number of layers
  • Switch can only occur at IDR access units

38
Quality Scalability
  • Medium-grain quality scalability (MGS)
  • More enhancement layers are supported
  • Refinement quality layers of residual
  • Key pictures
  • Drift control
  • Switch can occur at any access units
  • CGS key pictures refinement quality layers

39
Quality Scalability
  • Drift control
  • Drift The effect caused by unsynchronized MCP at
    the encoder and decoder side
  • Trade-off of MCP in quality SVC
  • Coding efficiency ? drift

40
Quality Scalability
  • MPEG-4 quality scalability with FGS
  • Base layer is stored and used for MCP of
    following pictures
  • Drift Drift free
  • Complexity Low
  • Efficiency Efficient based layer but inefficient
    enhancement layer
  • Refinement data are not used for MCP

Refinement (possibly lost or truncated)
Base layer
41
Quality Scalability
  • MPEG-2 quality scalability (without FGS)
  • Only 1 reference picture is stored and used for
    MCP of following pictures
  • Drift Both base layer and enhancement layer
  • Frequent intra updates is necessary
  • Complexity Low
  • Efficiency Efficient enhancement layer but
    inefficient base layer

Refinement (possibly lost or truncated)
Base layer
42
Quality Scalability
  • 2-loop prediction
  • Several closed encoder loops run at different
    bit-rate points in a layered structure
  • Drift Enhancement layer
  • Complexity High
  • Efficiency Efficient base layer and medium
    efficient enhancement layer

Refinement (possibly lost or truncated)
Base layer
43
Quality Scalability
  • SVC concepts
  • Key picture
  • Trade-off between coding efficiency and drift
  • MPEG-4 FGS All key pictures
  • MPEG-2 quality scalability Non-key pictures

Refinement (possibly lost or truncated)
Base layer
44
Quality Scalability
  • Drift control with hierarchical prediction
  • Key pictures
  • Based layer is stored and used for the MCP of
    following pictures
  • Other pictures
  • Enhancement layer is stored and used for the MCP
    of following pictures
  • GOP size adjusts the trade-off between
    enhancement layer coding efficiency and drift

Refinement (possibly lost or truncated)
Base layer
P
P
P
B1
B1
B2
B2
B2
B2
45
Quality Scalability
  • Comparisons of drift control

High efficiency
Low efficiency
Drift-free
Drift
46
Quality Scalability
  • Comparisons of coding efficiency

QSTEP 2 (QP-4)/6
High dQP
Low dQP
47
Quality Scalability
  • MGS with key pictures using optimized encoder
    control

Only base layer
48
Outline
  • Introduction
  • History of SVC
  • Structure of SVC
  • Temporal Scalability
  • Spatial Scalability
  • Quality Scalability
  • Combined Scalability
  • SVC encoder structure
  • Dependence and Quality refinement layers
  • Bit-stream format
  • Bit-stream switching
  • Profiles of SVC
  • Conclusions

49
Combined Scalability
  • SVC encoder structure

The same motion/prediction information
Dependency layer
Temporal Decomposition
The same motion/prediction information
50
Combined Scalability
  • Dependency and Quality refinement layers

Q 2
D 2
Q 1
Q 0
Q 2
Scalable bit-stream
D 1
Q 1
Q 0
Q 2
D 0
Q 1
Q 0
51
Combined Scalability
Q1
D1
Q0
T0
T2
T1
T2
T0
Q1
D0
Q0
52
Combined Scalability
  • Bit-stream format

NAL unit header
NAL unit header extension
NAL unit payload
1
1
1
1
1
3
2
3
3
6
2
P
T
D
Q
P (priority_id) indicates the importance of a
NAL unit T (temporal_id) indicates temporal
level D (dependency_id) indicates spatial/CGS
layer Q (quality_id) indicates MGS/FGS layer
53
Combined Scalability
  • Bit-stream switching
  • Inside a dependency layer
  • Switching everywhere
  • Outside a dependency layer
  • Switching up only at IDR access units
  • Switching down everywhere if using multiple-loop
    decoding

54
Outline
  • Introduction
  • History of SVC
  • Structure of SVC
  • Temporal Scalability
  • Spatial Scalability
  • Quality Scalability
  • Combined Scalability
  • Profiles of SVC
  • Scalable Baseline
  • Scalable High
  • Scalable High Intra
  • Conclusions

55
Profiles of SVC
  • Scalable Baseline
  • For conversational and surveillance applications
    requiring low decoding complexity
  • Spatial scalability fixed ratio (1, 1.5, or 2)
    and MB-aligned cropping
  • Temporal and quality scalability arbitrary
  • No interlaced coding tools
  • B-slices, weighted prediction, CABAC, and 8x8
    luma transform
  • The base layer conforms Baseline profile of
    H.264/AVC

56
Profiles of SVC
  • Scalable High
  • For broadcast, streaming, and storage
  • Spatial, temporal, and quality scalability
    arbitrary
  • The base layer conforms High profile of H.264/AVC
  • Scalable High Intra
  • Scalable High all IDR pictures

57
Conclusions
  • Temporal scalability
  • Hierarchical prediction structure
  • Spatial and quality scalability
  • Inter-layer prediction of Intra, motion, and
    residual information
  • Single-loop MC decoding
  • Identical size for each spatial layer CGS
  • CGS key pictures quality refinement layer
    MGS
  • applications
  • Power adaption decoding needed part of the
    video stream
  • Graceful degradation when right parts are
    lost
  • Format adaption backwards compatible extension
    in mobile TV
  • Whats next in SVC?
  • Bit-depth scalability (8-bit 420 ? 10-bit
    420)
  • Color format scalability (420 ? 444)

58
References
  • H. Schwarz, D. Marpe, and T. Wiegand, Overview
    of the Scalable Video Coding Extension of the
    H.264/AVC Standard, CSVT 2007.
  • T. Wiegand, Scalable Video Coding, Joint Video
    Team, doc. JVT-W132, San Jose, USA, April 2007.
  • T. Wiegand, Scalable Video Coding, Digital
    Image Communication, Course at Technical
    University of Berlin, 2006. (Available on
    http//iphome.hhi.de/wiegand/dic.htm)
  • H. Schwarz, D. Marpe, and T. Wiegand,
    Constrained Inter-Layer Prediction for
    Single-Loop Decoding in Spatial Scalability,
    Proc. of ICIP05.
About PowerShow.com