Overview of H.264/AVC

About This Presentation

Title:

Overview of H.264/AVC

Description:

Slices are a sequence of macroblocks processed in the order of a raster scan when not using FMO ... same slice group processed in the order of raster scan ... – PowerPoint PPT presentation

Number of Views:196

Avg rating:3.0/5.0

Slides: 101

Provided by: vap

Category:

more less

Transcript and Presenter's Notes

Title: Overview of H.264/AVC

1
Overview of H.264/AVC

2003.9.x
M.K.Tsai

2
Outline

Abstract
Applications
Network Abstraction Layer,NAL
Conclusion(I)
Design feature highlight
Conclusion(II)
Video Coding Layer,VCL
Profile and potential application
Conclusion(III)

3
abstract

H.264/AVC is newest video coding standard
Main goals have been enhanced compression and
provision of network-friendly representation
addressing conversational(video telephony) and
nonconversational (storage,broadcast, or
streaming) application
H.264/AVC have achieved a significant improvement
in rate-distortion efficiency
Scope of standardization is illustrated below

4
applications

Broadcast over cable, cable modem
Interactive or serial storage on optical and DVD
Conversational service over LAN, modem
Video-on-demand or streaming service over
ISDN,wireless network
Multimedia message service (MMS) over DSL, mobile
network
How to handle the variety of applications and
networks ?

5
applications

To address this need for flexibility and
customizability, the H.264/AVC design VCL and
NAL, structure of H.264/AVC encoder is shown below

6
applications

VCL(video coding layer), designed to efficiently
represent video content
NAL(network abstraction layer), formats the VCL
representation of the video and provides header
information in a manner appropriate for
conveyance by a variety of transport layers or
storage media

7
Network Abstraction Layer

To provide network friendliness to enable
simple and effective customization of the use of
the VCL
To facilitate the ability to map H.264/AVC data
to transport layers such as
RTP/IP for kind of real-time Internet services
File formats,ISO MP4 for storage
H.32X for conversational services
MPEG-2 systems for broadcasting services
The design of the NAL anticipates a variety of
such mappings

8
Network Abstraction Layer

Some key concepts of the NAL are NAL units, byte
stream, and packet format uses of NAL units,
parameter sets and access units
NAL units
a packet that contains an integer number of bytes
First byte is header byte containing indication
of type of data
Remaining byte contains payload data
Payload data is interleaved as necessary with
emulation prevention bytes, preventing start code
prefix from being generated inside payload
Specifies a format for use in both packet- and
bitstream- oriented transport system

9
Network Abstraction Layer

NAL units in Byte-Stream format use
byte stream format
Each is prefixed by a unique start code to
identify the boundary
Some systems require delivery of NAL unit stream
as ordered stream of bytes (like H.320 and
MPEG-2/H.220)
NAL units in packet-transport system use
Coded data is carried in packets framed by system
transport protocol
Can be carried by data packets without start code
prefix
In such system, inclusion of start code prefixes
in data would be waste

10
Network Abstraction Layer

VCL and Non-VCL NAL units
VCL NAL units contain data represents the values
of the samples in video pictures
Non- VCL NAL units contain extra data like
parameter sets and supplemental enhancement
information (SEI)
parameter sets, important header data applying to
large number of VCL NAL units
SEI, timing information and other supplemental
data enhancing usability of decoded video signal
but not necessary for decoding the values in the
picture

11
Network Abstraction Layer

Parameter sets
Contain information expected to rarely change and
offers the decoding of a large number of VCL NAL
units
Divided into two types
Sequence parameter sets, apply to series of
consecutive coded video picture
Picture parameter sets, apply to the decoding of
one or more individual picture within a coded
video sequence
The above two mechanisms decouple transmission of
infrequently changing information
Can be sent well ahead of the VCL NAL units and
repeated to provide robustness against data loss

12
Network Abstraction Layer

Parameter sets
Can be sent well ahead of the VCL NAL units and
repeated to provide robustness against data loss
Small amount of data can be used (identifier) to
refer to a larger amount of of information
(parameter set)
In some applications, these may be sent within
the channel (termed in-band transmission)

13
Network Abstraction Layer

Parameter sets
In other applications, it can be advantageous to
convey parameters sets out of band using
reliable transport mechanism

14
Network Abstraction Layer

Access units
The format of access unit is shown below

15
Network Abstraction Layer

Access units
Contains a set of VCL NAL units to compose a
primary coded picture
Prefixed with an access unit delimiter to aid in
locating the start of the access unit
SEI contains data such as picture timing
information
Primary coded data consists of VCL NAL units
consisting of slices that represent the sample of
the video
Redundant coded picture are available for use by
decoder in recovering from loss of data

16
Network Abstraction Layer

Access units
For the last coded picture of video sequence, end
of sequence NAL unit is present to indicate the
end of sequence
For the last coded picture in the entire NAL unit
stream, end of stream NAL unit is present to
indicate the stream is ending
Decoder are not required to decode redundant
coded pictures if they are present
Decoding of each access unit results in one
decoded picture

17
Network Abstraction Layer

Coded video sequences
Consists of a series of access unit and use only
one sequence parameter set
Can be decoded independently of other coded video
sequence ,given necessary parameter set
Instantaneous decoding refresh(IDR) access unit
is at the beginning and contains intra picture
Presence of IDR access unit indicates that no
subsequent picture will reference to picture
prior to intra picture

18
Conclusion(I)

H.264/AVC represents a number of advances in
standard video coding technology in term of
flexibility for effective use over a broad
variety of network types and application domain

19
Design feature highlight

Variable block-size motion compensation with
small block size
With minimum luma block size as small as 4x4
The matching chroma is half the length and width

20
Design feature highlight

Quarter-sample-accurate motion compensation
Half-pixel is generated by using 6 tap FIR filter
As first found in advanced profile of MPEG-4, but
further reduces the complexity
Multiple reference picture motion compensation
Extends upon enhanced technique found in H.263
Select among large numbers of pictures decoded
and stored in the decoder for pre-prediction
Same for bi-prediction which is restricted in
MPEG-2

21
Design feature highlight

Decoupling of reference order from display order
A strict dependency between ordering for
referencing and display in prior standard
Allow encoder to choose ordering of pictures for
referencing and display purposes with a high
degree of flexibility
Flexibility is constrained by total memory
capability
Removal of restriction enable removing extra
delay associated with bi-predictive coding

22
Design feature highlight

Motion vector over boundaries
Motion vectors are allowed to point outside
pictures
Especially useful for small picture and camera
movement
Decoupling of picture representation methods from
picture referencing capability
Bi-predictively-encoded pictures could not be
used as references in prior standard
Provide the encoder more flexibility to use a
picture for referencing that is closer to the
picture being coded

23
Design feature highlight

Weighted prediction
Allow motion-compensated prediction signal to be
weighted and offset by amounts
Improve coding efficiency for scenes containing
fades

one grid means one pixel
24
Design feature highlight

Improved skipped and direct motion inference
In prior standard ,skipped area of a
predictively-coded picture cant motion in the
scene content ,which is detrimental for global
motion
Infers motion in skipped motion
For bi-predictively coded areas ,improves further
on prior direct prediction such as H.263 and
MPEG-4.

25
Design feature highlight

Directional spatial prediction for intra coding
Extrapolating edges of previously decoded parts
of current picture is applied in intra-coded
regions of picture
Improve the quality of the prediction signal
Allow prediction from neighboring areas that were
not intra-coded

26
Design feature highlight

In-the-loop deblocking filtering
Block-based video coding produce artifacts known
as blocking artifacts originated from both
prediction and residual difference coding stages
of decoding process
Improvement in quality can be used in
inter-picture prediction to improve the ability
to predict other picture

27
Design feature highlight

In addition to improved prediction methods coding
efficiency is also enhanced, including the
following
Small block-size transform
All major prior video coding standards used a
transform block size of 8x8 while new ones is
based primarily on 4x4
Allow the encoder to represent the signal in a
more locally-adaptive fashion and reduce artifact
Short word-length transform
Arithmetic processing 32-bit ? 16-bits

28
Design feature highlight

Hierarchical block transform
Extend the effective block size for low-frequency
chroma to 8x8 array and luma to 16x16 array

29
Design feature highlight

Exact-match inverse transform
Previously transform was specified within error
tolerance bound due to impracticality of
obtaining exact match to ideal inverse transform
Each decoder would produce slightly different
decoded video, causing drift between encoder
and decoder
Arithmetic entropy coding
Previously found as an optional feature of H.263
Use a powerful Context-adaptive binary
arithmetic coding(CABAC)

30
Design feature highlight

Context-adaptive entropy coding
Both CAVLC (context-adaptive variable length
coding) and CABAC use context-based adaptivity
to improve performance

31
Design feature highlight

Robustness to data errors/losses and flexibility
for operation over variety of network
environments is enable, including the following
Parameter set structure
Key information was separated for handling in a
more flexible and specialize manner
Provide for robust and efficient conveyance
header information
Flexible slice size
Rigid slice structure reduce coding efficiency by
increasing the quantity of header data and
decreasing the effectiveness of prediction in
MPEG-2

32
Design feature highlight

NAL unit syntax structure
Each syntax structure in H.264/AVC is placed into
a logical data packet called a NAL unit
Allow greater customization of the method of
carrying the video content in a manner for each
specific network
Redundant pictures
Enhance robustness to data loss
Enable a representation of regions of pictures
for which the primary representation has been lost

33
Design feature highlight

Flexible macroblock ordering (FMO)
Partition picture into regions called slice
groups, with each slice becoming independently
decodable subset of a slice group
Significantly enhance robustness by managing the
spatial relationship between the regions that are
coded in each slice
Arbitrary slice ordering (ASO)
Enable sending and receiving the slices of the
picture in any order relative to each other as
found in H.263
Improve end-to-end delay in real time
applications particularly for out-of-order
delivery behavior

34
Design feature highlight

Data partitioning
Allow the syntax of each slice to be separated
into up to three different partitions(header
data, Intra-slice, Inter-slice, partition),
depending on a categorization of syntax elements
SP/SI synchronization/switching pictures
Allow exact synchronization of the decoding
process of some decoder with an ongoing video
Enable switching a decoder between video streams
that use different data rate, recover from data
loss or error
Enable switching between different kind of video
streams, recover from data loss or error

35
Design feature highlight

SP/SI synchronization/switching pictures

36
Design feature highlight

SP/SI synchronization/switching pictures

37
Conclusion(II)

H.264/AVC represents a number of advances in
standard video coding technology in term of both
coding efficiency enhancement and flexibility for
effective use over a board variety of network
types and application domain

38
Video Coding Layer

Pictures, Frames, and Fields
Picture can represent either an entire frame or a
single field
If two fields of a frame were captured at
different time instants the frame is referred to
as a interlaced frame, otherwise it is referred
to as a progressive frame

39
Video Coding Layer

YCbCr color space and 420 sampling
Y represents brightness
Cb?Cr represents color deviates from gray toward
blue and red
Division of the picture into macroblock
Slices and slice groups
Slices are a sequence of macroblocks processed in
the order of a raster scan when not using FMO
Some information from other slices maybe needed
to apply the deblocking filter across slice
boundaries

40
Video Coding Layer

Picture may be split into one or more slices
without FMO shown below
FMO modifies the way how pictures are
partitioned into slices and MBs by using slice
groups
Slice group is a set of MBs defined by MB to
slice group map specified by picture parameter
set and some information from slice header

41
Video Coding Layer

Slice group can be partitioned into one or more
slices, such that a slice is a sequence of MBs
within same slice group processed in the order of
raster scan
By using FMO, a picture can be split into many
macroblock scanning patterns such as the below

42
Video Coding Layer

Each slice can be coding using different types
I slice
A slice where all MBs are coded using intra
prediction
P slice
In addition to intra prediction, it can be coded
with inter prediction with at most one
motion-compensated prediction
B slice
In addition to coding type of P slice, it can be
coded with inter prediction with two
motion-compensated prediction
SP (switching P) slice
Efficient switching between different pre-coded
pictures
SI (switching I) slice
Allows exact match of a macroblock in an SP slice
for random access and error recovery

43
Video Coding Layer

If all slices in stream B are P-slices, decoder
wont have correct reference frame, solution is
to code frame as an I-slice like below
I-slice result in a peak in the coded bit rate at
each switching point

44
Video Coding Layer

SP-slices are designed to support switching
without increased bit-rate penalty of I-slices
Unlike normal P-slice, the subtraction occurs
in transform domain

45
Video Coding Layer

A simplified diagram of encoding and decoding
processing for SP-slices A2?B2?AB2 is shown (A
means reconstructed frame)

46
Video Coding Layer

If stream A and B are versions of the same
original sequence coded at different bit-rates
the SP-slice AB2 should be efficient

47
Video Coding Layer

SP-slices is to provide random access and
VCR-like functionalities.(e.g decoder can
fast-forward from A0 directly to frame A10 by
first decoding A0, then decoding SP-slice A0-10)
Second type of switching slice, SI-slice may be
used to switch from one sequence to a completely
different sequence

48
Video Coding Layer

Encoding and decoding process for macroblocks
All luma and chroma samples of a MB are either
spatially or temporally predicted
Each color component of prediction is subdivided
into 4x4 blocks and is transformed using integer
transform and then be quantized and encoded by
entropy coding methods
The input video signal is split into MBs, the
association of MBs to slice groups and slices is
selected
An efficient parallel processing of MB is
possible when there are various slices in the
picture

49
Video Coding Layer

Encoding and decoding process for macroblocks
block diagram of VCL for a MB is in the following

50
Video Coding Layer

Adaptive frame/field coding operation
For regions of moving objects or camera motion,
two adjacent rows show a reduced degree of
dependency in interlaced frames but progressive
frames
To provide high coding efficiency, H.264/AVC
allows the following decisions when coding a
frame
To combine two fields and code them as one single
frame (frame mode)
To not combine the two fields and to code them as
separated coded fields (field mode)
To combine the two fields and compress them as a
single frame, before coding them to split the
pairs of the vertically adjacent MB into pairs of
two fields or frame MB

51
Video Coding Layer

The three options can be made adaptively and the
first two can be is referred to as
picture-adaptive frame/field (PAFF) coding
As a frame is coded as two fields, coded in ways
similar to frame except the following
Motion compensation utilizes reference fields
rather frames
The zig-zag scan is different
Strong deblocking is not used for filtering
horizontal edges of MB in fields
A frame consists of mixed regions, its efficient
to code the nonmoving regions in frame mode,
moving regions in field mode

52
Video Coding Layer

A frame/field encoding decision can be made
independently for each vertical pairs of MB. The
coding option is referred as macroblock-adaptive
frame/field (MBAFF) coding. The below shows the
MBFAA MB pair concept.

53
Video Coding Layer

An important distinction between PAFF and MBAFF
is that in MBAFF, one field cant use MBs in
other field of the same frame
Sometimes PAFF coding can be more efficient than
MBAFF coding, particularly in the case of rapid
global motion, scene change, intra picture refresh

54
Video Coding Layer

Intra frame prediction
In all slice coding type Intra_4x4 or intra_16x16
together with chroma prediction and I_PCM
prediction mode
Intra_4x4 mode is based on 4x4 luma block and
suited for significant detail of picture
When using, each 4x4 block is predicted from the
neighboring samples like the below

55
Video Coding Layer

Intra frame prediction
4x4 block prediction mode
Suited to predict textures with structure in the
specified direction except the DC mode
prediction

56
Video Coding Layer

Intra frame prediction
In earlier draft, the four samples below L were
also used for some prediction modes. They are
dropped due to the need to reduce memory access
Intra modes for neighboring 4x4 block are highly
correlated. For example, if previously-encoded
4x4 blocks A and B were predicted mode 2, its
likely that the best mode for block C is also
mode 2.

57
Video Coding Layer

Intra frame prediction
Intra_16x16 mode is suited for smooth areas of a
picture
When using this mode, it contains
vertical?horizontal?DC and plane prediction
Plane prediction works well in areas of
smoothly-varying luminance

58
Video Coding Layer

Intra frame prediction
Chroma of MB is predicted by the similar
prediction as Intra_16x16(the same four modes)
I_PCM mode allows the encoder to bypass the
prediction and transform coding process and
instead directly send the values of the encoded
samples
I_PCM mode server the following purposes
Allow the encoder to precisely represent the
value of samples
Provide a way to accurately represent the values
of anomalous picture content
Enable placing a hard limit on the number of
bits, decoder must handle for MB without harm to
coding efficiency

59
Video Coding Layer

Intra frame prediction
Constrained intra coding mode allows prediction
only from intra-coded neighboring MBs
Intra prediction across slice boundaries is not
used
Referring to neighboring samples of
previously-coded blocks may incur error
propagation in environments with transmission
error

60
Video Coding Layer

Inter frame prediction
In P slices
Each P MB type is partitioned into partitions
like the below
This method of partitioning MB is known as tree
structure motion compensation

61
Video Coding Layer

Inter frame prediction
Choosing larger partition size means
Small number of bits are required to signal the
choice of MV and the type of partition
Motion compensated residual contain a significant
amount of energy in frame areas with high detail
Choosing small partition size means
Give a lower-energy residual after motion
compensation
Require larger number of bits to signal MV and
type of partition
The accuracy of motion compensation is in units
of one quarter of the distance between two luma
sample

62
Video Coding Layer

Inter frame prediction
Half-sample values are obtained by applying a
one-dimensional 6-tap FIR filter vertically and
horizontally
6 tap interpolation filter is relatively complex
but produces more accurate fit to the
integer-sample data and hence better motion
compensation performance
Quarter-sample values are generated by averaging
samples at integer- and half-sample position

63
Video Coding Layer
64
Video Coding Layer

The above illustrates the half sample
interpolation

65
Video Coding Layer

Inter frame prediction
The following illustrates the luma quarter-pel
positions
a round ((Gb)/2)
d round ((Gh)/2)
e round ((hb)/2)

66
Video Coding Layer

The prediction for chroma component are obtained
by bilinear interpolation
The displacements used for chroma have one-eighth
sample position accuracy
a round((8-dx)(8-dy)A dx(8-dy)B (8-dx)dyC
dxdyD/64)

67
Video Coding Layer

Inter frame prediction
Motion prediction using full,half,and one-quarter
sample have improvements than the previous
standards for two reasons
More accurate motion representation
More flexibility in prediction filtering
Allows MV over picture boundaries
No MV prediction takes place across slice
boundaries
Motion compensation for smaller regions than 8x8
use the same reference index for prediction of
all blocks within 8x8

68
Video Coding Layer

Inter frame prediction
Choice of neighboring partitions of same and
different size are shown below
For transmitted partitions, excluding 16x8 and
8x16 partition sizes MVp is the median of the MV
for partitions A,B,C

69
Video Coding Layer

For 16x8 partitions MVp for the upper 16x8
partition is predicted from B, MVp for the lower
16x8 partition is predicted from A
For 8x16 partitions MVp for the left 8x16
partition is predicted from A, MVp for the right
8x16 partition is predicted from C
For skipped macroblocks a 16x16 vector MVp is
generated as in case(1) above

70
Video Coding Layer

P MB can be coded in P_Skip type useful for large
areas with no change or constant motion like slow
panning can be represented with very few bits
Support multi-picture motion-compensation like
below

71
Video Coding Layer

In B slices
Intra coding are also supported
Four other types are supported list 0, list 1,
bi-predictive, and direct prediction
For bi-predictive mode, the prediction signal is
formed by a weighted average of
motion-compensation list 0 and list 1 prediction
signal
The direct mode can be list 0 or list 1
prediction or bi-predictive
Support multi-frame motion compensation

72
Video Coding Layer

Transform, scaling, quantization
Transform is applied to 4x4 block
Instead of DCT, a separated integer transform
with similar properties as DCT is used
Inverse transform mismatches are avoided
At encoder, transform, scanning, scaling, and
rounding as quantization followed by entropy
coding
At decoder, process of inverse encoding is
performed except for the rounding
Inverse transform is implemented using only
additions and bit-shifting operations of 16 bit

73
Video Coding Layer

Several reasons for using smaller size transform
Remove statistical correlation efficiently
Have visual benefits resulting in less noise
around edges
Require less computations amd a smaller
processing word-length
Quantization parameter(QP) can take 52 values
Qstep double in size for every increment of 6 in
QP
With increasing 1 of QP means increasing 12.5
Qstep

74
Video Coding Layer

Wide range of quantizer step size make it
possible for encoder to control the trade-off
between bit rate and quality accurately and
flexibly
The values of QP may be different from luma and
chroma. QPchroma is derived from QPY by
user-defined offset
4x4 luma DC coefficient and quantization (16x16
intra mode only )
The DC coefficient of each 4x4 block is
transformed again using 4x4 Hadamard transform
In a intra-coded MB, much energy is concentrated
in the DC coefficients and this extra transform
helps to de-correlate the 4x4 luma DC coefficients

75
Video Coding Layer

2x2 chroma DC coefficient transform and
quantization, as with Intra luma DC coefficients,
the extra transform help to de-correlate the 2x2
chroma DC coefficients and improve compression
performance
The complete process
Encoding
Input 4x4 residual samples
Forward core transform
( followed by forward transform for Chroma DC or
Intra-16 Luma DC coefficients)
Post-scaling and quantization
( modified for Chroma DC or Intra-16 Luma DC)

76
Video Coding Layer

Decoding
( inverse transform for chroma DC or intra-16
luma DC coefficient )
Re-scaling ( incorporating inverse transform
pre-scaling )
(modified for chroma DC or Intra-16 Luma DC
coefficients)
Inverse core transform
Post-scaling
Output4x4 residual samples

77
Video Coding Layer

Flow chart
An additional 2x2 transform is also applied to DC
coefficients of the four 4x4 blocks of chroma

78
Video Coding Layer

Entropy coding
Simpler method use a single infinite-extent
codeword table for all syntax elements except
residual
mapping of codeword table is customized according
to data statistics
Codeword table chosen is an exp-Golomb code with
simple and regular decoding property
In CAVLC, VLC tables for various syntax elements
are switched depending on already transmitted
syntax elements
In CAVLC, number of non-zero quantized
coefficient and actual size and position of the
coefficients are coded separately

79
Video Coding Layer

Entropy coding
VLC tables are designed to match the
corresponding conditioned statistics
CAVLC encoding of a block of transform
coefficients proceeds as follows
Encode number of non-zero coefficients and
trailing 1s
Encode total number of non-zero
coefficients(TotalCoeffs) and trailing /-1
values(T1) ? coeff_token
TotalCoeffs016 ,T103
There are 4 look-up tables for coeff_token (3 VLC
and 1 FLC)
Encode the sign of each T1
Coded in reverse order, starting with
highest-frequency

80
Video Coding Layer

Entropy coding
Encoding levels of remaining non-zero
coefficients
Coded in reverse order
There are 7 VLC tables to choose from
Choice of table adapts depending on magnitude of
coded level
Encode total number of zeros before last
coefficient
TotalZeros is sum of all zeros preceding the
highest non-zero coefficient in the reorder array
Coded with a VLC
Encode each run of zeros
Encoded in reverse order
Chosen depending on ZerosLeft?run_before

81
Video Coding Layer

Entropy coding
example

82
Video Coding Layer

Entropy coding
example

83
Video Coding Layer

Entropy coding
example

84
Video Coding Layer

In CABAC, it allows assignment of a non-integer
number of bits to each symbol of an alphabet
Usage of adaptive codes permits adaptation to
non-stationary symbol statistics
Statistics of already coded syntax elements are
used to estimate conditional probabilities used
for switching several estimated models
Arithmetic coding core engine and its associated
probability estimation are specified as
multiplication-free low complexity methods using
only shift and table look-ups

85
Video Coding Layer

Coding a data symbol involves the following
stages (take MVDx)
Binarization
For MVDXlt9 its carried out by following table,
larger values are by Exp-Golomb codeword
the first bit is bin 1,second bit is bin 2

86
Video Coding Layer

Coding a data symbol involves the following
stages (take MVDx)
Context model selection
Its by following table
Arithmetic encoding
Selected context model supplies two probability
estimates (1 and 0) to determine sub-range the
arithmetic coder uses

87
Video Coding Layer

Coding a data symbol involves the following
stages (take MVDx)
Probability update
The value of bin 1 is 0, the frequency count of
0 is incremented

88
Video Coding Layer

In-loop deblocking filter
Applied between inverse transform and
reconstruction of MB
Particular characteristics of block-based coding
is the accidental production of visible block
structures
Block edges are reconstructed with less accuracy
than interior pixels and blocking is most
visible artifacts
It has two benefits
Block edges are smoothed
Resulting in a smaller residuals after prediction
In adaptive filter, strength of filtering is
controlled by several syntax elements

89
Video Coding Layer

In-loop deblocking filter
Basic idea is that if a relatively larger
absolute difference between samples near a block
edge is measured , it is quite likely a blocking
artifact and should be reduced
If magnitude of difference is large and cant be
explained by coarse quantization, its likely
actual behavior of picture
Filtering is applied 4x4 block

90
Video Coding Layer

In-loop deblocking filter
Filtering is applied 4x4 block
Choice of filtering outcome depends on boundary
strength and gradient of image across boundary

91
Video Coding Layer

In-loop de-blocking filter
Boundary strength Bs is chosen according to
following table
Filter implementation
Bs ?1,2,3a 4-tap linear filter is applied
Bs ?4 3?4?5-tap linear filter may be used

92
Video Coding Layer

Below shows principle using one dimensional edge
Samples p0 and q0 as well as p1 and q1 are
filtered is determined using quantization
parameter (QP) dependent thresholds a(QP) and
ß(QP), ß(QP) is smaller than a(QP)

93
Video Coding Layer

Filtering of p0 and q0 takes place if each of the
below is satisfied
1. p0 q0 lt a(QP)
2. p1 p0 lt ß(QP)
3. q1 q0 lt ß(QP)
Filtering of p1 and q1 takes place if the below
is satisfied
1. p2 p0 lt ß(QP)
or 2. q2 q0 lt ß(QP)

94
Video Coding Layer
Foreman.cif 30 Hz
Foreman.qcif 10 Hz
95
Video Coding Layer

Hypothetical reference decoder (HRD)
For a standard, its not sufficient to provide a
coding algorithm
Its important in real-time system to specify how
bits are fed to a decoder and how the decoded
pictures are removed from decoder
Specifying input and output buffer models and
developing an implementation independent model of
receiver called HRD
Specifies operation of two buffers
Coded picture buffer (CPB)
Decoded picture buffer (DPB)

96
Video Coding Layer

CPB models arrival and removal time of the coded
bits
HRD is more flexible in support of sending video
at variety of bit rates without excessive delay
HRD specifies DPB model management to ensure that
excessive memory capability is not needed

97
Profile and potential application

Profiles
Three profiles are defined, which are Baseline,
Main, and Extended profiles.
Baseline support all features except the
following
B slice, weighted prediction, CABAC, field
coding, and picture or MB adaptive switching
between frame/field coding
SP/SI slices, and slices data partition
Main profile supports first set of above but FMO,
ASO, and redundant pictures
Extended profile supports all features of
baseline and the above both set except for CABAC

98
Profile and potential application

Areas for profiles of new standard to be used
A list of possible application areas is list
below
Conversational services
H.323 conversational video services that utilize
circuitswitched ISDN-based video conference
H.323 conversational services over internet with
best effort IP/RTP protocols
Entertainment video applications
Broadcast via satellite, cable or DSL
DVD for standard
VOD(video on demand) via various channels

99
Profile and potential application

Streaming services
3GPP streaming using IP/RTP for transport and
RSTP for session setup
Streaming over wired Internet using IP/RTP
protocol and RTSP for session
Other services
3GPP multimedia messaging services
Video mail

100
Conclusion(III)

Its VCL design is based on convectional
block-based hybrid video coding concepts, but
with some differences relative to prior standard,
they are illustrated below
Enhanced motion-prediction capability
Use of a small block-size exact-match transform
Adaptive in-loop de-blocking filter
Enhanced entropy coding methods

Write a Comment

User Comments (0)