Context-based adaptive binary arithmetic coding in the H.264/AVC video compression - PowerPoint PPT Presentation

View by Category
About This Presentation
Title:

Context-based adaptive binary arithmetic coding in the H.264/AVC video compression

Description:

UEG0 binarization for encoding of absolution values of transform coefficient levels. ... Finally, the absolute value of the level as well as the sign is ... – PowerPoint PPT presentation

Number of Views:799
Avg rating:3.0/5.0
Slides: 56
Provided by: NTU
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Context-based adaptive binary arithmetic coding in the H.264/AVC video compression


1
Context-based adaptive binary arithmetic coding
in the H.264/AVC video compression
CABAC
  • IEEE CSVT July 2003
  • Detlev Marpe, Heiko Schwarz, and Thomas Wiegand
  • 2003/11/04
  • Presented by Chen-hsiu Huang

2
Outline
  • Introduction
  • The CABAC framework
  • Detailed description of CABAC
  • Experimental result
  • Conclusion

3
Past deficiencies
  • Entropy coding such as MPEG-2, H.263, MPEG-4 (SP)
    is based on fixed tables of VLCs.
  • Due to VLCs, coding events with probability gt 0.5
    cannot be efficiently represented.
  • The usage of fixed VLC tables does not allow an
    adaptation to the actual symbol statistics.
  • Since there is a fixed assignment of VLC tables
    and syntax elements, existing inter-symbol
    redundancies cannot be exploited.

Why?
4
Solutions
Jump!
  • The first hybrid block-based video coding schemes
    that incorporate an adaptive binary arithmetic
    coder was presented in 6.
  • The first standard that use arithmetic entropy
    coder is given by Annex E of H.263 4.
  • However, the major drawbacks contains
  • Annex E is applied to the same syntax elements as
    the VLC elements of H.263.
  • All the probability models an non-adaptive that
    their underlying probability as assumed to be
    static.
  • The generic m-ary arithmetic coder used involves
    a considerable amount of computational complexity.

5
The CABAC Framework
  • binarization ? context modeling ? binary
    arithmetic coding

Figure 1.
6
Binarization
Back
  • Consider the value 3 of mb_type, which signals
    the macroblock type P_8x8, is given by 001.
  • The symbol probability p(3) is equal to the
    product of p(C0)(0), p(C1)(0), and
    p(C2)(1), where C0, C1, and C2 are denote the
    binary probability models of the internal nodes.

Figure 2.
7
  • Adaptive m-ary binary arithmetic coding (m gt 2)
    is in general requiring at least two
    multiplication for each symbol to encode as well
    as a number of fairly operations to perform the
    probability update 36.
  • Contrary, fast, multiplication-free variants of
    binary arithmetic coding, one of which was
    specifically developed for the CABAC frame, as
    described below.
  • Since the probability of symbols with larger bin
    strings is typically very low, the computation
    overhead in fairly small and can be easily
    compensated by using a fast binary coding engine.
  • Finally, binarization enables context modeling on
    sub-symbol level. For the most frequently
    observed bins, conditional probability can be
    used, while less frequently observed bins can be
    treaded using a joint, typically zero-order
    probability model.

Why?
8
Binarization Schemes
  • A binary representation for a given non-binary
    valued syntax element should be close to a
    minimum redundancy code.
  • Instead of Huffman tree, the design of CABAC
    (mostly) relies the a few basic code trees, whose
    structure enables a simple on-line computation of
    all code words without the need for storing any
    tables.
  • Unary code (U) and truncated unary code (TU)
  • The kth order Exp-Golomb code (EGk)
  • The fixed-length code (FL)
  • All the binarization schemes have less
    probability when the codeword length becomes
    longer.
  • In addition, there are binarization schemes based
    on a concatenation of these elementary types.
  • As an exception, there are five specific binary
    trees selected manually for the coding of
    macroblock and sub-macroblock types. Two of them
    show in Figure 2.

9
Unary and Truncated Unary Binarization
  • For each unsigned integer valued symbol x gt 0,
    the unary code word in CABAC consists if x 1
    bits plus a terminating 0 bit.
  • The truncated unary (TU) code is only defined for
    x with 0 lt x lt S, where for x lt S the code is
    given by the unary code, whereas for xS the
    terminating 0 bit is neglected.
  • For example
  • U 5 gt 111110
  • TU with S9
  • 6 gt 1111110
  • 9 gt 111111111

10
kth order Exp-Golomb Binarization
  • The prefix part of the EGk codeword consists of a
    unary code corresponding to the value
    l(x)floor(log2(x/2k1))
  • The EGk suffix part is computed as the binary
    representation of x2k(1-2l(x)) using kl(x)
    significant bits.

11
Fixed-Length Binarization
  • Let x denote a given value of such a syntax
    element, where 0 lt x lt S. Then, the FL codeword
    of x is simply given by the binarization
    representation of x with a fixed (minimum) number
    lFLceil(log2S) of bits.
  • Typically, FL binarization is applied to syntax
    elements with a nearly uniform distribution or to
    syntax elements, where each bit in the FL binary
    representation represents a specific coding
    decisions.
  • E.g. In the part of the coded block pattern
    symbol related to the luminance residual data.

12
Concatenation schemes
  • Three binarization schemes are derived
  • Concatenation of a 4-bit FL prefix as a
    representation of the luminance related part of
    the coded block pattern and a TU suffix with S2
    representing the chrominance related part of
    code_block_pattern.
  • Both the second and third are derived from the TU
    and the EGk binarization, which are referred as
    Unary/kth order Exp-Golomb (UEGk) binarization,
    are applied to motion vector differences and
    absolute values of transform coefficients levels.

13
  • The design of these concatenated binarization
    scheme is motivated by the following
    observations
  • First, the unary code is the simplest prefix-free
    code in terms of implementation cost.
  • Second, it permits a fast adaptation of the
    individual symbol probabilities in the
    sub-sequent context modeling stage, since the
    arrangement of the nodes in the corresponding
    tree is typically such that with increasing
    distance of the internal nodes from the root node
    the corresponding binary probabilities are less
    skewed.
  • These observations are only accurate for small
    values of the absolute motion vector differences
    and transform coefficient levels. For larger
    values, there is not much use of an adaptive
    modeling leading to the idea of concatenating and
    adaptation.

14
E.g. mvd, motion vector difference
  • For the prefix part of the UEGk bin string, a TU
    binarization with a cutoff S9 is involed for
    min(mvd, 9).
  • If mvd is equal to zero, the bin string consists
    only the prefix codeword 0.
  • If the condition mvd gt 9 holds, the suffix in
    constructed as an EG3 codeword for the value of
    mvd - 9, to which the sign of mvd is appended
    using the sign bit 1 for a negative mvd and 0
    otherwise. For mvd values with 0 lt mvd lt 9, the
    suffix consists only of the sign bit.

15
  • With the choice of the Exp-Golomb parameter k3,
    the suffix code words are given such that a
    geometrical increase of the prediction error in
    units of two samples is captured by a linear
    increase in the corresponding suffix code word
    length.

Figure 3. UEG0 binarization for encoding of
absolution values of transform coefficient levels.
16
Context modeling
  • Suppose a pre-defined set T of past symbols, a
    so-called context template, and a related set
    C0,,C-1 of contexts is given, where the
    context are specified by a modeling function
    FT?C operating on the template T.
  • For each symbol x to be code, a conditional
    probability p(xF(z)) is estimated by switching
    between different probability models according to
    the already coded neighboring symbols z in T.
    Thus, p(xF(z)) is estimated on the fly by
    tracking the actual source statistics.
  • The number t of different conditional
    probabilities to be estimated for an alphabet
    size of m is equal to tC(m-1).
  • This implies that by increasing the number of C,
    there is a point where overfitting of the model
    may occur.

17
  • In CABAC, only very limited context templates T
    consisting of a few neighboring of the current
    symbol to encode are employed such that only a
    small number of different context models C is
    used.
  • Second, context modeling is restricted to select
    bins of the binarized symbols. As a result, the
    model cost is drastically reduced.
  • Four basic design types of context models can be
    distinguished in CABAC. The first type involves a
    context template with up to two neighboring
    syntax elements in the past of the current syntax
    element to encode.

Figure 4. illustration of a context template
consisting of two neighboring syntax element A
and B to the left and on top of the current
syntax element C.
18
Types of context modeling
  • The second type of current is only defined for
    the syntax elements of mb_type and sub_mb_type.
  • For this kind of context models, the values of
    prior coded bins (b0,b1,...,bi-1) are used for
    the choice of a model for a given bin with index
    i.

Note that in CABAC these context models are only
used to select different models for different
internal nodes of the corresponding binary trees.
Figure 2.
19
  • Both the third and fourth type of context models
    is applied to residual data only. In contrast to
    all other types of context models, both types
    depend on the context categories of different
    block types.
  • The third type does not rely on past coded data,
    but on the position in the scanning path.
  • Significant map
  • The fourth type, modeling functions are specified
    that involve the evaluation of the accumulated
    number of encoded/decoded levels with a specific
    value prior to the current level bin to
    encode/decode.
  • Level information

20
Context index ?
  • The entity of probability models used in CABAC
    can be arranged in a linear fashion called
    context index ?.
  • Each probability model relate to a given context
    index ? is determined by two values, a 6-bit
    probability state index a? and the (binary) ß? of
    the most probable symbol (MPS).
  • (a? ß?,) for 0 ? 398 represented as 7-bit
    unsigned integer.

Figure 5. syntax elements and associated range of
context indices
21
  • The context indices in the range from 0 to 72 are
    related to syntax elements of macroblock,
    sub-macroblock, prediction modes of special and
    temporal as well as slice-based and
    macroblock-based control information.
  • For this type, a corresponding context index ?
    can be calculated as ?GS?S.. GS denotes the
    context index offset, which is defined as the
    lower value of the range given in Figure 5. And
    ?S denotes the context index increment of a given
    syntax element S.
  • Context indices of from 73 to 398 are related to
    the coding of residual data.
  • The range value in the lower row of the
    corresponding syntax elements in Figure 5 specify
    the context indices for field-based coding mode.
    In pure frame only 277 out of the total 399
    probabilities models are actually used.

22
Back
  • For other syntax elements of residual data, a
    context index ? is given by ?GS?S(ctx_cat)?S.
    Here the context category (ctx_cat) dependent
    offset ?S is employed. (Figure 6)
  • Note that only past coded value of syntax
    elements are evaluated that belong to the same
    slice, where the current coding process takes
    place.

Figure 6. Basic types with number of coefficients
and associated context categories.
23
Binary arithmetic coding
  • Binary arithmetic is based on the principal of
    recursive interval subdivision.
  • Suppose that an estimate of the probability pLPS
    in (0,0.5 of the least probable symbol (LPS) is
    given and its lower bound L and its width R.
    Based on this, the given interval is sub-divided
    into two sub-intervals RLPSRpLPS (3), and the
    dual interval is RMPSR-RLPS.
  • In a practical implementation, the main
    bottleneck in terms of throughput is the
    multiplication operation required.
  • A significant amount of work has been published
    aimed at speeding up the required calculation by
    introducing some approximations of either the
    range R or of the probability pLPS such that
    multiplication can be avoided. 32-34

24
  • The Q coder 32 and QM and MQ coder 35 both
    have their inefficiency. Here we designed an
    alternative multiplication-free one, called
    modulo coder (M coder), shown to provide a higher
    throughout than the MQ coder 36.
  • The basic idea of M coder is to project both the
    legal range Rmin,Rmax) of interval width R and
    the probability range with the LPS onto a small
    set of representative QQ0,...,QK-1,
    Pp0,...,pN-1. Thus the multiplication on the
    right-hand side of (3) can be approximated by
    using a table of KN pre-computed values.
  • A reasonable size of the corresponding table and
    a sufficient good approximation was found by
    using a set Q of K4 quantized range values
    together with a set P of M64 LPS related
    probability values.
  • Another distinct feature in H.264/AVC, as already
    mentioned above, is its simplicity bypass coding
    mode (assumed to be uniformly distributed).

25
Details of CABAC
  • The syntax elements are divided into two
    categories.
  • The first contains elements related to macroblock
    type, sub-macroblock type, and information of
    prediction modes both of spatial and of temporal
    type as well as slice and macroblock-based
    control information.
  • In the second, all residual data elements, i.e.,
    all syntax elements related to the coding of
    transform coefficients are combined.
  • In addition, a more detailed explanation of the
    probability estimation process and the
    table-based binary arithmetic coding engine of
    CABAC is given.

26
Coding of macroblock type, prediction mode, and
control information
  • At the top level of the macroblock layer syntax
    the signaling of mb_skip_flag and mb_type is
    performed. The binary-valued mb_skip_flag
    indicates whether the current macroblock in a
    P/SP or B slice is skipped.
  • For a given macroblock C, the related context
    models involves the mb_skip_flag values of the
    neighboring A at left and B on top. Given by
  • ?MbSkip(C) (mb_skip_flag(A) ! 0) ? 0 1
    (mb_skip_flag(B) ! 0) ? 0 1
  • If one or both of the neighboring A or B are not
    available, the mb_skip_type (C) value is set to 0.

27
Macroblock type
  • As already stated above. Figure 2 shows the
    binarization trees for mb_type and sub_mb_type
    that are used in P/SP slices.
  • Note the mb_type value of 4 for P slices is not
    used in CABAC entropy coding mode. For the values
    5-30 of mb_type, which is further specified
    in 1.
  • For coding a bin value corresponding to the
    binary decision at an internal node shown in
    Figure 2, separate context models denote by
    C0,...,C3 for mb_type and C0,...,C3 for
    sub_mb_type are employed.

?Figure 2
28
Coding of prediction modes
  • Intra prediction modes for luma 4x4 the
    luminance intra prediction modes for 4x4 blocks
    are itself predicted resulting in the syntax
    elements of the binary-values prev_intra4x4_pred_m
    ode_flag and rem_intra4x4_pred_mode, where the
    latter is only present if the former takes a
    value of 0.
  • For coding these syntax elements, two separate
    probability models are utilized one for coding
    of the flag and another for coding each bin value
    of the 3-FL binarization value of
    rem_intra4x4_pred_mode.

29
  • Intra prediction modes for chroma
  • ?ChPerd(C) (ChPredInDcMode(A) ! 0) ? 0 1
    (ChPredInDcMode(B) ! 0) ? 0 1
  • Reference Picture Index
  • ?RefIdx(C) (RefIdxZeroFlag(A) ! 0) ? 0 1 2
    ((RefIdxZeroFlag(B) ! 0) ? 0 1)
  • Components of motion vector differences
  • mvd(X,cmp) denote the value of a motion vector
    difference component of direction cmp in hori,
    vert related to a macroblock or sub-macroblock
    partition X.

30
  • Macroblock-based quantization parameter change
  • For updating the quantization parameter on a
    macroblock level, mb_qp_delta is present for each
    non-skipped macroblock. For coding the signed
    value d(C) of this syntax element, d(C) is first
    mapped onto a positive value by
  • d(C)2 d (C)-((d(C)gt0) ? 1 0)
  • Then d(C) is binarized using the unary
    binarization scheme.
  • End of slice flag
  • For signaling the last macroblock (macroblock
    pair) in a slice, the end_of_slice_flag is
    present for each macroblock (pair).
  • The event of non-terminating macroblock is
    related to the highest possible MPS possibility
  • Macroblock pair field flag
  • ?MbField(C) mb_field_decoding_flag(A)
    mb_field_decoding_flag(B)

31
Coding of residual data
  • A one-bit symbol coded_block_flag and a
    binary-valued significant map are used to
    indicate the occurrence and the location of
    non-zero transform coefficients in a given block.
  • Non-zero levels are encoded in reverse scanning
    order.
  • Context models for coding of nonzero transform
    coefficients are chosen based on the number of
    previously transmitted nonzero levels within the
    reverse scanning path.

32
  • First the coded block flag is transmitted for the
    given block of transform coefficients unless the
    coded block pattern or the macroblock mode
    indicated that the regarded block has no nonzero
    coefficients.
  • If the coded block flag is nonzero, a significant
    map specifying the position of significant
    coefficients is encoded.
  • Finally, the absolute value of the level as well
    as the sign is encoded for each significant
    transform coefficient.

Figure 7.
33
Encoding process of residual data
  • Coded block pattern For each non-skipped
    macroblock with prediction mode not equal to
    intra_16x16, the coded_block_pattern symbol
    indicates which of the six 8x8 blocks four
    luminance and two chrominance contain nonzero
    transform coefficients.
  • A given value of the syntax element
    coded_block_pattern is binarized using the
    concatenation of a 4-bit FL and a TU binarization
    with cutoff value S2.
  • Coded block flag is a one-bit symbol, which
    indicate if there are significant, i.e. nonzero
    coefficients inside single block of transform
    coefficients.
  • Scanning of transform coefficients the 2-D array
    of transform coefficient levels of those
    sub-blocks for which the coded_block_flag
    indicates nonzero entries are first mapped onto a
    1D list using a given scanning pattern.

34
  • Significance map If the significant_coeff_flag
    symbol is one, a further one-bit symbol
    last_significant_coefficient is sent. This symbol
    indicates if the current significant coefficient
    is the last in inside the block or if further
    significant coefficients follow.
  • Level information The value of significant
    coefficients (levels) are encoded by using two
    coding symbols coeff_abs_level_minus1, and
    coeff_sign_flag. The UEG0 binarization scheme is
    used for encoding of coeff_abs_level_minus1.
  • The levels are transmitted in reverse scanning
    order allowing the usage of reasonable adjust
    context models.

35
Context models for residual data
  • Context modes for residual data In H.264/AVC,
    there 12 types of transform coefficient blocks,
    which typically have different kinds of
    statistics. To keep the number of different
    context models small, they are classified into
    five categories as in Figure 6.
  • For each of these categories, a special set of
    context models is used for all syntax elements
    related to residual data.
  • coded_block_pattern For bin indices from 0 to 3
    corresponding to the four 8x8 luminance blocks,
  • ?CBP(C,bin_idx) ((CBP_Bit(A) ! 0) ? 0 1)
    2((CBP_Bit(B) ! 0) ? 0 1)
  • For indices 4 and 5, are specified in 1

?Figure 6
36
  • Coded Block Flag Coding of the coded_block_flag
    utilizes four different probability models for
    each of the five categories as specified in
    Figure 6.
  • ?CBFlag(C) coded_block_flag(A)
    2coded_block_flag(B)
  • Significant map For encoding the significant
    map, up to 15 different probability models are
    used for both significant_coeff_flag and
    last_significant_flag.
  • The choice of the models and the context index
    increments depend on the scanning position
  • ?SIG(coeffi) ?LAST(coeffi) i
  • Level information Reverse scanning of the level
    information allows a more reliable estimation of
    the statistics, because at the end of the
    scanning path it is very likely to observe the
    occurrence of successive so-called trailing 1s.

37
Probability estimation
  • For CABAC, 64 representative probability values
    ps in 0.01875, 0.5 were derived for the LPS by
  • Psa Ps-1 for all s1,...,63
  • a(0.01875 / 0.5)(1/63) and p00.5

Figure 8. LPS probability values and transition
rules for updating the probability estimation of
each state after observing a LPS (dashed lines in
left direction) and a MPS (solid lines in right
direction).
38
  • Both the chosen scaling factor a 0.95 and the
    cardinality N64 of the set probabilities
    represent a good compromise between the desire
    for fast adaptation (a ? 0, small N) and
    sufficiently stable and accurate estimate (a ?
    1, large N).
  • As a result of this design, each context model in
    CABAC can be completely determined by two
    parameters its current estimate of the LPS
    probability and its value of MPS ß being either 0
    or 1.
  • Actually, for a given probability state, the
    update depends on the state index and the value
    of the encoded symbol identified either as a LPS
    or a MPS.
  • The derivation of the transition rules for the
    LPS probability is based on the following
    relation between a given LPS probability pold and
    its updated counterpart pnew

39
Table-based binary arithmetic coding
  • Actually, the CABAC coding engine consists of two
    sub-engines, one for regular coding mode and the
    other for bypass coding engine.
  • Interval sub-division in regular coding mode The
    internal state of the arithmetic encoding engine
    is as usual characterized by two quantities the
    current interval R and the base L of the current
    code interval.

Figure 9.
40
  • First, the current interval R is approximated by
    a quantized value Q(R), using an equi-partition
    of the whole range 28Rlt29 into four cells. But
    instead of using the corresponding representative
    quantized values Q0, Q1, Q2, and Q3. Q(R) is only
    addressed by its quantizer index ?, e.g. ?(Rgtgt6)
    3.
  • Thus, this index and the probability state index
    are used as entries in a 2D table TabRangeLPS to
    determine (approximate) the LPS related
    sin-interval range RLPS. Here the table
    TabRangeLPS contains all 64x4 pre-computed
    product values psQ? for 0s63, and 0 ?3 in 8
    bit precision.

41
  • Bypass coding mode To speed up the
    encoding/decoding of symbols, for which R-RLPS
    RLPS R/2 is assumed to hold.
  • The variable L is doubled before choosing the
    lower or upper sub-interval depending on the
    value of the symbol to encode (0 or 1).
  • In this way, doubling of L and R in the
    sub-sequent renormalization in the bypass is
    operated with doubled decision threshold.

Figure 10.
42
  • Renormalization and carry-over control A
    renormalization operation after interval
    sub-division is required whenever the new
    interval range R no longer stays with its legal
    range of 28,29).
  • For the CABAC engine, the renormalization process
    and carry-over control of 37 was adopted.
  • This implies, in particular, that the encoder has
    to resolve any carry propagation by monitoring
    the bits that are outstanding for being emitted.
  • More details can be found in 1.

43
Experimental result
  • In our experiments, we compare the coding
    efficiency of CABAC to the coding efficiency of
    the baseline entropy coding method of H.264/AVC.
    The baseline entropy coding method uses the
    zero-order Exp-Golomb code for all syntax
    elements with the exception of the residual data,
    which are coded using the coding method of CAVLC
    1, 2.
  • For the range of acceptable video quality for
    broadcast application of about 3038 dB and
    averaged over all tested sequences, bit-rate
    savings of 9 to 14 are achieved, where higher
    gains are obtained at lower rates.

44
(No Transcript)
45
References
Back
  • 1 Draft ITU-T Recommendation H.264 and Draft
    ISO/IEC 14 496-10 AVC," in Joint Video Team of
    ISO/IEC JTC1/SC29/WG11 ITU-T SG16/Q.6 Doc.
    JVT-G050, T. Wieg, Ed., Pattaya, Thailand, Mar.
    2003.
  • 2 T. Wiegand, G. J. Sullivan, G. Bjontegaard,
    and A. Luthra, Overview of the H.264/AVC Video
    Coding Standard, IEEE Trans. Circuits Syst.
    Video Technol., vol. 13, pp. 560576, July 2003.
  • 4 Video Coding for Low Bitrate Communications,
    Version 1, ITU-T, ITU-T Recommendation H.263,
    1995.
  • 6 C. A. Gonzales, DCT Coding of Motion
    Sequences Including Arithmetic Coder, ISO/IEC
    JCT1/SC2/WP8, MPEG 89/187, MPEG 89/187, 1989.
  • 32 W. B. Pennebaker, J. L. Mitchell, G. G.
    Langdon, and R. B. Arps, An overview of the
    basic principles of the Q-coder adaptive binary
    arithmetic coder, IBM J. Res. Dev., vol. 32, pp.
    717726, 1988.
  • 33 J. Rissanen and K. M. Mohiuddin, A
    multiplication-free multialphabet arithmetic
    code, IEEE Trans. Commun., vol. 37, pp. 9398,
    Feb. 1989.
  • 34 P. G. Howard and J. S. Vitter, Practical
    implementations of arithmetic coding, in Image
    and Text Compression, J. A. Storer, Ed. Boston,
    MA Kluwer, 1992, pp. 85112.
  • 36 D. Marpe and T.Wiegand, A highly efficient
    multiplication-free binary arithmetic coder and
    its application in video coding, presented at
    the IEEE Int. Conf. Image Proc. (ICIP),
    Barcelona, Spain, Sept. 2003.

46
Q1.
Back
  • The problem with this scheme lies in the fact
    that Huffman codes have to be an integral number
    of bits long.
  • The optimal number of bits to be used for each
    symbol is the -log2(1/p), where p is the
    probability of a given character.
  • Thus, if the probability of a character is 1/256,
    such as would be found in a random byte stream,
    the optimal number of bits per character is log
    base 2 of 256, or 8.
  • If the probability goes up to 1/2, the optimum
    number of bits needed to code the character would
    go down to 1.
  • If a statistical method can be developed that can
    assign a 90 (gt 0.5) probability to a given
    character, the optimal code size would be 0.15
    bits. The Huffman coding system would probably
    assign a 1 bit code to the symbol, which is 6
    times longer than is necessary.

47
Q2.
Back
  • For each symbol to encode, the upper bound u(u)
    and low bound l(l) of the interval containing the
    tag for the sequence must be computed.

48
H.264 / MPEG-4 Part 10 Introduction to CABAC
  • When entropy_coding_mode is set to 1, an
    arithmetic coding system is used to encode and
    decode H.264 syntax elements.
  • The arithmetic coding scheme selected for H.264,
    Context-based Adaptive Binary Arithmetic Coding
    or CABAC, achieves good compression performance
    through
  • Selecting probability models for each syntax
    element according to the elements context,
  • Adapting probability estimates based on local
    statistics and
  • Using arithmetic coding.

49
Coding stages
  • Binarization
  • CABAC uses Binary Arithmetic Coding which means
    that only binary decisions (1 or 0) are encoded.
  • A non-binary-valued symbol (e.g. a transform
    coefficient or motion vector) is "binarized" or
    converted into a binary code prior to arithmetic
    coding.
  • This process is similar to the process of
    converting a data symbol into a variable length
    code but the binary code is further encoded (by
    the arithmetic coder) prior to transmission.

50
  • Context model selection
  • A "context model" is a probability model for one
    or more bins of the binarized symbol.
  • This model may be chosen from a selection of
    available models depending on the statistics of
    recently-coded data symbols.
  • The context model stores the probability of each
    bin being "1" or "0".
  • Arithmetic encoding
  • An arithmetic coder encodes each bin according to
    the selected probability model.
  • Note that there are just two sub-ranges for each
    bin (corresponding to "0" and "1").
  • Probability update
  • The selected context model is updated based on
    the actual coded value (e.g. if the bin value was
    "1", the frequency count of "1"s is increased).
  • Above stages are repeated for each bit (or bin)
    of the binarized symbol.

51
The coding process
  • Binarization. We will illustrate the coding
    process for one example, MVDx (motion vector
    difference in the x-direction).
  • Binarization is carried out according to the
    following table for MVDxlt9 (larger values of
    MVDx are binarized using an Exp-Golomb codeword).
  • (Note that each of these binarized codewords are
    uniquely decodeable).

MVDx Binarization
0 0
1 10
2 110
3 1110
4 11110
5 111110
6 1111110
7 11111110
8 111111110
52
  • Choose a context model for each bin. One of 3
    models is selected for bin 1, based on previous
    coded MVD values. The L1 norm of two
    previously-coded values, ek, is calculated
  • ekMVDAMVDB
  • A left block, B above block
  • If ek is small, then there is a high probability
    that the current MVD will have a small magnitude
    if ek is large then it is more likely that the
    current MVD will have a large magnitude. We
    select a probability table (context model)
    accordingly.

ek Context model for bin 1
0 lt ek lt 3 Model 0
3 lt ek lt 32 Model 1
32 lt ek Model 2
53
  • The remaining bins are coded using one of 4
    further context models
  • Encode each bin. The selected context model
    supplies two probability estimates the
    probability that the bin contains 1 and the
    probability that the bin contains 0.
  • These estimates determine the two sub-ranges that
    the arithmetic coder uses to encode the bin.

Bin Context mode
1 0, 1, or 2 (depend on ek)
2 3
3 4
4 5
5 6
6 and higher 7
54
  • Update the context models. For example, if
    context model 2 was selected for bin 1 and the
    value of bin 1 was 0, the frequency count of
    0s is incremented. This means that the next
    time this model is selected, the probability of
    an 0 will be slightly higher.
  • When the total number of occurrences of a model
    exceeds a threshold value, the frequency counts
    for 0 and 1 will be scaled down, which in
    effect gives higher priority to recent
    observations.
  • At the beginning of each coded slice, the context
    models are initialized depending on the initial
    value of the Quantization Parameter QP (since
    this has a significant effect on the probability
    of occurrence of the various data symbols).

55
The arithmetic coding engine
  • The arithmetic decoder has three distinct
    properties
  • Probability estimation is performed by a
    transition process between 64 separate
    probability states for Least Probable Symbol
    (LPS, the least probable of the two binary
    decisions 0 or 1).
  • The range R representing the current state of the
    arithmetic coder is quantized to a small range of
    pre-set values before calculating the new range
    at each step, making it possible to calculate the
    new range using a look-up table (i.e.
    multiplication-free).
  • A simplified encoding and decoding process is
    defined for data symbols with a near-uniform
    probability distribution. (bypass)
  • The definition of the decoding process is
    designed to facilitate low-complexity
    implementations of arithmetic encoding and
    decoding.

Back
About PowerShow.com