ECEC453 Image Processing Architecture - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

ECEC453 Image Processing Architecture

Description:

Lecture 6, 2/3/04. Lossy Video Coding Ideas. Technology of DCT and ... plication by a constant, multiplication by. 0.5 (shift). Characteristics. of optimized ... – PowerPoint PPT presentation

Number of Views:226
Avg rating:3.0/5.0
Slides: 49
Provided by: OlehTr8
Category:

less

Transcript and Presenter's Notes

Title: ECEC453 Image Processing Architecture


1
ECE-C453Image Processing Architecture
  • Lecture 6, 2/3/04
  • Lossy Video Coding Ideas
  • Technology of DCT and Motion Estimation
  • Oleh Tretiak
  • Drexel University

2
Decorrelation Ideas
  • Orthogonal Transforms (KLR, DCT)
  • Main method for intra-frame coding
  • Wavelet
  • New stuff (JPEG 2000)
  • Predictive coding
  • Simple
  • Used for inter-frame coding (video)

Review
3
Lossy Predictive Coding
  • How to decorrelate?
  • Predict values
  • Block coding (DFT)
  • wavelet
  • Predictive (sample based, feedback)
    encoder,Differential Pulse Code Modulation (DPCM)

Review
4
Review Image Decorrelation
  • x (x1, x2, ... xn), a sequence of image gray
    values
  • Preprocess convert to y (y1, y2, ... yn), y
    Ax, A an orthogonal matrix (A-1 AT)
  • Theoretical best (for Gaussian process) A is the
    Karhunen-Loeve transformation matrix
  • Images are not Gaussian processes
  • Karhunen-Loeve matrix is image-dependent,
    computationally expensive to find
  • Evaluating y Ax with K-L transformation is
    computationally expensive
  • In practice, we use DCT (discrete cosine
    transform) for decorrelation
  • Computationally efficient
  • Almost as good as the K-L transformation

Review
5
Review Block-Based Coding
  • Full image DCT - one set of decorrelated
    coefficients for whole image
  • Block-based coding
  • Image divided into small blocks
  • Each block is decorrelated separately
  • Block decorrelation performs almost as well
    (better?) than full image decorrelation
  • Current standards (JPEG, MPEG) use 8x8 DCT blocks

Review
6
Rate-Distortion 1D vs. 2D coding
  • Theory on tradeoff between distortion and least
    number of bits
  • Interesting tradeoff only if samples are
    correlated
  • Water-filling construction to compute R(d)

Review
7
Wavelet Transform
  • Filterbank and wavelets
  • 2 D wavelets
  • Wavelet Pyramid

Review
8
Filterbank Pyramid
125
125
250
500
1000
Review
9
Lena Top Level, next level
Review
10
This Lecture
  • Idea
  • Video Coding by Pixel Prediction
  • Motion Estimation
  • Technology DCT, and how much it costs
  • Technology Motion Estimation Algorithms

11
Video Coding
  • Video Sequence of images
  • Reason for changes between successive images
  • Edits
  • Camera pan, zoom
  • Intra-frame motion
  • Intra-frame texture
  • Noise
  • Model Successive images are similar
  • Video coding uses intra-frame redundancy to
    achieve lossy compression

12
Predicting sequential images
f(t-1)
f(t)
f(t)f(t1)
13
Motion Compensation
  • Macroblock size
  • MxN
  • Matching criterion
  • MAE (mean absolute error)
  • Search window
  • p pixel locations
  • Search algorithm
  • Full search
  • Logarithmic search
  • Parallel Hierarchical One-Dimensional Search
  • Pixel subsampling and projection
  • Hierarchical downsampling

14
Motion Estimation Methods
No compensation
Full search
logarithmic search
3 level hierarchical
15
DCT Technology
  • DCT Formula
  • How it works
  • DCT plus quantization
  • DCT implementations and cost
  • Direct
  • Separable
  • Fast
  • Refinements

16
What is the DCT?
Note in these equations, p stands for p.
  • One-dimensional 8 point DCT
  • Input x0, ... x7, output y0, ... y7
  • One-dimensional inverse DCT
  • Input y0, ... y7, output x0, ... x7
  • Matrix form of equations x, y are one column
    matrices

17
Two-Dimensional DCT
  • Forward 2DDCT. Input xij i 0, ... 7, j 0,
    ... 7. Output ykl k 0, ... 7, l 0, ... 7
  • Matrix form, X, Y 8x8 matrices with
    coefficients xij , ykl
  • The 2DDCT is separable!

Note in these equations, p stands for p.
18
General DCT
  • One dimension
  • Two dimensions

19
Example 4x4 DCT
  • See 06IPA.xls

20
Computational Complexity
  • 1D DCT
  • N input and output samples N2 64 operations
    (additions multiplications)
  • 2D DCT - direct implementation
  • M N2 input values, M output values -gt M2 N4
  • 2D DCT - separable implementation, Y TXTT
    ZTT, where Z TX, all matrices are NxN -gt 2N3
    operations
  • For N 8
  • 2D DCT direct 4096 operations, 64 operations
    per pixel
  • 2D DCT separable 1024 operations, 16 ops/pixel
  • Big savings due to separable transform
  • Inverse DFT same story.

21
DCT Encoding in JPEG, MPEG
  • Take 8x8 blocks of pixels
  • Subtract range mean value
  • Compute 8x8 DCT
  • Quantize the DCT coefficients
  • Typically, many of the samples are equal to zero
  • Lossless entropy coding of the quantized samples
  • Different quantization step is used for different
    DCT coefficients
  • ykl DCT coefficients, qkl quantizer steps
  • zkl quantized values

22
DCT Example
  • Data from lena, smooth area. RMS error 3.5

DCT
Original
DCT, quantized
Reconstructed
23
DCT example
  • Data from lena, busy area. RMS error 7.3

Original
DCT
DCT, quantized
Reconstructed
24
Overview DCT coding
  • Transformation decorrelates samples
  • Transformed samples are quantized, quantization
    step depends on the coefficient. Degree of
    compression and loss can be changed by scaling
    the quantization steps
  • Many quantized samples are zero gt run length
    coding
  • At receiver, perform inverse DCT
  • Many calculations!

JPEG standard quantization steps
25
Speeding up the DCT
  • Separable transform - basic speedup
  • Fast DCT transform - like FFT
  • Further speedup through Scaled DCT

26
Optimized (fast) DCT
  • 1-D Chen DCT diagram. Dashed lines indicate
    subtraction, multi-plication by a constant,
    multiplication by 0.5 (shift).

Characteristics of optimized DCT algorithms
27
DCT Complexity
  • Direct DCT computation
  • 64 DCT values, each requires 64 multiplications
    additions gt 4096 multiply-accumulate (MA)
    operations per block
  • Separable algorithm (operate on rows, then on
    columns) gt 16 one-dimensional 8 point DCT
    operations gt 1024 MA operations
  • Fast implementation Nlog2N operations 16x24
    384 MA ops
  • Special methods many operations involve
    multiplication by 1 or -1, take advantage of this!

28
Fast Scaled DCT
  • Picture of a butterfly at last stage of DCT
    following quantizer

29
DCT refinements
Complexity of scaled DCT algorithms, excluding
quantization
  • Multiply-accumulate architectures
  • Basic operation is a bc d, well suited for
    DCT
  • Super-scalar architectures
  • Multi-register, multi-ALU processors
  • Perform several operations in parallel

30
Motion Estimation
  • Architecture of Motion Estimation
  • Algorithms and Costs
  • Full Search
  • Logarithmic Search
  • PHODS
  • Downsample, projection
  • Hierarchical motion estimation
  • Other criteria
  • Multi-image estimation

31
Baseline Models
  • Previous frame predicts current frame
  • I(x, y, t) I(x, y, t-1) e(x, y, t)
  • Not effective in presence of motion zoom, pan,
    etc.
  • Prediction to account for motion
  • I(x, y, t) I(xu, yv, t-1) e(x, y, t)
  • (u, v) motion (displacement) vector
  • Model works (somewhat) for pan, not for other
    motion
  • Compromise Compute independent motion estimates
    for rectangular image regions macroblocks.
  • Macroblocks are, in general, bigger than DCT
    blocks

32
Generic Encoder - simplified
33
Generic Decoder
34
Motion Compensation
  • Macroblock size
  • MxN
  • Matching criterion
  • MAE (mean absolute error)
  • Search window
  • p pixel locations
  • Search algorithm
  • Full search
  • Logarithmic search
  • Parallel Hierarchical One-Dimensional Search
  • Pixel subsampling and projection
  • Hierarchical downsampling

35
Motion Estimation Terminology
  • Issues
  • Size of macroblock
  • Size of search region
  • In video coding standards, M N 16

36
Matching Criterion
  • Matching criterion what produces the fewest
    coded bits for the error image
  • Coding for each value of motion vector (u, v) is
    too time consuming (expensive)
  • In practice, mean absolute error (MAE) is most
    popular
  • C - current image, R - reference image, (x, y) -
    macroblock origin

37
Full-Search Method
  • Compute for (2p1)2 values of (i, j).
  • Each location requires 3MN operations
  • Picture dimensions IxJ, F pictures per second
  • 3IJF(2p 1)2 operations per second
  • I 720, J 480, F 30, p 15 gt 30 GOPS
  • Guaranteed to find best (MAE) displacement
  • How to do it?
  • Special computers
  • Smaller p
  • Faster (suboptimal) algorithm

38
Logarithmic Search (1D)
  • Goal find minimum over u in -p, p
  • First step evaluate at -p/2, 0, p/2 (interval
    p)
  • Next step choose interval of length p/2 around
    minimum (2 more evaluations)
  • Continue until interval length is equal to 2.
    This takes k ceiling(log2p) iterations
  • Example p 7

39
Logarithmic Search - 2D
  • First stage requires 3x3 9 evaluations
  • Subsequent stages require 8 evaluations
  • k ceiling(log2p) stages (iterations)
  • Rate 3IJF(8k1)
  • p 15, I 720, J 480, F 30 gt 1 GOPS
  • Can fail to find minimum
  • Bottom line Faster method, more error than full
    search

40
PHODS
  • Parallel Hierarchical One-Dimensional Search
  • 1-st Blue2-nd Green3-rd Red

Twice as fast as logarithmic Less reliable
41
Other Fast Methods
  • Subsample (do not use all points in macroblock)
  • Projection Row and column projection of pixels,
    follow with 1-D search
  • Hierarchical motion estimation
  • Downsample reference image and current image
  • Perform low resolution search
  • Refine

42
Hierarchical Search
  • Prepare downsampled versions of current and
    reference images
  • Full macroblock 16x16
  • Down 2 macroblock 8x8
  • Down 4 macroblock 4x4
  • Full search in Down 4 reference image
  • 16 x speedup, smaller macroblock
  • 16 x speedup, fewer displacement vectors
  • p 16, p 4
  • Around point of best match, do local search in
    Down 2 reference image (3x3 search zone)
  • Repeat for Full reference image (3x3 search zone)

Full
Down 2
Down 4
43
Motion Estimation Methods
No compensation
Full search
logarithmic search
3 level hierarchical
44
Comparison
45
More Speedup
  • Simpler comparison criteria
  • Binarize difference, count pixels that do not
    match
  • PDC (Pixel Difference Classification)
  • Binarize current and reference
  • BPROP (count matching pixels)
  • DPC (count different pixels)
  • BMP (operations done on bitplanes)
  • Produce 3-25 fold speedup

46
Big Picture on Speedup
  • Speedup methods are less accurate
  • Same Bit Rate, lower SNR
  • Same SNR, higher bit rate
  • Binary criteria lose about 0.5 dB
  • Suppose we have adequate computing power? Can we
    do better?
  • Sub-pixel motion estimation
  • First find best match with pixel accuracy in
    displacement vectors
  • Interpolate images for half-pixel shifts

47
Multipicture Motion Estimation
  • Estimate on basis of past and future
  • Non-sequential image transmission
  • More chances to find good match
  • More calculations

48
Video Compression - Summary
  • Video sequence of images
  • Can use intraframe compression
  • Motion JPEG
  • Interframe compression offers great potential for
    savings
  • No motion compensation lower compression
  • Motion compensation greater compression
  • All video standards provide for motion
    compensation
  • Compensation done on macroblocks, multiple motion
    vectors per image
  • Tradeoff between computing requirement and image
    quality
Write a Comment
User Comments (0)
About PowerShow.com