Title: ECEC453 Image Processing Architecture
1ECE-C453Image Processing Architecture
- Lecture 6, 2/3/04
- Lossy Video Coding Ideas
- Technology of DCT and Motion Estimation
- Oleh Tretiak
- Drexel University
2Decorrelation Ideas
- Orthogonal Transforms (KLR, DCT)
- Main method for intra-frame coding
- Wavelet
- New stuff (JPEG 2000)
- Predictive coding
- Simple
- Used for inter-frame coding (video)
Review
3Lossy Predictive Coding
- How to decorrelate?
- Predict values
- Block coding (DFT)
- wavelet
- Predictive (sample based, feedback)
encoder,Differential Pulse Code Modulation (DPCM)
Review
4Review Image Decorrelation
- x (x1, x2, ... xn), a sequence of image gray
values - Preprocess convert to y (y1, y2, ... yn), y
Ax, A an orthogonal matrix (A-1 AT) - Theoretical best (for Gaussian process) A is the
Karhunen-Loeve transformation matrix - Images are not Gaussian processes
- Karhunen-Loeve matrix is image-dependent,
computationally expensive to find - Evaluating y Ax with K-L transformation is
computationally expensive - In practice, we use DCT (discrete cosine
transform) for decorrelation - Computationally efficient
- Almost as good as the K-L transformation
Review
5Review Block-Based Coding
- Full image DCT - one set of decorrelated
coefficients for whole image - Block-based coding
- Image divided into small blocks
- Each block is decorrelated separately
- Block decorrelation performs almost as well
(better?) than full image decorrelation - Current standards (JPEG, MPEG) use 8x8 DCT blocks
Review
6Rate-Distortion 1D vs. 2D coding
- Theory on tradeoff between distortion and least
number of bits - Interesting tradeoff only if samples are
correlated - Water-filling construction to compute R(d)
Review
7Wavelet Transform
- Filterbank and wavelets
- 2 D wavelets
- Wavelet Pyramid
Review
8Filterbank Pyramid
125
125
250
500
1000
Review
9Lena Top Level, next level
Review
10This Lecture
- Idea
- Video Coding by Pixel Prediction
- Motion Estimation
- Technology DCT, and how much it costs
- Technology Motion Estimation Algorithms
11Video Coding
- Video Sequence of images
- Reason for changes between successive images
- Edits
- Camera pan, zoom
- Intra-frame motion
- Intra-frame texture
- Noise
- Model Successive images are similar
- Video coding uses intra-frame redundancy to
achieve lossy compression
12Predicting sequential images
f(t-1)
f(t)
f(t)f(t1)
13Motion Compensation
- Macroblock size
- MxN
- Matching criterion
- MAE (mean absolute error)
- Search window
- p pixel locations
- Search algorithm
- Full search
- Logarithmic search
- Parallel Hierarchical One-Dimensional Search
- Pixel subsampling and projection
- Hierarchical downsampling
14Motion Estimation Methods
No compensation
Full search
logarithmic search
3 level hierarchical
15DCT Technology
- DCT Formula
- How it works
- DCT plus quantization
- DCT implementations and cost
- Direct
- Separable
- Fast
- Refinements
16What is the DCT?
Note in these equations, p stands for p.
- One-dimensional 8 point DCT
- Input x0, ... x7, output y0, ... y7
- One-dimensional inverse DCT
- Input y0, ... y7, output x0, ... x7
- Matrix form of equations x, y are one column
matrices
17Two-Dimensional DCT
- Forward 2DDCT. Input xij i 0, ... 7, j 0,
... 7. Output ykl k 0, ... 7, l 0, ... 7 - Matrix form, X, Y 8x8 matrices with
coefficients xij , ykl - The 2DDCT is separable!
Note in these equations, p stands for p.
18General DCT
- One dimension
- Two dimensions
19Example 4x4 DCT
20Computational Complexity
- 1D DCT
- N input and output samples N2 64 operations
(additions multiplications) - 2D DCT - direct implementation
- M N2 input values, M output values -gt M2 N4
- 2D DCT - separable implementation, Y TXTT
ZTT, where Z TX, all matrices are NxN -gt 2N3
operations - For N 8
- 2D DCT direct 4096 operations, 64 operations
per pixel - 2D DCT separable 1024 operations, 16 ops/pixel
- Big savings due to separable transform
- Inverse DFT same story.
21DCT Encoding in JPEG, MPEG
- Take 8x8 blocks of pixels
- Subtract range mean value
- Compute 8x8 DCT
- Quantize the DCT coefficients
- Typically, many of the samples are equal to zero
- Lossless entropy coding of the quantized samples
- Different quantization step is used for different
DCT coefficients - ykl DCT coefficients, qkl quantizer steps
- zkl quantized values
22DCT Example
- Data from lena, smooth area. RMS error 3.5
DCT
Original
DCT, quantized
Reconstructed
23DCT example
- Data from lena, busy area. RMS error 7.3
Original
DCT
DCT, quantized
Reconstructed
24Overview DCT coding
- Transformation decorrelates samples
- Transformed samples are quantized, quantization
step depends on the coefficient. Degree of
compression and loss can be changed by scaling
the quantization steps - Many quantized samples are zero gt run length
coding - At receiver, perform inverse DCT
- Many calculations!
JPEG standard quantization steps
25Speeding up the DCT
- Separable transform - basic speedup
- Fast DCT transform - like FFT
- Further speedup through Scaled DCT
26Optimized (fast) DCT
- 1-D Chen DCT diagram. Dashed lines indicate
subtraction, multi-plication by a constant,
multiplication by 0.5 (shift).
Characteristics of optimized DCT algorithms
27DCT Complexity
- Direct DCT computation
- 64 DCT values, each requires 64 multiplications
additions gt 4096 multiply-accumulate (MA)
operations per block - Separable algorithm (operate on rows, then on
columns) gt 16 one-dimensional 8 point DCT
operations gt 1024 MA operations - Fast implementation Nlog2N operations 16x24
384 MA ops - Special methods many operations involve
multiplication by 1 or -1, take advantage of this!
28Fast Scaled DCT
- Picture of a butterfly at last stage of DCT
following quantizer
29DCT refinements
Complexity of scaled DCT algorithms, excluding
quantization
- Multiply-accumulate architectures
- Basic operation is a bc d, well suited for
DCT - Super-scalar architectures
- Multi-register, multi-ALU processors
- Perform several operations in parallel
30Motion Estimation
- Architecture of Motion Estimation
- Algorithms and Costs
- Full Search
- Logarithmic Search
- PHODS
- Downsample, projection
- Hierarchical motion estimation
- Other criteria
- Multi-image estimation
31Baseline Models
- Previous frame predicts current frame
- I(x, y, t) I(x, y, t-1) e(x, y, t)
- Not effective in presence of motion zoom, pan,
etc. - Prediction to account for motion
- I(x, y, t) I(xu, yv, t-1) e(x, y, t)
- (u, v) motion (displacement) vector
- Model works (somewhat) for pan, not for other
motion - Compromise Compute independent motion estimates
for rectangular image regions macroblocks. - Macroblocks are, in general, bigger than DCT
blocks
32Generic Encoder - simplified
33Generic Decoder
34Motion Compensation
- Macroblock size
- MxN
- Matching criterion
- MAE (mean absolute error)
- Search window
- p pixel locations
- Search algorithm
- Full search
- Logarithmic search
- Parallel Hierarchical One-Dimensional Search
- Pixel subsampling and projection
- Hierarchical downsampling
35Motion Estimation Terminology
- Issues
- Size of macroblock
- Size of search region
- In video coding standards, M N 16
36Matching Criterion
- Matching criterion what produces the fewest
coded bits for the error image - Coding for each value of motion vector (u, v) is
too time consuming (expensive) - In practice, mean absolute error (MAE) is most
popular - C - current image, R - reference image, (x, y) -
macroblock origin
37Full-Search Method
- Compute for (2p1)2 values of (i, j).
- Each location requires 3MN operations
- Picture dimensions IxJ, F pictures per second
- 3IJF(2p 1)2 operations per second
- I 720, J 480, F 30, p 15 gt 30 GOPS
- Guaranteed to find best (MAE) displacement
- How to do it?
- Special computers
- Smaller p
- Faster (suboptimal) algorithm
38Logarithmic Search (1D)
- Goal find minimum over u in -p, p
- First step evaluate at -p/2, 0, p/2 (interval
p) - Next step choose interval of length p/2 around
minimum (2 more evaluations) - Continue until interval length is equal to 2.
This takes k ceiling(log2p) iterations - Example p 7
39Logarithmic Search - 2D
- First stage requires 3x3 9 evaluations
- Subsequent stages require 8 evaluations
- k ceiling(log2p) stages (iterations)
- Rate 3IJF(8k1)
- p 15, I 720, J 480, F 30 gt 1 GOPS
- Can fail to find minimum
- Bottom line Faster method, more error than full
search
40PHODS
- Parallel Hierarchical One-Dimensional Search
- 1-st Blue2-nd Green3-rd Red
Twice as fast as logarithmic Less reliable
41Other Fast Methods
- Subsample (do not use all points in macroblock)
- Projection Row and column projection of pixels,
follow with 1-D search - Hierarchical motion estimation
- Downsample reference image and current image
- Perform low resolution search
- Refine
42Hierarchical Search
- Prepare downsampled versions of current and
reference images - Full macroblock 16x16
- Down 2 macroblock 8x8
- Down 4 macroblock 4x4
- Full search in Down 4 reference image
- 16 x speedup, smaller macroblock
- 16 x speedup, fewer displacement vectors
- p 16, p 4
- Around point of best match, do local search in
Down 2 reference image (3x3 search zone) - Repeat for Full reference image (3x3 search zone)
Full
Down 2
Down 4
43Motion Estimation Methods
No compensation
Full search
logarithmic search
3 level hierarchical
44Comparison
45More Speedup
- Simpler comparison criteria
- Binarize difference, count pixels that do not
match - PDC (Pixel Difference Classification)
- Binarize current and reference
- BPROP (count matching pixels)
- DPC (count different pixels)
- BMP (operations done on bitplanes)
- Produce 3-25 fold speedup
46Big Picture on Speedup
- Speedup methods are less accurate
- Same Bit Rate, lower SNR
- Same SNR, higher bit rate
- Binary criteria lose about 0.5 dB
- Suppose we have adequate computing power? Can we
do better? - Sub-pixel motion estimation
- First find best match with pixel accuracy in
displacement vectors - Interpolate images for half-pixel shifts
47Multipicture Motion Estimation
- Estimate on basis of past and future
- Non-sequential image transmission
- More chances to find good match
- More calculations
48Video Compression - Summary
- Video sequence of images
- Can use intraframe compression
- Motion JPEG
- Interframe compression offers great potential for
savings - No motion compensation lower compression
- Motion compensation greater compression
- All video standards provide for motion
compensation - Compensation done on macroblocks, multiple motion
vectors per image - Tradeoff between computing requirement and image
quality