ICASSP 2004 Poster Slides - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

ICASSP 2004 Poster Slides

Description:

ICASSP 2004 Poster Slides. Adam Zelinski. This poster presents an automatic approach for minimizing the number of ... Multiplierless implementation means ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 20
Provided by: adamcze
Category:
Tags: icassp | poster | slides

less

Transcript and Presenter's Notes

Title: ICASSP 2004 Poster Slides


1
ICASSP 2004 Poster Slides
  • Adam Zelinski

2
Abstract
  • This poster presents an automatic approach for
    minimizing the number of additions required for a
    multiplierless implementation of a signal
    transform, under an arbitrary quality constraint.
  • Multiplierless implementation means
    multiplications by constants are realized as
    networks of shifts and additions (e.g., y 5x ?
    y xltlt2 x).
  • Cost of implementation number of additions
  • Higher precision higher output quality but also
    higher cost there is a tradeoff.

3
Design Flow Challenges
Choosing a robust algorithm requires
understanding DSP concepts and literature.
An exponentially large number of different
precision configurations exist.
Reducing the precision of a constant
unpredictably impacts output.
  • Given a transform
  • We automatically select a numerically robust
    algorithm
  • We automatically tune the constant precisions

4
DSP Transform Algorithms
  • We consider the following transforms

5
SPIRAL
  • Automatically implements, optimizes DSP
    algorithms.
  • Searches across many formulas, finding one with
    minimal runtime.
  • We use it to generate robust algorithms.

transform
controls
algorithm generation
algorithm
controls
search engine
algorithm compilation
C code
runtime
runtime measurement
platform-adapted implementation
6
Robust Algorithm Example DCT-II 8
  • Formula generated by SPIRAL
  • Data flow diagram
  • Rotation-based algorithms are selected for
    robustness.

7
Increasing Algorithm Robustness
  • Automatically convert Rotations to Lifting Steps
    (LS)

Targets for approximation
  • Rounding error in 1st LS (3rd LS analogous)

not magnified
  • Rounding error in 2nd LS

e is magnified unless ? in 0, ?/2 or 3?/2,
2?
Solution angle manipulation
8
Multiplierless Implementation
Constant multiplies are converted to shifts and
additions.
Arobust
Amultiplierless
  • An algorithm constant c is approximated as
    . (n denotes of fractional bits)
  • Example

Direct
c 0.10011100100101
6 adds (6 shifts)
Canonical Signed Digit (CSD)
c 0.10100100100101
5 adds (5 shifts)
Addition Chains(our method)
4 adds (5 shifts)
9
Converting Constant Multiplies to Shifts and Adds
Addition Chains outperform CSD
10
Quality Measures
  • Transform quality must be maintained when
    minimizing cost.
  • Measures we consider
  • Coding Gain
  • MP3 decoder compliance rating(Non-Compliant,
    Limited Accuracy, or Fully Compliant)
  • Peak Signal to Noise Ratio of JPEG decompressed
    image, D, to original image, O

11
Search Space
  • A precision list, , is
    associated with an algorithm having n constants.
  • If max. bitwidth, B, is 19, at most 5 adds are
    needed per constant
  • Goal Find a precision list s.t. of additions
    is minimized and quality threshold is met.
  • Size of search space 6n32-point DCT-II has 80
    constants, size 680, exhaustive search
    infeasible

12
Global Greedy Search
  • Global
  • Same bitwidth assumed for all constants
  • Exhaustive search over all B1 possibilities
  • Greedy
  • Set each constant to max. precision
  • Each constant is reduced in turn to require one
    fewer addition quality is evaluated (n
    evaluations)
  • Choose the config. whose quality is highest
  • Continue until Qthresh is not satisfied by any
    config.

13
Evolutionary Search
  • Mimics natural process of evolution
  • Random configs. are chosen
  • For a set of generations, random members are
    introduced, mutated, and crossbred. Only the
    fittest proceed to the next generation.

Mutation
Randomly change precision.
Crossbreeding
Swap precisions.
14
Experiments
15
Experimental Results
Number of additions is significantly reduced
transform quality is maintained.
16
Experimental Results
Progress of the evolutionary algorithm during E1.
17
Applying Global Search to the DCT-II within MP3.
Experimental Results
Quality Measure MP3 Compliance Rating Limited
Accuracy(RMS lt 1.4e-4, MaxDiff lt 8) Achieved
whenglobal bitwidth 9 Fully Compliant(RMS lt
8.8e-6, MaxDiff lt 6.1e-5) Achieved when global
bitwidth 13
Reference input is decoded and compared to
reference output. RMS (Root Mean Squared) error
and MaxDiff error are computed.
18
Searching to find a low-cost DCT-II for JPEG.
Experimental Results
Quality Measure PSNR (dB) Thresholds 32, 32.5,
, 34.5 (dB) Global best for lowest
thresholds Greedy, Evol best for larger
thresholds. Greedy, Evol often produce same
result for a given threshold.
19
Acknowledgments
  • This work was supported by NSF through awards
    0234293, 0310941, and 0325687.

Primary References
  • J. Liang and T.D. Tran, Fast Multiplierless
    Approximations of the DCT with the Lifting
    Scheme, IEEE Trans. on Sig. Proc., vol. 49, no.
    12, pp. 30323044, 2001.
  • M. Pueschel, et. al., SPIRAL A Generator for
    Platform-Adapted Libraries of Signal Processing
    Algorithms, Journal of High Performance
    Computing and Applications, special issue on
    Automatic Performance Tuning, 18(1), pp. 21-45,
    2004. http//www.spiral.net
Write a Comment
User Comments (0)
About PowerShow.com