Automating%20Transformations%20from%20Floating%20Point%20to%20Fixed%20Point%20for%20Implementing%20Digital%20Signal%20Processing%20Algorithms - PowerPoint PPT Presentation

About This Presentation
Title:

Automating%20Transformations%20from%20Floating%20Point%20to%20Fixed%20Point%20for%20Implementing%20Digital%20Signal%20Processing%20Algorithms

Description:

Control distortion vs. complexity tradeoffs. Code. Conversion. Wordlength. Optimization ... power are possible in software when hardware has fixed wordlengths? ... – PowerPoint PPT presentation

Number of Views:169
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Automating%20Transformations%20from%20Floating%20Point%20to%20Fixed%20Point%20for%20Implementing%20Digital%20Signal%20Processing%20Algorithms


1
Automating Transformations fromFloating Point to
Fixed Point for Implementing Digital Signal
Processing Algorithms
  • Prof. Brian L. Evans
  • Embedded Signal Processing Laboratory
  • Dept. of Electrical and Computer Engineering
  • The University of Texas at Austin

Based on work by PhD student Kyungtae Han (now at
Intel Research Labs)
July 4, 2006
2
Outline
  • Introduction
  • Background
  • Optimize fixed-point wordlengths
  • Reduce power consumption in arithmetic
  • Automate transformations of systems
  • Conclusion

3
Implementing Digital Signal Processing Algorithms
Introduction
Price
Power
Hardware
Floating- Point Processor

Floating-Point Program
Code Conversion
Digital Signal Processing Algorithms
Fixed- Point Processor
Fixed Point (Uniform Wordlength)

Wordlength Optimization
L
H
Fixed- Point ASIC
Fixed Point (Optimized Wordlength)

L
H
ASIC Application Specific Integrated Circuit
Power consumption
4
Transformations to Fixed Point
Introduction
  • Advantages
  • Lower hardware complexity
  • Lower power consumption
  • Faster speed in processing
  • Disadvantages
  • Introduces distortion due toquantization error
  • Search for optimum wordlengthsby trial error
    is time-consuming
  • Research goals
  • Automate transformations to fixed point
  • Control distortion vs. complexity tradeoffs

Floating-Point Program
Code Conversion
Transformation
Wordlength Optimization
Fixed Point (Optimized Wordlength)
5
Outline
  • Introduction
  • Background
  • Optimize fixed-point wordlengths
  • Reduce power consumption in arithmetic
  • Automate transformations of systems
  • Conclusion

6
Fixed-Point Data Format
Background
  • Integer wordlength (IWL)
  • Number of bits assigned to integer representation
  • Includes sign bit
  • Fractional wordlength (FWL)
  • Number of bits assigned to fraction
  • Wordlength WL IWL FWL

SystemC format www.systemc.org
p 3.14159(10) Floating Point
3.140625(10) 011.001001(2)
WL9 IWL3 FWL6 3.141479492(10)
011.00100100001110(2) WL16 IWL3
FWL13
7
Distortion vs. Complexity Tradeoffs
Background
  • Different wordlengths have different application
    distortion and implementation complexity tradeoffs

Applicationdistortion d(w)
c(w) Implementation cost function
Cmax Constant for maximum implementation cost
d(w) Application distortion function
Dmax Constant for maximum application distortion
Wordlength lower bounds
Wordlength upper bounds
Feasible region
Optimal tradeoff curve
Implementationcomplexity c(w)
Vector of wordlengths
  • Minimize implementation cost
  • Minimize application distortion

8
Wordlength Optimization
Background
  • Multiple objective optimization
  • Single objective optimization

Proposed work fixes integer wordlengthsand
searches for fractional wordlengths
9
Genetic Algorithm
Background
  • Evolutionary algorithm
  • Inspired by Holland 1975
  • Mimic processes of plant and animal evolution
  • Find optimum of a complex function

Greg Rohling, Ph.D Defense, Georgia Tech, 2004
10
Pareto Optimality
Background
  • Pareto optimality best that could be achieved
    without disadvantaging at least one group
    Schick, 1970
  • Pareto optimal set is set of nondominated
    solutions
  • E is dominated by C as all objectives for C are
    less than corresponding objectives for E
  • Solutions A, B, C, D are nondominated (not
    dominated by any solution)
  • Pareto front is boundary (tradeoff curve) that
    connects Pareto optimal set solutions

Pareto Front
I
A
G
Objective 2
H
B
E
C
F
D
Objective 1
11
Outline
  • Introduction
  • Background
  • Optimize fixed-point wordlengths
  • Reduce power consumption in arithmetic
  • Automate transformations of systems
  • Conclusion

12
Search for Optimum Wordlength
Optimize Fixed-Point Wordlengths
  • Exhaustive search impractical for many variables
  • Gradient-based search (single objective)
  • Utilizes gradient information to determine next
    candidates
  • Complexity measure (CM) Sung Kum, 1995
  • Distortion measure (DM) Han et al., 2001
  • Complexity-and-distortion measure (CDM) Han
    Evans, 2004
  • Guided random search
  • Genetic algorithm for single objective Leban
    Tasic, 2000
  • Multiple objective genetic algorithm Han, Olson
    Evans, 2006

Next
Next
13
Complexity-and-Distortion Measure
Optimize Fixed-Point Wordlengths
  • Weighted combination of measures
  • Single objective function
  • Gradient-based search
  • Initialization
  • Iterative greedy search based on complexity and
    distortiongradient information

c(w) Complexity function
d(w) Distortion function
Dmax Constant for maximum distortion
Cmax Constant for maximum complexity
14
Case Study I Filter Design
Optimize Fixed-Point Wordlengths
  • Infinite impulse response (IIR) filter
  • Complexity measure Area model of
    field-programmable gate array (FPGA)
    Constantinides, Cheung Luk 2003
  • Distortion measure Root mean square (RMS) error
  • Seven fixed-point variables (indicated by slashes)

15
Case Study I Gradient-Based Search
Optimize Fixed-Point Wordlengths
  • CDM could lead to lower complexity and lower
    number of simulations compared to DM and CM

Search Method Gradient Measure Number of System Simulations Complexity Estimate (LUT) Distortion (RMS)
Gradient Gradient Gradient Complete DM CDM CM - 316 145 417 167 51.05 49.85 51.95 - 0.0981 0.0992 0.0986 -
Maximum distortion measured by root mean square
(RMS) error is 0.1 167 268,435,456 (8.5
years, if 1 second per 1 simulation)
16
Case Study I Genetic Algorithm
Optimize Fixed-Point Wordlengths
  • Search Pareto optimal set (nondominated)
  • Handles multiple objectives Error and Area

Pareto Front
22,500 simulations
45,000 simulations
9,000 simulations
100th Generation
250th Generation
500th Generation
Population for one generation 90
LUT Lookup table
17
Case Study I Comparison
Optimize Fixed-Point Wordlengths
  • Gradient-based search (GS) results vs. GA results

500th Generation (45000 simulations)
50th Generation (4500 simulations)
Required RMSmax for gradient-based search are
Dmax 0.12, 0.1, 0.08
  • GS methods can get stuck in a local minimum
  • GS methods reduce running time (CDM 145
    simulations)

18
Case Study II Communication System
Optimize Fixed-Point Wordlengths
  • Simple binary phase shift keying (BPSK) system
  • Complexity measure Area model of
    field-programmable gate array (FPGA)
    Constantinides, Cheung, and Luk 2003
  • Distortion measure Bit error rate (BER)
  • Four fixed-point variables (indicated by slashes)

Source Data (1 or -1)
AWGN
Carrier
Integration Dump
Decision
BER
19
Case Study II Gradient-Based Search
Optimize Fixed-Point Wordlengths
  • CDM could lead to lower complexity and lower
    number of simulations compared to DM and CM

Search Method Gradient Measure Number of System Simulations Complexity Estimate (LUT) Distortion (BER)
Gradient Gradient Gradient Complete DM CDM CM - 66 65 193 65536 40.65 43.65 41.95 - 0.083 0.085 0.081 -
Maximum distortion measured by bit error rate
(BER) error is 0.1
20
Case Study II Genetic Algorithm
Optimize Fixed-Point Wordlengths
For Comparison
  • Search Pareto optimal set
  • Handles multiple objectives

BER LUT
DM 0.83 40.65
CDM 0.85 43.95
CM 0.81 41.95
Pareto Front
Error (Bit Error Rate)
Error (Bit Error Rate)
Error (Bit Error Rate)
4,500 simulations
9,000 simulations
18,000 simulations
50th Generation
100th Generation
200th Generation
Preliminary results
Population for one generation 90
LUT Lookup table
21
Comparison of Proposed Methods
Optimize Fixed-Point Wordlengths
Gradient-based search Genetic algorithm
Type of Solution One point Family of points
Tradeoff Curve Found No Yes
Execution Time Short Long
Amount of Computation Low High
Parallelism Low High
22
Outline
  • Introduction
  • Background
  • Optimize fixed-point wordlengths
  • Reduce power consumption in arithmetic
  • Automate transformations of systems
  • Conclusion

23
Lower Power Consumption in DSP
Reduce Power Consumption in Arithmetic
  • Minimize power dissipation due to limited battery
    power and cooling system
  • Multipliers often a major source of dynamic power
    consumption in typical DSP applications
  • Multi-precision multiplier select smaller
    multipliers (8, 16 or 24 bits) to reduce power
    consumption
  • Wordlength reduction to select any word size
    Han, Evans Swartzlander 2004
  • In general, what reductions in power are possible
    in software when hardware has fixed wordlengths?

Next
24
Wordlength Reduction in Multiplication
Reduce Power Consumption in Arithmetic
  • Input data wordlength reduction
  • Smaller bits enough to represent, e.g. p x p
    9
  • Truncation
  • Signed right shift
  • Move toward the least significant bit (LSB)
  • Signed bit extended for arithmetic right shift

Sign bit
25
Power Reduction via Wordlength Reduction
Reduce Power Consumption in Arithmetic
  • Power consumption
  • Switching power consumption
  • Static power consumption
  • Switching power consumption
  • Switching activity parameter, a
  • Reduce a by wordlength reduction

CL Load capacitance
Vdd Operating voltage
fclk Operating frequency
Relationship between reduced wordlength and
switching parameter a in power consumption?
26
Analytical Method
Reduce Power Consumption in Arithmetic
No Reduction
Reduction
Input Switching expectation
Full length L/2
Truncate N bits M/2
N-bit signed right shift L/2
Wordlength (L) 16
27
Dynamic Power Consumption for Wallace Multiplier
(1 MHz)
Reduce Power Consumption in Arithmetic
Reduction (56)
16-bit x 16-bit multiplier (Simulated on Xilinx
XC3S200-5FT256 FPGA)
Truncate 1st arg Truncate 2nd arg (recode,nonrecod
e)
Truncation- First Truncation- Second
Wallace multiplier used in TI 320C64 DSP
28
Dynamic Power Consumption for Radix-4 Modified
Booth Multiplier (1 MHz)
Reduce Power Consumption in Arithmetic
Reduction (31)
Sensitive (13)
16-bit x 16-bit multiplier (Simulated on Xilinx
XC3S200-5FT256 FPGA)
Truncate 1st arg Truncate 2nd arg (recode,nonrecod
e)
Swapping could have benefit
Radix-4 modified Booth multiplier used in TI
320C62 DSP
29
Comparison of Proposed Methods
Reduce Power Consumption in Arithmetic
  • Truncation to 8 bits reduces est. power
    consumption by 56 in Wallace and 31 in Booth
    16-bit multipliers
  • Signed right shift has no est. power reduction in
    Wallace multiplier (for any shift) and 25
    reduction in Booth (for 8-bit shift) multiplier
  • Operand swapping reduces power consumption for
    Booth but has negligible savings for Wallace
    multiplier
  • Power consumption in tree-based multiplier
  • Highly dependent on input data
  • Simulation matches analysis

30
Outline
  • Introduction
  • Background
  • Optimize fixed-point wordlengths
  • Reduce power consumption in arithmetic
  • Automate transformations of systems
  • Conclusion

31
Automating Transformations from Floating Point
to Fixed Point
Automatic Transformations of Systems
  • Existing fixed-point tools
  • Support fixed-point simulation
  • Convert floating-point code to raw fixed-point
    code
  • Manually find optimum wordlength by trial and
    error
  • Automating transformations
  • Fully automate conversion and wordlength
    optimization

Floating-Point Program
Wordlength-Optimized Fixed-Point Program
Code Conversion
Wordlength Optimization
32
Automatic Transformation Flow
Automatic Transformations of Systems
  • Code generation
  • Parse floating-point program
  • Generate raw fixed-point program and auxiliary
    programs
  • Range estimation
  • Estimate range to avoid overflow
    (Analytical/Simulation)
  • Determine integer wordlength (IWL)
  • Wordlength optimization
  • Optimize wordlength according to given input, and
    error specification (Analytical/Simulation)
  • Determine fractional wordlength (FWL)

Code Generation
Wordlength Optimization
Range Estimation
33
Automating Transformation Environment for
Wordlength Optimization
Automatic Transformations of Systems
Input Data
Top Program
Floating-Point Program
Optimum Wordlength
Evaluation Program (Objectives)
Search Engine
Fixed-Point Program
Gradient-based or Genetic algorithm
Error Estimation
Complexity Estimation
Range Estimation
  • Given floating-point program and options,
  • auxiliary programs are automatically generated
  • Given input data, optimum wordlength is searched

34
Demo of Released Software
Automatic Transformations of Systems
35
Conclusion
Conclusion
  • Search for optimum wordlength
  • Gradient-based search reduces execution time
    while solutions could be trapped in local optimum
  • Genetic algorithm can find distortion vs.
    complexity tradeoff curve, but it requires longer
    execution time
  • Reduce power consumption by wordlength reduction
    of multiplicands
  • Automate transformations from floating-point
    programs to fixed-point programs
  • Freely distributable software release available at

http//www.ece.utexas.edu/bevans/projects/wordlen
gth/converter/
36
Future Work
Conclusion
  • Advanced wordlength search algorithms
  • Hybrid wordlength optimization
  • Prune redundant wordlength variables (e.g. delay,
    adder)
  • Adaptive step size for gradient-based search
    methods
  • Further analysis on search algorithms
  • Analysis of genetic algorithms with different
    settings
  • Comparison with simulated annealing
  • Low power consumption
  • System level including memory Powell and Chau,
    1991
  • Wordlength reduction for floating-point
    multipliers

37
Future Work (continued)
Conclusion
  • Electronic design automation software
  • Enhanced code generator (e.g. rounding
    preferences)
  • Hybrid analytical/simulation range estimation
  • Optimum DSP algorithms
  • Rearranging subsystems at block diagram
  • Rearranging mathematical expressions in algorithm
  • Developing more sophisticated hardware area
    models
  • Avoids having to route each design through
    synthesis tools
  • Transcendental functions

38
End
Thank you!
39
Backup Slides
Backup Slides
40
Publications-I
Publications
  • Conference Papers
  • K. Han, A. G. Olson, and B. L. Evans, Automatic
    floating-point to fixed-point transformations'',
    Proc. IEEE Asilomar Conf. on Signals, Systems,
    and Computers, Nov. 2006, Pacific Grove, CA USA.
    invited paper.
  • K. Han, B. L. Evans, and E. E. Swartzlander, Jr.,
    Low-Power Multipliers with Data Wordlength
    Reduction'', Proc. IEEE Asilomar Conf. on
    Signals, Systems, and Computers, Oct. 30-Nov. 2,
    2005, pp. 1615-1619, Pacific Grove, CA USA.
  • K. Han, B. L. Evans, and E.E. Swartzlander, Jr.,
    Data Wordlength Reduction for Low-Power Signal
    Processing Software,'' Proc. IEEE Work. on Signal
    Processing Systems, Oct. 13-15, 2004, pp.
    343-348, Austin, TX USA.
  • K. Han and B. L. Evans, Wordlength Optimization
    with Complexity-And-Distortion Measure and Its
    Applications to Broadband Wireless Demodulator
    Design,'' Proc. IEEE Int. Conf. on Acoustics,
    Speech, and Signal Proc., May 17-21, 2004, vol.
    5, pp. 37-40, Montreal, Canada.
  • K. Han, I. Eo, K. Kim, and H. Cho, Numerical
    Word-Length Optimization for CDMA Demodulator,''
    Proc. IEEE Int. Sym. on Circuits and Systems,
    May, 2001, vol. 4, pp. 290-293, Sydney,
    Australia.
  • K. Han, I. Eo, K. Kim, and H. Cho, Bit
    Constraint Parameter Decision Method for CDMA
    Digital Demodulator,'' Proc. CDMA Int. Conf.
    Exhibition, Nov. 2000, vol. 2, pp. 583-586,
    Seoul, Korea.
  • S. Nahm, K. Han, and W. Sung, A CORDIC-based
    Digital Quadrature Mixer Comparison with
    ROM-based Architecture,'' Proc. IEEE Int. Sym. on
    Circuits and Systems, Jun. 1998, vol. 4, pp.
    385-388, Monterey, CA USA.

41
Publications-II
Publications
  • Journal Articles
  • K. Han and B. L. Evans, Optimum Wordlength
    Search Using A Complexity-And-Distortion
    Measure,'' EURASIP Journal on Applied Signal
    Processing, special issue on Design Methods for
    DSP Systems, vol. 2006, no. 5, pp. 103-116, 2006.
  • Other Publications
  • K. Han, E. Soo, H. Jugn, and K. Kim, Apparatus
    and Method for Short-Delay Multipath Searcher in
    Spread Spectrm Systems, U.S. Patent pending, Nov.
    2001.
  • K. Han, I. Lim, E. Soo, H. Seo, K. Kim, H. Jung,
    and H. Cho, Apparatus and Method for Separating
    Carrier of Multicarrier Wireless Communication
    Receiver System, U.S. Patent pending, Sep. 2001.
  • K. Han, Carrier Synchronization Scheme Using
    Input Signal Interpolation for Digital
    Receivers,'' Master's Thesis, Seoul National
    University, Seoul, Korea, Feb. 1998.

42
Research on Transformation
Backup
43
Simulation Flow
Backup
Gradient-based search algorithm
Genetic search algorithm
Generate Pareto Front
Search wordlength sets
Setup desired specification
Search wordlength set
Pick one of sets
Generate Optimized fixed-point program
44
Algorithm Design and Implementation
Backup
High
Low
Floating-Point Programs
Floating-Point Processor
Code Conversion
Uniform Wordlength Fixed-Point Programs
Fixed-Point Processor
Hardware Complexity
Power Consumption
Design Time
Wordlength Optimization
Optimized Fixed-Point Programs
Fixed-Point IC
Low
High
Algorithm Implementation
Algorithm Design
45
Wordlength Optimization Constraints
Backup
  • Distortion constraint
  • Complexity constraint

Application-specific distortion d(w)
Application-specific distortion d(w)
Dmax
Cmax
Implementation complexity c(w)
Implementation complexity c(w)
46
Gradient-Based Search
Backup
  • Gradient information can be used for update
    direction
  • Gradient information is measured in design
    parameters such as implementation complexity,
    precision distortion, or power consumption
  • Complexity measurement (CM) Sung and Kum, 1995
  • Distortion measurement (DM) Han et al., 2001
  • Complexity-and-distortion measurement (CDM) Han
    and Evans, 2004 (proposed)

47
Gradient Information
Backup
w2
N number of variable
h iteration index
n variable index
w wordlength vector
f(w) objective function
20
23
3
10
8
2
10
15
25
5
10
w1
2
3
4
Search direction
b
Objective value
a
b
Gradient
48
Gradient-Based Search Direction
Backup
  • Wordlength update (s step size)
  • Direction
  • where

Finite Difference
49
Complexity and Distortion Function
Backup
  • Complexity function, c(w)
  • Number of multiplications is counted
  • Hardware complexity is estimated by assuming that
    complexity linearly increases as wordlength
    increases
  • Given hardware model results in accurate
    complexity
  • Distortion function, d(w)
  • Difficult to derive closed-form mathematical
    expression
  • Estimated by computer simulation measuring output
    SNR or bit error rate in digital communication
    systems

50
Complexity Measure Sung and Kum, 1995
Backup
  • Uses complexity sensitivity information as
    direction to search for optimum wordlength
  • Advantage minimizes complexity
  • Disadvantage demands large number of iterations

Objective function
Optimization problem
Update direction
51
Distortion Measure Han et al., 2001
Backup
  • Applies the application performance information
    to search for the optimum wordlengths
  • Advantage Fewer number of iterations
  • Disadvantage Not guaranteed to yield optimum
    wordlength for complexity

Objective function
Optimization problem
Update direction
52
Feasible Solution Search Sung and Kum, 1995
Backup
  • Exhaustive search of all possible wordlengths
  • Advantages
  • Does not miss optimum points
  • Simple algorithm
  • Disadvantage
  • Many trials (experiments)
  • Distance
  • Expected number of iterations

Direction of full searchminimum wordlengths
2,2optimum wordlengths 5,5d 6trials
24
53
Sequential Search K. Han et al. 2001
Backup
  • Greedy search based on sensitivity information
    (gradient)
  • Example
  • Minimum wordlengths 2,2
  • Direction of sequential search
  • Optimum wordlengths 5,5
  • 12 iterations
  • Advantage Fewer trials
  • Disadvantage Could miss global optimum point

54
Case Study Receiver Design
Backup
Transmitter
Multicarrier Modulator
Encoder
Data
Wireless Channel
Receiver
Channel Estimator
Multicarrier Demodulator
Bit Error Rate Tester
Channel Equalizer
w0 Input wordlength of a multicarrier demodulator which performs a fast Fourier transform (FFT)
w1 Input wordlength of equalizer
w2 Input wordlength of channel estimator
w3 Output wordlength of channel estimator
55
Simulation Results
Backup
  • CDM leads to lower complexity compared to DM
  • CDM reduces the number of trials compared to CM,
    feasible solution Sung and Kim 1995, and
    exhaustive search
  • Fast searching

Search Method Gradient Measure ac Number of Trials Simulations Wordlength for Variables Complexity Estimate Distortion (BER)
Gradient Gradient Gradient Feasible Exhaustive DM CDM CM - - 0 0.5 1 - - 16 15 69 210 26364 64 60 69 210 26364 10,9,4,10 7,10,4,6 7,7,4,6 7,7,4,6 - 10781 7702 7699 7699 - 0.0009 0.0012 0.0015 0.0015 -
Required BER 1.5 x 10-3
56
Simulation Environments
Backup
  • Assumptions
  • Internal wordlengths of blocks have been decided
  • Complexity increases linearly as wordlength
    increases
  • Required application performance
  • Bit error rate of 1.5 x 10-3 (without error
    correcting codes)
  • Simulation tool
  • LabVIEW 7.0

Complexity Vector
Input Weight
FFT 1024
Equalizer (right) 1
Estimator 128
Equalizer (upper) 2
Complexity C(w) cT.w
57
FFT Cost
Backup
  • N Tap FFT cost
  • 256 Tap FFT cost

58
Minimum Wordlengths
Backup
  • Change one wordlength variable while keeping
    other variables at high precision
  • 1,16,16,16,2,16,16,16,...
  • 16,1,16,16,16,2,16,16,...
  • 16,16,16,15,16,16,16,16
  • Minimum wordlength vector is 5,4,4,4

59
Number of Trials
Backup
  • Start at 5,4,4,4 wordlength
  • Next wordlength vectorsfor complexity measure(a
    1.0)
  • 5,4,4,4,
  • 5,5,4,4,
  • Increase wordlength one-by-one until satisfying
    required application performance

60
Power Consumption
Low-Power Signal Processing
  • Power consumption in CMOS circuits
  • Significant power in CMOS circuits is
  • dissipated when they are switching
  • Power reduction in hardware part Chandrakasan
    and Brodersen, 1995
  • Scaling down, minimizing area
  • Adjusting voltage and frequency during operation
  • Power reduction in software part Tiwari, Malik
    and Wolfe, 1994 Lee et al., 1997
  • Instruction ordering and packing
  • Energy reductions varying from 26 to 73

61
Wordlength for Low-Power Consumption
Low-Power Signal Processing
  • Power model of wordlength Choi and Burleson,
    1994
  • Wordlength is considered as capacitance
  • Power consumption is proportional to wordlength
  • Switching activity is not considered
  • Data wordlength reduction technique Han, Evans,
    and Swartzlander, 2004 (proposed)
  • Count node transitions for switching activity
  • Reduce input data wordlength to decrease power
    consumption

62
Dynamic and Static Power
Backup
Trends in dynamic and static power dissipation
showing increasing contribution of static power
S. Thompson, P. Packan, and M. Bohr. MOS
Scaling Transistor Challenges for the 21st
Century. Intel Technology Journal, Q3 1998
63
Power Dissipation of Multiplier Unit
Backup
  • Multiply unit is usually a major source of power
    consumption in typical DSP applications
  • Multiply unit required for digital
    communication digital signal processingalgorith
    ms
  • Digital filters, equalizers, FFT/IFFT, digital
    down/upconverter, etc.

TMS320C5x Power Dissipation Characteristics from
www.ti.com
64
Wallace vs. Booth Multipliers
Backup
Symmetric
Asymmetric (one operand recoded)
Tree dot diagram in 4-bit Wallace multiplier
Radix-4 multiplier based on Booths recoding (?
? a P)
65
Radix-4 Modified Booth Multiplier
Backup
  • One multiplicand is recoded
  • The a and x are multiplicands
  • P is product of multiplication
  • Three bits in X are recoded to z

66
Switching Activity in Multipliers
Backup
  • Logic delay and propagation cause glitches
  • Proposed analytical method
  • Hard to estimate glitches in closed form
  • Analyze switching activity w/r to input data
    wordlength
  • Does not consider multiplier architecture
  • Simulation method
  • Count all switching activities(transition counts
    in logic)
  • Power estimation (Xilinx XPower)
  • Considers multiplier architecture

67
Analytical Method
Reduce Power Consumption in Arithmetic
  • Stream of data for one multiplicand
  • Compare two adjacent numbersin stream after
    reduction
  • Expectation of bitswitching, x, withprobability
    Px
  • L-bit input data
  • Truncate input datato M bits (remove N bits)
  • N-bit signed right shift inL-bit input (Y is
    sign bit)

68
Analytical Method
Backup
X has binomial distribution
Always L/2 (independent on M and N)
69
Power Reduction in TI DSP
Backup
  • TI TMS320VC5416 DSP STARTER KIT
  • Radix-4 modified Booth multiplier
  • Measure average current for wordlength reduction
    of multiplicands

loop STM data_a, AR2 STM data_b, AR3 MPY
AR2, AR3,a MPY AR2, AR3,a .. MPY AR2,
AR3,a B loop
Assembly program (data_a and data_b has random
data with wordlength w)
70
Code Generation for Fixed-Point Program
Backup
  • Adder function in MATLAB

Function c adder_fx(a, b) c 0 a fi (a,
1,32,16) b fi (b, 1,32,16) c fi (c,
1,32,16) c() a b
Function c adder(a, b) c 0 c a b
Determined by designers with trial and error
(a) Floating point program for adder
(b) Raw fixed-point program
Function c adder_fx(a, b, numtype) c 0 a
fi (a, numtype.a) b fi (b, numtype.b) c fi
(c, numtype.c) c() a b
WL
S
FWL
fi(a, S,WL,FWL) is a constructor function for a
fixed-point object in fixed-point toolbox S
Signed, WL Wordlength, FWL Fraction length
(c) Converted fixed-point program for
automating optimization
71
Code Generation
Backup
ltRun Code Generationgt
ltFloating-point Programgt
72
Running Transformation
Backup
  • Just call top function with input data
  • Range and optimum wordlengths depend on input
    statistic

gt in rand(1,1000) gt mac_top(in)
73
Advantages/disadvantages of wordlength search
algorithms
Backup
Write a Comment
User Comments (0)
About PowerShow.com