Closing the Power Gap between ASIC and Custom - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

Closing the Power Gap between ASIC and Custom

Description:

Closing the Power Gap between ASIC and Custom David Chinnery, Kurt Keutzer Outline Motivation for focusing on reducing ASIC power The power gap between ASIC and ... – PowerPoint PPT presentation

Number of Views:144
Avg rating:3.0/5.0
Slides: 51
Provided by: dep1160
Category:

less

Transcript and Presenter's Notes

Title: Closing the Power Gap between ASIC and Custom


1
(No Transcript)
2
Closing the Power Gap between ASIC and Custom
  • David Chinnery, Kurt Keutzer

3
Outline
  • Motivation for focusing on reducing ASIC power
  • The power gap between ASIC and custom
  • Where does the power go?
  • What can we do about it?
  • Conclusions on automating low power techniques

4
Why power?
  • Battery life is limited by power (e.g. laptop,
    mobile phone)
  • Cost for packaging and cooling increase rapidly
    with power dissipation (e.g. plastic vs. ceramic
    package, heatsink, fan)
  • Higher temperatures degrade performance and
    reliability
  • Circuits are slower, with more leakage, at higher
    temperature
  • Less reliable due to increased rate of
    electromigration
  • Increasing integration increases power demand in
    portable applications (e.g. mp3 player/PDA/mobile
    phone combined)
  • Performance is limited by power now even for high
    end microprocessors

5
Power of high performance chips has increased
  • As device dimensions (W, L, Tox) scaled down by
    a factor k, for high performance,
  • If supply Vdd and threshold voltage Vth fixed,
    then power/unit area ? k3
  • If Vdd and Vth scaled downlinearly and
    , then power/unit area ? k0.7
  • Further voltage scaling may be limited

6
Impact of voltage scaling on power
  • Major components of power Ptotal Pdynamic
    Pleakage
  • Dynamic power due to switching of capacitances
  • Reducing Vdd gives quadratic reduction in
    Pdynamic
  • But transistor drive current depends on Vdd
  • Must reduce Vth to maintain drive current
  • But reducing Vth increases subthreshold leakage
    current, which is the major contributor to
    Pleakage
  • Must look for other ways to reduce power

Chen in Trans. On Electron Devices 1997
7
Automate low power techniques
  • Custom designers can try to optimize the design
    at all levels
  • Electronic design automation (EDA) tools for
    ASICs
  • Most of the design optimization is high level
  • Fast time-to-market and lower design cost
  • Increasingly important to reduce design cost for
    larger chips
  • What is the power gap between (automated) ASIC
    design and custom design?
  • We need to characterize the contributing factors
  • Can we close the power gap?
  • Identify custom techniques that can be used in an
    EDA flow

8
Outline
  • Motivation for focusing on reducing ASIC power
  • The power gap between ASIC and custom
  • Where does the power go?
  • What can we do about it?
  • Conclusions on automating low power techniques

9
What is our metric for power?
  • Power
  • Fixed performance constraint (clock frequency or
    throughput e.g. 30 frames/s for MPEG2)
  • Reduce the power and meet the performance
    constraint
  • Energy efficiency
  • No performance constraint
  • Throughput/unit power (1/P?T?CPI), e.g. MIPS/mW
  • Cycles per instruction (CPI) accounts for impact
    of architectural choices (e.g. stalled pipeline
    stages)
  • Energy/operation is the inverse of
    throughput/unit power
  • Maximize throughput/unit power or minimize
    energy/operation

10
What is the power gap? ARM cores
  • 2 to 3 gap between custom and hard macro ARMs
  • 1.3 to 1.4 gap between ARM7TDMI-S and ARM7TDMI
  • 3 to 4 overall from synthesizable to custom ARMs

11
What is the power gap? DCT/IDCT blocks
  • 4 to 7 between discrete cosine transform (DCT)
    and inverse discrete cosine transform (IDCT)
    blocks, after scaling linearly for technology
    Fanucci ICECS 2002
  • We assumed power reduces linearly with technology
  • To get 30 frame/s MPEG2 with a general purpose
    processor would require two ARM9 cores and would
    consume 15 power Fanucci ICECS 2002
  • Application-specific hardware substantially
    reduces power

12
Outline
  • Motivation for focusing on reducing ASIC power
  • The power gap between ASIC and custom
  • Where does the power go?
  • What can we do about it?
  • Conclusions on automating low power techniques

13
Breakdown of power by functionality
  • Typical breakdown of on-chip power consumption
    for an embedded microprocessor
  • Clock 20 to 40
  • Memory 20 to 40
  • Control datapath 40 to 60
  • Input/output to off-chip 5
  • Most of power is in datapath, control, clock tree
    and memory
  • Techniques focus on reducing this power
  • Several companies provide custom memory for ASIC
    processes, so we wont discuss memory here

14
Summary of factors effect on active power
  • Automated designs are higher power than custom
    because of

  • ASIC design quality
  • Factor typical excellent
  • Microarchitecture (pipelining, parallelism) 2.6
    1.3
  • Clock gating and power gating 1.6 1.0
  • Logic design 1.2 1.0
  • High speed logic styles (DCVSL, PTL,
    domino) 1.3 1.3
  • Technology mapping 1.4 1.0
  • Cell sizing and wire sizing 1.6 1.1
  • Voltage scaling, multi-Vth, multi-Vdd 4.0 1.0
  • Floorplanning and placement 1.5 1.1
  • Process variation and process technology 2.6 1.2

15
Outline
  • Motivation for focusing on reducing ASIC power
  • The power gap between ASIC and custom
  • Where does the power go?
  • What can we do about it?
  • Conclusions on automating low power techniques

16
Outline
  • Motivation for focusing on reducing ASIC power
  • The power gap between ASIC and custom
  • Where does the power go?
  • What can we do about it?
  • ASIC design quality
  • Factor typical excellent
  • Microarchitecture (pipelining, parallelism) 2.6
    1.3
  • Clock gating and power gating 1.6 1.0
  • Logic design 1.2 1.0
  • High speed logic styles (DCVSL, PTL,
    domino) 1.3 1.3
  • Technology mapping 1.4 1.0
  • Cell sizing and wire sizing 1.6 1.1
  • Voltage scaling, multi-Vth, multi-Vdd 4.0 1.0
  • Floorplanning and placement 1.5 1.1
  • Process variation and process technology 2.6 1.2
  • Conclusions on automating low power techniques

17
Microarchitecture
leverage for voltage scaling and sizing
  • Increase throughput/cycle to allow Vdd reduction
  • Pipelining inserts registers, increasing
    throughput
  • Limited by
  • Reduction in instructions/cycle (1/CPI) due to
    branch misprediction, waiting to read or write
    memory, etc.
  • Power and delay for registers, data forwarding
    logic, and branch prediction
  • Parallelism increases throughputin exchange for
    increased area
  • Limited by
  • Routing, multiplexing, control overheads

insertregisters
18
Microarchitecture pipelining model
leverage for voltage scaling and sizing
  • Pipeline power model Harstein 2003
  • n stages, ?1.1 latch growth vs. n, ?0.05 for
    register power
  • Minimum stage delay
  • ASIC tpipelining overhead of 10 FO4 (register
    delay) 10 FO4 (imbalance)
  • Custom tpipelining overhead of 2.6 FO4 total,
    same tcombinational of 175 FO4
  • CPI penalty 0.025/stage for custom, and
    0.05/stage for ASICs
  • Add fits for dynamic and leakage power with
    voltage scaling and sizing
  • At 40 FO4 delay constraint (500MHz for
    Leff0.1um), ASIC is ?2.6 worse

19
Microarchitecture
leverage for voltage scaling and sizing
  • Custom IDCT pipelining to reduce Vdd
    Xanthapoulos JSSC99
  • With pipeline Vdd1.32V, 20 power overhead
  • Without pipeline Vdd2.2V to meet throughput
  • Parallel datapaths Bhavnagarwala IEEE Trans.
    VLSI00
  • ?2 to ?4 reduction in power by reducing Vdd by
    increasing throughput with parallel datapaths
  • Microarchitecture speed gap is ?1.8 (typical) to
    ?1.3 (excellent)
  • At a tight delay constraint, this corresponds to
    about ?2.6 to ?1.3 worse power due to higher Vdd,
    lower Vth, and wider gates to compensate

?2
20
Outline
  • Motivation for focusing on reducing ASIC power
  • The power gap between ASIC and custom
  • Where does the power go?
  • What can we do about it?
  • ASIC design quality
  • Factor typical excellent
  • Microarchitecture (pipelining, parallelism) 2.6
    1.3
  • Clock gating and power gating 1.6 1.0
  • Logic design 1.2 1.0
  • High speed logic styles (DCVSL, PTL,
    domino) 1.3 1.3
  • Technology mapping 1.4 1.0
  • Cell sizing and wire sizing 1.6 1.1
  • Voltage scaling, multi-Vth, multi-Vdd 4.0 1.0
  • Floorplanning and placement 1.5 1.1
  • Process variation and process technology 2.6 1.2
  • Conclusions on automating low power techniques

21
Clock gating
?1.6 to ?1.0
  • Clock signal has high activity, 2. Logic is lower
    activity 0.1.
  • Turn off clocks to inactive modules
  • Some DCT/IDCT registers are active lt 3 of time,
    clock gating and avoiding computation reduces
    power by ?10 August SOC01
  • Typical savings are up to ?1.6 power reduction
  • Power minimization tools automatically insert
    gated clocks
  • Designer can make microarchitectural/algorithm
    decisions
  • E.g. reduce precision for DCT/IDCT coefficients
  • Precomputation control signals reduces power by
    ?1.4 to ?3.3 Hsu ISLPED02
  • ASICs can do this

insertclock gating
22
Power gating
reduces leakage in standby
  • Turn off leakage path in inactive modules
  • May need to preserve the state registers
  • Can reduce standby leakage by 3 orders of
    magnitudeMutoh JSSC95
  • Other approaches
  • reverse biasing the substrate
  • setting input vectors to low leakage states,
    gives ?1.4 leakage reduction Lee DAC03
  • Just now getting ASIC methodology support
  • Need large sleep transistors to turn off power
  • Sleep transistors reduce available supply voltage

23
Outline
  • Motivation for focusing on reducing ASIC power
  • The power gap between ASIC and custom
  • Where does the power go?
  • What can we do about it?
  • ASIC design quality
  • Factor typical excellent
  • Microarchitecture (pipelining, parallelism) 2.6
    1.3
  • Clock gating and power gating 1.6 1.0
  • Logic design 1.2 1.0
  • High speed logic styles (DCVSL, PTL,
    domino) 1.3 1.3
  • Technology mapping 1.4 1.0
  • Cell sizing and wire sizing 1.6 1.1
  • Voltage scaling, multi-Vth, multi-Vdd 4.0 1.0
  • Floorplanning and placement 1.5 1.1
  • Process variation and process technology 2.6 1.2
  • Conclusions on automating low power techniques

24
High speed logic styles
leverage for voltage scaling and sizing
  • Low power designs use mostly static CMOS logic
  • Static CMOS logic is low leakage, robust
  • PMOS pullup series transistors are slow
  • Faster custom logic styles speedup critical paths
  • Custom can use slack from higher speed (?1.4) to
    reduce power by lowering Vdd
  • ASIC power ?1.3 worse than custom at a tight
    delay constraint due to logic style

25
Outline
  • Motivation for focusing on reducing ASIC power
  • The power gap between ASIC and custom
  • Where does the power go?
  • What can we do about it?
  • ASIC design quality
  • Factor typical excellent
  • Microarchitecture (pipelining, parallelism) 2.6
    1.3
  • Clock gating and power gating 1.6 1.0
  • Logic design 1.2 1.0
  • High speed logic styles (DCVSL, PTL,
    domino) 1.3 1.3
  • Technology mapping 1.4 1.0
  • Cell sizing and wire sizing 1.6 1.1
  • Voltage scaling, multi-Vth, multi-Vdd 4.0 1.0
  • Floorplanning and placement 1.5 1.1
  • Process variation and process technology 2.6 1.2
  • Conclusions on automating low power techniques

26
Technology mapping
?1.4 to ?1.0
  • Technology mapping tools dont target low power
  • We found that targeting minimum area for
    multipliers can result in ?1.3 power, delay is a
    poor choice
  • Technology mapping techniques to reduce active
    power
  • ?1.0 ASICs can do as well as custom, if tools
    improve

equivalent logic, lower activity
27
Outline
  • Motivation for focusing on reducing ASIC power
  • The power gap between ASIC and custom
  • Where does the power go?
  • What can we do about it?
  • ASIC design quality
  • Factor typical excellent
  • Microarchitecture (pipelining, parallelism) 2.6
    1.3
  • Clock gating and power gating 1.6 1.0
  • Logic design 1.2 1.0
  • High speed logic styles (DCVSL, PTL,
    domino) 1.3 1.3
  • Technology mapping 1.4 1.0
  • Cell sizing and wire sizing 1.6 1.1
  • Voltage scaling, multi-Vth, multi-Vdd 4.0 1.0
  • Floorplanning and placement 1.5 1.1
  • Process variation and process technology 2.6 1.2
  • Conclusions on automating low power techniques

28
Cell sizing and wire sizing
?1.6 to ?1.1
  • ?1.35 power reduction on Xtensa processor at
    325MHz by (mostly sizing) power minimization with
    Design Compiler and 0.13um library internship at
    Tensilica
  • Can do better than Design Compiler (DC) with
    cell sizing via linear program (LP) (global
    optimization vs. greedy pin-hole
    optimization), about ?1.1 to ?1.2 power
    reduction

29
Cell sizing and wire sizing
?1.6 to ?1.1
  • Cell libraries lack fine-grained sizes and skewed
    PN drives
  • Hurat SNUG01 Generate new cells ?1.2 power
    reduction and ?1.15 faster for bus controller,
    ?1.4 MHz/mW
  • Simultaneous buffer and wire sizing reduced
    clock tree power by ?2.7 Gong ISLPED96
  • ?1.1 to ?1.2 reduction in total power
  • Not available for ASIC interconnect yet
  • Up to ?1.6 gap due to cell sizing and wire
    sizing, can reduce to ?1.1 using a library with
    finely-grained sizes, a good sizing tool, and
    design-specific cells

optimizetransistorsizes
30
Outline
  • Motivation for focusing on reducing ASIC power
  • The power gap between ASIC and custom
  • Where does the power go?
  • What can we do about it?
  • ASIC design quality
  • Factor typical excellent
  • Microarchitecture (pipelining, parallelism) 2.6
    1.3
  • Clock gating and power gating 1.6 1.0
  • Logic design 1.2 1.0
  • High speed logic styles (DCVSL, PTL,
    domino) 1.3 1.3
  • Technology mapping 1.4 1.0
  • Cell sizing and wire sizing 1.6 1.1
  • Voltage scaling, multi-Vth, multi-Vdd 4.0 1.0
  • Floorplanning and placement 1.5 1.1
  • Process variation and process technology 2.6 1.2
  • Conclusions on automating low power techniques

31
Dynamic supply and substrate biasing
?4.0 to ?1.0
  • Change Vdd based on processor load
  • ?10 more energy efficient at low performance
    Burd ISSCC00
  • Adaptive voltage scaling with the ARM11 gives
    ?1.7 power reduction for voice, SMS, web
    applications National Semiconductor, ARM 02
  • Reduce Vdd and bias substrate to lower Vth
  • ?1.7 reduction in power, same speed Hamada
    CICC98
  • Increase Vth in standby to reduce leakage
  • These are complicated to automate for ASICs
  • Dynamic voltage requires accurate knowledge of
    path delays

32
Multiple supply and threshold voltages
?4.0 to ?1.0
  • Basic idea high speed where critical, low power
    elsewhere
  • Dual Vdd reduces power by ?1.7 after substrate
    biasing/lower Vdd Usami JSSC98
  • ?2 reduction in clock tree power by using low Vdd
  • Separate voltage islands different speeds and
    Vdd Lackey ICCAD02
  • Turn off Vdd to modules not in use, reduces
    leakage by ?500
  • ?1.25 to ?3 average power reduction, depending on
    activities
  • Dual Vth can give ?3 to ?6 reduction in
    leakageSirichotiyakul DAC99
  • ASICs are limited to Vdd and Vth offered by
    library and foundry
  • Cant change Vth to design-specific optimal point
  • Standard cell libraries characterized at only two
    or three Vdd
  • Dual Vdd requires level converters and dual Vdd
    layout

33
Outline
  • Motivation for focusing on reducing ASIC power
  • The power gap between ASIC and custom
  • Where does the power go?
  • What can we do about it?
  • ASIC design quality
  • Factor typical excellent
  • Microarchitecture (pipelining, parallelism) 2.6
    1.3
  • Clock gating and power gating 1.6 1.0
  • Logic design 1.2 1.0
  • High speed logic styles (DCVSL, PTL,
    domino) 1.3 1.3
  • Technology mapping 1.4 1.0
  • Cell sizing and wire sizing 1.6 1.1
  • Voltage scaling, multi-Vth, multi-Vdd 4.0 1.0
  • Floorplanning and placement 1.5 1.1
  • Process variation and process technology 2.6 1.2
  • Conclusions on automating low power techniques

34
Floorplanning and placement
?1.5 to ?1.1
  • Poor floorplanning and cell placement,
    inaccurate wire loads
  • 1.5 worse power than custom
  • We compared partitioning a design into 50K vs.
    200K gate modules from 0.25um to 0.13um
  • 42 longer wires for 200K partitions
  • Interconnect is 20 to 40 of total power
    Sylvester ICCAD98
  • ?1.1 to ?1.2 increase in total power due to
    wiring, and gates will be upsized to drive the
    longer wires

automatic place and route
blockpartitioned
Hauck Micro. Report 01
35
Floorplanning and placement
?1.5 to ?1.1
  • Bit slices can reduce wire length by 70 or
    more vs. automated place-and-route
  • up to ?1.4 energy reduction as faster and lower
    wiring capacitance Chang SM Thesis MIT98
  • ?1.5 energy reduction from bit slicing and some
    logic optimization Stok, Puri, Bhattacharya,
    Cohn
  • Manual place-and-route achieves 10 shorter wires
    and ?1.1 faster, about ?1.1 energy reduction
    Chang SM Thesis MIT98
  • ASICs still 1.1 higher power than custom due to
    layout

automatic place-and-route
tiled bit-slices
custom
36
Outline
  • Motivation for focusing on reducing ASIC power
  • The power gap between ASIC and custom
  • Where does the power go?
  • What can we do about it?
  • ASIC design quality
  • Factor typical excellent
  • Microarchitecture (pipelining, parallelism) 2.6
    1.3
  • Clock gating and power gating 1.6 1.0
  • Logic design 1.2 1.0
  • High speed logic styles (DCVSL, PTL,
    domino) 1.3 1.3
  • Technology mapping 1.4 1.0
  • Cell sizing and wire sizing 1.6 1.1
  • Voltage scaling, multi-Vth, multi-Vdd 4.0 1.0
  • Floorplanning and placement 1.5 1.1
  • Process variation and process technology 2.6 1.2
  • Conclusions on automating low power techniques

37
Process variation impact on power
?2.6 to ?1.2
  • ASICs are designed to work at the worst case
    delay and worst case power corners for the
    process typical delay and power are less
  • Simulated power was 1.7 actual power for custom
    DCT/IDCT
  • Up to a factor of ?1.75 between worst and best
    (average power of 80 chip samples in 0.3um)

38
Process variation impact on power
?2.6 to ?1.2
  • Binning would leave gap of ?1.4 between low and
    high bins
  • We found a gap of ?1.2 between low speed (high
    power) and high speed (low power, after derating
    for Vdd and frequency) bins of 0.18 and 0.13um
    Intel and AMD PC chips
  • ASICs dont speed bin (they scan test, no speed
    test)

1.4
low power bin
higher power bin
39
Process technology
?2.6 to ?1.2
  • Low power libraries are more expensive
  • 5 to 10 transistor width shrinks to reduce
    capacitances
  • Copper is 40 lower resistivity than aluminum
  • Low-k dielectric reduces wire capacitances we
    estimate about a 1.1 reduction in total power
    with a low-k dielectric
  • Silicon-on-insulator is 1.1 to 1.3 faster, 1.4
    power reduction Narendra Symp. VLSI 2001
  • We compared cell libraries in UMC 0.13um vs. IBM
    0.13um process
  • IBM cells about 1.05 faster, 1.6 higher active
    power, UMC had 17 leakage
  • Overall impact of process variation and
    technology
  • ?2.6 ASIC power relative to custom for worst case
    conditions and a cheap process
  • ?1.2 in a low power process, typical conditions,
    no speed binning

40
Outline
  • Motivation for focusing on reducing ASIC power
  • The power gap between ASIC and custom
  • Where does the power go?
  • What can we do about it?
  • Conclusions on automating low power techniques

41
Low power design conclusions
  • Typical ASIC is ?3 to ?7 less energy efficient
    than custom
  • We assumed ASIC and custom designs can use the
    same microarchitectural and logic design
    techniques. These are the biggest levers for
    reducing power.
  • Can get 10? or more going from general purpose
    hardware to application-specific hardware.
  • E.g. Fast Fourier transform implementations as
    discussed in Andrew Changs paper.
  • The largest factor for the power gap is voltage
    scaling responsible for up to 4
  • Process and microarchitecture can be large
    factors, about 2.6 each

42
Low power design conclusions
  • By incorporating custom techniques can get within
  • ?3 at a high performance target
  • Cant use custom logic styles
  • ASIC speed penalty drags down efficiency, as
    higher Vdd, lower Vth, and upsized gates are
    needed to meet performance target
  • ?1.5 at a lower performance target (2? slower)
  • Make full use of scaling down Vdd and Vth

43
Low power ASIC design example
  • 0.13um DSP example Stok, Puri, Bhattacharya,
    Cohn
  • 240,000 gates implementing Hilbert transform, FIR
    filter, and fast Fourier transform, with 42KB
    register array
  • Technology mapping, logic design (carry save
    adders), bit-slicing, physical synthesis gave
    ?1.86 increase in efficiency
  • A fine grained standard cell library gave another
    ?1.16
  • Voltage scaling gave another factor of ?1.46
  • ?3.1 increase in MHz/mW overall
  • The third speaker, Ruchir Puri will discuss some
    of their recent low power work at IBM.

44
Extra slides
45
Impact of voltage scaling on power
  • Ptotal Pdynamic Pshort circuit Pstatic
  • Short circuit power when switching is 10 or less
    of Ptotal
  • Dynamic power due to switching of capacitances
  • Reducing Vdd gives quadratic reduction in
    Pdynamic
  • But transistor drive current depends on Vdd
  • Must reduce Vth to maintain drive current
  • But reducing Vth increases subthreshold leakage
    current, which is the major contributor to
    Pstatic
  • (Clock frequency f gate switching activity a
    capacitance C transistor length L transistor
    gate oxide thickness Tox temperature T
    constants b, t, Io, and m.)

dynamic power
Chen in Trans. On Electron Devices 1997
46
ITRS leakage power trends
  • Cant scale down Vth much further due to large
    subthreshold leakage currents
  • Gate tunneling leakage through thin gate oxide
    Tox is also becoming a significant cause of
    leakage
  • Further Vdd voltage scaling will be limited
  • Must also look to other low power techniques

fast, low Vth
slow, high Vth
leakage increasing
From International Technology Roadmap for
Semiconductors data for 2001-2016 (assuming
activity of 0.1, ignoring interconnect).
47
Summary of factors affecting (active) power
  • Automated designs are higher power than custom
    because of
  • ASIC design quality
  • Factor typical excellent
  • Microarchitecture (pipelining, parallelism) 2.6
    1.3
  • Memory 1.4 1.0
  • Clock gating and power gating 1.6 1.0
  • Logic design 1.2 1.0
  • High speed logic styles (DCVSL, PTL,
    domino) 1.3 1.3
  • Technology mapping 1.4 1.0
  • Cell sizing and wire sizing 1.6 1.1
  • Voltage scaling, multi-Vth, multi-Vdd 4.0 1.0
  • Floorplanning and placement 1.5 1.1
  • Process variation and process technology 2.6 1.2

48
Memory reduce cache misses
?1.4 to ?1.0
  • Larger caches consume more power, but reduced
    cache misses
  • Pipeline stalls, waits many cycles for read/write
    to off-chip memory
  • Caches with higher associativity (e.g. 8-way vs.
    direct mapped) consume more power, also affects
    likelihood of a cache miss
  • Duarte ASIC/SOC 2001
  • Sub-banking only precharge the need section of
    the cache bank, ?1.32 energy savings
  • Software optimizations to reduce cache misses
    gave on average a ?1.6 reduction in power
  • 90 of the StrongARM area was caches, increasing
    the transistor length in the caches by 12
    reduced leakage by ?20 Montanaro JSSC96
  • ASICs can do this, custom memory is available for
    ASICs

49
Outline
  • Motivation for focusing on reducing ASIC power
  • The power gap between ASIC and custom
  • Where does the power go?
  • What can we do about it?
  • ASIC design quality
  • Factor typical excellent
  • Microarchitecture (pipelining, parallelism) 2.6
    1.3
  • Clock gating and power gating 1.6 1.0
  • Logic design 1.2 1.0
  • High speed logic styles (DCVSL, PTL,
    domino) 1.3 1.3
  • Technology mapping 1.4 1.0
  • Cell sizing and wire sizing 1.6 1.1
  • Voltage scaling, multi-Vth, multi-Vdd 4.0 1.0
  • Floorplanning and placement 1.5 1.1
  • Process variation and process technology 2.6 1.2
  • Conclusions on automating low power techniques

50
Logic design
?1.2 to ?1.0
  • Logic design refers to the topology and logic
    structure to implement functional units
  • Logic switching activity of a carry select adder
    was ?1.8 worse than a 32-bit carry lookahead
    Callaway VLSI Signal Proc.92
  • 0.13um 64-bit radix-2 compound domino adder was
    slower and about ?1.3 energy compared to radix-4
    Zlatanovici ESSC03
  • We implemented an algorithm to reduce switching
    activity in multipliers, reduced energy by ?1.1
    for 64-bit Ito ICCD03
  • Given similar design constraints, ASIC designers
    can choose the same logic design as custom, ?1.0
Write a Comment
User Comments (0)
About PowerShow.com