Power Optimal DualVdd Buffered Tree Considering Buffer Stations and Blockages - PowerPoint PPT Presentation

About This Presentation
Title:

Power Optimal DualVdd Buffered Tree Considering Buffer Stations and Blockages

Description:

No existing algorithms consider dual-Vdd for buffer insertion or buffered tree generation ... Efficient algorithms for power optimality ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 27
Provided by: edaEe
Learn more at: http://eda.ee.ucla.edu
Category:

less

Transcript and Presenter's Notes

Title: Power Optimal DualVdd Buffered Tree Considering Buffer Stations and Blockages


1
Power Optimal Dual-Vdd Buffered Tree Considering
Buffer Stations and Blockages
  • King Ho Tam and Lei He
  • Electrical Engineering Department
  • University of California, Los Angeles
  • Sponsors NSF CAREER, UC MICRO (Fujitsu, Intel
    and Mindspeed), and IBM Faculty Partner Award.

2
Motivation
  • Increasing interconnect power
  • 35 cells are buffers at 65nm technology Saxena,
    TCAD 04
  • Previous work
  • Power-optimal single Vdd buffer
    insertionLillis, JSSC 96
  • Delay-optimal buffered tree generationCong, DAC
    00 Alpert, TCAD 02
  • No existing algorithms consider dual-Vdd for
    buffer insertion or buffered tree generation

3
Major Contributions
  • First in-depth study of dual Vdd buffer insertion
    and buffered tree generation
  • Large power saving over single Vdd buffering
  • Efficient algorithms for power optimality
  • 17x faster than Lillis, JSSC 96 when single Vdd
    is considered

4
Outline
  • Dual Vdd buffer insertion and sizing (DVB)
  • Problem formulation
  • Sampling for speedup
  • Experimental results
  • Dual Vdd buffered tree generation (D-Tree)
  • Problem formulation
  • Improved augmented orthogonal search tree
  • Experimental results

5
Delay, Slew and Power Modeling
  • Elmore delay
  • Wire , buffer
  • Bakoglus slew metric (ln 9 Elmore)
  • Power energy per switch
  • Wire
  • Lumped buffer dynamic/short-circuit power
  • Can be easily extended to leakage power
  • Low Vdd (VL) reduces leakage
  • Need to assume of clock rate and switching
    activity

6
Introducing Dual Vdd Buffering
  • Achieves power saving since power a Vdd2
  • Suffer no loss of delay optimality
  • VL gt VH requires level converter (LC)
  • Restore voltage level and reduce leakage
  • Ext-CVS for logic Srivastava, ISLPED 04
  • LC delay and power overhead amortized

7
Key Observation in Dual Vdd Buffering
  • Disallowing VL gt VH will not affect optimality
  • Optimality empirically illustrated (_at_ 65nm)
  • (a) has LC and VH drives Cl, power (a) gt (b)
  • Delay (b) gt (a) only if Cl gt 0.5pF ( 9mm wire)

VH
VL
8
DVB Formulation
  • Dual Vdd Buffer Insertion (DVB)
  • Given interconnect tree
  • Find buffer placement, Vdd assignment for
    buffers, sizes of buffers
  • VH buffers driving VL buffers within the tree
  • Level converters at VH sinks driven by VL buffers
  • Minimize power subject to
  • Arrival time requirement at the source (RAT)
  • Slew rate constraint at buffer inputs and sinks

9
DVB Algorithm
  • Based on Lillis, JSSC 96
  • Dynamic programming with partial solution
    (option) pruning
  • Options must now record downstream Vdd levels for
    buffering
  • To prevent VL gt VH, which removes unnecessary
    search on solution space
  • Still quite slow for large nets
  • Challenge
  • Considering power causes super-linear growth in
    the number of options (w.r.t. tree size)
  • Dual Vdd buffers gt 2x options at each node

10
Speed-up Technique
  • Approximate by power-delay sampling
  • Sampling under each distinct cap value
  • Uniformly pick options from the entire RATpower
    trade-off curve

11
Experimental Settings for DVB
  • Testcase randomly generated Steiner trees
  • 20 to 800 terminals in 1cm x 1cm routing area
  • Buffer sizes 16x, 32x, 64x
  • Sampling grid set to 20x20
  • Comparison
  • Exact power-optimal algorithm (PB)Lillis, JSSC
    96
  • Our algorithm with single (SVB) and dual(DVB)
    Vdd buffers

12
Sampling Preserves Optimality
  • Sampling has little impact on optimality
  • SVB follows PB closely
  • Still optimal delay, 1.7 larger power over PB

13
Dual Vdd Reduces Power
  • Dual Vdd shifts power-delay curve to the left

14
Experimental Results for DVB
  • DVB saves 23 power over SVB
  • More power saving in larger nets
  • Power saving becomes larger w/delay slack
  • e.g. relax delay 5, saving becomes 26

15
Runtime
  • SVB scales a lot better for larger testcases
  • Achieved 17x speedup over PB Lillis, JSSC 96
  • DVB takes 2.5x more runtime than SVB

16
Outline
  • Dual-Vdd Buffer insertion and sizing (DVB)
  • Problem formulation
  • Sampling speed-up technique
  • Experimental results
  • Dual-Vdd buffered tree generation (D-Tree)
  • Problem formulation
  • Improved augmented orthogonal search tree
  • Experimental results

17
D-Tree Formulation
  • Dual Vdd Buffered Tree (D-Tree)
  • Given locations of terminals, buffer stations and
    blockages
  • Find a rectilinear Steiner tree (RST), buffer
    placement/size/Vdd assignment
  • VH buffers driving VL buffers only
  • Level converters at VH sinks driven by VL buffers
  • Minimize power
  • Arrival time requirement at the source (RAT)
  • Slew rate constraint at buffer inputs and sinks
  • D-Tree is NP-Hard
  • Finding minimum RST alone is NP-Complete

18
Buffered Tree Construction
  • Delay optimization only Cong, DAC 00 by
  • Build Hanan Graph w/buffer insertion nodes
    according to locations of buffer stations
  • Path search on the grid by option propagation

19
D-Tree Algorithm Overview
  • Challenges
  • Growth of option is exponential
  • An artifact of D-Trees NP-hardness
  • Considering power worsens option growth
  • Solution sampling efficient prune tree

20
Prune Tree in Lillis, JSSC 96
  • Option inserted in sorted capacitance
  • Never need to clear options out from the tree
  • If new option is checked against the tree
  • Automatically avoid redundant option in tree
  • e.g. ?new (c 20, p 100, q 600)
  • Not applicable to D-Tree problem
  • Order of new options is not known a priori

21
Our Improvement on Prune Tree
  • Indexing w/capacitance results in fewer trees
  • capacitance value lt power value
  • Efficient tree cleaning
  • Enables out-of-order option insertion
  • Guarantee no redundancy in tree

22
Tree Cleaning
  • To add an option ?new in O(clog(T)) time
  • Check whether ?new is dominated by any option in
    the data-structure
  • If not, remove options in the tree dominated by
    ?new in two downward tree traversals
  • e.g. ?new (c 10, p 70, q 410, )

23
Experimental Settings for D-Tree
  • Random testcases
  • All based on a random floorplan of 1cm x 1cm
  • Blockages 30, buffer stations 1mm apart
  • Comparison
  • Delay-optimal tree (RMP) Cong, DAC 00
  • Ours with single (S-Tree) and dual(D-Tree) Vdd
    Buffer

24
Experimental Results for D-Tree
  • Significant power saving over RMP
  • S-Tree 7, D-Tree 18
  • Larger saving for large testcases (e.g. T4)
  • Handles up to 6-sink nets (T5 takes 23 mins)
  • Similar capability compared with delay-optimal
    approaches Cong, DAC 00 Chen, ASP-DAC 02

25
Conclusion
  • Formulated dual Vdd buffer insertion/tree
    generation without level converters
  • Proposed 2 speedup techniques
  • Sampling w/negligible loss of optimality
  • Improved prune tree for solution pruning
  • Applied to single-Vdd buffer insertion, 17x
    faster than existing work
  • Large power saving over single Vdd buffering
  • 23 in buffer insertion dual Vdd vs single Vdd
  • 18 in buffered tree dual Vdd vs delay optimal

26
Future Work
  • Speed up tree construction
  • Slack allocation for more power reduction
  • Path-based buffer insertionSze, DAC 05
  • Allocate slack along one interconnect path
  • Consider single Vdd buffers only
  • Chip level FPGA dual Vdd assignmentLin, DAC 05
  • Fixed buffer location, assign Vdd levels
  • Consider Multiple critical path
  • Solved as a linear programming problem
Write a Comment
User Comments (0)
About PowerShow.com