Statistical Leakage Power Minimization Using Fast Equislack Shell Based Optimization - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Statistical Leakage Power Minimization Using Fast Equislack Shell Based Optimization

Description:

Statistical Leakage Power Minimization Using Fast Equi-slack ... Rocket. Nozzle. Power Density (W/cm. 2. 3. Leakage Optimization. Gate sizing problem. 1. 6 ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 29
Provided by: xia67
Category:

less

Transcript and Presenter's Notes

Title: Statistical Leakage Power Minimization Using Fast Equislack Shell Based Optimization


1
Statistical Leakage Power Minimization Using Fast
Equi-slack Shell Based Optimization
  • Xiaoji Ye, Peng Li
  • Department of ECE
  • Texas AM University
  • Yaping Zhan
  • AMD, Austin, TX, US

2
Power Density and Leakage
  • Power is critical for IC design
  • Leakage power becomes increasingly important

Patrick P. Gelsinger, ISSCC 2001
3
Leakage Optimization
  • Gate sizing problem

Objective Minimize leakage power Subject
to Circuit delay Timing constraint Gate
size ?Smin, S2, S3,, Smax Threshold voltage ?
Vthlow, Vthhigh
4
Deterministic Leakage Optimization
  • Process variations significantly impact the
    leakage power and timing
  • Deterministic optimization
  • Can not capture timing and leakage variability
  • May miss the true performance-limiting corner
  • Can not capture spatial correlations
  • Incapable of guaranteeing a specific yield
  • Perform statistical leakage optimization!

5
Statistical Optimization
  • Sensitivity-based incremental gate sizing
    technique
  • M. R. Guthaus, N. Venkateswaran, C.
    Visweswariah,V. Zolotov, ICCAD 2005
  • Use criticality as guidance to select candidate
    gates
  • Size one gate in each iteration, runtime may be
    high
  • Second-order conic programming
  • M. Mani, A. Devgan, M. Orshansky, DAC 2005
  • Gate size rounding is needed after each
    optimization
  • Bottleneck is the SOCP computation

6
Our Work
  • Use SSTA as guidance for circuit optimization
  • Capable of guaranteeing specific timing yield
  • Timing, leakage, leakage/delay sensitivity fully
    parameterized in process variations
  • Introduce a novel equi-slack shell based
    technique
  • A number of gates are sized simultaneously
    ateach iteration
  • Statistical timing information (slack) is updated
    efficiently at each iteration without re-running
    SSTA
  • Timing yield is strictly maintained
  • Smooth incremental optimization step

7
Motivation
  • Perform sizing on the basis of groups (equi-slack
    shells)

8
Our IdeaEqui-Slack Shell Concept
  • Group nodes by equal slacks
  • Do optimization within a group

9
Our ApproachSizing Within a Shell
  • Nodes within a shell are divided into levels
  • Level sensitivity is the sum of the leakage/delay
    sensitivities of all the nodes in this level
  • Pick the level with the maximum leakage/delay
    sensitivity
  • Size down nodes in a selected level

2
2
2
2
2
10
Our Approach Shell Update
  • Efficiently update the slacks after each sizing
    step
  • Merge the groups of the same slack

11
Flowchart
Obtain statistical slacks for all gates
Initial run of SSTA
Shell initialization
Find shell with the largest slack Optimize
within shell
Slack update within shell Shell update
Timing violated?
Post tuning
12
Atomic operations on the shell
  • Initialization
  • Expansion
  • Merge
  • Levelization
  • Sizing within the shell
  • Slack update
  • Shell update

13
Shell Initialization
  • Conduct initial forward and backward SSTAs to
    compute statistical slacks for all gates
  • Gates with the largest mean-3sigma slack are
    selected as seeds

14
Shell Expansion
  • For each seed, check its neighbor gates
  • If
    ? absorb the neighbor
  • Guarantee the (mean /- 3sigma) points of the
    slack for gates inside and outside shell are not
    close with each other

Shell B
Shell A
Shell C
15
Shell Merge
  • Merge when two shells meet

Expand
Shell B
Merge
Shell A
16
Shell Data Structure
  • Shell boundary stored separately for fast
    operations

17
Levelization
  • Levelization within a shell
  • Level sensitivity
  • Pick the level with the largest mean-3sigma
    sensitivity

G3
G9
G6
G1
L2
L4
L1
L3
G4
G8
G2
G7
G12
G5
18
Optimization Within A Shell
  • Size down gates in the selected level to absorb
    the slack
  • Gradually reduce the slack
  • Safe bound
  • Change gate size or Vth, guarantee
  • Slack reduction for other gates within shell

?Slack'max?Slack1,?Slack2,...,?Slackk
?Slacki is slack reduction for gatei
gatei target level, i1,...,k
Î
19
Slack Update Within A Shell
  • Internal gates slack reduced by
  • Boundary gates
  • Fanin cone of level
  • Fanout cone of level

G11
G10
G3
G9
G6
G1
L2
L4
L1
L3
G4
G8
G2
G7
G12
G5
G14
G13
20
Shell Update
  • Boundary gate if
  • Remove it from the shell
  • Form a new shell with a larger slack

G11
G10
G3
G9
G1
G6
L2
L4
L1
G4
G8
L3
G7
G2
G12
G5
G14
G13
21
Flowchart
Obtain statistical slacks for all gates
Initial run of SSTA
Shell initialization
Find shell with the largest slack Optimize
within shell
Slack update within shell Shell update
Timing violated?
Post tuning
22
Experimental Setup
  • Six gate sizes 1x, 2x, 4x, 6x, 8x, 10x, and dual
    Vth
  • Process variation range
  • Model characterization using 45nm PTM
  • Algorithm implemented in C, applied to ISCAS 85
    benchmark circuits
  • Results compared with deterministic optimization
    under the same timing constraint

23
Results
  • Leakage vs delay for c499 for different timing
    constraints

24
Results
  • Delay-power product for c880 for different timing
    constraints

25
Results
  • Leakage distribution for c432

26
(No Transcript)
27
Conclusion
  • A novel equi-slack shell based statistical
    leakage optimization technique is proposed
  • Level-based statistical sensitivities are used as
    guidance to simultaneously optimize multiple
    gates in a shell
  • Fast slack update after each sizing step without
    re-running SSTA can be achieved
  • Superior runtime and optimization results have
    been observed in the experiments

28
Thank you!
Write a Comment
User Comments (0)
About PowerShow.com