Title: Statistical Leakage Power Minimization Using Fast Equislack Shell Based Optimization
1Statistical Leakage Power Minimization Using Fast
Equi-slack Shell Based Optimization
- Xiaoji Ye, Peng Li
- Department of ECE
- Texas AM University
- Yaping Zhan
- AMD, Austin, TX, US
2Power Density and Leakage
- Power is critical for IC design
- Leakage power becomes increasingly important
Patrick P. Gelsinger, ISSCC 2001
3Leakage Optimization
Objective Minimize leakage power Subject
to Circuit delay Timing constraint Gate
size ?Smin, S2, S3,, Smax Threshold voltage ?
Vthlow, Vthhigh
4Deterministic Leakage Optimization
- Process variations significantly impact the
leakage power and timing - Deterministic optimization
- Can not capture timing and leakage variability
- May miss the true performance-limiting corner
- Can not capture spatial correlations
- Incapable of guaranteeing a specific yield
- Perform statistical leakage optimization!
5Statistical Optimization
- Sensitivity-based incremental gate sizing
technique - M. R. Guthaus, N. Venkateswaran, C.
Visweswariah,V. Zolotov, ICCAD 2005 - Use criticality as guidance to select candidate
gates - Size one gate in each iteration, runtime may be
high - Second-order conic programming
- M. Mani, A. Devgan, M. Orshansky, DAC 2005
- Gate size rounding is needed after each
optimization - Bottleneck is the SOCP computation
6Our Work
- Use SSTA as guidance for circuit optimization
- Capable of guaranteeing specific timing yield
- Timing, leakage, leakage/delay sensitivity fully
parameterized in process variations - Introduce a novel equi-slack shell based
technique - A number of gates are sized simultaneously
ateach iteration - Statistical timing information (slack) is updated
efficiently at each iteration without re-running
SSTA - Timing yield is strictly maintained
- Smooth incremental optimization step
7Motivation
- Perform sizing on the basis of groups (equi-slack
shells)
8Our IdeaEqui-Slack Shell Concept
- Group nodes by equal slacks
- Do optimization within a group
9Our ApproachSizing Within a Shell
- Nodes within a shell are divided into levels
- Level sensitivity is the sum of the leakage/delay
sensitivities of all the nodes in this level - Pick the level with the maximum leakage/delay
sensitivity - Size down nodes in a selected level
2
2
2
2
2
10Our Approach Shell Update
- Efficiently update the slacks after each sizing
step - Merge the groups of the same slack
11Flowchart
Obtain statistical slacks for all gates
Initial run of SSTA
Shell initialization
Find shell with the largest slack Optimize
within shell
Slack update within shell Shell update
Timing violated?
Post tuning
12Atomic operations on the shell
- Initialization
- Expansion
- Merge
- Levelization
- Sizing within the shell
- Slack update
- Shell update
13Shell Initialization
- Conduct initial forward and backward SSTAs to
compute statistical slacks for all gates - Gates with the largest mean-3sigma slack are
selected as seeds
14Shell Expansion
- For each seed, check its neighbor gates
- If
? absorb the neighbor - Guarantee the (mean /- 3sigma) points of the
slack for gates inside and outside shell are not
close with each other
Shell B
Shell A
Shell C
15Shell Merge
- Merge when two shells meet
Expand
Shell B
Merge
Shell A
16Shell Data Structure
- Shell boundary stored separately for fast
operations
17Levelization
- Levelization within a shell
- Level sensitivity
- Pick the level with the largest mean-3sigma
sensitivity
G3
G9
G6
G1
L2
L4
L1
L3
G4
G8
G2
G7
G12
G5
18Optimization Within A Shell
- Size down gates in the selected level to absorb
the slack - Gradually reduce the slack
- Safe bound
- Change gate size or Vth, guarantee
- Slack reduction for other gates within shell
?Slack'max?Slack1,?Slack2,...,?Slackk
?Slacki is slack reduction for gatei
gatei target level, i1,...,k
Î
19Slack Update Within A Shell
- Internal gates slack reduced by
- Boundary gates
- Fanin cone of level
- Fanout cone of level
G11
G10
G3
G9
G6
G1
L2
L4
L1
L3
G4
G8
G2
G7
G12
G5
G14
G13
20Shell Update
- Boundary gate if
- Remove it from the shell
- Form a new shell with a larger slack
G11
G10
G3
G9
G1
G6
L2
L4
L1
G4
G8
L3
G7
G2
G12
G5
G14
G13
21Flowchart
Obtain statistical slacks for all gates
Initial run of SSTA
Shell initialization
Find shell with the largest slack Optimize
within shell
Slack update within shell Shell update
Timing violated?
Post tuning
22Experimental Setup
- Six gate sizes 1x, 2x, 4x, 6x, 8x, 10x, and dual
Vth - Process variation range
- Model characterization using 45nm PTM
- Algorithm implemented in C, applied to ISCAS 85
benchmark circuits - Results compared with deterministic optimization
under the same timing constraint
23Results
- Leakage vs delay for c499 for different timing
constraints
24Results
- Delay-power product for c880 for different timing
constraints
25Results
- Leakage distribution for c432
26(No Transcript)
27Conclusion
- A novel equi-slack shell based statistical
leakage optimization technique is proposed - Level-based statistical sensitivities are used as
guidance to simultaneously optimize multiple
gates in a shell - Fast slack update after each sizing step without
re-running SSTA can be achieved - Superior runtime and optimization results have
been observed in the experiments
28Thank you!