Title: Fine-Grained Sleep Transistor Sizing Algorithm for Leakage Power Minimization
1Fine-Grained Sleep Transistor Sizing Algorithm
for Leakage Power Minimization
De-Shiuan Chiou Da-Cheng Juan Yu-Ting
Chen Shih-Chieh Chang Department of CS, National
Tsing Hua University, Taiwan
2Outline
- Sleep Transistor Sizing Problem
- MIC Estimation Mechanism
- Partitioned Time-Frame for MIC Estimation
- Experimental Results and Conclusions
3Power Gating
- Leakage increases exponentially
- reach 50 of total power in 65nm technology
- Power Gating
- One of the most effective ways to reduce leakage
Low Vth Logic Device
VDD
GND
4Implementation of Power Gating
- Distributed Sleep Transistor Network (DSTN)
Low Vth Logic Device
VDD
VGND
5Leakage Saving
- In standby mode
- Leakage proportional to the STs size
- Small ST to reduce leakage
VDD
VGND
Ileakage
Ileakage
Ileakage
6Voltage Drop across the ST
- In active mode
- Voltage drop across a ST degrades the speed
- Voltage drop inversely proportional to the STs
size - Large ST to bound the voltage drop
VDD
VGND
VST
VST
VST
7Sleep Transistor (ST) Sizing
- Dilemma scenario
- Large ST to bound the voltage drop. (active mode)
- Small ST to reduce leakage. (standby mode)
- gtobjective minimize ST size (leakage) under a
specified voltage drop constraint, VST
VDD
VGND
VST
VST
VST
VST
VST
VST
8Estimate Voltage Drop by MIC
- Maximum Instantaneous Current (MIC) through the
ST - determines the worst case voltage drop
- Estimating the upper bound of MIC(ST)
- for sizing ST appropriately to meet voltage drop
constraint
VDD
MIC(ST) MIC across a ST.
VGND
MIC(ST3)
MIC(ST1)
MIC(ST2)
9Estimate Voltage Drop by MIC
- MIC(C) (MIC of a cluster) is easy to measure
- Due to current balancing effect
- MIC(ST) (MIC through the ST) is hard to predict
Finding the MIC of a cluster is fast
VDD
Finding the MIC across a ST is time-consuming
MIC(C1)
VGND
MIC(ST3)
MIC(ST1)
MIC(ST2)
10Temporal Perspective of Clusters MIC
- Traditional ways
- use the entire clock periods MIC to determine
the ST size
(Current)
MIC(C1) occurs at T6
MIC(C2) occurs at T9
(Time Unit)
MIC(Ci) waveform
11Temporal Perspective of Clusters MIC
- Smaller time frames leads to
- a more accurate MIC estimation
- high computation complexity
MIC(Ci) waveform
12Difficulties
- Current balancing effect complicates the sizing
problem - Time-frame partitioning leads to high computation
complexity
MIC
MIC
MIC
MIC
13Contributions
- A more accurate MIC prediction in a temporal
perspective - A variable-length partitioning to reduce
computation complexity - Heuristics to minimize the size of sleep
transistors - Achieving 21 reduction in sleep transistor area
14Outline
- Sleep Transistor Sizing Problem
- MIC Estimation Mechanism
- Partitioned Time-Frame for MIC Estimation
- Experimental Results and Conclusions
15Resistance Network
16Discharging Ratio
- The discharging ratio can be calculated by
- Kirchhoffs Current Law
- Ohms Law
17Discharging Matrix ?
18MIC(ST) Estimation Mechanism
19Outline
- Sleep Transistor Sizing Problem
- MIC Estimation Mechanism
- Partitioned Time-Frame for MIC Estimation
- Experimental Results and Conclusions
20Temporal Perspective of Clusters MIC
- Different MIC(Ci) occurs at different time points
(Current)
MIC(Ci) waveform
21Temporal Perspective of Clusters MIC
- Different MIC(Ci) occurs at different time points
within a clock period - Traditional way to estimate MIC(STi) is over
pessimistic
22Time-Frame Partitioning for MIC(ST) Estimation
- Expand MIC(Ci) into MIC(Ci,Tj)
one clock cycle
(Current)
(Time Frame)
MIC(Ci,Tj) waveform
23Time-Frame Partitioning for MIC(ST) Estimation
- For each time frame Tj, use MIC(Ci,Tj) to obtain
MIC(STi,Tj)
24Time-Frame Partitioning for MIC(ST) Estimation
- For ST1, the maximum MIC(ST1,Tj) among all Tj is
the upper bound of MIC(ST1) after partitioning
one clock cycle
ST 1
(Current)
ST 2
(Time Frame)
MIC(STi,Tj) waveform
25Time-Frame Partitioning for MIC(ST) Estimation
Time-Frame Partitioning leads to a better MIC(ST)
estimation!
one clock cycle
ST 1
(Current)
ST 2
(Time Frame)
MIC(STi,Tj) waveform
26Reduce the Computation Complexity
- Increase the number of time frames leads to
- more accurate voltage drop estimation
- high computation complexity
- Reduce the computation complexity
- dominated time-frame removal
- variable length time-frame partitioning
27Dominated Time-Frame Removal
- T3 is dominated by T6
- MIC(C1,T6) gt MIC(C1,T3)
- MIC(C2,T6) gt MIC(C2,T3)
- Neglect T3 and all dominated time frames
28Variable Length Time-Frame Partitioning
- (Tb dominates Tc ) and (Tb dominates Td)
- gt the estimated upper bound will be smaller
- If all the MIC(Ci) are separated, the MIC(STi)
can be better estimated!
(2)
variable length two-way partition
(1)
uniform two-way partition
29Problem Formulation of ST Sizing
- Inputs
- Voltage-drop constraint
- MIC(Ci,Tj) Clusters MIC information
- Objective minimize the total ST width
- Voltage drops must meet the constraint
30ST Sizing Algorithm
1. Initialize ST size with a large value.
2. Update the discharging matrix.
3. Update MIC(STi,Tj) and voltage drops.
0.38 0.30 0.21 0.18
0.27 0.30 0.21 0.18
MIC(STi,Tj) .MIC(Ci,Tj) V(STi,Tj)MIC(STi,Tj).R
(STi)
0.21 0.24 0.35 0.28
99
99
99
99
0.14 0.16 0.23 0.36
4. Resize ST with the worst drop.
No
31Outline
- Sleep Transistor Sizing Problem
- MIC Estimation Mechanism
- Partitioned Time-Frame for MIC Estimation
- Experimental Results and Conclusions
32Environment Setup
- TSMC 130nm CMOS technology
- Vdd 1.3 volt
- Specified tolerable IR drop 5 of the ideal
supply voltage - MIC(Ci,Tj) is obtained via 10,000-random-pattern
PrimePower simulations
33Implementation Flow
RTL netlist
Commercial tools
Our tools
34Experimental Results
Circuit
C432
C499
C880
C1355
C3540
C5315
C7552
dalu
frg2
i8
t481
des
AES
Avg.
Previous works 2 Chiou et al. DAC06, 8 Long
et al. DAC03
35Conclusions
- Propose an efficient sleep transistor sizing
method for DSTN power gating designs - Present theorems based on temporal perspective
for estimating a tight upper bound of voltage
drop - Achieving 21 size (leakage) reduction
36 37Sleep Transistor (ST) Sizing
- Relations between WST, and VST.
- Sleep Transistors operate in linear region in
active mode.
38Sleep Transistor (ST) Sizing
- Determine the minimum required size (WST ) based
on - MIC(ST)
- VST IR-drop constraint
Smaller MIC(ST) leads to a better ST size!