Title: Dynamic and Leakage Power Reduction in MTCMOS Circuits Using an Automated Efficient Gate Clustering
1Dynamic and Leakage Power Reduction in MTCMOS
Circuits Using an Automated Efficient Gate
Clustering Technique
- Mohab Anis, Shawki Areibi , Mohamed Mahmoud and
Mohamed Elmasry - VLSI Research Group, University of Waterloo,
Canada - School of Engineering, University of Guelph,
Canada
2Presentation Outline
- Low Power Design in DSM
- Concept of sleep transistors
- Previous work
- Sizing the sleep transistor
- Bin-Packing technique
- Set-Partitioning technique
- Conclusion and extended work done
3Why Low Power Design ?
- Growing market of mobile and handheld electronic
systems. - Difficulty in providing adequate cooling. Fans
create noise and add to cost. - Heat dissipation impacts packaging technology and
cost - Increasing standby time of portable devices.
- In DSM regimes, leakage power has become as big a
problem as dynamic power
4Concept of sleep transistors
MTCMOS technology is an increasingly popular
technique to reduce leakage power Proper ST
sizing is a key issue ST size
Area , Pdynamic , Pleakage ST size
Delay
LVT Logic Block
VX
VX
R
I
SLEEP
HVT
Modeling of a sleep transistor as a
resistor
5First Approach 1
- Single ST to support whole circuit
- Increase in interconnect resistance for
- distant blocks
- ST size to compensate added
- resistance Area Pdynamic
Pleakage -
- More significant in the DSM regime
- 1 S.Mutah et al. 1-V Power Supply High-Speed
Digital Circuit Technology with Multi-Threshold
Voltage - CMOS, IEEE J. of Solid-State Circuits,
pp.847-853, 1995.
HVT
SLEEP
6Second Approach 2
Single ST is sized according to a mutual
exclusive discharge pattern algorithm.
ST assignments are wasteful.
Increase in interconnect resistance for distant
blocks. ST size to compensate added
resistance. Pdynamic Pleakage
More significant in the DSM regime.
2 J.Kao et al. MTCMOS Hierarchical Sizing
Based on Mutual Exclusive Discharge Patterns, in
Proc. of 35th DAC, pp. 495-500, Las Vegas,
1998
7Sizing the sleep transistor
- Objective Constant ST size, causing 5
degradation in circuit speed. - (W/L)sleep Isleep
- 0.05 ?n Cox
(Vdd-VtL)(Vdd-VtH) - Isleep is chosen to be 250 ?A.
- (W/L)sleep ? 6 for 0.18 ?m CMOS technology
- VtL 350mV, VtH 500mV
-
84-bit CLA Adder
9Preprocessing of Gate Currents
Random I/Ps to CLA adder are applied, highest
current discharge is monitored, and multiplied
by corresponding switching activity
Monitor the peak current value and time of
occurrence duration
Currents are combined into single current
Ieq maxIi, when ? Ii in time ? maxIi
10Timing Diagram
11Preprocessing Heuristic
- Initialize current vectors
- Set all Gates free to move to sub-cluster
- 3. For all gates in circuit
- If gate i is not clustered
yet - assign gate i to new
cluster k - update cluster current
vector - calculate max current,
start, end time - For all other gates in
circuit - If (gate j is not
clustered yet) - add current of gate
j to cluster k - If (combination ?
max current) - append gate to
cluster - update cluster
info - set gate j
locked in cluster k - End For
- End For
- 4. Return all clusters formed.
12Bin-Packing Technique
- Objective Minimize the No. of used STs.
- Subject to 1. ? Ieq ? Imax for any ST.
- 2. Ieq are assigned only once.
-
13Currents Assignment
2
1
Sleep Transistors
Equivalent Currents
IEQ3 IEQ4 IEQ7
IEQ1 IEQ2 IEQ5 IEQ6
Assigned Gates
G1 G2 G3 G4 G9 G10 G11 G12 G13 G15 G17 G19 G20
G21 G22 G24 G25 G26 G27 G28
G5 G6 G7 G8 G14 G16 G18 G23
240
250
? Currents (?A)
14Clustering of CLA adder
15Set-Partitioning Technique
16Cost Function
Cj ( w1 . Cj1 ) ( w2 . Cj2 ) Cj1
Sleep_Transistor max_current - ? currenti
?i Cj2 ? duv in a group Sj
Sj
Gv
dvw
duv
Gw
dwu
Gu
17Clustering Heuristic
- Create_Clusters ( )
- Calculate distances between all gates
- Initialize maxgates_per_clustern
- Create clusters with Single gates
- For cl2 cl ? maxgates_per_cluster
- Create_n_Gate_Cluster (cl)
- For all clusters created calculate_cost ( )
- Create_n_Gate_Clusters (cl)
- For cluster of type cl
- create_new_cluster ( )
- While not done
- Choose Gate with
minimum distances - If sum of currents ?
capacity - append gate to
newly created cluster - End If
- If total gates within
cluster ? limit - break
- End While
18Set-Partitioning Technique
- Objective Minimize ? CjSj
- Subject to 1. ? of currents for Sj ? Imax
- 2. Groups must cover all
gates - with no repetition.
19Grouping of gates
Cell
Lmin
Sleep Device cavity
Ground rail
Vdd
Cell Height
G1
G3
G2
G5
G7
G6
G8
G4
gnd
G19
G11
G12
G9
G10
G14
G13
G16
G15
G17
G18
Vdd
G22
G23
G27
G28
G26
G24
G21
G25
G20
gnd
20Computational Time
BP/SP CPU TIME
SP CPU Time
BP CPU Time
2000
1800
1600
1400
1200
1000
Time (secs)
800
600
400
200
0
-200
28
30
31
61
160
204
Number of Gates
21Results ( Savings)
22 Power Savings (Bin-Packing)
23 Power Savings (Set-Partitioning)
24 ST Area Saving (Bin-Packing)
25 ST Area Saving (Set-Partitioning)
26Conclusion
- BP technique cluster gates in MTCMOS circuits.
Pdynamic and Pleakage are reduced by 15 and 90
compared to 1 and 2 respectively. - SP takes routing complexity into consideration.
Pdynamic and Pleakage are reduced by 11 and 77
compared to 1 and 2 respectively.
27Extended Work Done
- A hybrid clustering technique that combines the
BP and SP techniques is devised, to produce a
more efficient and faster solution. - Noise associated with ground bounce is taken as
taken as a design criterion (lt 50mV). - Investigating effect of different ST sizes on
circuit parameters. - Investigating effect of the cost function weights
w1 and w2 on circuit parameters.