ECE260B CSE241A Winter 2005 Power Distribution - PowerPoint PPT Presentation

1 / 52
About This Presentation
Title:

ECE260B CSE241A Winter 2005 Power Distribution

Description:

Power supply noise is a serious issue in DSM design. Noise is getting worse as ... Miscellany. ECOs: What happens to rings and trunks if blocks change size? ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 53
Provided by: andrew139
Category:

less

Transcript and Presenter's Notes

Title: ECE260B CSE241A Winter 2005 Power Distribution


1
ECE260B CSE241AWinter 2005Power Distribution
Website http//vlsicad.ucsd.edu/courses/ece260b-
w05
2
Motivation
  • Power supply noise is a serious issue in DSM
    design
  • Noise is getting worse as technology scales
  • Noise margin decreases as supply voltage scales
  • Power supply noise may slow down circuit
    performance
  • Power supply noise may cause logic failures

3
Power
  • Routing resources
  • 20-40 of all metal tracks used by Vcc, Vss
  • Increased power ? denser power grid
  • Pins
  • Vcc or Vss pin carries 0.5-1W of power
  • Pentium 4 uses 423 pins 223 Vcc or Vss
  • More pins ? package more expensive
    ( package
    development, motherboard redesign, )
  • Battery cost
  • 1kg NiCad battery powers a Pentium 4 alone for
    less than 1 hour
  • Performance
  • High chip temperatures degrade circuit
    performance
  • Large across-chip temperature variations induce
    clock skew
  • High chip power limits use of high-performance
    circuits
  • Power transients determine minimum power supply
    voltage

4
Power Package
Pentium 4 die is about 1.5g and less than 1cm3
Pentium-4 in package with interposer, heat sink,
and fan can be 500g and 150cm3
Modern processor packaging is complex and adds
significantly to product cost. http//www.intel.co
m/support/processors/procid/ptype.htm
Courtesy M. McDermott UT-Austin
5
Planning for Power
  • Early simulation of major power dissipation
    components
  • Early quantification of chip power
  • Total chip power
  • Maximum power density
  • Total chip power fluctuations
  • inherent added fluctuations due to clock gating
  • Early power distribution analysis (dc, ac,
    multi-cycle)
  • I.e., average, maximum, multi-cycle fluctuations
  • Early allocation coordination of chip resources
  • Wiring tracks for power grid
  • Low Vt devices
  • Dynamic circuits
  • Clock gating
  • Placement and quantity of added decoupling
    capacitors

6
Power and Ground Routing
  • Floorplanning includes planning how the power,
    ground and clock should route
  • Power supply distribution
  • Tree trunk must supply current to all branches
  • Resistance must be very small since when a gate
    switches, its current flows through the supply
    lines
  • If the resistance of supply lines is too large,
    voltage supplied to gates will drop, which can
    cause the gate to malfunction
  • Usually, want at most 5-10 IR drop due to supply
    resistance
  • ? Usually on the top layers of metal, then
    distributed to lower wiring layers

7
Planar Power Distribution
  • Topology of VDD/VSS networks.
  • Inter-digitated
  • Design each macrocell such that all VDD and VSS
    terminals are on opposite sides.
  • If floorplan places all macrocells with VDD on
    same side, then no crossing between VDD and VSS.

VDD
B
VSS
C
VDD
VSS
A
VDD
VSS
VDD
VSS
VDD
VSS
Courtesy K. Yang, UCLA
8
Gridded Power Distribution
  • With more metal layers, power is striped
  • Connection between the stripes allows a power
    grid
  • Minimizes series resistance
  • Connection of lower layer layout/cells to the
    grid is through vias
  • Note that planar supply routing is often still
    needed for a strong lower layer connection.
  • There may not be sufficient area to make a strong
    connection in the middle of a design (connect
    better at periphery of die)

Courtesy K. Yang, UCLA
9
Power Supply Drop/Noise
  • Supply noise variations in power supply voltage
    that act as noise source for logic gates
  • Power supply wiring resistance ? voltage
    variations with current surges
  • Current surges depend on dynamic behavior of
    circuit
  • Solution approach
  • Measure maximum current required by each block
  • Redesign power/ground network to reduce
    resistance
  • Worst case move activity to another clock cycle
    to reduce peak current ? scheduling problem
  • Example Drive 32-bit bus, total bus wire load
    2pF, with delay 0.5ns
  • R for each transistor needs to be lt 0.25kW to
    meet RC 0.5ns
  • Effective R of bits together is 250/32 7.5W
  • For lt 10 drop, power distribution R must be lt 1W

Courtesy K. Yang, UCLA
10
Electromigration
  • Physical migration of metal atoms due to
    electron wind can eventually create a break in
    a wire
  • MTTF (mean time to failure) ? 1/J2 where J
    current density
  • Current density must not exceed specification ?
    wire Ii/wi lt Jspec
  • Specified as mA per ?m wire width (e.g., 1mA/ ?m)
    or mA per via cut
  • EM occurs both in signal (ACbidirectional) and
    power wires (DC unidirectional)
  • Much worse for DC than AC DC occurs inside cells
    and in power buses
  • May need more contacts on transistor sources and
    drains to meet EM limits
  • Width of power buses must support both iR and EM
    requirements
  • Issues in IR and EM constraint generation
  • Topology is most likely not a tree
  • How do we determine current patterns?
  • Effects of R, L

11
What Happens?
  • Example of an AlCu line seen under microscope.
  • Accelerated by higher temperature and high
    currents
  • Voids form on grain boundaries
  • Metal atoms move with current away from voids and
    collect at boundaries

Catastrophic failure
Courtesy K. Yang, UCLA
12
Taken from http//www.nd.edu/micro/fig20.html
Taken from Sverre Sjøthun, Electromigration In-De
pth, from www.dpwg.com
Courtesy S. Sapatnekar, UMinn
13
Power Supply Rules of Thumb
  • Rules depend on technology
  • Tech file has rules for resistance and
    electromigration
  • Examples
  • Must have a contact for each 16l of transistor
    width (more is better)
  • Wire must have less than 1mA/mm of width
  • Power/Gnd width Length of wire Sum (all
    transistors connected to wire) / 3106l (very
    approximate)
  • For small designs, power supply design is
    non-issue

Courtesy K. Yang, UCLA
14
Basic Methodology Concepts
  • Reliability (slotting, splitting)
  • Alignment of hierarchical rings, stripes
  • Isolation of analog power
  • Styles of power distribution
  • Rings and trunks
  • Uniform grid
  • Bottom-up grid generation
  • Depends on
  • Package flip-chip vs. wire-bond I/O count
    (fewer pads ? denser grid)
  • Power budget
  • IR drop limits
  • Floorplan constraints (hard macros, etc.)

15
Metal Slotting vs. Splitting
  • Required by metal layout rules for uniform CMP
    (planarization)
  • Split power wires
  • Less data than traditional slotting
  • More accurate R/C analysis of power mesh
  • Not supported by all tools

Easy connections through standard via arrays
GND
GND
GND
GND
VS.
M1
M1
Difficult to connect - where should vias go?
Courtesy Cadence Design Systems, Inc.
16
Trunks and Rings Methodology
  • Each Block has its own ring
  • Rings may be inside the blocks or part of the top
    level
  • Each Block has trunks connecting top level to
    block

V
G
G
V
Rings may be shared with abutted blocks
Individual trunks connecting blocks to top level
block 3
V
V
block 5
G
G
block 2
V
block 4
G
V
block 1
V
G
V
V
V
G
G
G
Courtesy Cadence Design Systems, Inc.
17
Trunks and Rings
  • Advantages
  • Power tailored to the demands of each block
    (flexible)
  • More area efficient since the demands of each
    block are uniquely met
  • Simple implementation supported by many tools
  • Rings can be shared between blocks by abutted
    blocks
  • Disadvantages
  • Limited redundancy, power grid built to match
    needs
  • Assumptions in design may change or be invalid
  • Non regular structure requires more detailed IR
    drop/EM analysis
  • missing vias/connections fatal
  • Rings will require slotting/splitting due to wide
    widths
  • Increase in data volume

Courtesy Cadence Design Systems, Inc.
18
Uniform Chip Grid Methodology
  • Robust and redundant power network
  • mainly in microprocessors and high end large
    ASICs
  • Implementation
  • Primary distribution through upper metal layers
  • Lower layers in blocks to connect to top through
    via stacks
  • Typically pushed into blocks
  • Blocks typically abut
  • Requires block grids to align
  • Rows/Followpins should align with block pins
  • Global buffer insertion

global grid higher layers
Fine or custom grid or no grid on lower layers
G
V
G
V
V
V
block 4
block 5
G
G
block 3
V
block 4
G
V
block 1
V
G
V
G
G
V
V
G
Courtesy Cadence Design Systems, Inc.
19
Uniform Chip Grid
  • Advantages
  • Easily implemented
  • Lends itself to straightforward hand calculations
  • Path redundancy allows less sensitively to
    changes in current pattern
  • Mesh of power/ground provides shielding (for
    capacitance) and current returns (for inductance)
  • Top-down propagation easy to use on this style
  • Disadvantages
  • Takes up significant routing resources (20-40
    of all routing tracks if not already reserved for
    power/ground)
  • Fine grids may slow down PR tools
  • Imposes grid structure into each block which may
    be unnecessary
  • Top and blocks coupled closely if top level
    routing pushed into blocks
  • Changes to block/top must be reflected in other

Courtesy Cadence Design Systems, Inc.
20
Bottom-Up Grid Generation Methodology
  • Design and optimize power grid for block, merge
    at top
  • Advantages
  • Able to tailor grid for routing resource
    efficiency in each block
  • Flexibility to choose the best grid for the block
    (i.e. ring and stripe, power plane, grid)
  • Disadvantages
  • Designing grid in context of the big picture is
    more difficult
  • Block grid may present challenging connections to
    top level
  • Assumptions for block grids connection to top
    level must be analyzed and validated

Courtesy Cadence Design Systems, Inc.
21
Power Routing in Area-Based PR
  • Power routing approaches
  • (1) Pre-route parts of power grid during
    floorplanning
  • (2) Build grid (except connections to standard
    cells) before PR
  • (3) Build entire grid before PR
  • N.B. Area-based PR tools respect pre-routes
    absolutely
  • Cadence tools power routing done inside SE, all
    other tasks (clock, place, route, scan, ) done
    by point tools
  • Lab 5 tomorrow has a tiny bit of power routing
    (rings, stripes)
  • Miscellany
  • ECOs What happens to rings and trunks if blocks
    change size?
  • Layer choices What is cost of skipping layers
    (to get from thick top-layer metal down to finer
    layers)?
  • How wide should power wires be?
  • Post-processing strategies

Courtesy Cadence Design Systems, Inc.
22
Power Routing Wire Width Considerations
  • Slotting rules Choose maximum width below
    slotting width
  • Halation (width-dependent spacing) rules Do as
    much as possible of power routing below wide wire
    width to save routing space
  • Choose power routing widths carefully to avoid
    blocking extra tracks (and, use the space if
    blocking the track!)
  • What is better power width here?

Blocked tracks
Courtesy Cadence Design Systems, Inc.
23
Power Routing Tool Usage
  • 4 layer power grid example (HVHV)
  • Turn on via stacking
  • Route metal2 vertically
  • Route metal4 vertically (use same coordinates)
  • Route metal3 horizontally (make coincident with
    every N metal1 routes)
  • Turn off via stacking
  • Route metal1 horizontally

metal2/metal4 coincident
metal1 inside cells
metal3 every n micron
Courtesy Cadence Design Systems, Inc.
24
Post-Processing Flows (DEF or Layout Editing)
During PnR
After post processing
Courtesy Cadence Design Systems, Inc.
25
(Tree) Supply Network Design
  • Tree topology assumption not very useful in
    practice, but illustrates some basic ideas
  • Assume R dominates, L and C negligible
  • marginally permissible assumption
  • Current drawn at various points in the tree
    (time-varying waveform)
  • Current causes a VIR drop
  • Ground is not at 0V
  • Vdd is not at intended level

Supply
sinks
Courtesy S. Sapatnekar, UMinn
26
IR Drop Constraints
  • Chowdhury and Breuer, TCAD 7/88
  • Can write V drop to each sink as
  • ? Ri Ii lt Vspec for all sink current
    patterns made available
  • Tree structure can compute Ii easily
  • Ri ? ? li / wi
  • Change wi to reduce IR drop
  • Objective minimize ? ai wi
  • Current density must never exceed a specification
  • For each wire, Ii/wi lt Jspec

Supply
Courtesy S. Sapatnekar, UMinn
27
P/G Mesh Optimization (R only)
  • Dutta and Marek-Sadowska, DAC 89
  • Cost function ? ai li wi ? ai cili2 //
    total wire area (since ci conductance
    wi/(? li)
  • Constraints
  • EM Ii ? ?e wi // current density I/w less
    than upper bound
  • Substitute Ii ?vi (wi/ ? li) // I V/R
    ? vp - vq ? ?e
    ? li // divide by wi, ? li
  • Wire width constraints Wmin ? wi ? Wmax
    (translate to ci)
  • Voltage drop constraints va - vb ? Vspec1 and/or
    vi ? Vspec2
  • Circuit equations that determine the vs
  • Variables cis (vis depend on cis)

Courtesy S. Sapatnekar, UMinn
28
Solution Technique
  • Method of feasible directions
  • Find an initial feasible solution (satisfies all
    constraints)
  • Choose a direction that maintains feasibility
  • Make a move in that direction to reduce cost
    function
  • Given a set of cis, must find corresponding vis
  • Feasible direction method move from point c to
    c
  • c and c must be close to each other (i.e., if
    you have the solution at c, the solution at c
    corresponds to a minor change in conductances)
  • Solving for vis solving a system of linear
    equations
  • Solution at c is a good guess for the solution
    at c
  • Converges in a few iterations

Courtesy S. Sapatnekar, UMinn
29
Modeling Gate Currents
  • Currents in supply grid caused by
    charging/discharging of capacitances by logic
    gates
  • All analyses require generation of a worst-case
    switching scenario
  • Enumeration is infeasible ? Two basic approaches
  • Simulation based methods designer supplies
    hot vectors, or we try to generate these hot
    vectors automatically
  • Pattern-independent methods try to estimate
    the worst-case (can be expensive, very
    inaccurate)
  • Once current patterns are available, apply them
    to supply network to find out if constraints are
    satisfied

Courtesy S. Sapatnekar, UMinn
30
Complexity of Hot Vector Generation
  • Devadas et al., TCAD 3/92
  • Assume zero gate delays for simplicity
  • Find the maximum current drawn by a block of
    gates
  • Using a current model for each gate
  • Find a set of input patterns so that the total
    current is maximized
  • Boolean assignment problem equivalent to
    Weighted Max-Satisfiability
  • Given a Boolean formula in conjunctive normal
    form (product of sums), is there an assignment of
    truth values to the variables such that the
    formula evaluates to True?
  • Checking for Satisfiability (for k-sat, k gt 2) is
    NP-complete
  • ? Difficult even under zero gate delay assumption

Courtesy S. Sapatnekar, UMinn
31
Pattern-Independent Methods
  • iMAX approach Kriplani et al., TCAD 8/95
  • Current model for a single gate
  • Gates switch at different times
  • Total current drawn from Vdd (ignoring supply
    network C) is the sum of these time-shifted
    waveforms
  • Objective find the worst-case waveform

Ipeak
? Delay
Courtesy S. Sapatnekar, UMinn
32
Example
  • Maximum current not just a sum of individual
    maximum currents
  • Temporal dependencies
  • Using deliberate clock skews can reduce the peak
    current, as we saw in the Useful-Skew discussion

Courtesy S. Sapatnekar, UMinn
33
Maximum Envelope Current (MEC)
  • Find the time interval during which a gate may
    switch
  • Manufacturing process variations can cause
    changes
  • Actual switching event can cause changes
  • Switching at second gate can occur at t1 or at
    t2
  • In general, a large number of paths can go
    through a gate assume (conservatively) that
    switching occurs in t ? 1,2
  • Assume that all gate inputs can switch
    independently provides an upper bound on the
    switching current

(unit gate delays)
Courtesy S. Sapatnekar, UMinn
34
(Large) Errors in MEC Approach
  • Correlation Problem
  • Switching at G0, G1, G2 and G3 not independent
  • G0 0 implies that G1, G2, G3 switch G0 1
    means that other inputs will determine gate
    activity
  • If the other inputs cannot make the gate switch
    in the same time window, then iMAX estimates are
    pessimistic
  • Reconvergent Fanout Problem
  • Signals that diverge at G0 reconverge at Gk ?
    inputs to Gk are not independent
  • Assumption of independent switching is not valid
  • Many heuristic refinements proposed, but
    guardbanding (error) of power estimation still a
    huge issue

G0
G1
Gk
G2
G0
G3
Courtesy S. Sapatnekar, UMinn
35
Outline
  • Motivation
  • Power Supply Noise Estimation
  • Decoupling Capacitance (decap) Budget
  • Allocation of Decoupling Capacitance
  • Experiment Results
  • Conclusion

36
Why Decoupling Capacitance
  • Frequency point of view
  • Decaps form low-pass filters
  • They cancel anti- effects
  • Physical point of view
  • Decaps serve as charge reservoirs
  • They shortcut supply current paths and reduces
    voltage drop
  • No effect to DC supply currents

37
Power Supply NetworkRLC Mesh
VDD
Current Source
Rp
Lp
VDD pin
VDD
VDD
VDD
Slide courtesy of S Zhao, K Roy C.-K. Kok
38
Current Distribution in Power Supply Mesh
Illustration
Current contribution
Current flowing path
Connection point,
VDD
(1)
(3)
VDD pin
(5)
VDD
(2)
(6)
C
B
Module A
Slide courtesy of S Zhao, K Roy C.-K. Kok
39
Current Distribution in Power Supply Network
  • Distribute switching current for each module in
    the power supply mesh
  • Observation Currents tend to flow along the
    least-impedance paths
  • Approximation Consider only those paths with
    minimal impedance --shortest, second shortest,

Slide courtesy of S Zhao, K Roy C.-K. Kok
40
Current Flowing Paths and Power Supply Noise
Calculation
  • Power supply noise at a target module is the
    voltage difference between the VDD pin and the
    module
  • Apply KVL

VDD
R2
L2
k
C1
i
Slide courtesy of S Zhao, K Roy C.-K. Kok
41
Why Decoupling Capacitance?
VDD
R2
L2
k
C1
R1
L1
C2
i
2(t)
  • P/G network wiresizing wont change voltage drop
    frequency spectrum
  • To reduce Vdrop by k times needs to size up wires
    by k times along the supply current path
  • Decoupling caps act as a low-pass filter
  • Efficient to remove high frequency elements of
    Vdrop

42
Decoupling Capacitance Budget
  • Decap budget for each module can be determined
    based on its noise level
  • Initial budget can be estimated as follows
  • Iterations are performed if necessary until
    noise at each module in the floorplan is kept
    under certain limit

Slide courtesy of S Zhao, K Roy C.-K. Kok
43
Allocation of Decoupling Capacitance
  • Decap needs to be placed in the vicinity of each
    target module
  • Decap requires WS to manufacture on
  • Use MOS capacitors
  • Decap allocation is reduced to WS allocation
  • Two-phase approach
  • Allocate the existing WS in the floorplan
  • Insert additional WS into the floorplan if
    required

Slide courtesy of S Zhao, K Roy C.-K. Kok
44
Allocation of Existing White Space
WS
A
B
D
w2
C
w1
E
w3
Slide courtesy of S Zhao, K Roy C.-K. Kok
45
Allocation of Existing WS--Linear Programming
(LP) Approach
  • LP Approach
  • Objective Maximize the utilization of available
    WS
  • Existing WS can be allocated to neighboring
    modules using LP
  • Notation

Slide courtesy of S Zhao, K Roy C.-K. Kok
46
Insert Additional WS into Floorplan If Necessary
  • Update decap budget for each module after
    existing WS has been allocated
  • If additional WS if required, insert WS into
    floorplan by extending it horizontally and
    vertically
  • Two-phase procedure
  • insert WS band between rows based the decap
    budgets of the modules in the row
  • insert WS band between columns based on the decap
    budgets of the modules in the column

Slide courtesy of S Zhao, K Roy C.-K. Kok
47
Moving Modules to Insert WS
Slide courtesy of S Zhao, K Roy C.-K. Kok
48
Experimental ResultsComparison of Decap
Budgets(Ours vs Greedy Solution)
49
Experimental Results for MCNC Benchmark Circuits
50
Floorplan of playout Before/After WS Insertion
51
Conclusion
  • A methodology for decoupling capacitance
    allocation at floorplan level is proposed
  • Linear programming technique is used to allocate
    existing WS to maximize its utilization
  • A heuristic is proposed for additional WS
    insertion
  • Compared with Greedy solution, our method
    produces significantly smaller decap budgets

52
Thank you
Write a Comment
User Comments (0)
About PowerShow.com