Interconnect Planning - PowerPoint PPT Presentation

1 / 59
About This Presentation
Title:

Interconnect Planning

Description:

SLIP Workshop web page at http://www.sliponline.org. 3 ... of planning for design closure ... SLIP Workshop web page at http://www.sliponline.org. 29. Why do we need ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 60
Provided by: Davi701
Category:

less

Transcript and Presenter's Notes

Title: Interconnect Planning


1
Interconnect Planning
  • Prof. David Z. Pan
  • dpan_at_ece.utexas.edu
  • Office ACES 5.434

2
Outline
  • Introduction
  • Two representative works on Interconnect Planning
  • Interconnect performance estimation modeling
    (IPEM Cong-Pan, TCAD01)
  • Buffer planning Cong-Kong-Pan, TVLSI01
  • Rents rule and SLIP workshop
  • SLIP stands for system level interconnect
    prediction
  • SLIP Workshop web page at http//www.sliponline.or
    g

3
Introduction
  • We have shown so far a lot of algorithms on
    interconnect optimization
  • They work effectively, but
  • They need some kind of planning for design
    closure
  • We have shown interconnect width planning
    (architecture planning) Cong-Pan, DAC99
  • Interconnect performance estimation modeling
    Cong-Pan, TCAD01
  • Buffer planning Cong-Kong-Pan, TVLSI01

4
Interconnect Optimization Not Enough
  • Need high level tools to cooperate
  • Interconnect synthesis capability is limited
    without planning
  • Whats their limit, during early design planning?
  • Enough routing resources?
  • Locations for buffers for long interconnects?
  • ......

5
Needs for Efficient Interconnect Performance
Models
  • Efficiency
  • Abstraction to hide detailed design information
  • granularity of wire segmentation
  • number of wire widths, buffer sizes, ...
  • Explicit relation to enable optimal design
    decision at high levels
  • Ease of interaction with logic/high level
    synthesis tools

6
Problem Formulation
G
l
CL
  • Rd0 driver effective resistance of G0
  • Rd driver effective resistance of G
  • l interconnect length
  • CL loading capacitance

7
Problem Formulation
G
l
G0
CL
Input
OWS, SDWS, BIWS, BISWS
What is the optimized delay? Do not run
optimization algorithm !
8
Example Delay/Area Est. under WS
9
Delay/Area Estimation under 1-WS
  • Closed-form delay formula
  • Closed-form area formula

10
Delay/Area Estimation under OWS
  • Closed-form delay estimation formula

where
,
W(x) is Lamberts W function defined as
  • Closed-form area estimation formula

11
Delay Comparison of Various WS Solutions
  • OWS model consistently matches TRIO
  • 1-WS and 2-WS work well for length lt8mm in Tier1
  • All work well in Tier4 up to chip size

12
Average Width (Area) Comparison
  • Very close for the model

13
Average Width (Area) Comparison
  • Very close for 1-WS, 2-WS and OWS !

14
Property of DEM-OWS
  • Theorem Tows is a sub-quadratic, convex function
    of length l
  • Note Without wiresizing, wiring delay ? l2, as
    used by most layout-driven logic synthesis
    systems, e.g.
  • Ramachandran et al., ICCAD-92
  • Chen-Tsai-Kurdahi, IEE Proc.-Circuits Device
    System95
  • Closed-form DEM-OWS will serve as a basis for
    deriving SDWS, BIWS and BISWS

15
Comparison of DEM-OWS vs. TRIO
  • 0.18um, Rd rg /100, CL cg x 100
  • For expt., max wire width is 20x min, wire is
    segmented in every 10um

16
Critical Length for BI under OWS
Solve for l, gt critical length lcrit (b, Rd ,
CL ) - Computed by bisection method - Constant
time in practice
17
Critical Length Meaning
18
Critical Lengths lcrit (b, Rb , Cb)
unit mm
- Denote lc lcrit (b, Rb , Cb)
19
Logic Volume within lc
- Defined as the number of min 2-input NAND gates
that can be packed within the area of lc/2
lc/2
unit million
20
Property of BIWS
  • Theorem For BIWS, the distances between adjacent
    buffers are the same, and equal to lc -- the
    critical length lcrit (b, Rb , Cb ) .
  • Proof is based on the convexity of Tows

l
21
Linear DEM for BIWS
  • Original long interconnect is divided into ?l/lc?
    stage
  • The stage number is proportional to l
  • Each stage of length lc has delay Tows(Rb , lc,
    Cb)
  • gt linear DEM for BIWS

22
Comparison of DEM-BIWS vs. TRIO
  • TRIO is an interconnect synthesis engine
  • Rd0 rg /10, CL cg x 10 , buffer type is 100
    x min.
  • For expt., max. wire width is 20x min. width,
    wire is segmented in every 100um.

23
Comparison of DEM-BIWS vs. TRIO
  • 0.18um, Rd0 rg /10, CL cg x 10, buffer type
    is 100 x min.
  • For expt., max. wire width is 20x min. width,
    wire is segmented in every 100um.

24
DEM under BISWS
  • Observations from extensive experiments
  • Linear delay versus length
  • Internal buffers are about the same size
  • Therefore, we estimate BISWS by the best BIWS
    from available buffer types
  • Complexity O(B). Since the set B is normally
    less than 20, constant time in practice.

25
Comparison of DEM-BISWS vs. TRIO
  • 0.18um, Rd0 rg /10, CL cg x 10
  • For expt., max. allowable buffer/driver size is
    400x min device max. wire width is 20x min.
    width wire is segmented in every 100um.

26
Multiple-Pin Nets (TAU99)
Cs1
G
S1
Sn
Csn
S2
Cs2
  • Estimation with different optimization
    objectives
  • Minimize the delay to a single critical sink
    (SCS)
  • Minimize the maximum delay (defined as the tree
    delay) for multiple critical sinks (MCS)
  • Minimize weighted delay ...

27
Key Idea
  • Estimation for Single Critical Sink
  • We first formulate the original problem into a
    single-line-multiple-load (SLML) problem
  • Then transform SLML into a single-line-single-load
    (SLSL) problem
  • Use previous 2-pin results to estimate delay and
    area on the critical path
  • Estimation for Multiple Critical Sinks
  • We obtain a lower bound delay estimation for the
    optimal tree delay
  • We show that in practice, the above lower bound
    estimation is tight and close to the optimal tree
    delay

28
Outline
  • Introduction
  • Two representative works on Interconnect Planning
  • Interconnect performance estimation modeling
    (IPEM Cong-Pan, TCAD01)
  • Buffer block planning during floorplanning
    Cong-Kong-Pan, TVLSI01
  • Rents rule and SLIP workshop
  • SLIP stands for system level interconnect
    prediction
  • SLIP Workshop web page at http//www.sliponline.or
    g

29
Why do we need Buffer Planning?
soft block
Hard (IP) block
  • Many buffers in modern designs
  • easily 10-20 cells now,
  • projected to be 70 in future uP according to an
    Intel paper ISPD03
  • Restriction from hard IP blocks
  • Impact on floorplan and placement
  • gt need to plan ahead to ensure timing/design
    convergence.

30
Limitation of Previous Works
  • Buffer Insertion
  • mostly done in a net by net manner after detailed
    placement
  • mostly no obstacles (hard IP blocks, etc)
    considered
  • no global buffer planning (only manual or
    semi-manual planning)
  • buffers are distributed in almost random manner
    across the entire chip

31
Buffer Block Planning with Floorplanning
  • Given initial floorplan and performance
    constraint for each net
  • Output optimal location/dimension of buffer
    blocks such that the overall chip area and the
    number of buffer blocks are minimized

32
Feasible Region for BI
  • Feasible region is the maximal region that a
    buffer can be placed to meet given delay
    constraint

1 buffer
driver
CL
k buffers
driver
CL
33
Feasible Region for One Buffer
Cb
Rb
Rd
Tb
l
CL
x
34
Feasible Region for One Buffer
  • We obtain closed-form formula of FR for inserting
    one buffer to meet delay constraint

35
KEY Observation for FR
  • Even under tight delay constraint, FR for BI can
    still be very large!

gt FR provides a lot of flexibility to plan
buffer location
36
Extension I FR for Multiple Buffers
k
1
i
Rd
Rb
Cb
Tb
xi
CL
  • More complicated, but still closed-form solution
    for FR
  • We also obtain the minimum number of buffers kmin
    needed to meet delay constraint

37
Extension II 2D Feasible Region
  • FR extended to 2-dimension with obstacles

sink
source
38
Overall Picture BBP for Interconnect-Driven
Floorplanning
  • For each floorplan (FL) configuration
  • Apply BBP on the given FL
  • Evaluate resulting FL in terms of timing, area,
    BB trade-off, etc.
  • Return the best FL solution

39
Experimental Setting
  • Two Algorithms
  • RDM no buffer planning, i.e., a buffer is
    randomly placed to any feasible location
  • BBP buffer block planning
  • Two Scenarios
  • RES restricted, delay minimal BI position(s)
  • FR feasible region
  • 6 MCNC 5 randomly generated circuits (0.18um
    tech)
  • Delay budget randomly assigned to be 1 to 1.2 x
    Topt

40
Nets That Meet Delay Target
FR provides a lot more flexibility than RES to
better meet delay target (e.g., to avoid
obstacles during BI)
41
Comparison of BB
BBP reduces BB from RDM by a factor of up to
3x BBP/FR further reduces BB from BBP/RES by up
to 34
42
Normalized Total Chip Area after BI
BBP/FR can effectively cluster more individual
buffers and put into dead area ( up to 7 area
saving)
43
Summary of Experimental Results
BBP/FR provides the best solution.
44
Summary and Recent Trends
  • A key concept here is the feasible region (FR)
  • FR provides a lot more flexibility to better meet
    delay constraint and plan buffer locations
  • Many follow-up and related works
  • You can do an advanced search by typing buffer
    and planning and 2003 under IEEE Xplorer
  • 18 papers in 2003
  • 16 papers in 2002
  • Independent feasible region Kohs group, ISPD
    2000
  • Buffer site, even inside macros Alpert et al,
    DAC 2001
  • With noise consideration Li et al, ASPDAC03
  • With congestion/routability consideration Ma et
    al, DAC03 Sham and Young, TCAD03
  • Buffer planning should be more important with
    growing number of buffers

45
A Priori System-LevelInterconnect
PredictionRents Rule and Wire Length
Distribution Models
Thanks to Dirk Stroobandt Ghent University
46
Why A Priori Interconnect Prediction?
  • Interconnect importance of wires increases (they
    do not scale as components).
  • A priori
  • For future designs, very little is known.
  • The sooner information is available, the better.
  • A Priori Interconnect Prediction estimating
    interconnect properties and their consequences
    before any layout step is performed.
  • Extrapolation to future systems Roadmaps.
  • To improve CAD tools for design layout
    generation.
  • To evaluate new computer architectures.

47
The Three Basic Models
Circuit model
Logic block
Net
Terminal / pin
48
Rents Rule
Rents rule was first described by Landman and
Russo in 1971. For average number of terminals
and blocks per module in a partitioned design
p Rent exponent
t ? average term./block
Measure for the complexity of the interconnection
topology Intrinsic Rent exponent p
(simple) 0 ? p ? 1 (complex)
Normal values 0.5 ? p ? 0.75
  • B. S. Landman and R. L. Russo. On a pin versus
    block relationship for partitions of logic
    graphs. IEEE Trans. on Comput., C-20, pp.
    1469-1479, 1971.

49
Rents Rule (cont.)
Rents rule is a result of the self-similarity
within circuits
Assumption the complexity of the interconnection
topology is equal at all levels.
50
Rents Rule (other definition)
(Dense) region B cells,
T terminals
If ?B cells are added, what is the increase
?T? In the absence of any other information we
guess
Overestimate many of ?T terminals connect to T
terminals and so do not contribute to the
total. We introduce a factor p (p lt1) which
indicates how self-connected the netlist is
placement optimization
?T
B
?B
T
Statistically homogenous system
Or, if ?B ?T are small compared to B and T
  • P. Christie and D. Stroobandt. The
    Interpretation and Application of Rents Rule.
    IEEE Trans. on VLSI Systems, Special Issue on
    SLIP, vol. 8 (no. 6), pp. 639-648, Dec. 2000.

51
Rents Rule (summary)
p
T t B
Rents rule is experimentally validated for a lot
of benchmarks.
  • Distinguish between
  • p intrinsic Rent exponent
  • p placement Rent exponent
  • p partitioning Rent exponent

average
Deviation for high B and T Rents region
II Also deviation for low B and T Rent region
III
Rents rule
  • B. S. Landman and R. L. Russo. On a pin versus
    block relationship for partitions of logic
    graphs. IEEE Trans. on Comput., C-20, pp.
    1469-1479, 1971.
  • D. Stroobandt. On an efficient method for
    estimating the interconnection complexity of
    designs and on the existence of region III in
    Rents rule. Proc. GLSVLSI, pp. 330-331, 1999.

52
Wirelength Estimation
1. Partition the circuit into 4 modules of equal
size such that Rents rule applies (minimal
number of pins).
2. Partition the Manhattan grid in 4 subgrids of
equal size in a symmetrical way.
  • W. E. Donath. Placement and Average
    Interconnection Lengths of Computer Logic. IEEE
    Trans. on Circuits Syst., vol. CAS-26, pp.
    272-277, 1979.

53
Donaths Hierarchical Placement Model
3. Each subcircuit (module) is mapped to a
subgrid.
4. Repeat recursively until all logic blocks are
assigned to exactly one grid cell in the
Manhattan grid.
54
Donaths Length Estimation Model
  • At each level Rents rule gives number of
    connections
  • number of terminals per module directly from
    Rents rule (partitioning based Rent exponent
    p)
  • number of nets cut at level k (Nk) equals
  • where ? depends on the total number of nets in
    the circuit and is bounded by 0.5 and 1.

55
Donaths Length Estimation Model
Length of the connections at level k ?
Adjacent (A-) combination
Diagonal (D-) combination
?
Donath assumes all connection source and
destination cells are uniformly distributed over
the grid.
56
Results Donath
Scaling of the average length L as a function of
the number of logic blocks G
Similar to measurements on placed designs.
57
Results Donath
Theoretical average wire length too high by
factor of 2
58
Occupation Probability Function
Same result found by using a terminal
conservation technique


-
-
TA?C

-
-
TAB
TBC
TB
TABC

Assumption net cannot connect A,B, and C
  • J. A. Davis et al. A Stochastic Wire-length
    Distribution for Gigascale Integration (GSI) -
    PART I Derivation and Validation. IEEE Trans. on
    Electron Dev., 45 (3), pp. 580 - 589, 1998.

59
Occupation Probability Function
For cells placed in infinite 2D plane
60
Occupation Probability Results
  • Use probability on each hierarchical level (local
    distributions).

8
Occupation prob.
7
Donath
6
Experiment
5
L
4
3
2
1
0
10000
10
100
1000
G
Write a Comment
User Comments (0)
About PowerShow.com