Title: Ralph H.J.M. Otten Eindhoven University of Technology Eindhoven, The Netherlands otten@ics.ele.tue.nl
1are wires plannable?
- Ralph H.J.M. OttenEindhoven University of
TechnologyEindhoven, The Netherlandsotten_at_ics.el
e.tue.nl
Giuseppe S. GarceaDelft University of
TechnologyDelft, The Netherlandsgiuseppe_at_cas.et.
tudelft.nl
2wire planning
- 1987 providing floorplan design with alignment
constraints - floorplan is a data structure capturing the
relative positions (i.e. no geometry,
possibly overlap, several optimizations) - alignment to save wire area (data path generator)
- often tremendous reduction in routing complexity
- in essence not limited to "data path" regularity
- 1998 fixing (and maximizing) time budgets for
modules - remove global iteration from synthesis
- fix total path delay
- provide pre-placement and pin positioning data
- enable early retiming, layer assignments, system
partitioning - ensure satisfaction of system timing requirements
- this talk
- iteration free synthesis what is needed?
- trends in chip industry where do the wires go?
- some directions
3iteration free synthesis (silicon compilers)
conceptualdesign
gate and net list
library
foot print
technology
weighted incidence structure
wire length and area minimization
under technology constraints
timing was an incidental, usually surprisingly
good, result of a synthesis flow with size as
its prime objective
layoutsynthesis
4iterative timing optimization
conceptualdesign
wire loads, resistances, critical paths
library
foot print
technology
layoutsynthesis
buffer insertion, transistor sizing, fanout trees
5timing awareness in conventional flows
synthesis uses delay models but has very
limited information
resynthesis accepts additional constraints
and
wire load models
layout synthesis tries to reduce total wire
length and area
timing is the (arbitrary) outcome of
- a sequence of optimizations with other
objectives - adding constraints and resynthesis
Þ bringing it to a local optimum - adding more constraints and resynthesis
Þ bringing it to another local optimum
desired a flow that satisfies timing
constraints exactly whenever possible
6sutherland's delay formula
note the absence of resistancenbetween driver
and load
- g computing effort
- size independent !
- depends on
- function
- topology
- device size
p inherent (parasitic) delay size independent
g/f effort delay
1/f restoring effort
if f is kept constant, then delay stays
constant
7continuously sized networks
gate size
f, the scaling factor, is the same for all input
a, the area sensititivity, is a property of the
gate, that is function,topology, sizing
8continuously sized networks
the size of a gate with constant delay varies
linearly with the load gate size a f C
in vector notation
9timing closure
the size of a gate with constant delay varies
linearly with the load gate size a f C
Sutherland and Sproul VLSI, 1991
Grodstein, e.a. ICCAD, 1995
constant delay methodology
gain-based synthesis
fixed delays
fixed timing
guaranteed timing
performance planning
10synthesis under timing constraints
no iterative loop has been created!
conceptualdesign
behavioralsynthesis
logicsynthesis
library
datapreparation
foot print
technology
areaoptimization
layoutsynthesis
timinganalysis
insert buffers to reduce area
11size assignment
vector of effort reciprocals that is Cin/Cout
N
f
weighted incidence matrix
a vector
TIMING GUARANTEED - for f was fixed - buffers
inserted for area recovery only where
enough slack is available !
implied by the calculated input capacitances
netlist possibly modified by inserted buffers
12resistive interconnect
problem 1 how to cope with resistive
interconnect while their delay models
cannot be made size independent?
13the new synthesis problem
logic synthesis is to provide an initial netlist
and the restoring effort 1/f for every
gate !
- sutherland's principle of uniform stage effort
- brayton's uniform stage delay
- technology mapping for speed
problem 2 how can we prevent synthesis
from generating networks that preclude
satisfying timing constraints, while timing
correct networks exist?
problem 1 how to cope with resistive
interconnect while their delay models
cannot be made size independent?
14synthesis with wire planning
no iterative loop has been created!
timing budgets
conceptualdesign
behavioralsynthesis
logicsynthesis
wireplanning
library
datapreparation
foot print
technology
preplacement pin assignment layer assignment wire
structures
areaoptimization
layoutsynthesis
timinganalysis
15global wire theory
global wires are interconnects whose delay can
be improved
by inserting restoring circuitry
assumptions
- global interconnections are always point-to
point wires - first moment matching is accurate enough
- restoring circuits are modeled with sakurai's
first order model
- the length of a section , the critical length,
dependson the wiring layer, but not on the
buffer size , and tends to be constant when
measured in feature sizes - the delay of an optimally segmented line is
linear in its length,path delay is therefore
independent of the positionof the restoring
circuits on the path - the delay of a section of an optimally buffered
line is the same for all layers
16wire planning considerations
- the definition of global wires creates a
two-level hierarchy - global wires will be optimally buffered
- a wire planning scenario
- allocate delays to global paths
- assign time budgets to modules
- create net lists for the modules
- assign size to all gates
- given the path delays, and convex trade-off
between module size and delay,size optimization
is efficiently solvable,and produces time
budgets for each module - logic synthesis has to create net lists for the
modules with given time budgets, and assign
restoring effortsto the gates - size assignment is done by solving the leontieff
system
17remaining problems
optimally buffered lines have fixed input /output
capacitance
problem 3 optimally buffered lines fix
input and output capacitances, and
therefore constrain the total effort along a
path, and thus the delay of that path.
problem 2 how can we prevent synthesis
from generating networks that preclude
satisfying timing constraints, while timing
correct networks exist?
problem 1 how to cope with resistive
interconnect while their delay models
cannot be made size independent?
18discrete libraries
derivation assumes continuous sizability !
libraries are mostly discrete and offer limited
range in sizes
problem 4 does the fact that libraries
are not continuously sizable defeat timing
closure by fixing individual gate delays?
problem 3 optimally buffered lines fix
input and output capacitances, and
therefore constrain the total effort along a
path, and thus the delay of that path.
problem 2 how can we prevent synthesis
from generating networks that preclude
satisfying timing constraints, while timing
correct networks exist?
problem 1 how to cope with resistive
interconnect while their delay models
cannot be made size independent?
19some problems of timing closure
problem 5 can the efficiency of load
independent mapping for speed be
advantageous under a constant delay methodology?
problem 4 does the fact that libraries
are not continuously sizable defeat timing
closure by fixing individual gate delays?
problem 3 optimally buffered lines fix
input and output capacitances, and
therefore constrain the total effort along a
path, and thus the delay of that path.
problem 2 how can we prevent synthesis
from generating networks that preclude
satisfying timing constraints, while timing
correct networks exist?
problem 1 how to cope with resistive
interconnect while their delay models
cannot be made size independent?
20are wires plannable?
- resistive interconnect and guiding synthesis
for that we need
- a solid basis for wire planning
- pin placement for detour free routing
- valid retiming
- early layer assignment
- . . . . . . .
21wire plans
- a wire plan for a functional network is a
position for each of its function nodes, and a
pin assignment for all its primary inputs and
outputs - a global wire plan is a wire plan of which all
arcs represent global wires, andwill be laid out
as optimally buffered lines. - a wire plan is monotonic if all its arcs can be
laid out such thatthe L1-length of every
directed path in the networkis equal to the
L1-distance between its end points - given a pin assignment, no global wire plan is
faster than a monotonic wire plan (if
functions have fixed delays) - given a pin assignment, monotonic wire plans have
the least wire capacitance
22wire plans for given pin assignment
- the inbox of a node is the smallest
iso-rectanglecontaining its support
- the outbox of a node is the smallest
iso-rectanglecontaining its range
- a bridge of a node is a minimum L2-length
lineconnecting the inbox and the outbox
23existence criterion
the existence of monotonic wire plan of a
functional network for a given pin assignment can
be checked on a node-by-node basis
- its in- or outbox is a single point
- its inbox and outbox are perpendicular iso-lines
- its outbox is in the projection of the inbox
a functional network has a monotonic wire
plan with respect to a given pin
assignment fif every node has one and only one
bridge
24are wires plannable?
- resistive interconnect and guiding synthesis
- delay prediction is needed and should be enabled
25trends in chip industry
many laws in chip industry fit a specific generic
form
differential equation with an integral
(solvable by separation of variables)
26moore's law
the growth rate of chip complexity will be
proportional to the achieved complexity to date
Gordon Moore, 1964
proportionality constant, "moore exponent
m", 0.2 for processors, and 0.4 for memory
Nnumerical complexity of the module (e.g. the
chip)
27rent's rule
the growth rate of the terminal count with the
complexity of the module will be proportional to
the average number of terminals per submodule
Landman, Russo, 1971
proportionality constant, "rent exponent r",
T(N) the number of terminals of a module
with
numerical complexity N
Nnumerical complexity of the module (e.g. the
chip)
28rents curves
10,000
r0.25K82
board level
high performance computers
1,000
gate arrays
r0.5K1.9
chip level
microprocessors
r0.63K1.4
r0.45K0.82
100
r0.12K6
static ram
r0.1K4
dynamic ram
10
100
1,000
10,000
100,000
1,000,000
Bakoglu, 1987
29process exponents
the reduction rate of device sizes will be
proportional to the achieved device size
Status2000,ICE, 2000
proportionality constants are pretty close in
value, and will be called the "process exponent
p",
30straverius laws
many laws in chip industry have generic form
differential equation with an integral
(solvable by separation of variables)
there are many more!!!
31another old rule
massive memory machines
how primary memory should be supplied to a
processor with a given speed
massive parallel machines
in a balanced computer system the size of
primary memory in bytes is close to the number of
instructions per second
Richard P. Case, 60's
32memory-to-compute ratio
to rebalance the system memory has to be
extended
downscaling makes memory (by Sm) and processor
(by Sc) smaller
processing became A times faster due to
downscaling
down scaling forces the memory-to-compute ratio
r to increase
very fast !!!
Paul Stravers, 2000
33buffer area under global wire assumptions
note buffer area is independent of wire
resistance
34wire length distribution
P(l), the wire length distribution, is usually
obtained by requiring that rent's rule must be
satisfied
35relative buffer area
using formulae of davis-de-meindl
36buffer area growth
37are wires plannable?
- resistive interconnect and guiding synthesis
- delay prediction is needed and should enable wire
planning
- the memory share of a balanced processor chip
area will increase very fast with scaling
- optimal buffering forces almost all
functionality from a single layer chip
38multilayer integration
main disadvantage early layers have to go
through many cycles
main disadvantage poor alignment of inter-layer
via's
the true 3D integration
layer growth
film transfer
recrystall- ization
sidewall metallization
already tried before 1980
39benefits
- global interconnect length considerably reduced
- folding datapaths over layers and determining
optimum crossing points can shorten cycle time - much smaller total footprint for the same
functionality - different technologies for different layers are
feasible
why not fully exploited today ?
- industry sustained its miraculous growth up to
now without it - technological feasibility for vlsi only shown
recently - economical feasibility not yet proven
- virtually no adequate cad-support
- no design experience with multilayer integration
40possible layer dedication
optical clock receivers, line repeaters, regular
i/o Otten,1980
processors (the main heat source), first level
memory
second level cache for performance
improvement M.B. Kleiner, S.A.Kühn, P. Ramm,
W.Weber, 1995
high density advanced memory technology
41thermal analysis
M.B. Kleiner, S.A.Kühn, P. Ramm, W.Weber, 1995
42are wires plannable?
- resistive interconnect and guiding synthesis
- delay prediction is needed and should enable wire
planning
- the memory share of a balanced processor chip
area will increase very fast with scaling
- optimal buffering forces almost all
functionality from a single layer chip
- multilayer integration may ease all of the above
today we are far from plannable wiring!
43are wires plannable?
- Ralph H.J.M. OttenEindhoven University of
TechnologyEindhoven, The Netherlandsotten_at_ics.el
e.tue.nl
Giuseppe S. GarceaDelft University of
TechnologyDelft, The Netherlandsgiuseppe_at_cas.et.
tudelft.nl