Ralph H.J.M. Otten Eindhoven University of Technology Eindhoven, The Netherlands otten@ics.ele.tue.nl presentation

About This Presentation

Transcript and Presenter's Notes

Title: Ralph H.J.M. Otten Eindhoven University of Technology Eindhoven, The Netherlands otten@ics.ele.tue.nl

1
are wires plannable?

Ralph H.J.M. OttenEindhoven University of
TechnologyEindhoven, The Netherlandsotten_at_ics.el
e.tue.nl

Giuseppe S. GarceaDelft University of
TechnologyDelft, The Netherlandsgiuseppe_at_cas.et.
tudelft.nl
2
wire planning

1987 providing floorplan design with alignment
constraints
floorplan is a data structure capturing the
relative positions (i.e. no geometry,
possibly overlap, several optimizations)
alignment to save wire area (data path generator)
often tremendous reduction in routing complexity
in essence not limited to "data path" regularity
1998 fixing (and maximizing) time budgets for
modules
remove global iteration from synthesis
fix total path delay
provide pre-placement and pin positioning data
enable early retiming, layer assignments, system
partitioning
ensure satisfaction of system timing requirements
this talk
iteration free synthesis what is needed?
trends in chip industry where do the wires go?
some directions

3
iteration free synthesis (silicon compilers)
conceptualdesign
gate and net list
library
foot print
technology
weighted incidence structure
wire length and area minimization
under technology constraints
timing was an incidental, usually surprisingly
good, result of a synthesis flow with size as
its prime objective
layoutsynthesis
4
iterative timing optimization
conceptualdesign
wire loads, resistances, critical paths
library
foot print
technology
layoutsynthesis
buffer insertion, transistor sizing, fanout trees
5
timing awareness in conventional flows
synthesis uses delay models but has very
limited information
resynthesis accepts additional constraints
and
wire load models
layout synthesis tries to reduce total wire
length and area
timing is the (arbitrary) outcome of

a sequence of optimizations with other
objectives
adding constraints and resynthesis
Þ bringing it to a local optimum
adding more constraints and resynthesis
Þ bringing it to another local optimum

desired a flow that satisfies timing
constraints exactly whenever possible
6
sutherland's delay formula
note the absence of resistancenbetween driver
and load

g computing effort
size independent !
depends on
function
topology
device size

p inherent (parasitic) delay size independent
g/f effort delay
1/f restoring effort
if f is kept constant, then delay stays
constant
7
continuously sized networks
gate size
f, the scaling factor, is the same for all input
a, the area sensititivity, is a property of the
gate, that is function,topology, sizing
8
continuously sized networks
the size of a gate with constant delay varies
linearly with the load gate size a f C
in vector notation
9
timing closure
the size of a gate with constant delay varies
linearly with the load gate size a f C
Sutherland and Sproul VLSI, 1991
Grodstein, e.a. ICCAD, 1995
constant delay methodology
gain-based synthesis
fixed delays
fixed timing
guaranteed timing
performance planning
10
synthesis under timing constraints
no iterative loop has been created!
conceptualdesign
behavioralsynthesis
logicsynthesis
library
datapreparation
foot print
technology
areaoptimization
layoutsynthesis
timinganalysis
insert buffers to reduce area
11
size assignment
vector of effort reciprocals that is Cin/Cout
N
f
weighted incidence matrix
a vector
TIMING GUARANTEED - for f was fixed - buffers
inserted for area recovery only where
enough slack is available !
implied by the calculated input capacitances
netlist possibly modified by inserted buffers
12
resistive interconnect
problem 1 how to cope with resistive
interconnect while their delay models
cannot be made size independent?
13
the new synthesis problem
logic synthesis is to provide an initial netlist
and the restoring effort 1/f for every
gate !

sutherland's principle of uniform stage effort
brayton's uniform stage delay
technology mapping for speed

problem 2 how can we prevent synthesis
from generating networks that preclude
satisfying timing constraints, while timing
correct networks exist?
problem 1 how to cope with resistive
interconnect while their delay models
cannot be made size independent?
14
synthesis with wire planning
no iterative loop has been created!
timing budgets
conceptualdesign
behavioralsynthesis
logicsynthesis
wireplanning
library
datapreparation
foot print
technology
preplacement pin assignment layer assignment wire
structures
areaoptimization
layoutsynthesis
timinganalysis
15
global wire theory
global wires are interconnects whose delay can
be improved
by inserting restoring circuitry
assumptions

global interconnections are always point-to
point wires
first moment matching is accurate enough
restoring circuits are modeled with sakurai's
first order model

the length of a section , the critical length,
dependson the wiring layer, but not on the
buffer size , and tends to be constant when
measured in feature sizes
the delay of an optimally segmented line is
linear in its length,path delay is therefore
independent of the positionof the restoring
circuits on the path
the delay of a section of an optimally buffered
line is the same for all layers

16
wire planning considerations

the definition of global wires creates a
two-level hierarchy
global wires will be optimally buffered
a wire planning scenario
allocate delays to global paths
assign time budgets to modules
create net lists for the modules
assign size to all gates
given the path delays, and convex trade-off
between module size and delay,size optimization
is efficiently solvable,and produces time
budgets for each module
logic synthesis has to create net lists for the
modules with given time budgets, and assign
restoring effortsto the gates
size assignment is done by solving the leontieff
system

17
remaining problems
optimally buffered lines have fixed input /output
capacitance
problem 3 optimally buffered lines fix
input and output capacitances, and
therefore constrain the total effort along a
path, and thus the delay of that path.
problem 2 how can we prevent synthesis
from generating networks that preclude
satisfying timing constraints, while timing
correct networks exist?
problem 1 how to cope with resistive
interconnect while their delay models
cannot be made size independent?
18
discrete libraries
derivation assumes continuous sizability !
libraries are mostly discrete and offer limited
range in sizes
problem 4 does the fact that libraries
are not continuously sizable defeat timing
closure by fixing individual gate delays?
problem 3 optimally buffered lines fix
input and output capacitances, and
therefore constrain the total effort along a
path, and thus the delay of that path.
problem 2 how can we prevent synthesis
from generating networks that preclude
satisfying timing constraints, while timing
correct networks exist?
problem 1 how to cope with resistive
interconnect while their delay models
cannot be made size independent?
19
some problems of timing closure
problem 5 can the efficiency of load
independent mapping for speed be
advantageous under a constant delay methodology?
problem 4 does the fact that libraries
are not continuously sizable defeat timing
closure by fixing individual gate delays?
problem 3 optimally buffered lines fix
input and output capacitances, and
therefore constrain the total effort along a
path, and thus the delay of that path.
problem 2 how can we prevent synthesis
from generating networks that preclude
satisfying timing constraints, while timing
correct networks exist?
problem 1 how to cope with resistive
interconnect while their delay models
cannot be made size independent?
20
are wires plannable?

iteration free synthesis

resistive interconnect and guiding synthesis

for that we need

a solid basis for wire planning
pin placement for detour free routing
valid retiming
early layer assignment
. . . . . . .

21
wire plans

a wire plan for a functional network is a
position for each of its function nodes, and a
pin assignment for all its primary inputs and
outputs
a global wire plan is a wire plan of which all
arcs represent global wires, andwill be laid out
as optimally buffered lines.
a wire plan is monotonic if all its arcs can be
laid out such thatthe L1-length of every
directed path in the networkis equal to the
L1-distance between its end points
given a pin assignment, no global wire plan is
faster than a monotonic wire plan (if
functions have fixed delays)
given a pin assignment, monotonic wire plans have
the least wire capacitance

22
wire plans for given pin assignment

the inbox of a node is the smallest
iso-rectanglecontaining its support

the outbox of a node is the smallest
iso-rectanglecontaining its range

a bridge of a node is a minimum L2-length
lineconnecting the inbox and the outbox

23
existence criterion
the existence of monotonic wire plan of a
functional network for a given pin assignment can
be checked on a node-by-node basis

its in- or outbox is a single point

its inbox and outbox are perpendicular iso-lines

its outbox is in the projection of the inbox

a functional network has a monotonic wire
plan with respect to a given pin
assignment fif every node has one and only one
bridge
24
are wires plannable?

iteration free synthesis

resistive interconnect and guiding synthesis

delay prediction is needed and should be enabled

25
trends in chip industry
many laws in chip industry fit a specific generic
form
differential equation with an integral
(solvable by separation of variables)
26
moore's law
the growth rate of chip complexity will be
proportional to the achieved complexity to date
Gordon Moore, 1964
proportionality constant, "moore exponent
m", 0.2 for processors, and 0.4 for memory
Nnumerical complexity of the module (e.g. the
chip)
27
rent's rule
the growth rate of the terminal count with the
complexity of the module will be proportional to
the average number of terminals per submodule
Landman, Russo, 1971
proportionality constant, "rent exponent r",
T(N) the number of terminals of a module
with
numerical complexity N
Nnumerical complexity of the module (e.g. the
chip)
28
rents curves
10,000
r0.25K82
board level
high performance computers
1,000
gate arrays
r0.5K1.9
chip level
microprocessors
r0.63K1.4
r0.45K0.82
100
r0.12K6
static ram
r0.1K4
dynamic ram
10
100
1,000
10,000
100,000
1,000,000
Bakoglu, 1987
29
process exponents
the reduction rate of device sizes will be
proportional to the achieved device size
Status2000,ICE, 2000
proportionality constants are pretty close in
value, and will be called the "process exponent
p",
30
straverius laws
many laws in chip industry have generic form
differential equation with an integral
(solvable by separation of variables)
there are many more!!!
31
another old rule
massive memory machines
how primary memory should be supplied to a
processor with a given speed
massive parallel machines
in a balanced computer system the size of
primary memory in bytes is close to the number of
instructions per second
Richard P. Case, 60's
32
memory-to-compute ratio
to rebalance the system memory has to be
extended
downscaling makes memory (by Sm) and processor
(by Sc) smaller
processing became A times faster due to
downscaling
down scaling forces the memory-to-compute ratio
r to increase
very fast !!!
Paul Stravers, 2000
33
buffer area under global wire assumptions
note buffer area is independent of wire
resistance
34
wire length distribution
P(l), the wire length distribution, is usually
obtained by requiring that rent's rule must be
satisfied
35
relative buffer area
using formulae of davis-de-meindl
36
buffer area growth
37
are wires plannable?

iteration free synthesis

resistive interconnect and guiding synthesis

delay prediction is needed and should enable wire
planning

the memory share of a balanced processor chip
area will increase very fast with scaling

optimal buffering forces almost all
functionality from a single layer chip

38
multilayer integration
main disadvantage early layers have to go
through many cycles
main disadvantage poor alignment of inter-layer
via's
the true 3D integration
layer growth
film transfer
recrystall- ization
sidewall metallization
already tried before 1980
39
benefits

global interconnect length considerably reduced
folding datapaths over layers and determining
optimum crossing points can shorten cycle time
much smaller total footprint for the same
functionality
different technologies for different layers are
feasible

why not fully exploited today ?

industry sustained its miraculous growth up to
now without it
technological feasibility for vlsi only shown
recently
economical feasibility not yet proven
virtually no adequate cad-support
no design experience with multilayer integration

40
possible layer dedication
optical clock receivers, line repeaters, regular
i/o Otten,1980
processors (the main heat source), first level
memory
second level cache for performance
improvement M.B. Kleiner, S.A.Kühn, P. Ramm,
W.Weber, 1995
high density advanced memory technology
41
thermal analysis
M.B. Kleiner, S.A.Kühn, P. Ramm, W.Weber, 1995
42
are wires plannable?

iteration free synthesis

resistive interconnect and guiding synthesis

delay prediction is needed and should enable wire
planning

the memory share of a balanced processor chip
area will increase very fast with scaling

optimal buffering forces almost all
functionality from a single layer chip

multilayer integration may ease all of the above

today we are far from plannable wiring!
43
are wires plannable?

Ralph H.J.M. OttenEindhoven University of
TechnologyEindhoven, The Netherlandsotten_at_ics.el
e.tue.nl

Giuseppe S. GarceaDelft University of
TechnologyDelft, The Netherlandsgiuseppe_at_cas.et.
tudelft.nl

Write a Comment

User Comments (0)

About PowerShow.com

Ralph H.J.M. Otten Eindhoven University of Technology Eindhoven, The Netherlands otten@ics.ele.tue.nl PowerPoint PPT Presentation