Circuit Retiming with Interconnect Delay - PowerPoint PPT Presentation

About This Presentation
Title:

Circuit Retiming with Interconnect Delay

Description:

Given a circuit, we want to relocate the registers to achieve a better clock period. ... Check if c is a feasible clock period by solving the MILP. ... – PowerPoint PPT presentation

Number of Views:101
Avg rating:3.0/5.0
Slides: 29
Provided by: denni58
Category:

less

Transcript and Presenter's Notes

Title: Circuit Retiming with Interconnect Delay


1
Circuit Retiming with Interconnect Delay
  • CUHK CSE CAD Group Meeting One
  • Evangeline Young
  • Aug 19, 2003

2
Circuit Retiming
  • Given a circuit, we want to relocate the
    registers to achieve a better clock period.

Clock period 3 units
Clock period 2 units
Retiming
Registers
3
Circuit Retiming
  • In order to maintain the functionality of the
    circuit, registers can only be moved in certain
    ways

Retiming
4
Circuit Retiming
  • Given a circuit, how should we place the
    registers to minimize the clock period?

5
Traditional Approach
  • This retiming problem is firstly introduced in
    the following classical paper
  • Retiming Synchronous Circuitry, Charles E.
    Leiserson and James B. Saxe, Algorithmica,
    65-35, 1991
  • Only gate delay was considered.
  • Three methods are proposed. One of them solves
    the problem by mixed integer linear programming
    (MILP).

6
Traditional Approach
  • Notations
  • d(v) is the delay of node v.
  • w(e) is the original no. of registers on edge e.
  • c is the clock period that we want to check if it
    is feasible.
  • r(v) is the retiming value of node v, i.e., the
    no. of registers moved from the output to the
    input of node v. (r(v) is what we want to find.)
  • s(v) is the longest delay from a register
    connected directly to node v to the output of v.

7
Traditional Approach
  • More about s(v)

A
s(v) is the delay from point A to B, including
the delay of v.
B
v
8
Traditional Approach
  • Integer Linear Program
  • d(v) ? s(v) for all node v (1)
  • s(v) ? c for all node v (2)
  • r(u)? r(v) ? w(e) for all edge e(u,v) (3)
  • s(u) s(v) ? -d(v) wherever e(u,v) s.t.
  • r(u) - r(v) w(e) (4)

9
Traditional Approach
  • Write R(v) as r(v) s(v)/c
  • The ILP can be written as an MILP
  • r(v) R(v) ? -d(v)/c for all node v (1)
  • R(v) r(v) ? 1 for all node v (2)
  • r(u)? r(v) ? w(e) for all edge e(u,v) (3)
  • R(u) R(v) ? w(e)-d(v)/c for all edge e(u,v)
    (4)
  • The above set of difference constraints can be
    solved in polynomial time, though it consists of
    both integer and real variables.

10
Traditional Approach
  • Use binary search to find the optimal clock
  • T0 0
  • T1 e10 // a large no.
  • Repeat
  • c (T0 T1)/2
  • Check if c is a feasible clock period by solving
    the MILP.
  • If success, T1 c otherwise, T0 c.
  • Until success and (T1 - T0)/T1 lt e

11
Retiming with Interconnect Delay
  • We consider clock period minimization.
  • Retiming has been studied and applied extensively
    at logic synthesis.
  • However, most previous retiming algorithms ignore
    interconnect delay.
  • Interconnect delay should be considered for high
    performance circuits in DSM design.
  • This solution is going to be presented in the
    upcoming ICCAD 2003.

12
Retiming with Interconnect Delay
  • We assume that wire delay is directly
    proportional to its length.
  • This assumption is reasonable
  • For short wires, the quadratic component of a
    wire delay is significantly smaller than its
    linear component.
  • For long wires, buffer insertion can be done.

13
Retiming with Interconnect Delay
14
Retiming with Interconnect Delay
  • Now, a retiming solution needs to specify
  • the retiming label r(v) for each node v.
  • the positions of the registers on each edge.

r( ) 0
r( ) -1
Retiming
r( ) 0
r( ) 0
The positions of the registers on the edges are
important as there are interconnect delay.
15
Our Contributions
  • Optimal algorithm
  • O(VE log V V2 log2V) time per
    iteration.
  • Near-optimal algorithm
  • Only 0.13 larger than the optimal on average.
  • O(VbE VbEh) time per iteration, e.g., a
    circuit with 16.1K gates and 28.6K wires can be
    retimed in 44.32s by a 1.8GHz PIII PC.
  • Based on an optimal algorithm handling
    interconnect delay only, i.e., no gate delay.

16
Optimal Approach
  • Rewrite the ILP on p.8 as follows
  • d(v) ? s(v) for all node v (1)
  • s(v) ? c for all node v (2)
  • r(u)? r(v) ? w(e) for all edge e(u,v) (3)
  • s(v) s(u) d(e) d(v) - c(r(v) - r(u) w(e))
  • for all edge e(u,v) (4)

17
Optimal Approach
  • Similarly, write R(v) as r(v) s(v)/c
  • r(v) R(v) ? -d(v)/c for all node v (1)
  • R(v) r(v) ? 1 for all node v (2)
  • r(u)? r(v) ? w(e) for all edge e(u,v) (3)
  • R(u) R(v) ? w(e) - d(v)/c - d(e)/c
  • for all edge e(u,v) (4)
  • Again, the above set of constraints can be solved
    in polynomial time, though the runtime is quite
    long.

18
Optimal Approach
V
E
copt
Circuit
Runtime (s)
s1488
655
1405
18.85
5.62
s1494
649
1411
20.78
4.37
s3271
1574
2707
10.24
33.70
s3330
1791
2890
27.05
43.14
s3384
1687
2782
24.16
25.19
s4863
2344
4093
23.58
87.75
s5378
2781
4261
27.25
138.68
s6669
3082
5399
22.96
177.59
s9234
5599
8005
42.73
512.86
s13207
7953
11302
72.34
1161.07
s15850
9774
13794
67.82
1545.59
s35932
16067
28590
29.54
8644.27
s38417
22181
32135
36.52
7680.79
s38584
19255
33010
gt15000
19
Near Optimal Approach
  • Transform the original graph G by splitting each
    node v (represents a gate) into a pair of nodes
    v1 and v2 connected by an edge with delay d(v).

v1
delay d(v)
delay 0
v
v2
delay d(v)
20
Near Optimal Approach
  • After representing each gate by a wire, we can
    find an optimal retiming solution S for the
    transformed circuit G1. (We will show how to find
    the optimal solution with no gate delay.)
  • The clock period of S will be a lower bound L for
    the optimal solution Topt of G.
  • From S, we can obtain a feasible retiming
    solution for the original circuit G.

21
Near Optimal Approach
  • Registers retimed into a wire representing a gate
    v will be moved backward to the input edges or
    forward to the output edges depending on their
    distances from v1 and v2.
  • Linear programming is used to determine the
    positions of the registers on each edge after
    this relocation step to minimize the clock period
    considering both gate and wire delay.

v1
v2
22
Near Optimal Approach
  • It is now the problem of solving the retiming
    problem optimally assuming that gate delay is
    zero.
  • When there is no gate delay, the set of
    constraints on p.17 becomes
  • r(v) R(v) ? 0 for all node v (1)
  • R(v) r(v) ? 1 for all node v (2)
  • r(u)? r(v) ? w(e) for all edge e(u,v) (3)
  • R(u) R(v) ? w(e)-d(e)/c for all edge e(u,v)
    (4)

23
Near Optimal Approach
  • Lemma 1 Given R(v) for all node v that satisfy
    constraint (4), we can obtain a solution to
    constraint (1)-(4) by setting
  • r(v) trunc(R(v))
  • Given Lemma 1, we only need to solve constraint
    (4) R(u) R(v) ? w(e)-d(e)/c for all edge
    e(u,v).
  • Consider the input graph G(V,E) such that the
    weight of each edge e(u,v) is -w(e)d(e)/c.

24
Near Optimal Approach
  • There is a solution to constraint (4) iff G has
    no positive cycles.
  • Positive cycle detection in G can be achieved by
    positive cycle detection in a smaller graph
    H(Vb,Eh) constructed from G. This technique can
    be applied in other positive cycle detection
    problems, not necessarily in circuit retiming.
  • After solving R(v), we can find r(v) and s(v) for
    all node v.

25
Near Optimal Approach
  • After the binary search, we can find the optimal
    clock and the corresponding r(v) and s(v) for all
    node v. Then, we can place the registers
    accordingly

Assume that r(v)-r(u)w(e) 4
u
v
c
c - s(u)
Other registers are placed right in front of v.
26
Near Optimal Approach
  • First, assuming that gate delay is zero.
  • Binary search to find the minimum feasible clock
    period c
  • To test the feasibility of a fixed c
  • Transforming to a positive cycle detection
    problem on a reduced graph
  • Can be solved by a single-source longest-path
    algorithm

27
Results
copt
Circuit
Topt (s)
cnear opt
Tnear opt (s)
18.85
s1488
18.82
5.62
0.28
20.78
s1494
20.78
4.37
0.25
10.24
s3271
10.24
33.70
1.09
27.05
s3330
27.05
43.14
0.50
24.21
s3384
24.16
25.19
0.74
23.58
s4863
23.58
87.75
3.12
27.27
s5378
27.25
138.68
1.16
23.07
s6669
22.96
177.59
1.91
42.73
s9234
42.73
512.86
4.08
72.34
s13207
72.34
1161.07
8.11
67.82
s15850
67.82
1545.59
24.02
29.59
s35932
29.54
8644.27
61.25
36.53
s38417
36.52
7680.79
83.56
s38584
gt15000
94.26
445.63
28
Future Directions
  • Consider a more accurate modeling for the
    interconnect delay, e.g., use Elmore delay.
  • How to map the retiming solution to the
    floorplanning or placement solution? Registers
    are large and take up silicon resources.
  • How to consider fan-out capacitance with
    interconnect delay?
Write a Comment
User Comments (0)
About PowerShow.com