A Highly Testable Pass Transistor Based Structured ASIC Design Methodology - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

A Highly Testable Pass Transistor Based Structured ASIC Design Methodology

Description:

... term1 2.31 6080 2626.6 0.83 600 720 t481 5.89 14035.2 2383.2 0.95 810 850 i9 9.92 40320 4064.1 1.77 1560 880 i8 9.49 24441.6 2575.6 2.1 1700 810 frg2 12.62 ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 32
Provided by: eceTamuE7
Learn more at: http://www.ece.tamu.edu
Category:

less

Transcript and Presenter's Notes

Title: A Highly Testable Pass Transistor Based Structured ASIC Design Methodology


1
A Highly Testable Pass Transistor Based
Structured ASIC Design Methodology
  • Kanupriya Gulati
  • Nikhil Jayakumar
  • Sunil P. Khatri

2
Motivation for Structured ASICs
Process (microns) 2.0 0.8 0.6 0.35 0.25 0.18 0.13 0.1
Single Mask Cost (K) 1.5 1.5 2.5 4.5 7.5 12 40 60
of Masks 12 12 12 16 20 26 30 34
Mask Set cost (K) 18 18 30 72 150 312 1000 2000
  • A full set of lithography masks can cost between
    1-3M.
  • Roughly 25 reduction in ASIC design starts in
    past 7 years. Sematech Annual Report 2002,
    A. Sangiovanni-Vincentelli The Tides of EDA,
    keynote talk, DAC 2003.

3
Our Solution
  • Use a regular array of pass transistor logic
    based if-then-else (ITE) cells with flip-flops
    along the edges of the die as the underlying
    circuit structure.
  • Stock such arrays pre-processed up until
    metallization step
  • Or, use previously generated masks for all other
    layers and use new masks for only METAL, VIA
    layers.
  • To create an ASIC for a given design
    technology-map this design to the smallest
    available array.
  • Only METAL and VIA masks require changes.

4
Advantages
  • Can share masks for several layers.
  • Reduces NRE.
  • No need for the designer to worry about DFM
    issues.
  • Improved yield.
  • New designs can be implemented faster.
  • Task of engineering change simplified design
    modification requires only METAL, VIA mask
    changes.
  • Generating test patterns for such a design is
    easy.
  • 100 test coverage in time linear in the size of
    the network
  • No redundant faults in the design.

5
The Gap between FPGA and ASIC
FPGA
ASIC
  • Low speed
  • High Power
  • Cost-effective for low volume products
  • High Speed
  • Low Power
  • Cost-effective for high volume products
  • Necessary for products requiring high performance
    or low power.

What bridges the gap?
6
Taxonomy of Regular Logic Fabrics
  • As we move further away from Standard cell
    (ASIC), we lose
  • Area
  • Speed
  • Power
  • As we move closer to FPGAs, we gain
  • Flexibility
  • Lower NRE

Our Approach
  • Exploring Regular Fabrics to Optimize the
    Performance-Cost Trade-off L. Pillegi et.al.

7
Overview
  • Convert a logic netlist to a partitioned Reduced
    Order Binary Decision Diagram (ROBDD).
  • Each ROBDD node is implemented as an ITE cell.
  • Place these ITE cells in an area and delay
    efficient manner on a pre-fabricated array of ITE
    cells.

8
ITE Cell Structure
out
out
  • Used NMOS pass-gate based structure.
  • Each ITE cell generates buffered output and its
    complement.
  • Delay of NMOS pass-gate ITE cell was found to be
    similar to that of CMOS pass-gate based ITE cell
    with a smaller area.
  • Probably due to the increased diffussion
    capacitance in CMOS pass-gates.

i
i
T
E
9
ITE Cell Design
  • MUX control signals run along the length of the
    cell.
  • Each ITE cell has 3 variable signals and three
    complemented variable signals running
    horizontally in metal 3.
  • Appropriate placement of stacked vias at the
    horizontal metal 3 wires allows the ITE cell to
    be connected to any one of the 3 variables in the
    corresponding row of the array.
  • Metal layers 1 and 2 used for most of the layout,
    metal layer 3 used to route variables and their
    complement.

VDD
GND
10
Synthesis Partitioned ROBDD
  • Synthesis of logic netlist into a partitioned
    ROBDD structure done in VIS.
  • Primary input variables are ordered using a DFS
    ordering.
  • Enable dynamic variable ordering before building
    ROBDDs
  • Do bottom up construction of ROBDDs
  • Let set of variables in ROBDD manager be V
    (initially PIs).
  • If size of any ROBDD gt user-specified threshold
    B
  • Introduce new variable v (intermediate ROBDD
    variable) and continue building ROBDDs on a set
    of variables V U v.
  • Results in a series of ROBDDs
  • Size of each ROBDD bounded by B.
  • Output of these ROBDDs represent either a primary
    output or an intermediate ROBDD variable.

11
Example
z
z
y2
y2
y1
y1
x1
x2
x3
x4
x4
x1
x3
x2
  • Given multi-level logic network with primary
    inputs x1,x2, x3,x4
  • As bottom-up ROBDD construction proceeds, new
    variables y1 and y2 are created.
  • Z is built in terms of y1, y2

12
Placement
  • First Replicate ITE cells whose outputs are
    heavily loaded in order to limit fanout
  • Correspond to ROBDD nodes with high in-degrees.
  • If in-degree of ROBDD node k, then replicate
    this node times.
  • we use K 3
  • Compute initial estimate of number of ITE cells
    n in any row of the ITE array and number of
    rows m of the ITE array as follows
  • where, x width of each ITE cell
  • y height of each ITE cell
  • N total number of ITE cells

13
Placement
  • Sort the N ITE cells in increasing order of their
    ROBDD variable index.
  • Variable index is a measure of closeness of
    variable to the root of ROBDD.
  • A variable closer to the root has smaller index
    than one further from the root.
  • Assign ITE cells to rows of the ITE array

14
Assigning ITE cells to rows
  • If there are nj ITE cells with variable index vj
    such that nj gt n (n number of ITE cells that
    can fit in one row)
  • ITE cells need to span rows.
  • Sort these nj cells in decreasing order of cost
    C.
  • ci children of node c
  • cj parents of node c
  • Helps keep routes short.

Level 2
Level 3
Cost(b) 3 3 0
a
Level 4
b
Cost(a) 5 2 3
Level 5
Level 6
15
Assigning ITE cells to rows
  • If there are nj ITE cells with variable index vj
    such that nj lt n
  • Attempt to populate corresponding row of the ITE
    array with additional ITE cells with variable
    index vj1
  • If row is still not full, add ITE cells with
    variable index vj2 as well.
  • Each row can hold ITE cells which depend on at
    most 3 variables since the number of variables
    that can be routed over any ITE cell is 3.

16
Placement of ITE cells within rows
  • ITE cells are arranged within rows to reduce
    crossings in the induced circuit graph (after
    planarization of the array of ITE cells).
  • Use DOT (graphviz.org) to do this.
  • DOT only re-arranges cells in each ITE row in a
    manner that minimizes graph crossings.
  • DOT is not allowed to modify the assignment of
    ITE cells to rows.

17
Implementing Sequential Designs
  • Each row of ITE cells has a bank of 3 flip-flops.
  • Outputs of the flops can drive one of the inputs
    by means of a METAL and VIA mask change.

18
Route
  • Use WROUTE (in Cadences Silicon Ensemble for
    DSM) to route the ITE cell array.
  • Use 4 metal layers for the route.

Example alu2
19
Summary of Design Flow
  • Convert netlist to partitioned ROBDD in VIS.
  • Perform cell replication if required to limit
    fanout.
  • Perform ITE cell assignment to rows.
  • Re-arrange ITE cells within rows using DOT to
    minimize crossings in the graph induced by the
    interconnections among the ITE cells.
  • Use the result of DOT as the final placement and
    perform routing using WROUTE (or any other
    routing tool).

20
Ease of Testability
  • In traditional scanned standard-cell based
    circuits
  • ATPG problem is NP complete.
  • In our scanned ITE cell based approach
  • In functional mode
  • Partitioned ROBDD outputs are regular inputs to
    other partitions.
  • In test mode
  • Primary inputs and the outputs of each partition
    are scanned in to allow independent testability
    of the different partitions.

21
Abstract View of Partitioned ROBDDs
z
y2
PO
x5
x9
x6
.
x3
.
.
.
Additional Scan-able nodes
x4
.
.
y2
y1
x1
x2
x3
x4
PIs
22
Ease of Testability - Excitation
ROBDD of
  • Path from to
  • Linear time BDD operation

23
Ease of Testability - Propagation
ROBDD of
  • Path from to
  • Again a Linear time BDD operation
  • Support variables for both conditions are
    Non-Overlapping !!
  • Circuit is guaranteed irredundant
  • 100 stuck fault coverage guaranteed in time
    linear in the size of the circuit.

24
Experiments
  • To compare with standard-cell based design, the
    circuits were mapped to a library of 20 gates.
  • Used SIS for optimization (script.rugged) and
    map.
  • Placement and routing done using SEDSM using
    0.1um process and 4 metal layers.
  • Delay of standard-cell based designs
  • Pre-characterized the library using SPICE (0.1um
    BPTM)
  • Used sense package in SIS
  • sense returns longest sensitizeable path (false
    paths implicitly ignored)

25
Experiments
  • Partitioned ROBDD construction done using the
    frontier method in VIS.
  • Tried the following different partitioning
    threshold numbers (B).
  • 5, 10, 15, 20 and 1000.
  • For each circuit, the result that yielded the
    smallest number of ROBDD nodes was selected.
  • This partitioned ROBDD structure was then taken
    through our design flow.

26
Experiments
  • Delay of ITE cell array
  • Found by traversing longest topological path (in
    terms of number of ITE cells) between any circuit
    PI and PO
  • Delay at each ITE cell is given by
  • If variable is a primary input
  • D(cell) MAX D(leftchild), D(rightchild)
    D(ITE block)
  • If variable is an internal node
  • D(cell) MAX D(variable), D(leftchild),
    D(rightchild) D(ITE block)
  • D(ITE block) found from SPICE simulations (0.1um
    BPTM)
  • Assumed that the ITE cell drove the maximum load
    allowed hence delay estimates are conservative

27
Results (Combinational designs)
Ckt. Evaluation Delay Evaluation Delay Evaluation Delay Area Area Area
StdCell ITE Ovh StdCell ITE Ovh
alu2 770 500 0.65 1314.1 2560 1.95
alu4 1020 527 0.52 2500 5068.8 2.03
apex6 500 1310 2.57 2678.1 14585.6 5.45
apex7 440 1030 2.34 885.1 4608 5.21
C1908 880 2590 2.91 1827.6 8288 4.53
C3540 1250 3050 2.44 4323.1 29491.2 6.82
C432 930 3070 3.3 715.6 4640 6.48
C499 600 1070 1.78 1827.6 3974.4 2.17
C880 1210 2750 2.27 1463.1 8985.6 6.14
dalu 1110 2460 2.22 3164.1 39916.8 12.62
frg2 810 1700 2.1 2575.6 24441.6 9.49
i8 880 1560 1.77 4064.1 40320 9.92
i9 850 810 0.95 2383.2 14035.2 5.89
t481 720 600 0.83 2626.6 6080 2.31
term1 320 730 2.28 663.1 2355.2 3.55
too_large 510 1550 3.04 1105.6 10560 9.55
vda 650 600 0.92 1508.03 6080 4.03
x1 380 950 2.5 1105.6 9625.6 8.71
x3 510 1660 3.25 2756.25 16844.8 6.11
x4 440 650 1.48 1314.1 11264 8.57
Avg 2.01 6.08
  • Delay penalty is 2X
  • Area Penalty is 6X
  • FPGAs typically have a 25X delay penalty and a
    10X area penalty.

28
Results (Sequential designs)
  • Delay penalty is 1.6X.
  • Area penalty is 3.4X.
  • FPGAs typically have a 25X delay penalty and a
    10X area penalty

Ckt. Evaluation Delay Evaluation Delay Evaluation Delay Area Area Area
StdCell ITE Ovh StdCell ITE Ovh
s1488 630 650 1.03 3277.6 6240 1.9
s1494 650 600 0.92 3108.1 6400 2.06
s208 270 550 2.04 105.1 1459.2 13.88
s344 390 650 1.67 715.6 2649.6 3.7
s349 410 650 1.59 742.6 2649.6 3.57
s386 290 550 1.9 885.1 2060.8 2.33
s444 380 700 1.84 1105.6 2880 2.6
s510 390 400 1.03 1105.6 3161.6 2.86
s526 330 700 2.12 1314.1 2355.2 1.79
s526n 330 700 2.12 1314.1 2457.6 1.87
s820 560 650 1.16 1827.6 3968 2.17
s832 570 650 1.14 1827.6 3968 2.17
Avg 1.55 3.41
29
Speed-up of ATPG
Ckt Regular ATPG (SIS) ATPG for ITE Improve
C1908 0.78 0.02 39.00
C3540 4.84 0.02 242.00
C432 0.1 0.52 0.19
C499 0.32 0.01 32.00
C880 0.16 0.01 16.00
frg2 17.21 0.45 38.24
i8 16.26 0.16 101.63
i9 0.6 0.03 20.00
apex7 0.05 0.04 1.25
x3 1.95 0.19 10.26
apex6 0.94 0.27 3.48
term1 0.56 0.02 28.00
alu2 0.3 0.02 15.00
alu4 1.47 0.47 3.13
too_large 8.83 0.41 21.54
vda 3.42 4.37 0.78
x1 0.26 0.43 0.60
x4 0.32 0.28 1.14
Avg. 31.90
  • ATPG is about 30X faster for ITE cell based
    circuits.
  • ITE based circuits are guaranteed irredundant and
    100 testable in linear time!!!

30
Conclusions
  • We have a method that can implement circuits
    quicker and with NRE amortized over a large
    number of designs.
  • Strikes a reasonable compromise between ASICs and
    FPGAs.
  • An ITE cell based design is easily testable.
  • 100 testable in linear time
  • Guaranteed irredundant
  • Testability gains arise from the use of
    partitioned ROBDD based PTL design approach
  • Same gains can be reaped in a regular PTL design
    approach
  • Can be modified to efficiently test for other
    faults
  • Delay faults, stuck open faults etc.

31
Questions ?
Write a Comment
User Comments (0)
About PowerShow.com