FIFO Chip Design Example - PowerPoint PPT Presentation

1 / 79
About This Presentation
Title:

FIFO Chip Design Example

Description:

The first thing we must do is decide the pins in an actual pad ... Need two DFF parts one flipped with different wiring to the global signals one unchanged ... – PowerPoint PPT presentation

Number of Views:305
Avg rating:3.0/5.0
Slides: 80
Provided by: drdavid
Category:
Tags: fifo | chip | design | example | flipped

less

Transcript and Presenter's Notes

Title: FIFO Chip Design Example


1
FIFO Chip Design Example
  • EE166
  • SJSU
  • David Parent

2
FIFO Example
  • We will now try to put together the concepts of
  • Cell based design
  • Super Buffer
  • Clock trees
  • IP reuse
  • Getting a chip into a Pad frame
  • FIFO
  • Simple, Regular

3
Getting Started
  • The first thing we must do is decide the pins in
    an actual pad frame with the package.
  • This will give us the context we need to make
    intelligent decisions about routing.

4
MOSIS Pad frame
  • The stand tiny chip from MOSIS can support 40
    pins.
  • You need to start with the pin out of the actual
    packaged chip to make the part useable and
    testable.
  • We will use pin 1 as VDD and 21 as GND as a
    standard. This inputs will come in the top
    (2-20) and out puts in general will be out the
    bottom (21-40)
  • We will choose pin 2 for CK, 3 for NPRE and 4 for
    NCLR
  • A0-A15 will map to pins 5-20.
  • Y0-Y15 will map to pins 40-25.

5
Packaged Part
Note I will not fab a part without the pins
list!
6
Sample Pad frame
Area inside is 895mm by 895mm.
pin 21
pin 1
You can get more area buy using less pins. (Read
Data in serially?)
You can have larger circuits but they use up
more MOSIS money
7
Bonding Diagram
This goes in the package.
8
How big a FIFO can we make?
  • Our DFF is 72mm x 36mm in area
  • A MOSIS tiny chip gives you about 900mm x 900mm
    of space Assume that we can only use ½ the
    space.
  • This can be increased if you use less than 40
    pads.
  • Number of rows 450/36 gives 12
  • Number of columns 900/72 gives 12

9
Saving Space
We could get rid of not clock by adding an
inverter and save 3mm.
We could overlay the clock and reset signals and
save 10mm.
We could overlay the ground wires and save 3 mm.
10
Trade offs
  • Replacing not clock with an inverter.
  • New Cell Height 33 (450/33 gives 13)
  • New Cell Width 728 (900/ gives 11)
  • Routing is easier
  • Do not have to worry about skew between not clock
    and clock
  • Will the power go up?
  • Maybe. You would need another super buffer to
    drive not clock. In this case you only need one.

11
Trade offs
  • Overlay the reset and clock signals
  • New average Cell Height 31 (450/31 gives 14)
  • No New Cell Width
  • Need two DFF parts one flipped with different
    wiring to the global signals one unchanged
  • We already need two type of FF one with D and not
    D and the other with D input only. This would
    make 4 different FF!

12
Trade offs
  • Overlay the ground signals
  • New average Cell Height 34.5 (450/34.5 gives 13)
  • No New Cell Width
  • Electro migration?
  • Nothing works!
  • We have to try it all!
  • Still only 15 wide!
  • We could shrink height by 3mm which would give us
    16 bits wide but then AOI logic would not fit
    into the cell height.
  • We beg the senior engineer for 50mm more space.

13
4 DFFs
20 min
40 min
5 min
7 min
  • All The FFs need to have the Not clock removed!
  • Need to have to verify 4 new parts from one old
    part!
  • This will take some time!
  • No choice.
  • New Average Cell Height 31.375 16 bits high will
    give less than 500 microns so it it will fit in
    the expanded space.

14
Derivative DFF Design
It also helped that my NAND3 was designed to
have flexible routing, rather than minimum area.
I really saved some time by reusing the same
template.
15
Not D Internal Routing Up
  • Not CK are provided by inverters to be added as
    required. Not D is generated by the NAND2 from
    D. Since we will not be operating at less than
    1ns the increase setup time will not matter.

16
DFF_DI_RU
NPRE
NCLR
CK
Q QN
Use the nand as an inverter.
D
17
DFF_DI_RD
CK NPRE NCLR
18
DFF_DE_RU
19
DFF_DE_RU
D ND
20
DFF_DE_RD
Q QN
D ND
21
DFF_INV
22
Design Review
  • After looking at the parts so far it looks like
    there could be an electro migration problem where
    the VDD is bought into the circuit

Since all the FF use the same basic parts, We
just have to fix it once in each cell. You can
even edit it in place! I had to flatten the NTAP
to do this. I had to add some nwell due to a DRC
error.
23
New DFF Structure
24
Back to the FIFO
  • We can fit 16 bits high within 500 microns
  • We can fit 900/80 long (11)
  • We can do a FIFO 10 bits deep.
  • We will use 16 x 10 DFF (160)
  • 8 DFF_DI_RU
  • 8 DFF_DI_RD
  • 72 DFF_DE_RU
  • 72 DFF_DE_RD

25
Gut check on power
  • 160 DFF
  • Each one has 21 NMOS and 21 PMOS
  • This is like having 21 inverters
  • Total number of inverters is 3360 (6720
    transistors)
  • The power for one inverter at 30 Mhz is

The power for an alfa of one and 3360 Inverters
is 58mW We know that not all transistors do not
switch every clock cycle so this is an upper
bound.
26
30 MHz! What happened to 200MHz?
  • With no PLL and the data coming from off chip the
    maximum clock rate and off chip speed is about 30
    MHz! One could design special output buffers but
    these are tricky and would use more power!
  • We will continue to test at a higher speed
    because the simulation will go faster!
  • For the final pin to pin simulation we will have
    to simulate at 30 MHz for at least 20 clock
    periods.

27
FIFO Schematic
Start off with the basic structure that can be
copied and pasted.
28
FIFO Schematic
Hard to see!
29
FIFO Schematic Complete
30
FIFO Symbol
31
Verilog takes less than a second to verify.
32
Verilog Test bench
33
Spice Test Bench
34
Input Vectors
35
Input Vectors
36
Input Vectors
37
Output
38
Output
39
Spice Summary
  • The circuits has been validated
  • The simulation took about 10 minutes to run!

40
Layout
Pre Rout CK NPRE and NCLR
Set up The first 4 FF
Then make it 10 across
41
Layout with only cells
42
Route VDD and Ground
43
Final FIFO Layout
GND
VDD
DATA FLOW
44
Final Layout Verification
45
Post Extraction Simulation
46
Modify a Pad frame
  • The parts we need are
  • input buffer (padinc)
  • output buffer (padio)
  • corners (fc)
  • VDD pad (padvdd)
  • GND pad (paddgnd)
  • You can FTP a sample pad frame from mosis
  • http//www.mosis.org/Technical/Designsupport/pad-l
    ibrary-scmos.html
  • The Docs are there as well.

47
Sample Padframe
Load in a sample padframe. To change a pin just
select it and press q for edit, and then change
the same to what you want.
48
Change Pin 21 from padinc to padgnd
padinc to padgnd
49
Make sure pads abut.
correct
But Metal 1 together and make sure the PSEL line
is on he horizontal axis.
Not correct!
Pin 26
50
Change pins 22-24 to unused
51
FIFO_PF
  • After you make a padframe open up a new cell and
    add the the instance of your padframe.
  • Then add pins.

52
Create Pin Names
Finished pin
Do not use a global Variable for VDD and GND!
Use metal 3 input/output
Use a width of 50
53
Pin to Pin Test Bench
  • VDD and GND can not be global then have to be
    direct pins.
  • The test bench and symbol are almost identical to
    to FIFO part
  • Copy the FIFO to a new cell called FIFO_PF
  • Edit the symbol to add two ne pins VDD and GND
    which are in/out.
  • Change the schematic accordingly.

54
FIFO_PF Layout
  • Parts
  • Super buffer for clock
  • FIFO
  • Padframe
  • Stamp them down and wire them up!

55
PAD I/O
OEN Output enable DO sends output to pad DB
inverts data coming from the pad DI
does not invert data coming in from the pad
DB DI
OEN DO
56
PAD VDD/GND
GND
57
Connect a metal 2 path 2.7u wide to DIB
58
DRC
  • The pads do not pass DRC. Pads are an exception
    to the rule, and these pads have proven
    themselves in the field.
  • We have to do DRC on everything else.
  • Draw a Do not do DRC layer around the pads with
    the edge of the pads just over the metal 2
    connection (We want to make sure we are connect
    right?)

59
Do not DRC Layer
Put it just over the metal 2
60
FIF0_PF Symbol
Rearrange ports to match chip.
Note Delete the schematic view of the FIFO_PF.
61
FIF0_PF_TB
62
Simulation Trouble
  • I ran into some major trouble that took two days
    to fix.
  • The pad frame I had built was off a little bit
  • It takes 3 minutes to extract and almost 10
    minutes to simulation (Then you see an error.)
  • That is almost 15 minutes to try each solution.
  • Also the pad frame kept moving on me when I would
    inadvertently moved it when I was zoomed in on a
    small feature.

63
Solutions
  • Start off with just the pad frame and just wire
    the input and outputs together and see if it acts
    like a buffer.
  • Build the pad frame in the top most cell.
  • Put all the parts in the center.
  • Select everything inside the pad frame and make
    it a cell.
  • Edit the cell in place.
  • Do the shortest simulation possible to make sure
    everything is connected (Like a reset or set
    operation)
  • You can probe the extracted wires by descend
    editing to the extracted view and selecting the
    wire you want.
  • Get the pad frame working on its own in parallel
    with the circuit design.

64
The rise time on the when NPRE goes low is 4ns!
The output Y seems ok and since the circuit is
supposed to work at 30 MHz I will not try to
buffer the signal. This simulation ran for 8mins
40 secs to get 20ns
65
A B
C
66
Analysis
  • Point A is when NCLR goes low thus changing the
    state of every FF (notice the power surge
  • Point B is when NCLR goes high and the FIFO
    begins to fill
  • Point C is when the all the FF are turning on at
    the same time. (Note the power surge.)
  • We will take the average power from 600 to
    700ns, and average over 4 clock cycles.
  • Average Power 305mW/4 gives 76mW

67
Outputs Y15-Y8
A
B
10 Clock Cycles for Y to get A after NCLR goes
high.
68
Outputs Y7-Y0
69
Inputs A15-A8
70
Inputs A7-A0
71
Final FIFO Layout
See how much area is taken up by routing!
Super Buffer
A0
NCLR
D A T A F L O W
NPRE
CK
A15
VDD
Y0
GND
Y15
72
Statistics (Working alone in the middle of the
summer)
  • FIFO_PF
  • DRC 55 seconds
  • Extract 190s
  • Quick simulation (20ns 520s)
  • Long 30MHz simulation (500ns XXXs)
  • Whole Chip 48 man hours.

73
Design Statistics
  • This does not include the time to write the
    documentation.
  • The total project took somewhere between 40 to 60
    hours.
  • This means that documentation can take a lot of
    time.
  • The numbers do not add up but one can easily see
    that the
  • time required to complete a step goes up at
    the best linearly with
  • of transistors and at worst exponentially!
  • You need to plan accordingly!

74
DRC and Extract time vs. gate count
75
Simulation time vs. gate count
76
Design time vs. gate count
77
How fast do I work?
Can you measure your output in transistors per
hour?
78
Design Review
  • Clock Rate30MHz
  • Power76mW
  • Area 1500mm x 1500mm2.25x106mm
  • Power Density76mW/Area gives .33 3x10-7 W/mm2 .
    No cooling required!
  • 2.3x10-7 W/mm2 no cooling
  • 1.0x10-6 W/mm2 with expensive cooling

79
Lessons Learned
  • Design re-use is a faster method of design.
  • Getting the circuit into a pad frame can take a
    large amount of time.
  • Get pad frame done before you need it.
  • Verilog simulation are very fast but give no
    timing data unless it is built in.
  • The project will all ways take longer than
    expected! (even if you plan for it!)
Write a Comment
User Comments (0)
About PowerShow.com