SiGe HBT BiCMOS Field Programmable Gate Arrays for Fast Reconfigurable Computing - PowerPoint PPT Presentation

1 / 61
About This Presentation
Title:

SiGe HBT BiCMOS Field Programmable Gate Arrays for Fast Reconfigurable Computing

Description:

Wireless Phones - Watch Sized Phone. Direct Broadcast Satellite ... Mobile Subscriber Equipment. Command Information Systems. High Speed Switching Nodes ... – PowerPoint PPT presentation

Number of Views:139
Avg rating:3.0/5.0
Slides: 62
Provided by: Por93
Category:

less

Transcript and Presenter's Notes

Title: SiGe HBT BiCMOS Field Programmable Gate Arrays for Fast Reconfigurable Computing


1
SiGe HBT BiCMOS Field Programmable Gate Arrays
for Fast Reconfigurable Computing
  • Bryan S. Goda
  • Rensselaer Polytechnic Institute
  • Troy, New York

2
Agenda
  • Introduction
  • BiCMOS FPGA History
  • SiGe HBT BiCMOS Process
  • Current Mode Logic
  • Xilinx 6200 FPGA Design
  • Configuration Memory
  • Performance Results
  • Conclusions and Future Work

3
Current Role of SiGe
  • More Zip per Chip
  • Wireless Phones -gt Watch Sized Phone
  • Direct Broadcast Satellite
  • Fiber-Optic Lines, Switches, and Routers

4
Programmable Bipolar Logic
  • 1983 Fairchild ECL Field Programmable Logic
    Array
  • Fuse Based
  • 4ns Cycle Rate
  • High Power
  • Scaling Problems
  • 1990 Algotronix 1.2uM 256 Cell Configurable
    Logic Array
  • fT 6 GHz, 200ps Gate Delay
  • 4 Transistor Static RAM Memory Cells
  • ASIC Emulation and Signal Processing
  • Forerunner of XC6200

5
US Patent CMOS Switchable 2 Input Multiplexer
V
6
SiGe Heterojunction Bipolar Transistor
  • Selectively introduce Ge into the base of a Si
    BJT
  • Smaller Base Bandgap increases e- injection,
    higher Beta (100)
  • Higher Beta allows more heavily doped base RB
    (125 Ohm)
  • Graded Bandgap decrease base transit time fT

7
(No Transcript)
8
SiGe HBT
  • 50Ghz Process, 100Ghz process within a year (30uA
    at 50 Ghz)
  • 5 layers of metal
  • Used in RPI VLSI Class
  • co-integrated with CMOS process
  • can have HBT logic with CMOS memory
  • low power and high speed

9
f
Curves for Various Emitter Lengths
T
10
SiGe HBT Layout
Emitter
Base
Collector Sub-Collector
11
Band Diagram
Eg,Ge(x0)
Eg,Ge(xWb)-
Eg,Ge(x0)
Eg,Ge(grade)
0.031 ev
p-SiGe base
Drift Field
e-
EC
n Si emitter
h
EV
n- Si collector
Ge
Dielectric Constant Si 11.7 Ge 16.2 SiGe (7.5
Ge)12.03
p-Si
12
CML Branch Current vs. Differential DC Voltage
13
IBM SiGe and CMOS Load Gate Delays on M1, M2, LM
14
Current Steering Logic
Vcc 0 V
Fastest Logic Level Limited Drive Capability
Level 1
-250 mV
-950 mV
Inter-block Signal Level Good Fan-Out (10)
Level 2
-1.2 V
-1.90 V
Clock Signal Slowest Level Level 4 Possible
Level 3
-2.15 V
Vee 4.5 V
15
Current Steering Logic In SiGe
  • 13ps Transistor Switching Time (75 Ghz)
  • 6ps Process Next Year
  • Small Voltage Swings (250mv) vs 3.3 or 5 V
  • Less Power
  • Smaller Swing Faster
  • Steer Currents, Use Differential Logic
  • Less Switch Noise
  • Less Transistors needed, Complement Signal
    Present
  • Flip-Flops and Multiplexers Easy to Implement

16
Vcc
O V
CML XOR Logic Schematic
Level 1 0 -0.25 V
A XOR B
A
A XOR B
A
A
B
B
1 0 1 1 0 1 1 1 0
A level1
Level 2 -0.95 -1.2V
B level 2
Vref

0 0 0 1 1 0 0 1 0
1 0 1 0 1 1 1 0
Vee
-4.5V
A XOR B
17
General FPGA Structure
I/O Cell
Logic Cell
Routing Network
Configuration Memory
18
High Speed FPGA Applications
  • Real Time Image Processing
  • Radar
  • Pattern Recognition
  • Digital Networks
  • Mobile Subscriber Equipment
  • Command Information Systems
  • High Speed Switching Nodes
  • Control Systems
  • Guidance Systems
  • Reprogrammable Survivability
  • Satellite Systems

19
Image Correlation
Search Image
Desired Image
1. Desired Image is programmed into chip (1
pixel 1CLB) 2. Load a section of search
image 3. If enough pixels match, then turn found
bit on 4. Load another section, or reprogram
with new desired image
20
Samples From XC6200 CAD Tools
IO Blocks
CLBs
Pins
21
FPGA Drawbacks
  • Slowdown
  • 200 Mhz Internal Speed down to 30-60 MHz
    External
  • Pass Transistor Low Pass Filter
  • Limited Bandwidth
  • Relatively Long Configuration Times (Seconds)
  • Vender Guarded Information
  • More Expensive than Comparable ASIC

22
Pass Transistor Interconnect Modeling
3
M
1
M
M
1
2
3
1
4
2
3
On
M
4
2
M
M
4
(Memory)
Interconnect
Pass Transistor
Equivalent Circuit from Node 3 to Node
2
23
Field Programmable Gate Arrays (FPGA)
  • Hierarchy Level Organization (Sea of Gates)
  • Simple Cells (Configurable Logic Blocks)
  • 4x4, 16x16, 64x64 groupings
  • Hierarchy of routing resources at each level
  • I/O Blocks (external interface)

24
Design Parameters
  • Logic Swings Levels
  • Based on Differential Pair Switching
  • Current Levels
  • Redesign of the Configurable Logic Block
  • Take Advantage of Differential Wiring
  • What Parts Can be Turned off if not Used?
  • Supply Levels
  • How Many Levels of Logic?
  • Routing Resources
  • CMOS Voltage Levels
  • Integrate CMOS into Bipolar Current Tree

25
Current Tree with CMOS Routing
26
Bipolar vs Bipolar/CMOS Current Trees
CMOS Bipolar
Pulse Width 50ps 60ps
70ps 100ps
27
41 Multiplexer
Level 1 Inputs
Level 1 Output
Level 1 Output
Level 2 Input
Level 2 Input
Level 3 Input
Level 3 Input
CMOS Version
W/L 51
28
Sample Logic Using Multiplexers
X1 a
A and B
X2 b
Y2
If a1 then select Y2 output b If a0 then
select Y3 output 0
1 0
Y3
X3 a
X1 a
A OR B
Y2
X2 a
If a1 then select Y2 output 1 If a0 then
select Y3 output b
1 0
Y3
X3 b
29
Redesign of XC6200 Logic
X1 a
  • Original XC6200 Design
  • Have to Track Inversions

X2 b
Y2
1 0
Inverted Output
Y3
X3 a
X1 a
  • Revised Design
  • Use Differential Pair Logic
  • Eliminate XC6200 Fast Logic
  • No Inversion Tracking

Y2
X2b
1 0
Non-Inverted Output
Y3
X3 a
30
X1
X2
Y2
1 0
CS Multiplexer
RP Multiplexer
C
F
S
D Q
Original XC6200 Architecture
X3
Y3
Clk
Q
Clr
X1
X2
Y2
1 0
CS Multiplexer
Redesigned Architecture
RP Multiplexer
C
F
S
D Q
X3
Y3
Bipolar with CMOS Routing
Clk
Q
Switchable
Clr
31
10 Ghz Three CLB Simulation
32
CLB Layout
41 Mux (off switchable) CMOS Control
Master/Slave Latch (off switchable)
(off switchable)
41 Mux High Speed Logic
21 Mux CMOS Control
Buffer
33
Sample CLB Test Circuit
Vref
CLB
81 Mux
Vref
Buffer
8/1 Divide
Pad Drivers
34
Actual Fabricated Test Circuit
Pads (110u x 110u)
35
Outgoing CLB Routing
Incoming CLB Routing
N S E W N4 S4 E4 W4
X3
N S E W N4 S4 E4 W4
N S E W N4 S4 E4 W4
X1
X2
CLB
F
36
4x4 Block Boundary Routing
N Switches
N Switches
E Switches
E Switches
W Switches
W Switches
S Switches
S Switches
Length 4 FastLane (4x4) Length 16 Fastlane
(16x16) Chip Length Fastlane (64x64)
Local Routing Magic Routing
37
Local CLB Routing
N S E W N4 S4 E4 W4
N S E F
X3
Eout
N S E W N4 S4 E4 W4
N S E W N4 S4 E4 W4
X1
X2
CLB
  • Nearest Neighbor Routing
  • Output (F) or Local Through

S E W F
F
Sout
Example Route East Signal Through to Next
CLB Note Cant Route Signal Back to Origin at
this Level
38
Normal CMOS Memory-CML Interface
SRAM Bits
In Memory Planes
CMOS to CML Buffer
V
V
SS
SS
Data
CLB
Multiplexer
Inputs
V
REF
decode
New Configuration
V
EE
V
EE
39
Memory Design
D Latch M/S 40 Transistors
D Latch M/S 18 Transistors
RAM Cell 6 Transistors Parallel Load
40
3-D Chip Stacking
Memory Planes
CLBs
  • Shorter Wires
  • More CLBs/Area
  • Optimize Memory

41
CLB with Routing and RAM (2)
CLB Select
RAM2
CLB
RAM1
MUX
MUX
MUX
MUX Selects
42
Layout of Configurable Logic Block with 2 sets of
RAM
RAM
21 Mux
Circuit Elements 240 nfets 122 pfets 36
resistors 98 npn1 HBTs 16 npnhb1 HBTs
Master/Slave Latch (memory)
81Mux (routing) CMOS Selects
CLB (logic)
43
SiGe Performance
Circuit Type
Buffer
CML
MUX
CLB
XOR,AND,OR
XOR,AND,OR
Propagation Delay
17ps
22-25ps
23-26ps
100ps
Power Decreasing Ideas
Date Idea Power Consumption/CLB Dec
98 Original CLB 73
mW June 99 CLB Redesign I 34 mW Aug
99 CLB Redesign II 24 mW Dec
99 Widlar Current Mirror with CMOS Control,
CMOS Routing 10.8 mW Mar
00 Supply Voltage 4.5 -gt 3.3V 7 mW Dec
00 7HP Process 0.3 mW
Projected Power Levels for 7HP Process At
50Ghz, 30 uA, 20x reduction in power
44
Multiplexer Performance vs Temperature
Normal 250 mV Swing
200 mV Min Swing
45
Vcc
Input
Vref
Vee
Widlar Current Mirror with CMOS Control
46
XC6200 Design Improvements
  • Developed at the University of Scotland
  • Inversion of Signal at Every CLB
  • Taken care of due to differential pair wiring
  • No Pass Transistors, Use Multiplexers for
    Routing
  • Able to turn off unused parts with CMOS
    controlled current mirror
  • No CMOS-CML Conversion circuits needed, CMOS in
    current trees
  • Handcrafted, dense layouts
  • Context Switching

47
Power Delay Product
1
5HP
PDP CMOS High
0.1
PDP CMOS Low
PDP BiCMOS
uW/gate/Mhz (log scale)
7HP
0.01
8HP
0.001
1998
1999
2000
2001
2002
Year
48
Data Dependent Switching
Differential Logic has Complement Switching In
Opposite Direction
A
A
B
B
C
C
Slow Transition
Bit Line Twisting
Could Vary Signals Up to 30 Setup Time
Violations
A
A
B
B
C
C
Fast Transition
49
Future Work
  • Testing
  • Overall FPGA Architecture
  • Scaling
  • Integrate with Other Systems
  • Projected Graduation May 2001, work to
    continue at USMA
  • Power Reduction
  • 7HP Process

50
CLB Context Switch Example
Pattern1 0001100100 70ps 7.1 GHz
Pattern2 1011011100 70ps
Select
AND OR AND
OR
0001100100 1011011100
0001000100 AND 1011111100 OR
51
Redesigned CLB Cell with Routing and Memory (2x)
Three 8-1 Input Mux
2x24 Bit RAM
M1 M2 M3 M4
Four 4-1 Output Mux
CLB
52
CLB Row 4x1
N/S Input Output
Memory Bus Lines
Circuit Elements 1520 Nfets 792 Pfets 260
Resistors 140 NPN1 HB 576 NPN1
Switch
53
XC6200 Device Family
Device XC6209 XC6216 XC6236
XC6264 Gate Count 9-13K 16-24K
36-55K 64-100K Number Cells 2304
4096 9216 16384 I/O Blocks
192 256 384
512 Row x Col 48x48 64x64
96x96 128x128
54
Typical Routing Delays
Symbol Parameter XC6200 SiGe Redesign
TNN Route Nearest Neighbor 1 ns
23 ps Tmagic Route X2/X3 to
Magic Out 1.5 ns 47 ps TL4
Length 4 FastLane 1.5 ns 47 ps TL16
Length 16 FastLane 2 ns 70 ps TCL64
Chip-Length (64) Delay 3 ns
94 ps 31x improvement
55
4x4 CLB Layout Cell
  • Largest Basic
  • Block
  • Over 13,000
  • Transistors
  • Commercial
  • Product Size is a 4x4 Array
  • of this Cell

56
(No Transcript)
57
(No Transcript)
58
Example High Speed Switch of 2 Incoming Signals
0 0 0 0 0 0 01 1 0 0 1 0 0 0 0 0 1 1 0 0 1 0 01
0 1 1 0 1 1 1 0 0 1 0 1 1 0 1 1 1 0 01 0 1 1 0 1
1 1 0 0 1 0 1 1
Pattern 2 1011011100
Pattern 1 0001100100
Switch Point
59
(No Transcript)
60
5 Stage Ring Oscillator
Speed Relative to Schematic Current
Schematic 6.36 Ghz -- 8.4mA Parasitics 5.71
Ghz 89 8.6mA 50oC 5.26 Ghz 82 8.85
mA 75oC 4.87 Ghz 76 9.1 mA 100oC 4.16
Ghz 65 9.34 mA 125oC 3.12 Ghz 49 9.5 mA
61
(No Transcript)
62
(No Transcript)
63
BiCMOS and CMOS Characteristics
Technology
Size, V threshold
Effective Size, Vdd
PDP Level
(uW/gate/MHz)
1998 CMOS
Ldrawn0.5u
Leff0.36u
Hi0.36
Vth0.87V
Vdd3.3V
Low0.2
2000 CMOS
Ldrawn0.25u
Leff0.18u
Hi0.18
Vth0.5V
Vdd2.5V
Low0.08
2002 CMOS
Ldrawn0.22u
Leff0.12u
Hi0.1
Vth0.4V
Vdd1.8V
Low0.05
1999 BiCMOS 5HP
Vbe0.85V
Vdd4.5V
0.36
2000 BiCMOS 7HP
TBD
TBD
0.01
Write a Comment
User Comments (0)
About PowerShow.com