Digital Integrated Circuits A Design Perspective - PowerPoint PPT Presentation

1 / 58
About This Presentation
Title:

Digital Integrated Circuits A Design Perspective

Description:

The encoding logic is easily incorporated ... to set shift bits Signal pass through one gate independent of shift ... Architectures Arithmetic unit ... – PowerPoint PPT presentation

Number of Views:438
Avg rating:3.0/5.0
Slides: 59
Provided by: Boriv88
Category:

less

Transcript and Presenter's Notes

Title: Digital Integrated Circuits A Design Perspective


1
Digital Integrated CircuitsA Design Perspective
Jan M. Rabaey Anantha Chandrakasan Borivoje
Nikolic
Arithmetic Circuits
2
A Generic Digital Processor
3
Building Blocks for Digital Architectures
Arithmetic unit

Bit-sliced datapath
(adder, multiplier, shifter, comparator, etc.)
-
Memory
- RAM, ROM, Buffers, Shift registers
Control
- Finite state machine (PLA, random logic.)
- Counters
Interconnect
- Switches
- Arbiters
- Bus
4

Arithmetic building blocks
  • Speed and power of arithmetic components often
    dominates the overall system performance
  • For each module, multiple topologies and ways of
    design exists, with each of them has its own
    advantages
  • A global picture is of crucial importance. A
    designer focus their attention on gates or
    transistors that have the largest impact on their
    goal function. Non-critical components can be
    developed routinely.
  • Typically two optimization process logic
    optimization (re-arrange Boolean equations so
    that a faster or small circuit could be obtained)
    and circuit optimization (manipulate circuit
    topology and transistor sizes to optimize speed)

5
Bit-Sliced Design
Since the same operation has to be performed on
each bit of a data word, the data path can
consist of the number of bit slices (equal to the
word length), each operating on a single bit
hence the term bit-sliced
6
Adders
7
Full-Adder
8
The Binary Adder
9
The Ripple-Carry Adder
Worst case delay linear with the number of bits
td O(N)
tadder (N-1)tcarry tsum
Goal Make the fastest possible carry path circuit
10
Complimentary Static CMOS Full Adder
28 Transistors
11
Complimentary Static CMOS Full Adder
  • Large PMOS stacks are present in both carry and
    sum generation circuits
  • Intrinsic load capacitance of Co signal is large
    and consists of eight capacitance components
  • There is one more inverter delay for carry and
    sum (worse when the load capacitance is large)
  • Note that critical signal Ci closer to the
    output node

12
Express Sum and Carry as a function of P, G, D
Define 3 new variable which ONLY depend on A, B
Generate (G) AB
Propagate (P) A
B
Å
Delete (D)
A

B
S
C
D and P
Can also derive expressions for
and
based on

o
Note that we will be sometimes using an alternate
definition for

Propagate (P) A
B
13
Transmission Gate XOR
When B1, M1/M2 inverter, M3/M4 off, so FAB When
B0, M1/M2 off, M3/M4 transmission gate, so FAB
14
Transmission Gate Full Adder
15
Manchester Carry Chain
Generate (G) AB
Propagate (P) A
B
Å
Delete
A

B
Prevent floating Co
16
Full-Adder
17
Manchester Carry Chain
18
Manchester Carry Chain
Stick Diagram
19
Manchester Carry Chain
  • Delay for the Manchester Carry Chain can be
    modeled similar to a linearized RC network as in
    transmission-gates
  • This means the propagation delay is quadratic in
    the number of bits N (but does not imply the
    delay will be larger than the ripple carry adder)
  • It might be necessary to insert signal buffering
    inverters.
  • Still a ripple carry adder, typically only good
    for small word length (lt8/16 bits)
  • We need faster adders for computer and
    multimedia applications with word length 32-128
    bits

20
Carry-Bypass Adder
Also called Carry-Skip
P1
G0
G0
P1
delete or generate
Break the bit-slice organization
21
Carry-Bypass Adder (cont.)
tadder tsetup Mtcarry (N/M-1)tbypass
(M-1)tcarry tsum
(worst case)
Tsetup overhead time to create G, P, D signals
22
Carry Ripple versus Carry Bypass (both still
linear)
23
Carry-Select Adder
24
Carry Select Adder Critical Path
25
Linear Carry Select
tadder tsetup Mtcarry (N/M)tmux tsum
26
Square Root Carry Select
M
27
Adder Delays - Comparison
Bypass
28
LookAhead - Basic Idea
29
Look-Ahead Topology
Expanding Lookahead equations
All the way
30
Look-Ahead Adder Logarithmic adder
31
Carry Look-Ahead Trees
C0G0P0Cin C1G1P1C0 C2G2P2C1 C3G3P3C2
C0G0P0Cin C1G1P1C0 G1G0P1P1P0Cin C2G2P2
C1 G2G1P2G0P2P1P2P1P0Cin G21P21C0
(G21G2P2G1 P21P2P1) C3G3P3C2
G3G2P3G1P3P2G0P3P2P1P3P2P1P0Cin
G10P10C0 (G10G1P1G0 P10P1P0)
G32P32C1G32P32(G10P10C0)(G32P3
2G10)P32P10C0
Can continue building the tree hierarchically.
G32(G3P3G2) and P32P3P2 are called dot
products.
32
Tree Adders
16-bit radix-2 Kogge-Stone tree (radix 2 means
that the tree is Binary it combines two dot
product or carry words at a time at Each level of
hierarchy)
33
Tree Adders
16-bit radix-4 Kogge-Stone Tree
34
Sparse Trees
16-bit radix-2 sparse tree with sparseness of 2
35
Tree Adders
Brent-Kung Tree
36
Intel Itanium Microprocessor
Itanium has 6 integer execution units like this
37
Bit-Sliced Design
38
Bit-Sliced Datapath
The adder is implemented as a radix-4 Carry
Look-Ahead adder, the red lines are forwarding
the results of different stages
39
Itanium Integer Datapath
Courtesy of Intel
40
Multipliers
41
The Binary Multiplication
42
The Binary Multiplication
43
The Array Multiplier (4 by 4)
Half adder
carry
sum
The carryout of the last adder for Yi is
forwarded to Yi1
44
The MxN Array Multiplier Critical Path
Critical Path 1 2
45
Carry-Save Multiplier
  • A more efficient realization can be obtained by
    noticing that the multiplication results does not
    change when the output carry bits are passed
    diagonally downwards instead of to the right.
  • But need extra adders (vector merging adders)
    that can use fast carry look ahead adders (since
    results come at the same time)
  • Critical path is uniquely defined

46
Multiplier Floorplan
47
Wallace-Tree Multiplier
Save the number of full adders Increase the
complexity of routing
48
Wallace-Tree Multiplier
HA
Can use carry Look-Ahead adder for the last stage
49
Wallace-Tree Multiplier
50
Booth encoding
  • Multiply by 01111110 gives 8 partial products,
    but two are all zero. Add these zero is waste of
    time.
  • Instead, multiply by 100000010, where 1 stands
    for -1. Then you need to only add (actually
    subtract) partial products, which improves speed
  • This kind of transformation is called booth
    encoding. It reduces the number of partial
    product to at most half of the original
    multiplier width.
  • The encoding logic is easily incorporated in the
    overall multiplier design.

51
Multipliers Summary
This is also why algorithmic invention has
significant meaning to VLSI design.
52
Shifters
53
The Binary Shifter
54
The Barrel Shifter
Column maximum shift
Word length
Area Dominated by Wiring
55
4x4 barrel shifter
  • Coder/decoder required to set shift bits
  • Signal pass through one gate independent of
    shift amount (parasitic capacitance may change
    the picture)

56
Logarithmic Shifter
No separate coder/decoder is required
57
0-7 bit Logarithmic Shifter
A
3
Out3
A
2
Out2
A
1
Out1
A
0
Good for large shift amount (note that cascade
pass transistor slow down the gate and generate
weak signals, buffers may be needed)
Out0
58
Building Blocks for Digital Architectures
Arithmetic unit

Bit-sliced datapath
(adder, multiplier, shifter, comparator)
-
(comparator, divider, sin, cos etc)
Write a Comment
User Comments (0)
About PowerShow.com