Chap' 4: Datapath Design - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

Chap' 4: Datapath Design

Description:

X = (FX, EX), where FX = mantissa, EX = exponent ... Addition: shift one mantissa and add. Subtraction: shift one mantissa and subtract ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 49
Provided by: sungg
Category:

less

Transcript and Presenter's Notes

Title: Chap' 4: Datapath Design


1
Chap. 4 Datapath Design
  • Discusses the design of arithmetic units
  • Basic computer arithmetic methods
  • 4.1. Addition, subtraction, multiplication, and
    division
  • All arithmetic functions can be approximated
  • 4.2. Arithmetic Logic Units (ALUs)
  • 4.3. Floating-point and pipeline processing

2
Unsigned Binary Addition
  • Decimal addition with fixed number of digits3
    4 7, 8 9 7 (with overflow 10)
  • Half Adder
  • Binary 1-digit adder0 0 0, 0 1 1,
    1 0 1, 1 1 0
  • Full Adder
  • Binary 1-digit adder with carry-in
    carry-out1 1 0 (cout 1), 1 1 (cin
    1) 1 (cout 1)

modulo addition
3
Half Adder (HA) Implementation
  • Inputs x and y output sumx y sum0 0
    00 1 11 0 11 1 0
  • sum x y x y x EX-OR y

4
Full Adder (FA) Implementation
  • Inputs x, y, cin Outputs sum,
    coutx y cin sum cout0 0 0 0 00 0 1 1
    00 1 0 1 00 1 1 0 11 0 0 1
    01 0 1 0 11 1 0 0 11 1 1 1 1

5
Lee 2000
6
Simple Adder Designs
  • Serial Binary Adder
  • data enters serially, summed data exits serially
  • Fig. 4.2 (p. 225)
  • Parallel Adder
  • Fig. 4.3 (p. 226) n-bit ripple-carry adder (RCA)
  • Fig. 4.4 (p. 226) n-bit adder-subtracter
  • Fast Parallel Adder
  • based on carry lookahead

7
Carry Lookahead Addition
  • Generates carry out signal using using only
    primary input signals (does not use ripples)
  • Key observations
  • ci is generated, regardless of the values of any
    other carry values, if (xi AND yi) is equal to 1
  • ci is propagated, depending on the value of ci-1,
    if(xi EX-OR yi) is equal to 1
  • NOTE we can also use (xi OR yi) for the
    propagate term

8
(No Transcript)
9
Lee 2000
10
Lee 2000
Lee 2000
11
Multiplication
  • Combinational Multiplier
  • Typically uses an array of CSA (carry save adder)
    modules
  • Trades off space (hardware) for time (calculation
    speed)
  • Sequential Multipler
  • Executes a sequence of add-and-shift operations
  • Tries to minimize number of add-and-shifts
    required
  • Advantage can use existing registers and ALU
  • Disadvantage slower than combinational version

12
Multiplication H/W
  • Based on paper-and-pencil method of repeated
    shift-and-add operations

13
Observations
  • Multiplication of single digits in binary
    multiplication is just an AND operation
  • Multiplication of two n-bit numbers can be
    accomplished with (n-1) additions
  • Can use array of AND gates, HAs, and FAs
  • Figs. 4.17, 4.18, 4.19 (pp. 242-243) --gt CSA
  • Question Where is most of the delay in this
    design?

14
Sequential Multiplication
  • Use one parallel adder, a set of registers
    (capable of shifting), and control logic
  • Use the ASM design method to design this circuit
  • Multiplier recoding can be used to reduce the
    number of adds and subtracts required
  • Booths Algorithm, Booth Multiplier
  • Modified Booth Multiplier

15
Multiplication with Signed Numbers
  • Case 1 multiplier X and multiplicand Y are
    positive
  • Case 2 X is positive and Y is negative
  • sign-extend the partial products during shifting
  • use the msb (most significant bit) of the partial
    product
  • Case 3 X is negative and Y is positive
  • add 1 final step of subtracting Y from the
    partial product
  • Case 4 both X and Y are negative
  • apply methods for both Case 2 and Case 3

16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
Booths Algorithm
  • Suppose X 0111 1110. What is X in base 10?
  • X 64 32 16 8 4 2 126
  • X 128 2 126
  • This works in general ? refer to p. 239
  • A run of 1s can be replaced by 1 add 1
    subtract
  • X can be recoded as X 1000 0010, where 1
    denotes add and 1 denotes subtract
  • Called differentiating recoding
  • Algorithm shown in Figs. 4.15, 4.16 (pp. 240-241)

24
Division
  • Sequential Divider
  • Executes a sequence of subtract-and-shift
    operations
  • Tries to minimize number of add-and-shifts
    required
  • Advantage can use existing registers and ALU
  • Disadvantage slower than combinational version
  • Combinational Divider
  • Uses an array of 1-bit subtracter modules
  • Trades off space (hardware) for time (calculation
    speed)

25
Sequential Division H/W
  • Based on paper-and-pencil method of repeated
    subtract operations
  • Note quotient bit needs to be guessed)
  • Two basic methods available
  • Restoring division
  • restore partial remainder if guess is wrong
  • Nonrestoring division
  • change next subtract step to addition if guess is
    wrong
  • More advanced methods based on other guessing
    methods

26
Paper-and-pencil Division Method
27
Lee 2000
28
Lee 2000
29
Lee 2000
30
Lee 2000
31
Lee 2000
32
Lee 2000
33
Lee 2000
34
Arithmetic Logic Unit (ALU)
  • Uses of the ALU
  • process arithmetic and logical instructions
  • address calculations
  • act as a data conduit (route data between two
    points)
  • ALU Design Techniques
  • many advanced transistor-level design techniques
    used to achieve fast ALU designs
  • gate-level designs can be flattened for better
    performance
  • basic ALU design is fairly simple

35
Design of One Bit of ALU
  • ALU can be designed as an adder that can
    conditionally perform other functions based on
    the selection of control inputs
  • ALU designed as a chain of identical 1-bit adders
  • may not be efficient for large numbers of bits
  • Adder functions
  • sum x EX-OR y EX-OR cin
  • cout (x AND y) OR (y AND cin) OR (x AND cin)
  • Alternative ALU designs shown in Sec. 4.2

36
y
x
Lee 2000
37
Floating-Point Arithmetic
  • IEEE Standard for floating-point numbers based on
    draft proposed by Kahan et. al. in 1979.
  • X (FX, EX), where FX mantissa, EX exponent
  • Multiplication multiply mantissas, add exponents
  • Division divide mantissas, subtract exponents
  • Addition shift one mantissa and add
  • Subtraction shift one mantissa and subtract

38
(No Transcript)
39
Floating-Point Addition Process (Assuming
Positive Numbers)
40
ASM Method Step 1 Pseudocode
41
Floating-Point Addition Units
  • Similar algorithm shown in Fig. 4.42
  • Example of algorithm execution shown in Fig. 4.43
  • Floating-point addition unit for IBM System/360
    shown in Fig. 4.44

42
Pipeline Processing Basic Structure
43
  • Speedup
  • Speedup(pipeline) Time(no pipeline) / Time
    (pipeline)
  • Space-Time Diagram
  • Efficiency
  • Ratio of numbered blocks to total number of
    space-time blocks
  • What is the efficiency of an ideal m-stage
    pipeline operating on N data items?

44
Example Pipeline Structure
Linear Pipeline Structure for Floating-point Multi
plication
45
Example Timing Diagram
After 4 clock cycles, there is one output result
every clock cycle
46
Categorization of Pipeline Structures
  • Based on Function
  • Instruction pipeline
  • Arithmetic pipeline (e.g., multiplier pipeline)
  • Based on Structure
  • Linear / Nonlinear
  • Static / Dynamic (multi-function)
  • Scalar / Vector

47
Simple Instruction Pipelines
  • Static linear pipeline of about 2-8 stages
  • Difficulties with simple static linear pipelines
  • Variations in instruction execution times
  • Variations in instruction lengths
  • Different number of accesses to memory to fetch
    instruction
  • Cannot quickly determine location of next
    instruction
  • Thus, instruction sets should be designed so that
    the resulting architectures are easily pipelined
  • Set of fixed-length, similar-complexity
    instructions

48
Instruction Pipeline Control
  • ASM Chart Method
  • Changes ASM chart to fetch the next instruction
    while the current instruction is being executed.
  • Pipelined Control Signals
  • Control logic generates control signals in the
    first stage
  • Control signals are pipelined along with the
    instructions
Write a Comment
User Comments (0)
About PowerShow.com