Computer Architecture Lecture Notes Spring 2005 Dr. Michael P. Frank - PowerPoint PPT Presentation

About This Presentation
Title:

Computer Architecture Lecture Notes Spring 2005 Dr. Michael P. Frank

Description:

Computer Architecture Lecture Notes Spring 2005 Dr. Michael P. Frank Competency Area 4: Computer Arithmetic In previous chapters we ve discussed: Performance ... – PowerPoint PPT presentation

Number of Views:207
Avg rating:3.0/5.0
Slides: 35
Provided by: adrianj5
Learn more at: https://eng.fsu.edu
Category:

less

Transcript and Presenter's Notes

Title: Computer Architecture Lecture Notes Spring 2005 Dr. Michael P. Frank


1
Computer Architecture Lecture Notes Spring
2005Dr. Michael P. Frank
  • Competency Area 4
  • Computer Arithmetic

2
Introduction
  • In previous chapters weve discussed
  • Performance (execution time, clock cycles,
    instructions, MIPS, etc)
  • Abstractions Instruction Set Architecture
    Assembly Language and Machine Language
  • In this chapter
  • Implementing the Architecture
  • How does the hardware really add, subtract,
    multiply and divide?
  • Signed and unsigned representations
  • Constructing an ALU (Arithmetic Logic Unit)

3
Introduction
  • Humans naturally represent numbers in base 10,
    however, computers understand base 2.

Example -1 (1111 1111)2
255
Note Signed representation includes
sign-magnitude and twos complement. Also,
ones complement representation.
4
Possible Representations
  • Sign Magnitude One's Complement
    Two's Complement 000 0 000 0 000
    0 001 1 001 1 001 1 010 2 010
    2 010 2 011 3 011 3 011 3 100
    -0 100 -3 100 -4 101 -1 101
    -2 101 -3 110 -2 110 -1 110
    -2 111 -3 111 -0 111 -1
  • Sign Magnitude (first bit is sign bit, others
    magnitude)
  • Twos Complement (negation invert bits and add
    1)
  • Ones Complement (first bit is sign bit, invert
    other bits for magnitude)
  • NOTE Computers today use twos complement binary
    representations for signed numbers.

5
Twos Complement Representations
  • 32 bit signed numbers (MIPS)0000 0000 0000
    0000 0000 0000 0000 0000two 0ten0000 0000 0000
    0000 0000 0000 0000 0001two 1ten0000 0000
    0000 0000 0000 0000 0000 0010two
    2ten...0111 1111 1111 1111 1111 1111 1111
    1110two 2,147,483,646ten0111 1111 1111 1111
    1111 1111 1111 1111two 2,147,483,647ten1000
    0000 0000 0000 0000 0000 0000 0000two
    2,147,483,648ten1000 0000 0000 0000 0000 0000
    0000 0001two 2,147,483,647ten1000 0000 0000
    0000 0000 0000 0000 0010two
    2,147,483,646ten...1111 1111 1111 1111 1111
    1111 1111 1101two 3ten1111 1111 1111 1111
    1111 1111 1111 1110two 2ten1111 1111 1111
    1111 1111 1111 1111 1111two 1ten
  • The hardware need only test the first bit to
    determine the sign.

6
Twos Complement Operations
  • Negating a two's complement number
  • invert all bits and add 1
  • Or, preserve rightmost 1 and 0s to its right,
    flip all bits to the left of the rightmost 1
  • Converting n-bit numbers into m-bit numbers with
    m gt n
  • Example Convert 4-bit signed number into
    8-bit number.
  • 0010 ? 0000 0010 (210)
  • 1010 ? 1111 1010 (-610)
  • "sign extension" is used. The most significant
    bit is copied into the right portion of the new
    word. For unsigned numbers, the leftmost bits
    are filled with 0s.
  • Example instructions lbu/lb, slt/sltu, etc.

7
Addition and Subtraction
  • Just like in grade school (carry/borrow 1s)
    0111 0111 0110  0110 - 0110 - 0101
  • Two's complement operations easy
  • subtraction using addition of negative numbers
    0111  1010
  • Overflow (result too large for finite computer
    word)
  • e.g., adding two n-bit numbers does not yield an
    n-bit number 0111  0001 note that
    overflow term is somewhat misleading, 1000 it
    does not mean a carry overflowed

8
32-bit ALU with Zero Detect
Recall that given following control lines, we
get these functions 000 and 001 or
010 add 110 subtract 111 slt Weve
learned how to build each of these functions in
hardware.
9
So far
  • Weve studied how to implement a 1-bit ALU in
    hardware that supports the MIPS instruction set
  • key idea use multiplexor to select desired
    output function
  • we can efficiently perform subtraction using
    twos complement
  • we can replicate a 1-bit ALU to produce a 32-bit
    ALU
  • Important issues about hardware
  • all of the gates are always working
  • the speed of a gate is affected by the number of
    inputs to the gate
  • the speed of a circuit is affected by the number
    of gates in series (on the critical path or
    the deepest level of logic)
  • Changes in hardware organization can improve
    performance
  • well look at examples for addition (carry
    lookahead adder) and multiplication, and division

10
Better adder design
  • For adder design
  • Problem ? ripple carry adder is slow due to
    sequential evaluation of carry-in/carry-out bits
  • Consider the carryin inputs
  • Using substitution, we can see the ripple
    effect

11
Carry-Lookahead Adder
  • Faster carry schemes exist that improve the speed
    of adders in hardware and reduce complexity in
    equations, namely the carry lookahead adder.
  • Let cini represent the ith carryin bit, then
  • We can now define the terms generate and
    propagate
  • Then,

12
Carry-Lookahead Adder
  • Suppose gi is 1. The adder generates a carryout
    independent of the value of
    the carryin, i.e.
  • Now suppose gi is 0 and pi is 1
  • The adder propagates a carryin to a carryout. In
    summary, cout is 1 if either gi is 1 or both pi
    and cin are 1.
  • This new approach creates the first level of
    abstraction.

13
Carry-Lookahead Adder
  • Sometimes the first level of abstraction will
    produce large equations. It is beneficial then
    to look at the second level of abstraction. It
    is produced by considering a 4-bit adder where we
    propagate and generate signals at a higher level
  • Were representing a 16-bit adder, with a
    super propagate signal
  • and a super generate signal.
  • So Pi is true only if the each of the bits in
    the group propagates a
  • carry.

14
Carry-Lookahead Adder
  • For the super generate signals it matters only
    if there is a carry out in the most significant
    bit.
  • Now we can represent the carryout signals for
    the 16-bit adder
  • with two levels of abstraction as

15
2nd Level of Abstraction Carry-LookAhead Adder
Design
16
O(log n)-time carry-skip adder
With this structure, we can do a2n-bit add in
2(n1) logic stagesHardwareoverhead islt2
regularripple-carry.
  • (8 bit segment shown)

3rd carry tick
2nd carry tick
4th carry tick
1st carry tick
17
Multiplication Algorithms
  • Recall that multiplication is accomplished via
    shifting and addition.
  • Example
  • 0010 (multiplicand)
  • x 0110 (multiplier)
  • 0000
  • 0010 (shift multiplicand left 1 bit)
  • 00100
  • 0010
  • 0001100 (product)

Multiply by LSB of multiplier
Intermediate product
18
Multiplication Algorithm 1
Hardware implementation of Algorithm 1
19
Multiplication Algorithm 1
For each bit
20
Multiplication Algorithm 1
Example (4-bit)
Iteration Step Multiplier Multiplicand Product
0 Initial Steps 0011 0000 0010 0000 0000
1 1a ? LSB multiplier 1 0011 0000 0010 0000 0010
1 2 ? Shift Mcand lft -- 0000 0100 0000 0010
1 3 ? shift Multiplier rgt 0001 0000 0100 0000 0010
2 1a ? LSB multiplier 1 0001 0000 0100 0000 0010
2 2 ? Shift Mcand lft -- 0000 1000 0000 0110
2 3 ? shift Multiplier rgt 0000 0000 1000 0000 0110
3 1 ? LSB multiplier 0 0000 0000 1000 0000 0110
3 2 ? Shift Mcand lft -- 0001 0000 0000 0110
3 3 ? shift Multiplier rgt 0000 0001 0000 0000 0110
4 1 ? LSB multiplier 0 0000 0001 0000 0000 0110
4 2 ? Shift Mcand lft 0010 0000 0000 0110
4 3 ? shift Multiplier rgt 0000 0010 0000 0000 0110
21
Multiplication Algorithms
  • For Algorithm 1 we initialize the left half of
    the multiplicand to 0 to accommodate for its left
    shifts. All adds are 64 bits wide. This is
    wasteful and slow.
  • Algorithm 2? instead of shifting multiplicand
    left, shift product register to the right gt half
    the widths of the ALU and multiplicand

22
Multiplication Algorithm 2
For each bit
23
Multiplication Algorithm 2
Example (4-bit)
Iteration Step Multiplier Multiplicand Product
0 Initial Steps 0011 0010 0000 0000
1 1a ? LSB multiplier 1 0011 0010 0010 0000
1 2 ? Shift Product register rgt 0011 0010 0001 0000
1 3 ? shift Multiplier rgt 0001 0010 0001 0000
2 1a ? LSB multiplier 1 0001 0010 0011 0000
2 2 ? Shift Product register rgt 0001 0010 0001 1000
2 3 ? shift Multiplier rgt 0000 0010 0001 1000
3 1 ? LSB multiplier 0 0000 0010 0001 1000
3 2 ? Shift Product register rgt 0000 0010 0000 1100
3 3 ? shift Multiplier rgt 0000 0010 0000 1100
4 1 ? LSB multiplier 0 0000 0010 0000 1100
4 2 ? Shift Product register rgt 0000 0010 0000 0110
4 3 ? shift Multiplier rgt 0000 0010 0000 0110
24
Multiplication Algorithm 3
  • The third multiplication algorithm combines the
    right half of
  • the product with the multiplier.
  • This reduces the number of steps to implement
    the multiply and
  • it also saves space.
  • Hardware Implementation of Algorithm 3

25
Multiplication Algorithm 3
For each bit
26
Multiplication Algorithm 3
Example (4-bit)
Iteration Step Multiplicand Product
0 Initial Steps 0010 0000 0011
1 1a ? LSB product 1 0010 0010 0011
1 2 ? Shift Product register rgt 0010 0001 0001
2 1a ? LSB product 1 0010 0011 0001
2 2 ? Shift Product register rgt 0010 0001 1000
3 1 ? LSB product 0 0010 0001 1000
3 2 ? Shift Product register rgt 0010 0000 1100
4 1 ? LSB product 0 0010 0000 1100
4 2 ? Shift Product register rgt 0010 0000 0110
27
Division Algorithms
  • Example
  • Hardware implementations are similar to
    multiplication algorithms
  • Algorithm 1 ? implements conventional division
    method
  • Algorithm 2 ? reduces divisor register and ALU by
    half
  • Algorithm 3 ? eliminates quotient register
    completely

28
Floating Point Numbers
  • We need a way to represent
  • numbers with fractions, e.g., 3.1416
  • very small numbers, e.g., .000000001
  • very large numbers, e.g., 3.15576 109
  • Representation
  • sign, exponent, significand (1)sign
    significand 2exponent
  • more bits for significand gives more accuracy
  • more bits for exponent increases dynamic range
  • IEEE 754 floating point standard
  • For Single Precision 8 bit exponent, 23 bit
    significand, 1 bit sign
  • For Double Precision 11 bit exponent, 52 bit
    significand, 1 bit sign

29
IEEE 754 floating-point standard
  • Leading 1 bit of significand is implicit
  • Exponent is usually biased to make sorting
    easier
  • All 0s is smallest exponent, all 1s is largest
  • bias of 127 for single precision and 1023 for
    double precision
  • Summary
  • (1)sign (1fraction) 2exponent bias
  • Example -0.7510 -1.12?2-1
  • Single precision (-1)1 ? (1 .1000) ?
    2126-127
  • 10111111010000000000000000000000
  • Double precision (-1)1 ? (1 .1000) ?
    21022-1023
  • 10111111111010000000000000000000000(32 more 0s)

30
FP Addition Algorithm
  • The number with the smaller exponent must be
    shifted right before adding.
  • So the binary points align.
  • After adding, the sum must be normalized.
  • Then it is rounded,
  • and possibly re-normalized
  • Possible errors include
  • Overflow (exponent too big)
  • Underflow (exp. too small)

31
Floating-Point Addition Hardware
  • Implementsalgorithmfrom prev.slide.
  • Note highcomplexitycomparedwith
    integeraddition HW.

32
FP Multiplication Algorithm
  • Add the exponents.
  • Adjusting for bias.
  • Multiply the significands.
  • Normalize,
  • Check for over/under flow,
  • then round.
  • Repeat if necessary.
  • Compute the sign.

33
Ethics Addendum Intel Pentium FP bug
  • In July 1994, Intel discovered there was a bug in
    the Pentiums FP division hardware
  • But decided not to announce the bug, and go ahead
    and ship chips having the flaw anyway, to save
    them time money
  • Based on their analysis, they thought errors
    could arise only rarely.
  • Even after the bug was discovered by users,
    Intel initially refused toreplace the bad chips
    on request!
  • They got a lot ofbad PR from this
  • Lesson Good, ethicalengineers fix problems
    when they first find them, and dont cover them
    up!

34
Summary
  • Computer arithmetic is constrained by limited
    precision.
  • Bit patterns have no inherent meaning but
    standards do exist
  • twos complement
  • IEEE 754 floating point
  • Computer instructions determine meaning of the
    bit patterns.
  • Performance and accuracy are important so there
    are many complexities in real machines (i.e.,
    algorithms and implementation). Please read
    the remainder of the chapter on your own.
    However, you will only be responsible for the
    material that has been covered in the lectures
    for exams.
Write a Comment
User Comments (0)
About PowerShow.com