CSE 8383 - Advanced Computer Architecture - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

CSE 8383 - Advanced Computer Architecture

Description:

Normalize mantissa and adjust exponent. Round the product mantissa to a single length mantissa. ... Mantissa. Subtract. Exponents. Find. Leading 1. Round. Re ... – PowerPoint PPT presentation

Number of Views:399
Avg rating:3.0/5.0
Slides: 41
Provided by: rew7
Category:

less

Transcript and Presenter's Notes

Title: CSE 8383 - Advanced Computer Architecture


1
CSE 8383 - Advanced Computer Architecture
  • Week-3
  • Week of Jan 26, 2004
  • engr.smu.edu/rewini/8383

2
Contents
  • Linear Pipelines
  • Nonlinear pipelines
  • Instruction Pipelines
  • Arithmetic Operations
  • Design of Multifunction Pipeline

3
Linear Pipeline
  • Processing Stages are linearly connected
  • Perform fixed function
  • Synchronous Pipeline
  • Clocked latches between Stage i and Stage i1
  • Equal delays in all stages
  • Asynchronous Pipeline (Handshaking)

4
Latches
S1
S2
S3
L1
L2
Slowest stage determines delay
Equal delays ? clock period
5
Reservation Table
Time
X
X
X
X
S1
S2
S3
S4
6
5 tasks on 4 stages
Time
X X X X X
X X X X X
X X X X X
X X X X X
S1
S2
S3
S4
7
Non Linear Pipelines
  • Variable functions
  • Feed-Forward
  • Feedback

8
3 stages 2 functions
Y
X
S1
S2
S3
9
Reservation Tables for X Y
X X X
X X
X X X
S1
S2
S3
Y Y
Y
Y Y Y
S1
S2
S3
10
Linear Instruction Pipelines
  • Assume the following instruction execution
    phases
  • Fetch (F)
  • Decode (D)
  • Operand Fetch (O)
  • Execute (E)
  • Write results (W)

11
Pipeline Instruction Execution
I1 I2 I3
I1 I2 I3
I1 I2 I3
I1 I2 I3
I1 I2 I3
F
D
O
E
W
12
Dependencies
  • Data Dependency
  • (Operand is not ready yet)
  • Instruction Dependency
  • (Branching)
  • Will that Cause a Problem?

13
Data Dependency
  • I1 -- Add R1, R2, R3
  • I2 -- Sub R4, R1, R5

1
2
3
4
5
6
I1 I2
I1 I2
I1 I2
I1 I2
I1 I2
F
D
O
E
W
14
Solutions
  • STALL
  • Forwarding
  • Write and Read in one cycle
  • .

15
Instruction Dependency
  • I1 Branch o
  • I2

1
2
3
4
5
6
I1 I2
I1 I2
I1 I2
I1 I2
I1 I2
F
D
O
E
W
16
Solutions
  • STALL
  • Predict Branch taken
  • Predict Branch not taken
  • .

17
Floating Point Multiplication
  • Inputs (Mantissa1, Exponenet1), (Mantissa2,
    Exponent2)
  • Add the two exponents ? Exponent-out
  • Multiple the 2 mantissas
  • Normalize mantissa and adjust exponent
  • Round the product mantissa to a single length
    mantissa. You may adjust the exponent

18
Linear Pipeline for floating-point multiplication
Round
Normalize
Add Exponents
Multiply Mantissa
Round
Normalize
Accumulator
Partial Products
Add Exponents
Re normalize
19
Linear Pipeline for floating-point Addition
Partial Shift
Find Leading 1
Add Mantissa
Partial Shift
Subtract Exponents
Round
Re normalize
20
Combined Adder and Multiplier
B
Partial Products
C
G
H
F
A
Partial Shift
Find Leading 1
Add Mantissa
Partial Shift
Exponents Subtract / ADD
Round
Re normalize
E
D
21
Reservation Table for Multiply
1 2 3 4 5 6 7
A X
B X X
C X X
D X X
E X
F
G
H
22
Reservation Table for Addition
1 2 3 4 5 6 7 8 9
A Y
B
C Y
D Y
E Y
F Y Y
G Y
H Y Y
23
Nonlinear Pipeline Design
  • Latency
  • The number of clock cycles between two
    initiations of a pipeline
  • Collision
  • Resource Conflict
  • Forbidden Latencies
  • Latencies that cause collisions

24
Nonlinear Pipeline Design cont
  • Latency Sequence
  • A sequence of permissible latencies between
    successive task initiations
  • Latency Cycle
  • A sequence that repeats the same subsequence
  • Collision vector
  • C (Cm, Cm-1, , C2, C1), m lt n-1
  • n number of column in reservation table
  • Ci 1 if latency i causes collision, 0 otherwise

25
Mul Mul Collision (lunch after 1 cycle)
1 2 3 4 5 6 7
A X Z
B X X Z Z
C X X Z Z
D X Z X
E X Z
F
G
H
26
Mul Mul Collision (lunch after 2 cycles)
1 2 3 4 5 6 7
A X Z
B X X Z Z
C X X Z Z
D X X Z
E X
F
G
H
27
Mul Mul Collision (lunch after 3 cycles)
1 2 3 4 5 6 7
A X Z
B X X Z Z
C X X Z Z
D X X
E X
F
G
H
28
Collision Vector for Multiply after Multiply
  • Forbidden Latencies 1, 2
  • Collision vector
  • 0 0 0 0 1 1 ? 11
  • Maximum forbidden latency 2 ? m 2

29
Example
Y
X
S1
S2
S3
30
Reservation Tables for X Y
X X X
X X
X X X
S1
S2
S3
Y Y
Y
Y Y Y
S1
S2
S3
31
Reservation Tables for X Y
X X X
X X
X X X
S1
S2
S3
Y Y
Y
Y Y Y
S1
S2
S3
32
Forbidden Latencies
  • X after X
  • X after Y
  • Y after X
  • Y after Y

33
X after X
2
X1 X2 X1 X2 X1
X1 X2 X1 X2
X1 X2 X1 X2 X1
S1
S2
S3
5
X1 X2 X1 X1
X1 X1 X2
X1 X1 X1 X2
S1
S2
S3
34
X after X
4
X1 X2 X1 X1
X1 X1 X2 X2
X1 X1 X2 X1
S1
S2
S3
7
X1 X1 X2 X1
X1 X1
X1 X1 X1
S1
S2
S3
35
Collision Vector
  • Forbidden Latencies 2, 4, 5, 7
  • Collision Vector
  • 1 0 1 1 0 1 0

36
Y after Y
Y Y Y
Y Y
Y Y Y Y Y
S1
S2
S3
Y Y Y
Y
Y Y Y Y
S1
S2
S3
37
Collision Vector
  • Forbidden Latencies 2, 4
  • Collision Vector
  • 1 0 1 0

38
Exercise Find the collision vector
1 2 3 4 5 6 7
A X X X
B X X
C X X
D X
39
State Diagram for X
8
1 0 1 1 0 1 0
8
3
8
6
1
1 0 1 1 0 1 1
1 1 1 1 1 1 1
3
6
40
Cycles
  • Simple cycles ? each state appears only once
  • (3), (6), (8), (1, 8), (3, 8), and (6,8)
  • Greedy Cycles ? simple cycles whose edges are all
    made with minimum latencies from their respective
    starting states
  • (1,8), (3) ? one of them is MAL
Write a Comment
User Comments (0)
About PowerShow.com