CS 2200 PowerPoint PPT Presentation

presentation player overlay
1 / 25
About This Presentation
Transcript and Presenter's Notes

Title: CS 2200


1
CS 2200
  • Presentation 7b
  • Pipelining Hazards

2
  • R0 is always 0
  • R1 Loop counter
  • R2 Loop termination value
  • R3 contains address of array A
  • R4 contains address of array B
  • add R1, R0, R0 Initialize loop
    counter
  • addi R2, R0, 10 Termination val
  • loop beq R1, R2, done Are we done?
  • lw R5, 0(R3) Get Ai
  • addi R5, R5, 7 Add 7
  • sw R5, 0(R4) Store in Bi
  • addi R3, R3, 1 Increment A
  • addi R4, R4, 1 Increment B
  • addi R1, R1, 1 Increment counter
  • beq R0, R0, loop Go back do it
    again
  • done halt

3
M X
?
1
P C
Instr Mem
DPRF
A
Data Mem
M X
M X
D
SE
WB
EX
MEM
ID
IF
4
???
beq
M X
?
1
P C
Instr Mem
DPRF
A
Data Mem
M X
M X
D
SE
WB
EX
MEM
ID
IF
5
Solution
  • Branch Delay Slot

6
  • R0 is always 0
  • R1 Loop counter
  • R2 Loop termination value
  • R3 contains address of array A
  • R4 contains address of array B
  • add R1, R0, R0
  • addi R2, R0, 10
  • loop beq R1, R2, done
  • nop Branch delay slot
  • lw R5, 0(R3)
  • addi R5, R5, 7
  • sw R5, 0(R4)
  • addi R3, R3, 1
  • addi R4, R4, 1
  • addi R1, R1, 1
  • beq R0, R0, loop
  • nop Branch delay slot
  • done halt

7
  • R0 is always 0
  • R1 Loop counter
  • R2 Loop termination value
  • R3 contains address of array A
  • R4 contains address of array B
  • add R1, R0, R0
  • addi R2, R0, 10
  • loop beq R1, R2, done
  • nop Delay slot
  • lw R5, 0(R3)
  • addi R5, R5, 7 Dependency
  • sw R5, 0(R4)
  • addi R3, R3, 1
  • addi R4, R4, 1
  • addi R1, R1, 1
  • beq R0, R0, loop
  • nop Delay slot
  • done halt

8
opc
beq
addi
add
opc
M X
?
1
R1
P C
Instr Mem
DPRF
A
Data Mem
R2
M X
M X
D
SE
R1 value
R1
WB
EX
MEM
ID
IF
9
Extreme Closeup
DPRF
MUX
Rx
MEM
R1
Rx val
R1 value
Rx
Comparator (i.e. are they equal?)
R1
10
opc
beq
addi
add
opc
M X
?
1
R1
R2 value
P C
Instr Mem
DPRF
A
Data Mem
R2
M X
M X
D
SE
R1 value
R2
R1
WB
EX
MEM
ID
IF
11
  • R0 is always 0
  • R1 Loop counter
  • R2 Loop termination value
  • R3 contains address of array A
  • R4 contains address of array B
  • add R1, R0, R0
  • addi R2, R0, 10
  • loop beq R1, R2, done
  • nop Delay slot
  • lw R5, 0(R3)
  • addi R5, R5, 7 Dependency
  • sw R5, 0(R4)
  • addi R3, R3, 1
  • addi R4, R4, 1
  • addi R1, R1, 1
  • beq R0, R0, loop
  • nop Delay slot
  • done halt

12
addi
lw
???
M X
?
1
R5
P C
Instr Mem
DPRF
addr
A
Data Mem
M X
M X
D
SE
7
R5
WB
EX
MEM
ID
IF
13
addi
lw
M X
?
1
old R5 value
R5 value
P C
Instr Mem
DPRF
addr
A
Data Mem
M X
M X
7
D
SE
R5
WB
EX
MEM
ID
IF
14
addi
lw
???
STALL!
M X
?
1
R5
P C
Instr Mem
DPRF
addr
A
Data Mem
M X
M X
D
SE
7
R5
WB
EX
MEM
ID
IF
15
addi
bubble
lw
M X
?
1
R5
R5 value
P C
Instr Mem
DPRF
addr
A
Data Mem
M X
M X
D
SE
7
R5
WB
EX
MEM
ID
IF
16
  • R0 is always 0
  • R1 Loop counter
  • R2 Loop termination value
  • R3 contains address of array A
  • R4 contains address of array B
  • add R1, R0, R0
  • addi R2, R0, 10
  • loop beq R1, R2, done
  • nop Delay slot
  • lw R5, 0(R3)
  • addi R5, R5, 7 Dependency
  • sw R5, 0(R4) Dependency
  • addi R3, R3, 1
  • addi R4, R4, 1
  • addi R1, R1, 1
  • beq R0, R0, loop
  • nop Delay slot
  • done halt

17
sw
addi
M X
?
1
P C
R5 value
Instr Mem
DPRF
addr
A
Data Mem
M X
M X
D
SE
R5
R5
WB
EX
MEM
ID
IF
18
  • R0 is always 0
  • R1 Loop counter
  • R2 Loop termination value
  • R3 contains address of array A
  • R4 contains address of array B
  • add R1, R0, R0
  • addi R2, R0, 10
  • loop beq R1, R2, done
  • nop Delay slot
  • lw R5, 0(R3)
  • addi R5, R5, 7 Dependency
    Delay
  • sw R5, 0(R4)
  • addi R3, R3, 1
  • addi R4, R4, 1
  • addi R1, R1, 1
  • beq R0, R0, loop
  • nop Delay slot
  • done halt

19
  • R0 is always 0
  • R1 Loop counter
  • R2 Loop termination value
  • R3 contains address of array A
  • R4 contains address of array B
  • add R1, R0, R0
  • addi R2, R0, 10
  • loop beq R1, R2, done
  • nop Delay slot
  • lw R5, 0(R3)
  • addi R5, R5, 7 Dependency
    Delay
  • sw R5, 0(R4)
  • addi R3, R3, 1
  • addi R4, R4, 1
  • addi R1, R1, 1
  • beq R0, R0, loop
  • nop Delay slot
  • done halt

20
  • R0 is always 0
  • R1 Loop counter
  • R2 Loop termination value
  • R3 contains address of array A
  • R4 contains address of array B
  • add R1, R0, R0
  • addi R2, R0, 10
  • loop beq R1, R2, done
  • lw R5, 0(R3) Delay slot
  • addi R5, R5, 7 Dependency
    Delay
  • sw R5, 0(R4)
  • addi R3, R3, 1
  • addi R4, R4, 1
  • addi R1, R1, 1
  • beq R0, R0, loop
  • nop Delay slot
  • done halt

21
  • R0 is always 0
  • R1 Loop counter
  • R2 Loop termination value
  • R3 contains address of array A
  • R4 contains address of array B
  • add R1, R0, R0
  • addi R2, R0, 10
  • loop beq R1, R2, done
  • lw R5, 0(R3) Delay slot
  • addi R5, R5, 7 Dependency
    Delay
  • sw R5, 0(R4)
  • addi R3, R3, 1
  • addi R1, R1, 1
  • beq R0, R0, loop
  • addi R4, R4, 1 Delay slot
  • done halt

22
  • Optimized
  • R0 is always 0
  • R1 Loop counter
  • R2 Loop termination value
  • R3 contains address of array A
  • R4 contains address of array B
  • add R1, R0, R0
  • addi R2, R0, 10
  • loop beq R1, R2, done
  • lw R5, 0(R3) Delay slot
  • addi R3, R3, 1 No Dependency
  • addi R5, R5, 7
  • sw R5, 0(R4)
  • addi R1, R1, 1
  • beq R0, R0, loop
  • addi R4, R4, 1 Delay slot
  • done halt

23
With LC-2200 and a 5 stage pipeline with no
forwarding, squashing, or flushing
  • RAW hazard
  • 3 bubbles in pipeline
  • BEQ
  • 2 bubbles if branch not taken
  • 3 bubbles if branch taken
  • RAW with forwarding
  • 0 bubbles if ALU instructions
  • 1 bubble if load followed by ALU
  • With branch determination in IDRR stage
  • 1 bubble if branch not taken
  • 2 bubbles if branch taken
  • This can be reduced to 0 bubbles if branch not
    taken if the result of the comparison is not
    buffered through the Z register

24
Questions?
25
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com