CS 42906290 Lecture 07 Outoforder execution, Outoforder completion a.k.a. the cool stuff

About This Presentation

Title:

CS 42906290 Lecture 07 Outoforder execution, Outoforder completion a.k.a. the cool stuff

Description:

CS 4290/6290 Lecture 07 Out-of-Order. CS 4290/6290 Lecture 07. Out-of ... Option 2: Arrgh! Let's look at Option 1 for now. 25. The College of Computing ... – PowerPoint PPT presentation

Number of Views:53

Avg rating:3.0/5.0

Slides: 70

Provided by: michaelt8

Category:

more less

Transcript and Presenter's Notes

Title: CS 42906290 Lecture 07 Outoforder execution, Outoforder completion a.k.a. the cool stuff

1
CS 4290/6290 Lecture 07Out-of-order
execution,Out-of-order completion(a.k.a. the
cool stuff)

(Lectures based on the work of Jay Brockman,
Sharon Hu, Randy Katz, Peter Kogge, Bill Leahy,
Ken MacKenzie, Richard Murphy, Michael Niemier,
and Milos Pruvlovic)

2
Scheduling

Finds instructions to execute in each cycle
Static (in-order) schedulinglooks only at the
next instruction
Dynamic (out-of-order) schedulinglooks at a
window of instructions
How many instructions are we looking for?
3-4 is typical today, 8 is in the works
A CPU that can ideally do N instrs per cycleis
called N-way superscalar, N-issue
superscalar, or simply N-way or N-issue.

3
Static Scheduling

Cycle 1
Start I1.
Can we also start I2? No.
Cycle 2
Start I2.
Can we also start I3? Yes.
Can we also start I4? No.
If the next instruction can not start,stops
looking for things to do in this cycle!

Program code
I1 ADD R1, R2, R3
I2 SUB R4, R1, R5
I3 AND R6, R1, R7
I4 OR R8, R2, R6
I5 XOR R10, R2, R11
4
Dynamic Scheduling

Cycle 1
Operands ready? I1, I5.
Start I1, I5.
Cycle 2
Operands ready? I2, I3.
Start I2,I3.
Window size (W)how many instructions ahead do
we look.
Do not confuse with issue width (N).
E.g. a 4-issue out-of-order processor can have a
128-entry window (it can look at the next 128
instructions).

Program code
I1 ADD R1, R2, R3
I2 SUB R4, R1, R5
I3 AND R6, R1, R7
I4 OR R8, R2, R6
I5 XOR R10, R2, R11
5
Dynamic Scheduling Pipeline

Fetch gets the next few instructions(reads the
instruction stream in-order)
Decode decodes the instructions fetched in the
previous cycle (in-order)
Then we can start looking at instructions and try
to execute them out of order.
Important we fetch and decode in-order even in
an out-of-order processor.

6
Register Renaming

Name dependences
I3 can not go before I2 becauseI3 will overwrite
R5
I5 can not go before I2 becauseI2, when it goes,
will overwriteR2 with a stale value
Name dependences because the dependence is
because of register name,not the flow of data.

Program code
I1 ADD R1, R2, R3
I2 SUB R2, R1, R5
I3 AND R5, R11, R7
I4 OR R8, R6, R2
I5 XOR R2, R4, R11
7
Register Renaming

Solution give I3 some othersome other name
(e.g. S)for the value it produces.
But I4 uses that value,so we must also change
that to S
In fact, all uses of R5 from I3 to the next
instruction that writes to R5 again must now be
changed to S!
We get rid of output dependences in the same way
change R2 in I5 (and subsequent instrs) to T.

Program code
I1 ADD R1, R2, R3
I2 SUB R2, R1, R5
I3 AND R5, R11, R7
I4 OR R8, R6, R2
I5 XOR R2, R4, R11
8
Register Renaming

Implementation
Space for T, S, etc.
How do we know whento rename a register?
Simple Solution
Do renaming in-order, just after decoding
Change the name of a registereach time we decode
aninstruction that will write to it.
Remember what name we gave it ?

Program code
I1 ADD R1, R2, R3
I2 SUB R2, R1, R5
I3 AND S, R11, R7
I4 OR R8, R6, R2
I5 XOR T, R4, R11
9
Register Renaming Example
Renaming table
Original
Renamed
Destination
R1
T1
Source
R2
R2

Source
R5
R5

R8
R8

Decoded
Renamed
I1 ADD T1, R2, R3
I1 ADD R1, R2, R3
10
Register Renaming Example
Renaming table
Original
Renamed
Source
R1
T1
Destination
R2
T2

Source
R5
R5

R8
R8

Decoded
Renamed
I1 ADD T1, R2, R3
I1 ADD R1, R2, R3
I2 SUB T2, T1, R5
I2 SUB R2, R1, R5
11
Register Renaming Example
Renaming table
Original
Renamed
R1
T1
R2
T2

Destination
R5
T3

Source
R8
R8

Source
Decoded
Renamed
I1 ADD T1, R2, R3
I1 ADD R1, R2, R3
I2 SUB T2, T1, R5
I2 SUB R2, R1, R5
I3 AND R5, R11, R7
I3 AND T3, R11, R7
12
Register Renaming Example
Renaming table
Original
Renamed
R1
T1
R2
T2

R5
T3

R8
T4

Decoded
Renamed
I1 ADD T1, R2, R3
I1 ADD R1, R2, R3
I2 SUB T2, T1, R5
I2 SUB R2, R1, R5
I3 AND R5, R11, R7
I3 AND T3, R11, R7
I4 OR R8, R6, R2
I4 OR T4, R6, T2
13
Register Renaming Example
Renaming table
Original
Renamed
R1
T1
R2
T5

R5
T3

R8
T4

Decoded
Renamed
I1 ADD T1, R2, R3
I1 ADD R1, R2, R3
I2 SUB T2, T1, R5
I2 SUB R2, R1, R5
I3 AND R5, R11, R7
I3 AND T3, R11, R7
I4 OR R8, R6, R2
I4 OR T4, R6, T2
I5 XOR R2, R4, R11
I5 XOR T5, R4, R11
14
Register Names

We keep using new names
Each name needs a place to keep its value
We can have only so many of those places
What happens when we run out of names?
There must be a way to recycle names
When can we recycle a name?
When we have given its value to allinstructions
that use it as a source operand!
This is not as easy as it sounds

15
Implementing Dynamic Scheduling

Tomasulos Algorithm
Used in IBM 260/91 (in the 60s)
Tracks when operands are availableto satisfy
data dependences
Removes name dependencesthrough register
renaming
Very similar to what is used today

16
Tomasulos Algorithm The Picture
17
Tomasulos Algorithm Issue

Get next instruction from instruction queue.
Find a free reservation station for it(if none
are free, stall until one is)
Read operands that are in the registers
If the operand is not in the register,find which
reservation station will produce it
In effect, this step renames registers(reservatio
n station IDs are temporary names)

18
Tomasulos Algorithm Execute

Monitor results as they are produced
Put a result into all reservation stations
waiting for it (missing source operand)
When all operands available for an
instruction,it is ready (we can actually execute
it)
Several ready instrs for one functional unit?
Pick one.
Except for load/storeLoad/Store must be done
inthe proper order to avoid hazards through
memory

19
Tomasulos Algorithm Write Result

When result is computed, make it availableon the
common data bus (CDB), wherewaiting
reservation stations can pick it up
Stores write to memory
Result stored in the register file
This step frees the reservation station
For our register renaming, this recycles the
temporary name future instructions can again find
the value in the actual register, until it is
renamed again)

20
Tomasulos Algorithm Load/Store

The reservation stations take care of dependences
through registers.
Dependences also possible through memory
Stores can not be reordered with respect toother
load/store operations to the same address
Example
Can I3 execute before I2?
Not if R3 is 100!

I1 ADD R1, R2, R3
I2 ST R4, 100(R1)
I3 LD R4, (R2)
21
Tomasulos Algorithm Load/Store

Load
Wait for all previous stores to compute address
If any store to the same address,wait for it to
actually write to memory
Alternatively, just get the value of the last
such store
Store
Wait for all previous loads and stores to compute
addresses
If any load/store from/to the same address,wait
for it to read/write

22
Tomasulos Algorithm Example

We need to have
Instruction status
Not part of HW, but having it makes our life
easier
Reservation stations
All fields for each reservation station
Register status
Which reservation station it is renamed to

Loop L.D F0, 0(R1) Load 64-bit FP
value MUL.D F4,F0,F2 Multiply
FP S.D F4,0(R1) Store 64-bit FP
value DADDUI R1,R1,-8 Add (int)
immediate BNE R1,R2,Loop Branch if R1!R2
23
Branches Kill!

Branches are very frequent
Approx. 20 of all instructions
Can not wait until we know where it goes
Long pipelines
Branch outcome known after B cycles
No scheduling past the branch until outcome known
Superscalars (e.g. 4-way)
Branch every cycle or so!
One cycle of work, then bubbles for B cycles?

24
Surviving Branches Prediction

Predict Branches
And predict them well!
Fetch, decode, etc. on the predicted path
Option 1 No execute until branch resovled
Option 2 Execute anyway (speculation)
Recover from mispredictions
Option 1 Restart fetch from correct path
Option 2 Arrgh! Lets look at Option 1 for now

25
Branch Prediction

Need to know two things
Whether the branch is taken or not (direction)
The target address if it is taken (target)
Direct jumps, Function calls
Direction known (always taken), target easy to
compute
Conditional Branches (typically PC-relative)
Direction difficult to predict, target easy to
compute
Indirect jumps, function returns
Direction known (always taken), target difficult

26
Branch Prediction Direction

Needed for conditional branches
Most branches are of this type
Many, many kinds of predictors for this
Static compiler annotation(e.g. BEQL is
branch if equal likely)
Dynamic hardware prediction
Dynamic prediction usually history-based
Example predict direction is the sameas the
last time this branch was executed

27
One-Bit Branch Predictor
Branch historytable of 2K entries,1 bit per
entry
K bits of branchinstruction address
Use this entry topredict this branch 0
predict not taken 1 predict taken
Index
When branch direction resolved,go back into the
table andupdate entry 0 if not taken, 1 if taken
28
The Bit Is Not Enough!

Example short loop (8 iterations)
Taken 7 times, then not taken once
Not-taken misspredicted (was taken previously)
Execute the same loop again
First always misspredicted(previous outcome was
not taken)
Then 6 predicted correctly
Then last one misspredicted again
Each fluke in a stable patternresults in two
misspredicts per loop

29
Two Bits are Better Than One

Two-Bit Predictor
First bit is the prediction
Second bit tells if it is strong or weak
A misspredict will
Weaken a strong prediction
Change a weak predictionto the opposite
strongprediction
Correct prediction will
Strengthen a weak prediction
Leave strong predictions strong

30
Still Not Good Enough
We can live with these
These are good
This is bad!
31
(N,M) Correlating Predictors

Branch outcome correlates with the outcome of
some recently executed branches
Use this in our prediction
Keep N bits of historyof recent outcomes
Use a different M-bitpredictor for each
differenthistory
Note N-bit history means2N different
predictors foreach branch

32
The gShare Predictor

Correlating predictors often wasteful
Some histories are rare or even impossible
Yet we dedicate a predictor for each history
Solution hashing
Use a single large predictor table
Hash history and branch address together
Use the hash to index into the table
The hash is just an XOR, so its fast

33
The gShare Predictor
K bits of branchinstruction address
Index
Prediction
XOR
N bits of globalbranch history
Table of 2-bitpredictors with2max(N,K)entries
34
The pShare Predictor

Similar to gShare, but uses local history

Branch address
L bits
K bits
N-bit localhistory
Prediction
XOR
Index
Index
Table of 2-bitpredictors with2max(N,L)entries
Table of local historieswith 2K entries,each
entry has N bits
35
Why pShare is Good?

Long local history (e.g. 10 bits) used to choose
the actual predictor for the branch
Back to our 8-iteration loop example
The 8-th (not taken) branch would alwayshave a
history of 1101111111
All other seven instances of this branch inthat
loop have different histories
So, after a few passes through the loop to
train the predictors, we have perfect
prediction each time

36
Why pShare is Bad?

Needs a lot of branch instances totrain the
different 2-bit predictors
Simple 2-bit predictor
Has a prediction after it sees one instance of a
branch
The pShare predictor
Has a prediction after it sees an instanceof
that branch and that particular history
Back to our loop example
pShare needs two entire 8-iteration loops to
warm up
Starts making useful predictions onlywhen we
enter the same loop for the second time

37
Tournament Predictors

No predictor is clearly the best
Simple 2-bit warms up quicklyand uses only 2
bits per branch
pShare uses many bits per branch,but tends to be
much better after warming-up
IdeaLets have a predictor to predictwhich
predictor will predict better ?

38
Direction Predictor Accuracy
39
Target Address Prediction

Branch Target Buffer
IF stage need to know fetch addr every cycle
Need target address one cycle after fetching a
branch
For some branches (e.g. indirect) target
knownonly after EX stage, which is way too late
Even easily-computed branch targets need to wait
until instruction decoded and direction predicted
in ID stage(still at least one cycle too late)
So, we have a quick-and-dirty predictor for the
targetthat only needs the address of the branch
instruction

40
Branch Target Buffer

BTB indexed by instruction address
We dont even know if it is a branch!
If address matches a BTB entry, it ispredicted
to be a branch
BTB entry tells whether it is taken (direction)
and where it goes if taken
BTB takes only the instruction address, sowhile
we fetch one instruction in the IF stagewe are
predicting where to fetch the next one from

41
Branch Target Buffer
42
Return Address Stack (RAS)

Function returns are frequent, yet
Address is difficult to compute(have to wait
until EX stage done to know it)
Address difficult to predict with BTB(function
can be called from multiple places)
But return address is actually easy to predict
It is the address after the last call
instructionthat we havent returned from yet
Hence the Return Address Stack

43
Return Address Stack (RAS)

Call pushes return address into the RAS
When a return instruction decoded,pop the
predicted return address from RAS
Accurate prediction even w/ small RAS

44
Life Story of a Branch

BTB predicts next address in IF stage
Later, after decoding we can get a second
prediction from RAS or direction predictor
These are usually better than BTB, so if they say
differently, we make bubbles and restart fetch
from new prediction
Finally, the actual branch outcome becomes known
eventually. If it is different from prediction,
bubbles and restart fetch again

45
Speculation

Predict branches, then do everything(execute,
write result, schedule instructions)
What do we do when we mispredict?
Two things
Allow things-before-the-branch to complete
Undo things-after-the-branch we have completed
Solution
At the end, put instructions in the correct order
again

46
Speculation Pipeline

New Structure Reorder Buffer (ROB)
Queues instructions in the original order
Use ROB entry number as name in renaming
ROB entry keeps the result after Write Result
New stage Commit
Takes the oldest instruction in ROB
If instruction executed and result in ROB entry
Write result to registers
Free the ROB entry
Do this N times per cycle in a N-way superscalar

47
Recovery From a Misprediction

Mispredicted branch eventually committed
Now precise state is in the registers
Everything before the branch done and in regs
Nothing after the branch is in regs yet
Flush all the other structures
Reservation stations, ROB, instruction queue
Restart fetch from correct destination
Precise exceptions? Same thing!

48
Speculation Stores

ROB takes over the role of the store queue
Stores go to memory when they commit
Commit is in-order, so store order is correct
Mispredictions do not affect memory state

49
Speculation The Picture
50
ROB vs. Register Renaming

How many ports do we need for the ROB?
Lots! Look at a single-issue processor
Issue read two entries and write one
Write Result write one entry
Commit read and write one entry
ROB has a dual role
Keeps results (names)
Keeps order
Lets split the two roles

51
ROB vs. Register Renaming

Keeping results physical registers
Have a large physical register file
Keep architected-to-physical mapping in a table
Physical registers hold all values (names)
Keeping order simplified ROB
Only keeps info needed to commit instructions
Reservation stations also simplified
No need to keep values
Called instruction window instead of RS

52
How does it work?

Rename
Find in the rename RAT (Register Allocation
Table)which physical registers are sources
Get a free physical register for destinationand
change rename RAT
Dispatch
Wait in windowuntil all source registers have
values, then
Read source values from registers
Write Result
Send result to destination register
Send destination register number to window

53
Committing

Wait until oldest instruction done
Change commit RAT
Before it said Rn is in Pj
Now change it so Rn is in Pk (the destination)
Free physical register Pj
Everything that wants Pj is already committed
All future uses of Rn should use Pk

54
Recovering Precise State

To get precise state after instruction X, we
Wait until X commits
The commit RAT is the precise state
E.g. recovery from branch misprediction
Wait until X commits
Rename map commit map
Flush window ROB, restart fetch

55
Reading Assignment

J. E. Smith and A. R. Pleszkun, Implementing
Precise Interrupts in Pipelined Processors",IEEE
Transactions on Computers,37(5), pages 562-573,
May, 1988.
How to get the paper http//gtel.gatech.edu2051/
Xplore/DynWel.jsp
Then log on with your GT user pass
Search in Journals Magazines forComputers,
select ToC
Find the year 1988, the May issue

56
Register Renaming Example

8 architectural (logical) registers R0..R7
16 physical registers (numbered 0..15),
6-instruction window
Single-issue, nine-stage pipeline
Fetch (also use BTB to predict next fetch addr)
Decode
Rename and put in instruction window
Also use RAS and direction predictor, calculate
target address if not indirect
Schedule
Instruction stays in schedule stage until
operands ready
Read Operands
Execute
Also calculate target address if indirect
Read Memory
Write Result
Commit
Instruction stays in commit stage until it can
actually commit

57
Register Renaming Example
R0,
XOR
R0,
R0
P0
1
P0
1
XOR
1
0
P8
R1,
LD.IMM
416(R0)
0
R2,
LD.IMM
4(R0)
0
R3,
LD.IMM
400(R0)
0
R4,
AND
R0,
R0
0
R5,
LD
0(R3)
0
R4,
ADD
R4,
R5
R3,
ADD
R3,
R2
R3,
BNE
R1,
-12(PC)
0
0
Cycle 3 Rename I1
R0
P8
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
58
Register Renaming Example
R0,
XOR
R0,
R0
P9
P0
1
P0
1
XOR
1
0
P8
R1,
LD.IMM
416(R0)
P10
0
R2,
LD.IMM
4(R0)
P11
0
R3,
LD.IMM
400(R0)
P12
0
R4,
AND
R0,
R0
P13
0
R5,
LD
0(R3)
P14
0
R4,
ADD
R4,
R5
P15
R3,
ADD
R3,
R2
R3,
BNE
R1,
-12(PC)
1
0
End of Cycle 3
R0
P8
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
59
Register Renaming Example
R0,
XOR
R0,
R0
P10
P0
1
P0
1
XOR
1
0
P8
R1,
LD.IMM
416(R0)
P11
P8
0
LD.IMM
1
1
P9
R2,
LD.IMM
4(R0)
416
P12
0
R3,
LD.IMM
400(R0)
P13
0
R4,
AND
R0,
R0
P14
0
R5,
LD
0(R3)
P15
0
R4,
ADD
R4,
R5
R3,
ADD
R3,
R2
R3,
BNE
R1,
-12(PC)
2
0
Cycle 4 Renamed I2
R0
P8
1
0
R1
P9
1
0
0
0
0
0
0
0
0
0
0
0
0
0
60
Register Renaming Example
R0,
XOR
R0,
R0
P10
P0
1
P0
1
XOR
1
0
P8
R1,
LD.IMM
416(R0)
P11
P8
0
1
LD.IMM
1
1
P9
R2,
LD.IMM
4(R0)
416
P12
0
R3,
LD.IMM
400(R0)
P13
0
R4,
AND
R0,
R0
P14
0
R5,
LD
0(R3)
P15
0
R4,
ADD
R4,
R5
R3,
ADD
R3,
R2
R3,
BNE
R1,
-12(PC)
2
0
Cycle 4 Schedule(the XOR is scheduled)
R0
P8
1
0
R1
P9
1
0
0
0
0
0
0
0
0
0
0
0
0
0
61
Register Renaming Example
R0,
XOR
R0,
R0
P11
P8
0
1
LD.IMM
1
2
P10
R1,
LD.IMM
416(R0)
4
P12
P8
0
1
LD.IMM
1
1
P9
R2,
LD.IMM
4(R0)
416
P13
0
R3,
LD.IMM
400(R0)
P14
0
R4,
AND
R0,
R0
P15
0
R5,
LD
0(R3)
0
R4,
ADD
R4,
R5
R3,
ADD
R3,
R2
R3,
BNE
R1,
-12(PC)
3
0
Cycle 5 Renamed I3
R0
P8
1
0
R1
P9
1
0
R2
P10
1
0
0
0
0
0
0
0
0
0
0
0
RR
0
P0
1
P0
1
XOR
0
P8
62
Register Renaming Example
R0,
XOR
R0,
R0
P11
0
R1,
LD.IMM
416(R0)
P12
P8
0
1
LD.IMM
1
1
P9
R2,
LD.IMM
4(R0)
416
P13
P8
0
1
LD.IMM
1
2
P10
R3,
LD.IMM
400(R0)
4
P14
0
R4,
AND
R0,
R0
P15
0
R5,
LD
0(R3)
0
R4,
ADD
R4,
R5
R3,
ADD
R3,
R2
R3,
BNE
R1,
-12(PC)
3
0
Cycle 5 Schedule
R0
P8
1
0
R1
P9
1
0
R2
P10
1
0
0
0
0
0
0
0
0
0
0
0
0
P0
1
P0
1
XOR
0
P8
63
Register Renaming Example
R0,
XOR
R0,
R0
P11
0
R1,
LD.IMM
416(R0)
P12
P8
0
1
LD.IMM
1
1
P9
R2,
LD.IMM
4(R0)
416
P13
P8
0
1
LD.IMM
1
2
P10
R3,
LD.IMM
400(R0)
4
P14
0
R4,
AND
R0,
R0
P15
0
R5,
LD
0(R3)
0
R4,
ADD
R4,
R5
R3,
ADD
R3,
R2
R3,
BNE
R1,
-12(PC)
3
0
Cycle 5 I1 Reads Regs
R0
P8
1
0
R1
P9
1
0
R2
P10
1
0
0
P0
1
P0
1
XOR
0
P8
0
0
0
0
0
0
0
0
0
0
64
Register Renaming Example
R0,
XOR
R0,
R0
P12
P8
0
1
LD.IMM
1
2
P10
R1,
LD.IMM
416(R0)
4
P13
P8
0
1
LD.IMM
1
1
P9
R2,
LD.IMM
4(R0)
416
P14
P8
0
1
LD.IMM
1
3
P11
R3,
LD.IMM
400(R0)
400
P15
0
R4,
AND
R0,
R0
0
R5,
LD
0(R3)
0
R4,
ADD
R4,
R5
R3,
ADD
R3,
R2
R3,
BNE
R1,
-12(PC)
4
0
Cycle 6 Renamed I4
R0
P8
1
0
R1
P9
1
0
R2
P10
1
0
R3
P11
1
0
0
0
0
0
0
0
Exe
0
P0
1
P0
1
XOR
0
P8
0
0
RR
0
65
Register Renaming Example
R0,
XOR
R0,
R0
P13
P8
1
1
LD.IMM
1
2
P10
R1,
LD.IMM
416(R0)
4
P14
P8
1
1
LD.IMM
1
1
P9
R2,
LD.IMM
4(R0)
416
P15
P8
1
1
LD.IMM
1
3
P11
R3,
LD.IMM
400(R0)
400
P8
1
P8
1
AND
1
4
P12
R4,
AND
R0,
R0
0
R5,
LD
0(R3)
0
R4,
ADD
R4,
R5
R3,
ADD
R3,
R2
R3,
BNE
R1,
-12(PC)
4
0
Cycle 7 Renamed I5,Sched nothing, thenI1
Writes result
R0
P8
1
1
R1
P9
1
0
R2
P10
1
0
R3
P11
1
0
R4
P12
1
0
0
0
0
WR
0
P0
1
P0
1
XOR
0
P8
0
Exe
0
0
0
RR
0
66
Register Renaming Example
R0,
XOR
R0,
R0
P14
P8
1
1
LD.IMM
1
2
P10
R1,
LD.IMM
416(R0)
4
P15
P8
1
1
LD.IMM
1
1
P9
R2,
LD.IMM
4(R0)
416
P8
1
1
LD.IMM
1
3
P11
R3,
LD.IMM
400(R0)
400
P8
1
P8
1
AND
1
4
P12
R4,
AND
R0,
R0
P11
0
1
LD
1
5
P13
R5,
LD
0(R3)
0
0
R4,
ADD
R4,
R5
R3,
ADD
R3,
R2
R3,
BNE
R1,
-12(PC)
4
0
Cycle 8 Renamed I6
R0
P8
1
1
R1
P9
1
0
R2
P10
1
0
R3
P11
1
0
R4
P12
1
0
0
0
0
WR
0
0
Exe
0
0
0
RR
0
67
Register Renaming Example
R0,
XOR
R0,
R0
P14
P8
1
1
LD.IMM
1
2
P10
R1,
LD.IMM
416(R0)
4
P15
P8
1
1
LD.IMM
1
1
P9
R2,
LD.IMM
4(R0)
416
P8
1
1
LD.IMM
1
3
P11
R3,
LD.IMM
400(R0)
400
P8
1
P8
1
AND
1
4
P12
R4,
AND
R0,
R0
P11
0
1
LD
1
5
P13
R5,
LD
0(R3)
0
0
R4,
ADD
R4,
R5
R3,
ADD
R3,
R2
R3,
BNE
R1,
-12(PC)
4
0
Cycle 8 ScheduleI2..I5 can be scheduled,pick I2
R0
P8
1
1
R1
P9
1
0
R2
P10
1
0
R3
P11
1
0
R4
P12
1
0
0
0
0
WR
0
0
Exe
0
0
0
RR
0
68
Register Renaming Example
R0,
XOR
R0,
R0
P14
P8
1
1
LD.IMM
1
2
P10
R1,
LD.IMM
416(R0)
4
P15
R2,
LD.IMM
4(R0)
0
P0
P8
1
1
LD.IMM
1
3
P11
R3,
LD.IMM
400(R0)
400
P8
1
P8
1
AND
1
4
P12
R4,
AND
R0,
R0
P11
0
1
LD
1
5
P13
R5,
LD
0(R3)
0
0
R4,
ADD
R4,
R5
R3,
ADD
R3,
R2
R3,
BNE
R1,
-12(PC)
4
0
Cycle 8 Commit
R0
P8
1
1
R1
P9
1
0
R2
P10
1
0
R3
P11
1
0
R4
P12
1
0
0
0
0
WR
0
0
Exe
0
0
0
RR
0
P8
1
1
LD.IMM
1
1
P9
416
69
Register Renaming Example
R0,
XOR
R0,
R0
P14
P8
1
1
LD.IMM
1
2
P10
R1,
LD.IMM
416(R0)
4
P15
R2,
LD.IMM
4(R0)
0
P0
P8
1
1
LD.IMM
1
3
P11
R3,
LD.IMM
400(R0)
400
P8
1
P8
1
AND
1
4
P12
R4,
AND
R0,
R0
P11
0
1
LD
1
5
P13
R5,
LD
0(R3)
0
0
R4,
ADD
R4,
R5
R3,
ADD
R3,
R2
R3,
BNE
R1,
-12(PC)
4
1
Cycle 8 After Commit
0
R1
P9
1
0
R2
P10
1
0
R3
P11
1
0
R4
P12
1
0
0
0
0
WR
0
0
Exe
0
0
0
RR
0
P8
1
1
LD.IMM
1
1
P9
416

Write a Comment

User Comments (0)