ECE 361 Computer Architecture Lecture 10: Designing a Multiple Cycle Processor - PowerPoint PPT Presentation

About This Presentation
Title:

ECE 361 Computer Architecture Lecture 10: Designing a Multiple Cycle Processor

Description:

Title: Training Last modified by: Kenin Coloma Created Date: 9/9/1996 11:33:30 AM Document presentation format: On-screen Show Other titles: Times New Roman Arial ... – PowerPoint PPT presentation

Number of Views:172
Avg rating:3.0/5.0
Slides: 38
Provided by: eceNorthw
Category:

less

Transcript and Presenter's Notes

Title: ECE 361 Computer Architecture Lecture 10: Designing a Multiple Cycle Processor


1
ECE 361Computer ArchitectureLecture 10
Designing a Multiple Cycle Processor
2
Recap A Single Cycle Datapath
  • We have everything except control signals
    (underline)
  • Todays lecture will show you how to generate the
    control signals

3
Recap PLA Implementation of the Main Control
RegWrite
ALUSrc
RegDst
MemtoReg
MemWrite
Branch
Jump
ExtOp
ALUoplt2gt
ALUoplt1gt
ALUoplt0gt
4
The Big Picture Where are We Now?
  • The Five Classic Components of a Computer
  • Todays Topic Designing the Datapath for the
    Multiple Clock Cycle Datapath

Processor
Input
Control
Memory
Datapath
Output
5
Outline of Todays Lecture
  • Recap and Introduction
  • Introduction to the Concept of Multiple Cycle
    Processor
  • Multiple Cycle Implementation of R-type
    Instructions
  • What is a Multiple Cycle Delay Path and Why is it
    Bad?
  • Multiple Cycle Implementation of Or Immediate
  • Multiple Cycle Implementation of Load and Store
  • Putting it all Together

6
Abstract View of our single cycle processor
Main Control
op
ALU control
fun
ALUSrc
Equal
ExtOp
MemWr
MemWr
MemRd
RegDst
RegWr
nPC_sel
ALUctr
Reg. Wrt
ALU
Register Fetch
Ext
Mem Access
PC
Instruction Fetch
Next PC
Result Store
Data Mem
  • looks like a FSM with PC as state

7
Whats wrong with our CPI1 processor?
Arithmetic Logical
PC
Reg File
Inst Memory
ALU
setup
mux
mux
Load
PC
Inst Memory
ALU
Data Mem
Reg File
setup
mux
mux
Critical Path
Store
PC
Inst Memory
ALU
Data Mem
Reg File
mux
Branch
PC
Inst Memory
cmp
Reg File
mux
  • Long Cycle Time
  • All instructions take as much time as the slowest
  • Real memory is not so nice as our idealized
    memory
  • cannot always get the job done in one (short)
    cycle

8
Drawbacks of this Single Cycle Processor
  • Long cycle time
  • Cycle time must be long enough for the load
    instruction
  • PCs Clock -to-Q
  • Instruction Memory Access Time
  • Register File Access Time
  • ALU Delay (address calculation)
  • Data Memory Access Time
  • Register File Setup Time
  • Clock Skew
  • Cycle time is much longer than needed for all
    other instructions. Examples
  • R-type instructions do not require data memory
    access
  • Jump does not require ALU operation nor data
    memory access

9
Overview of a Multiple Cycle Implementation
  • The root of the single cycle processors
    problems
  • The cycle time has to be long enough for the
    slowest instruction
  • Solution
  • Break the instruction into smaller steps
  • Execute each step (instead of the entire
    instruction) in one cycle
  • Cycle time time it takes to execute the longest
    step
  • Keep all the steps to have similar length
  • This is the essence of the multiple cycle
    processor
  • The advantages of the multiple cycle processor
  • Cycle time is much shorter
  • Different instructions take different number of
    cycles to complete
  • Load takes five cycles
  • Jump only takes three cycles
  • Allows a functional unit to be used more than
    once per instruction

10
The Five Steps of a Load Instruction
Instruction Fetch
Instr Decode /
Address
Reg Wr
Data Memory
Reg. Fetch
Clk
Clk-to-Q
New Value
Old Value
PC
Instruction Memory Access Time
Rs, Rt, Rd, Op, Func
Old Value
New Value
Delay through Control Logic
ALUctr
Old Value
New Value
ExtOp
Old Value
New Value
ALUSrc
Old Value
New Value
RegWr
Old Value
New Value
Register File Access Time
busA
Old Value
New Value
Delay through Extender Mux
Register File Write Time
busB
Old Value
New Value
ALU Delay
Address
Old Value
New Value
Data Memory Access Time
busW
Old Value
New
11
Register File Memory Write Timing Ideal vs.
Reality
  • In previous lectures, register file and memory
    are simplified
  • Write happens at the clock tick
  • Address, data, and write enable must bestable
    one set-up time before the clock tick
  • In real life
  • Neither register file nor ideal memory has the
    clock input
  • The write path is a combinational logic delay
    path
  • Write enable goes to 1 and Din settles down
  • Memory write access delay
  • Din is written into memaddress
  • Important Address and Data must bestable BEFORE
    Write Enable goes to 1

12
Race Condition Between Address and Write Enable
  • This real (no clock input) register file may
    notwork reliably in the single cycle processor
    because
  • We cannot guarantee Rw willbe stable BEFORE
    RegWr 1
  • There is a race between Rw (address)and RegWr
    (write enable)
  • The real (no clock input) memory may not
    workreliably in the single cycle processor
    because
  • We cannot guarantee Address willbe stable BEFORE
    WrEn 1
  • There is a race between Adr and WrEn

13
How to Avoid this Race Condition?
  • Solution for the multiple cycle implementation
  • Make sure Address is stable by the end of Cycle N
  • Assert Write Enable signal ONE cycle later at
    Cycle (N 1)
  • Address cannot change until Write Enable is
    disasserted

14
Dual-Port Ideal Memory
  • Dual Port Ideal Memory
  • Independent Read (RAdr, Dout) and Write (WAdr,
    Din) ports
  • Read and write (to different location) can occur
    at the same cycle
  • Read Port is a combinational path
  • Read Address Valid --gt
  • Memory Read Access Delay --gt
  • Data Out Valid
  • Write Port is also a combinational path
  • MemWrite 1 --gt
  • Memory Write Access Delay --gt
  • Data In is written into locationWrAdr

15
Questions and Administrative Matters
16
Instruction Fetch Cycle In the Beginning
  • Every cycle begins right AFTER the clock tick
  • memPC PClt310gt 4

Clk
One Logic Clock Cycle
You are here!
PCWr?
PC
32
MemWr?
IRWr?
32
32
RAdr
Clk
4
32
Ideal Memory
Instruction Reg
WrAdr
32
Dout
Din
32
ALUop?
32
Clk
17
Instruction Fetch Cycle The End
  • Every cycle ends AT the next clock tick (storage
    element updates)
  • IR lt-- memPC PClt310gt lt-- PClt310gt 4

Clk
One Logic Clock Cycle
You are here!
PCWr1
PC
32
MemWr0
IRWr1
32
00
32
RAdr
Clk
4
32
Ideal Memory
Instruction Reg
32
WrAdr
Dout
Din
ALUOp Add
32
32
Clk
18
Instruction Fetch Cycle Overall Picture
PCWr1
PCWrCondx
PCSrc0
BrWr0
Zero
ALUSelA0
MemWr0
IRWr1
IorD0
1
Mux
32
PC
0
32
Zero
RAdr
32
32
busA
Ideal Memory
32
Instruction Reg
32
4
0
32
WrAdr
32
1
32
Din
Dout
32
busB
2
32
3
ALUSelB00
ALUOpAdd
19
Register Fetch / Instruction Decode
  • busA lt- RegFilers busB lt- RegFilert
  • ALU is not being used ALUctr xx

PCWr0
PCWrCond0
PCSrcx
Zero
ALUSelAx
MemWr0
IRWr0
RegWr0
RegDstx
IorDx
1
Mux
32
PC
0
32
Zero
Rs
Ra
RAdr
5
32
32
Rt
Rb
busA
32
Ideal Memory
32
Instruction Reg
Reg File
5
32
4
Rt
0
32
Rw
WrAdr
32
1
32
32
Rd
Din
Dout
busW
32
busB
2
32
3
Op
Go to the Control
Imm
6
ALUSelBxx
Func
16
6
ALUOpxx
20
Register Fetch / Instruction Decode (Continue)
  • busA lt- Regrs busB lt- Regrt
  • Target lt- PC SignExt(Imm16)4

PCWr0
PCWrCond0
PCSrcx
BrWr1
Zero
ALUSelA0
MemWr0
IRWr0
RegWr0
RegDstx
IorDx
1
Mux
32
PC
0
32
Zero
Rs
Ra
RAdr
5
32
32
Rt
Rb
busA
32
Ideal Memory
32
Instruction Reg
Reg File
5
32
4
Rt
0
32
Rw
WrAdr
32
1
32
32
Rd
Din
Dout
busW
32
busB
2
32
3
Control
Beq
Op
Imm
Rtype
6
ALUSelB10
Func
Ori
16
32
6
Memory
ALUOpAdd

ExtOp1
21
Branch Completion
  • if (busA busB)
  • PC lt- Target

PCWr0
PCWrCond1
PCSrc1
BrWr0
Zero
ALUSelA1
MemWr0
IRWr0
RegWr0
RegDstx
IorDx
1
32
Mux
32
PC
0
32
Zero
Rs
Ra
RAdr
5
32
32
Rt
Rb
busA
32
Ideal Memory
32
Instruction Reg
Reg File
5
32
4
Rt
0
32
Rw
WrAdr
32
1
32
32
Rd
Din
Dout
busW
32
busB
2
32
3
Imm
ALUSelB01
16
32
ALUOpSub
ExtOpx
22
Instruction Decode We have a R-type!
  • Next Cycle R-type Execution

PCWr0
PCWrCond0
PCSrcx
BrWr1
Zero
ALUSelA0
MemWr0
IRWr0
RegWr0
RegDstx
IorDx
1
Mux
32
PC
0
32
Zero
Rs
Ra
RAdr
5
32
32
Rt
Rb
busA
32
Ideal Memory
32
Instruction Reg
Reg File
5
32
4
Rt
0
32
Rw
WrAdr
32
1
32
32
Rd
Din
Dout
busW
32
busB
2
32
3
Control
Beq
Op
Imm
Rtype
6
ALUSelB10
Func
Ori
16
32
6
Memory
ALUOpAdd

ExtOp1
23
R-type Execution
  • ALU Output lt- busA op busB

PCWr0
PCWrCond0
PCSrcx
BrWr0
Zero
ALUSelA1
MemWr0
IRWr0
RegWr0
RegDst1
IorDx
1
Mux
32
PC
0
32
Zero
Rs
Ra
RAdr
5
32
32
Rt
Rb
busA
32
Ideal Memory
32
Instruction Reg
Reg File
5
32
4
Rt
0
32
Rw
WrAdr
32
1
32
32
Rd
Din
Dout
busW
32
busB
2
32
3
Imm
32
ALUOpRtype
MemtoRegx
ExtOpx
ALUSelB01
24
R-type Completion
  • Rrd lt- ALU Output

PCWr0
PCWrCond0
PCSrcx
BrWr0
Zero
ALUSelA1
MemWr0
IRWr0
RegWr1
RegDst1
IorDx
1
Mux
32
PC
0
32
Zero
Rs
Ra
RAdr
5
32
32
Rt
Rb
busA
32
Ideal Memory
32
Instruction Reg
Reg File
5
32
4
Rt
0
32
Rw
WrAdr
32
1
32
Rd
Din
Dout
busW
32
busB
2
32
3
Imm
32
ALUOpRtype
MemtoReg0
ExtOpx
ALUSelB01
25
A Multiple Cycle Delay Path
  • There is no register to save the results between
  • Register Fetch busA lt- Regrs busB lt-
    Regrt
  • R-type Execution ALU output lt- busA op busB
  • R-type Completion Regrd lt- ALU output

Register here to save outputs of Rfetch?
ALUselA
PCWr
Register here to save outputs of RExec?
Zero
Rs
Ra
5
busA
32
Rt
Rb
32
Instruction Reg
Reg File
5
32
4
Rt
0
32
Rw
1
32
Rd
busW
32
busB
2
3
ALUselB
ALUOp
26
A Multiple Cycle Delay Path (Continue)
  • Register is NOT needed to save the outputs of
    Register Fetch
  • IRWr 0 busA and busB will not change after
    Register Fetch
  • Register is NOT needed to save the outputs of
    R-type Execution
  • busA and busB will not change after Register
    Fetch
  • Control signals ALUSelA, ALUSelB, and ALUOpwill
    not change after R-type Execution
  • Consequently ALU output will not change after
    R-type Execution
  • In theory (P. 316, PH), you need a register to
    hold a signal value if
  • (1) The signal is computed in one clock cycle and
    used in another.
  • (2) AND the inputs to the functional block that
    computes this signal can change before the
    signal is written into a state element.
  • You can save a register if Cond 1 is true BUT
    Cond 2 is false
  • But in practice, this will introduce a multiple
    cycle delay path
  • A logic delay path that takes multiple cycles to
    propagate from one storage element to the next
    storage element

27
Pros and Cons of a Multiple Cycle Delay Path
  • A 3-cycle path example
  • IR (storage) -gt Reg File Read -gt ALU -gt Reg
    File Write (storage)
  • Advantages
  • Register savings
  • We can share time among cycles
  • If ALU takes longer than one cycle, still a OK
    as longas the entire path takes less than 3
    cycles to finish

Zero
Rs
Ra
5
busA
32
Rt
Rb
32
Instruction Reg
Reg File
5
32
4
Rt
0
32
Rw
1
32
Rd
busW
32
busB
2
3
ALUselB
28
Pros and Cons of a Multiple Cycle Delay Path
(Continue)
  • Disadvantage
  • Static timing analyzer, which ONLY looks at delay
    between two storage elements, will report this as
    a timing violation
  • You have to ignore the static timing analyzers
    warnings

Zero
Rs
Ra
5
busA
32
Rt
Rb
32
Instruction Reg
Reg File
5
32
4
Rt
0
32
Rw
1
32
Rd
busW
32
busB
2
3
ALUselB
29
Instruction Decode We have an Ori!
  • Next Cycle Ori Execution

PCWr0
PCWrCond0
PCSrcx
BrWr1
Zero
ALUSelA0
MemWr0
IRWr0
RegWr0
RegDstx
IorDx
1
Mux
32
PC
0
32
Zero
Rs
Ra
RAdr
5
32
32
Rt
Rb
busA
32
Ideal Memory
32
Intruction Reg
Reg File
5
32
4
Rt
0
32
Rw
WrAdr
32
1
32
32
Rd
Din
Dout
busW
32
busB
2
32
3
Control
Beq
Op
Imm
Rtype
6
ALUSelB10
Func
Ori
16
32
6
Memory
ALUOpAdd

ExtOp1
30
Ori Execution
  • ALU output lt- busA or ZeroExtImm16

PCWr0
PCWrCond0
PCSrcx
BrWr0
Zero
ALUSelA1
MemWr0
IRWr0
RegWr0
RegDst0
IorDx
1
Mux
32
PC
0
32
Zero
Rs
Ra
RAdr
5
32
32
Rt
Rb
busA
32
Ideal Memory
32
Instruction Reg
Reg File
5
32
4
Rt
0
32
Rw
WrAdr
32
1
32
32
Rd
Din
Dout
busW
32
busB
2
32
3
Imm
16
32
ALUOpOr
MemtoRegx
ExtOp0
ALUSelB11
31
Ori Completion
  • Regrt lt- ALU output

PCWr0
PCWrCond0
PCSrcx
BrWr0
Zero
ALUSelA1
MemWr0
IRWr0
RegWr1
RegDst0
IorDx
1
Mux
32
PC
0
32
Zero
Rs
Ra
RAdr
5
32
32
Rt
Rb
busA
32
Ideal Memory
32
Instruction Reg
Reg File
5
32
4
Rt
0
32
Rw
WrAdr
32
1
32
Rd
Din
Dout
busW
32
busB
2
32
3
Imm
32
ALUOpOr
MemtoReg0
ExtOp0
ALUSelB11
32
Memory Address Calculation
AdrCal
1 ExtOp
ALUSelA
ALUSelB11
  • ALU output lt- busA SignExtImm16

ALUOpAdd
x MemtoReg
PCSrc
PCWr0
PCWrCond0
PCSrcx
BrWr0
Zero
ALUSelA1
MemWr0
IRWr0
RegWr1
RegDstx
IorDx
1
Mux
32
PC
0
32
Zero
Rs
Ra
RAdr
5
32
32
Rt
Rb
busA
32
Ideal Memory
32
Instruction Reg
Reg File
5
32
4
Rt
0
32
Rw
WrAdr
32
1
32
32
Rd
Din
Dout
busW
32
busB
2
32
3
Imm
16
32
ALUOpAdd
MemtoRegx
ExtOp1
ALUSelB11
33
Memory Access for Store
SWmem
1 ExtOp
MemWr
ALUSelA
ALUSelB11
  • memALU output lt- busB

ALUOpAdd
x PCSrc,RegDst
PCWr0
PCWrCond0
PCSrcx
BrWr0
MemtoReg
Zero
ALUSelA1
MemWr1
IRWr0
RegWr0
RegDstx
IorDx
1
Mux
32
PC
0
32
Zero
Rs
Ra
RAdr
5
32
32
Rt
Rb
busA
32
Ideal Memory
32
Instruction Reg
Reg File
5
32
4
Rt
0
32
Rw
WrAdr
32
1
32
Rd
Din
Dout
busW
32
busB
2
32
3
Imm
32
ALUOpAdd
MemtoRegx
ExtOp1
ALUSelB11
34
Memory Access for Load
  • Mem Dout lt- memALU output

PCWr0
PCWrCond0
PCSrcx
BrWr0
Zero
ALUSelA1
MemWr0
IRWr0
RegWr0
RegDst0
IorD1
1
Mux
32
PC
0
32
Zero
Rs
Ra
RAdr
5
32
32
Rt
Rb
busA
32
Ideal Memory
32
Instruction Reg
Reg File
5
32
4
Rt
0
32
Rw
WrAdr
32
1
32
32
Rd
Din
Dout
busW
32
busB
2
32
3
Imm
16
32
ALUOpAdd
MemtoRegx
ExtOp1
ALUSelB11
35
Write Back for Load
  • Regrt lt- Mem Dout

PCWr0
PCWrCond0
PCSrcx
BrWr0
Zero
ALUSelA1
MemWr0
IRWr0
RegWr0
RegDst0
IorDx
1
Mux
32
PC
0
32
Zero
Rs
Ra
RAdr
5
32
32
Rt
Rb
busA
32
Ideal Memory
32
Instruction Reg
Reg File
5
32
4
Rt
0
32
Rw
WrAdr
32
1
32
32
Rd
Din
Dout
busW
32
busB
2
32
3
Imm
16
32
ALUOpAdd
MemtoReg1
ExtOp1
ALUSelB11
36
Putting it all together Multiple Cycle Datapath
PCWr
PCWrCond
PCSrc
BrWr
Zero
ALUSelA
MemWr
IRWr
RegWr
RegDst
IorD
1
Mux
32
PC
0
32
Zero
Rs
Ra
RAdr
5
32
32
Rt
Rb
busA
32
Ideal Memory
32
Instruction Reg
Reg File
5
32
4
Rt
0
32
Rw
WrAdr
32
1
32
Rd
Din
Dout
busW
32
busB
2
32
3
Imm
32
ALUOp
MemtoReg
ExtOp
ALUSelB
37
Putting it all together Control State Diagram
beq
AdrCal
1 ExtOp
ALUSelA
ALUSelB11
lw or sw
ALUOpAdd
x MemtoReg
Ori
PCSrc
Rtype
OriExec
lw
sw
SWMem
LWmem
1 ExtOp
1 ExtOp
ALUSelA, IorD
MemWr
ALUSelB11
ALUSelA
ALUOpAdd
ALUSelB11
ALUOpAdd
x MemtoReg
PCSrc
x PCSrc,RegDst
MemtoReg
OriFinish
LWwr
38
Summary
  • Disadvantages of the Single Cycle Processor
  • Long cycle time
  • Cycle time is too long for all instructions
    except the Load
  • Multiple Cycle Processor
  • Divide the instructions into smaller steps
  • Execute each step (instead of the entire
    instruction) in one cycle
  • Do NOT confuse Multiple Cycle Processor with
    Multiple Cycle Delay Path
  • Multiple Cycle Processor executes
    eachinstruction in multiple clock cycles
  • Multiple Cycle Delay Path a combinational logic
    path between two storage elements that takes more
    than one clock cycle to complete
  • It is possible (desirable) to build a MC
    Processor without MCDP
  • Use a register to save a signals value whenever
    a signal is generated in one clock cycle and used
    in another cycle later
Write a Comment
User Comments (0)
About PowerShow.com