EEM 486: Computer Architecture Lecture 4 Designing a Multicycle Processor - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

EEM 486: Computer Architecture Lecture 4 Designing a Multicycle Processor

Description:

EEM 486. EEM 486: Computer Architecture. Lecture 4. Designing a ... ALUOut A SExt(Im16) Step 4. Memory access. MDR Memory[ALUOut] Step 5. Load completion ... – PowerPoint PPT presentation

Number of Views:140
Avg rating:3.0/5.0
Slides: 56
Provided by: homeAna
Category:

less

Transcript and Presenter's Notes

Title: EEM 486: Computer Architecture Lecture 4 Designing a Multicycle Processor


1
EEM 486 Computer ArchitectureLecture
4Designing a Multicycle Processor
2
The Big Picture
  • Designing a Multiple Clock Cycle Datapath

3
Single-Cycle Processor
  • In our single-cycle processor, each instruction
    is realized
  • by exactly one control command or microinstruction

4
Abstract View of Single Cycle-Processor
5
Whats Wrong with CPI1 Processor?
Arithmetic Logical
PC
Reg File
Inst Memory
ALU
setup
mux
mux
Load
PC
Inst Memory
ALU
Data Mem
Reg File
setup
mux
mux
Critical Path
Store
Inst Memory
PC
ALU
Data Mem
Reg File
mux
Branch
PC
Inst Memory
cmp
Reg File
mux
  • Long Cycle Time
  • All instructions take as much time as the slowest
  • Real memory is not as nice as our idealized
    memory
  • Cannot always get the job done in one (short)
    cycle

6
Memory Access Time
  • Physics gt fast memories are small (large
    memories are slow)
  • gt Use a hierarchy of memories

7
Multicycle Approach
  • Break up the instructions into steps
  • Let each step take one smaller clock cycle
  • - Balance the amount of work to be done
  • - Restrict each cycle to use only one major
    functional unit
  • Major functional units Memory,
    Register File, and ALU
  • Let different instructions take different numbers
    of cycles
  • Use a functional unit more than once within
    execution of one instruction (Less hardware)
  • A single memory unit for both instructions and
    data
  • A single ALU, rather than an ALU and two adders
  • At the end of a cycle
  • store values for use in later cycles
  • introduce additional internal registers

8
Partitioning the CPI1 Datapath
  • Add registers between smallest steps

MemWr
MemWr
RegDst
RegWr
MemRd
ALUSrc
nPC_sel
ExtOp
ALUctr
Reg. File
Exec
Operand Fetch
Instruction Fetch
Mem Access
PC
Next PC
Data Mem
Write back
Execution
Memory access
Instruction fetch
Decode and Operand fetch
9
Recall Step-by-step Processor Design
  • Step 1 ISA gt Logical Register Transfers
  • Step 2 Components of the Datapath
  • Step 3 RTL Components gt Datapath
  • Step 4 Datapath Logical RTs gt Physical RTs
  • Step 5 Physical RTs gt Control

10
Step 4 R-type (add, sub, . . .)
inst Logical Register Transfers ADDU RrdltRrs
Rrt PC lt PC 4
Step 1. Instruction Fetch IR ? MEMPC, PC
? PC 4 Step 2. Instruction Decode and
Register Fetch A ? Rrs, B ? Rrt Step
3. Execution ALUOut ? A op B Step
4. Write-back Rrd ? ALUOut
11
R-type - Fetch
12
R-type Decode/Register Fetch
13
R-type - Execution
14
R-type Write Back
15
Step 4 Logical immed
inst Logical Register Transfers ORI Rrt
lt Rrs OR ZExt(Im16) PC lt PC 4
Step 1. Instruction Fetch IR ? MEMPC, PC
? PC 4 Step 2. Instruction Decode and
Register Fetch A ? Rrs Step
3. Execution ALUOut ? A OR
ZExt(Im16) Step 4. Write-back Rrt ? ALUOut
16
Logical immediate - Execution
ALUSrcA1
RegWrite0
nPCWrite0
MemRead0
IRWrite0
Address
PC
Instruction 25-21
Rs
0
Read data 1
Memory
A
1
MemData
Instruction 20-16
Rt
ALU Out
ALU
Registers
Write data
Inst 15-11
B
0
Read data 2
Instruction 15-0
Rw
1
4
Write data
Instruction register
2
Zero extend
16
32
ALUSrcB2
ALUctrOr
17
Logical immediate Write Back
18
Step 4 Load
inst Logical Register Transfers LW Rrt lt
MEMRrs SExt(Im16) PC lt PC 4
Step 1. Instruction Fetch IR ? MEMPC, PC
? PC 4 Step 2. Instruction Decode and
Register Fetch A ? Rrs Step 3. Memory
address computation ALUOut ? A
SExt(Im16) Step 4. Memory access MDR ?
MemoryALUOut Step 5. Load completion
Rrt ? MDR
19
Load Address Calculation
RegDstx
ALUSrcA1
RegWrite0
nPCWrite0
MemRead0
IRWrite0
Address
PC
Instruction 25-21
Rs
0
Read data 1
Memory
A
1
MemData
Instruction 20-16
Rt
ALU Out
Registers
Write data
ALU
B
0
Read data 2
0
Inst 15-11
Instruction 15-0
Rw
1
1
4
Write data
Instruction register
2
Zero/ Sign extend
16
32
ALUSrcB2
ALUctrAdd
ExtOpSign
20
Load Memory Read
RegDstx
MemRead1
ALUSrcAx
RegWrite0
nPCWrite0
IRWrite0
Instruction 31-26
Address
PC
0
Instruction 25-21
Rs
Read data 1
Memory
1
A
MemData
Instruction 20-16
ALU Out
Rt
ALU
Registers
Write data
B
0
Read data 2
Inst 15-11
Instruction 15-0
Rw
1
4
Write data
Instruction register
2
MDR
Extender
16
32
ALUSrcBx
ALUctrx
ExtOpx
IorD1
21
Load Write Back
22
Step 4 Store
inst Logical Register Transfers SW MEMRrs
SExt(Im16) lt Rrt PC lt PC 4
Step 1. Instruction Fetch IR ? MEMPC, PC
? PC 4 Step 2. Instruction Decode and
Register Fetch A ? Rrs, B ? Rrt Step
3. Memory address computation ALUOut ? A
SExt(Im16) Step 4. Memory access
MemoryALUOut ? B
23
Store Address Calculation
24
Store Memory Write
25
Step 4 Branch
inst Logical Register Transfers BEQ if Rrs
Rrt then PC lt PC 4 SExt(Im16) 00 else
PC lt PC 4
Step 1. Instruction Fetch IR ? MEMPC, PC
? PC 4 Step 2. Instruction Decode and
Register Fetch A ? Rrs, B ? Rrt
ALUOut ? PC SExt(Im16) 00 Step 3. Branch
completion If A B, PC ? ALUOut
26
Branch Address Calculation
27
BranchExecution
28
Multicycle Processor
29
Summary of Instruction Steps
30
Simple Questions
  • How many cycles will it take to execute this
    code? lw t2, 0(t3) lw t3, 4(t3) beq
    t2, t3, Label assume not add t5, t2,
    t3 sw t5, 8(t3)Label ...
  • What is going on during the 8th cycle of
    execution?
  • In what cycle does the actual addition of t2 and
    t3 takes place?

31
Finite State Machine (FSM) Controller
  • State specifies control points for Register
    Transfer
  • Transfer occurs upon exiting state (same falling
    edge)

32
FSM for Control
33
Step 4 ? Control Specification
34
Step 5 ? (datapath state diagram?? control)
  • Translate RTs into control points
  • Assign states
  • Then go build the controller

35
Mapping RTs to Control Points
36
Assigning States
37
Control Logic Datapath Control Outputs
38
Control Logic Next State Function
39
PLA Implementation
40
Performance Evaluation
  • What is the average CPI?
  • State diagram gives CPI for each instruction type
  • Workload gives frequency of each type

Type CPIi for type Frequency CPIi x freqIi
Arith/Logic 4 40 1.6 Load 5 30
1.5 Store 4 10 0.4 branch 3 20
0.6 Average CPI 4.1
41
Another Implementation Style
42
Address Select Logic
43
Address Select Logic
44
Dispatch ROMs
Dispatch ROM 1
Op
Opcode name
Value
000000
R-format
0110
000010
jmp
1001
beq
000100
1000
lw
100011
0010
101011
sw
0010
Dispatch ROM 2
Op
Opcode name
Value
100011
lw
0011
sw
101011
0101
45
Microprogramming
Microprogramming Designing the control as a
program that implements the machine instructions
in terms of microinstructions
46
Microinstruction ???
User program plus Data this can change!
Main Memory
ADD SUB AND
. . .
one of these is mapped into one of these
DATA
execution unit
AND microsequence e.g., Fetch Calc
Operand Addr Fetch Operand(s)
Calculate Save Answer(s)
control memory
CPU
47
Microprogramming a Multicycle Processor
  • 1) Choose datapath and sequencer architecture
  • 2) Assign states and sequence of each
    (multicycle) instruction (i.e., define the
    controller FSM)
  • 3) Choose microinstruction format (minimum bits
    to describe all allowable functions of sequencer
    and datapath)
  • 4) Map instructions into microinstruction
    sequences

48
Designing a Microinstruction Set
  • 1) Start with list of control signals
  • 2) Group signals together that make sense called
    fields
  • 3) Place fields in some logical order (e.g., ALU
    operation ALU operands first and
    microinstruction sequencing last)
  • 4) Create a symbolic legend for the
    microinstruction format, showing name of field
    values and how they set the control signals
  • 5) To minimize the width, encode operations that
    will never be used at the same time

49
Multicycle Processor
50
Microinstruction fields
51
Sequencer
52
Microinstructions
Fetch and Decode
R-type instructions
53
Microinstructions
Memory-reference
Branch
54
Microprogram
55
Summary
  • Disadvantages of the Single Cycle Processor
  • Long cycle time
  • Cycle time is too long for all instructions
    except the Load
  • Multiple Cycle Processor
  • Divide the instructions into smaller steps
  • Execute each step (instead of the entire
    instruction) in one cycle
  • Partition datapath into equal size chunks to
    minimize cycle time
  • Follow same 5-step method for designing real
    processor
Write a Comment
User Comments (0)
About PowerShow.com