Modeling CPU - PowerPoint PPT Presentation

About This Presentation

Title:

Modeling CPU

Description:

public class Fetch ... { public fire() { pc = input_pc.get(0); rs, rt, ... Superscalar execution. Multiple fetches at once.. Might be problematic to do in PN. ... – PowerPoint PPT presentation

Number of Views:37

Avg rating:3.0/5.0

Slides: 21

Provided by: trev84

Category:

more less

Transcript and Presenter's Notes

Title: Modeling CPU

1
Modeling CPUs using Different MOCs a Case Study

Trevor C. Meyerowitz
Advisor Alberto Sangiovanni-Vincentelli
290n Final Presentation
May 15 2002

2
Outline

Introduction
Motivation
The Simple CPU to be modeled
The Domains Investigated
Modeling a Non-Pipelined Processor
Modeling a Pipelined Processor
Demo
Conclusions

3
Motivation

Processor Designs are becoming much larger and
more complicated
Many instructions in flight at a single time
Strange Orderings, Speculation
This can be very hard to verify
We are developing a methodology to help alleviate
these problems.
Using Different Models of Computation can
Potentially Simplify the Design Task
PtolemyII Allows us to Compare a Variety of these
MOCs in a Unified Framework

4
The Simple CPU

Processor Statistics
Small Instruction Set
ADD, SUB, ADDI, SUBI, and BNE
Only Integer Operations
128 registers, 128 entry instruction memory
This is enough to be interesting
Data dependencies
Control flow

5
The Domains Investigated

Process Networks
Untimed Model
Kahn-Macqueen Semantics
Infinite Queues
Blocking Reads
Fully Deterministic
Schedule Independent

Synchronous Reactive
Untimed Model
Instantaneous Communication and Computation
Iterates Until a Fixed Point is Found
Signals must be monotonic

6
The Nonpipelined Processor

Code and netlist reusued for both domains (I.e.
these are domain polymorphic actors)
Represented in PtolemyII as Fetch, Regfile,
Execute and a Delay.
Fetch only after previous instruction has
completed

7
Non-Pipelined Processor Pseudocode (Fetch
Regfile)
public class Fetch public fire() pc
input_pc.get(0) ltrs, rt, rd, val, instgt
readIMEM(pc) output_inst.send(0,
inst) output_regs.send(rs, rt)
public class Reg public fire() if
(read_mode) inst input_get_op_codes()
ltrs_v, rt_vgt read_regs()
output_regs.send(0, inst)
output_regs.send(rs_v, rt_v) else
rd_v input_get_write_vals()
write_values() read_mode
!read_mode
8
Non-Pipelined Processor Pseudocode (Execute)
public class Exec public fire() if
(write_modefalse) reg_vals
input_reg_vals() inst_type
read_inst() results exec_inst(inst_type,
reg_vals) else
write_values(rd, results)
write_next_pc(results) write_mode
!write_mode
9
Non-Pipelined Processor Differences between
Domains

SR required that we put the register read and
register write in different iterations as well as
split of execution and writing its results
Process networks cannot query port status
SR requires use of prefire and postfire
conditions
We shared code between the two domains, SR
probably has more flexibility.

10
Pipelined Processor

Only required recoding of fetch behavior
Fetch every iteration
Only stall after branches (no branch prediction)
No forwarding logic is required!?
This is because two register reads cant occur
without a register write happening between them
Due to PN deterministic requirement
Also true because of SR because of states
Probably could structure SR to require forwarding
logic (lower level of abstraction!!)

11
Pipelined Processor Fetch pseudo-code
public class Fetch public fire() if
(initial_firing
prev_inst_is_branch) pc
input_pc.get(0) ltrs, rt, rd, val,
instgt readIMEM(pc)
output_inst.send(0, inst) output_regs.send(rs
, rt) pc pc1
Causes you to stall until the branch is finished.
Immediately fires again if there is no branch!
12
Pipelining and Forwarding (t0)
Program
Fetch
Reg File
Exec
Inst_2 R3 R1(?) R1(?)
Inst_1 R1 R2(4) R3(5)
Register File State R1 2 R2 4 R3 5
13
Pipelining and Forwarding (t1)
Program
Fetch
Reg File
Exec
Inst_2 R3 R1(2) R1(2)
Inst_1 R1(9) R2(4) R3(5)
Register File State R1 2 R2 4 R3 5
The PN and SR models dont have this problem
because they enforce the order read inst_1,
write inst_1, read inst_2
This is an error!! It should read R1 as 9. We
can solve this by adding forwarding logic, or
stalling the pipeline
14
Pipelined Processor with Branch Prediction

Still in order, but branches are predicted
instead of stalling.
Requires recoding of Fetch and the Register File
Fetch
Performs branch prediction
Handles mispredicts
Register File
Keeps a queue of instructions
Stall on dependencies
Only write resolved instructions to regfile
This represents one refinement path
Biased towards Process Networks

15
DEMO TIME
Inst RD, RS, RT (Val)ADD 5 5 5 ADD 6 5 5 BNE 5
20 -3 ADD 7 6 6 ADD 8 7 7 ADD 9 8 8 ADD 10 9
9 SUB 11 10 50
Program Code
16
Outline

Introduction
Modeling a Non-Pipelined Processor
Modeling a Pipelined Processor
Demo
Conclusions
Other Architectural Features
Observations
Future Work

17
Other Architectural Features

Out of Order Execution
Requires breaking of PN model
Superscalar execution
Multiple fetches at once.. Might be problematic
to do in PN.
Memory systems
Initially simple, more complicated when
refinements are added.

18
Observations

Process Networks are relatively easy to use and
are quite predictable.
Process Networks are great for initial abstract
models.
Synchronous Reactive is simpler than DE to work
with, but more complicated to design than PNs.
PN doesnt deal well with ordering refinements,
but SR can handle them better.
We envision a methodology where you start with a
PN model and then move to an SR model.

19
Future Work