Pipelining Motivation and Basic Design - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Pipelining Motivation and Basic Design

Description:

Instruction Latencies vs. Throughput. Multiple Cycle CPU. Pipelined CPU. Cycle 1. Cycle 2. Cycle 3. Cycle 4. Cycle 5. Cycle 6. Cycle 7. Cycle 8. Ifetch. Reg/Dec. Exec ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 18
Provided by: constantin56
Category:

less

Transcript and Presenter's Notes

Title: Pipelining Motivation and Basic Design


1
Pipelining Motivation and Basic Design
  • ECE 411 - Fall 2009
  • Lecture 9

2
LC-3b Data Path
3
LC-3b Control FSM
4
Instruction Latencies vs. Throughput
  • Multiple Cycle CPU

Cycle 1
Cycle 2
Cycle 3
Cycle 4
Cycle 5
  • Pipelined CPU

Cycle 1
Cycle 2
Cycle 3
Cycle 4
Cycle 5
Cycle 6
Cycle 7
Cycle 8
5
Pipelining Analogy
  • Pipelined laundry overlapping execution
  • Parallelism improves performance
  • Four loads
  • Speedup 8/3.5 2.3
  • Non-stop
  • Speedup 2n/0.5n 1.5 4 number of stages

6
Pipeline Performance
Single-cycle
Pipelined
7
Pipelining the LC-3b
  • Note that the instruction fetch, decode, and
    execution operations use different hardware
  • Some contention for the MAR/MDR, and conflict
    over PC update
  • MAR/MDR contention is why separate instruction
    and data caches are common
  • Example pipeline organization Load instruction
  • Instruction Fetch
  • Decode/Register read
  • Execute
  • Address calculation
  • Mem access
  • Result writeback

8
Pipelining Example
9
Pipelined Data Path
10
Whats Wrong With This Pipeline?
  • Uneven pipeline stages
  • Execute part will take much longer than the other
    two stages
  • Pipeline performance is limited by the latency of
    the longest stage
  • How do we keep data from leaking across stage
    boundaries?
  • Need a latch between each pipeline stage
  • Solution Divide the pipeline into more stages,
    so that each stage has an equal latency, i.e.,
    Balance the pipeline stages
  • Instruction Fetch
  • Decode/Register Read
  • Execute
  • Memory Latency
  • Writeback

11
More Realistic Pipeline
12
Example With New Pipeline
  • More stages, i.e., more clock cycles to traverse,
    but probably shorter period (faster clock)

13
MIPS Pipeline (textbook)
  • Five stages, one step per stage
  • IF Instruction fetch from memory
  • ID Instruction decode register read
  • EX Execute operation or calculate address
  • MEM Access memory operand
  • WB Write result back to register

14
Pipeline Speedup
  • If all stages are balanced
  • i.e., all take the same time
  • Time between instructions pipelined Time
    between instructions non-pipelined Number of
    stages
  • If not balanced, speedup is less
  • Speedup due to increased throughput
  • Latency (time for each instruction) does not
    decrease

15
Pipelining and ISA Design
  • MIPS ISA designed for pipelining
  • All instructions are 32-bits
  • Easier to fetch and decode in one cycle
  • c.f. x86 1- to 17-byte instructions
  • Few and regular instruction formats
  • Can decode and read registers in one step
  • Load/store addressing
  • Can calculate address in 3rd stage, access memory
    in 4th stage
  • Alignment of memory operands
  • Memory access takes only one cycle

16
Hazards
  • Situations that prevent starting the next
    instruction in the next cycle
  • Structure hazards
  • A required resource is busy
  • Data hazard
  • Need to wait for previous instruction to complete
    its data read/write
  • Control hazard
  • Deciding on control action depends on previous
    instruction

17
Structure Hazards
  • Conflict for use of a resource
  • In MIPS pipeline with a single memory
  • Load/store requires data access
  • Instruction fetch would have to stall for that
    cycle
  • Would cause a pipeline bubble
  • Hence, pipelined data paths prefer separate
    instruction/data memories
  • Or separate instruction/data caches
Write a Comment
User Comments (0)
About PowerShow.com