Superscalar Microprocessors - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Superscalar Microprocessors

Description:

Pre-Decoding is done when instructions are transferred from memory to the cache. The Pre-Decoded instruction is more simple than the original ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 32
Provided by: buhzard
Category:

less

Transcript and Presenter's Notes

Title: Superscalar Microprocessors


1
Superscalar Microprocessors
  • Robert Hock
  • 4/23/02

2
Superscalar Microprocessors
  • Topics Covered
  • Superscalar Processor Overview
  • MIPS R10000
  • Intel IA32
  • PowerPC

3
What does superscalar mean?
  • Definition
  • Superscalar machines are able to issue multiple
    instructions for each clock cycle from a
    conventional linear instruction stream

4
In English This Time
  • A superscalar processor can run code out of
    sequence in order to optimize it. Instructions of
    various lengths introduce latency into the
    program execution. By piplining these
    instructions, it is possible to execute multiple
    instructions out of sync.

5
How Does it Work?
  • Instructions are introduced in sequence
  • These instructions are scheduled dynamically by
    the hardware
  • More than one instruction can be issued each
    clock cycle
  • The number of instructions issued is also set
    dynamically by the hardware

6
Phases of the Superscalar Pipeline
  • Fetch
  • Pre-fetch
  • Decode
  • Rename
  • Issue
  • Execute
  • Complete
  • Reorder
  • Commit
  • Retire
  • Write-Back

7
Fetch Decode
  • Fetching Decoding can be done faster than
    Execution
  • Processor Fetches Decodes more instructions
    than it Commits, because it discards instructions
    from mispredicted branch paths

8
Pre-Fetch Pre-Decoding
  • Pre-Decoding is done when instructions are
    transferred from memory to the cache
  • The Pre-Decoded instruction is more simple than
    the original
  • The Decoder can decode this format faster than
    the original

9
Renaming
  • Renaming is the process of giving physical
    registers to take the place of logical registers

10
Issue
  • Waiting instructions are analyzed to find
    instructions beyond the current instructions that
    can be executed independantly
  • This is Look-Ahead capability
  • Instructions can be issued in-order or
    out-of-order

11
Execute
  • Instruction is Executed in either a single cycle,
    or may take multiple cycles
  • After Execution, the Completion phase is reached

12
Reorder
  • The Reorder logic sorts whether the instruction
    was on a predictive branch, and whether that
    branch was correct
  • Execution exceptions are marked

13
Commit
  • An executed instruction is committed when
  • All previous instructions required by the program
    have already been committed
  • No interrupt has occurred
  • If instruction was executed from a branch
    prediction and the branch was correct

14
Retire
  • An instruction is Retired when
  • The instruction has been committed
  • The instruction has been removed because of
    branch prediction or exception

15
Write-Back
  • As the name implies, final instruction data is
    written back

16
MIPS R10000 Overview
  • 64-bit instruction set
  • Can decode 4 instructions per cycle
  • Has 5 execution pipelines
  • Uses dynamic scheduling and out-of-order
    execution
  • Does speculative branching

17
MIPS R10000 Pipeline Diagram
18
R10000 Functional Units
  • Integer ALU1
  • Integer ALU2
  • Load/Store Unit
  • Float Adder
  • Float Multiply

19
R10000 Pipeline Stages
  • Stage 1
  • Fetch 4 Instructions per cycle
  • Stage 2
  • 4 Instructions are Decoded Renamed
  • Only 1 Branch Instruction can be decoded per
    cycle
  • Stage 3
  • Decoded Instructions Issued

20
R10000 Pipeline Stages(cont)
  • Stages 4-6 (dependant on instruction)
  • Float Multiply (3 stage pipeline)
  • Float Adder (3 stage pipeline)
  • Integer ALU1 (1 stage pipeline)
  • Integer ALU2 (1 stage pipeline)

21
Intel IA-32 Overview
  • 32-bit instruction set.
  • 3-Way Pipelined
  • 12 stage pipeline
  • Optimized Scheduling, that necessitates
    retiring instructions in linear order

22
IA-32 Functional Units
  • Integer
  • Float
  • Load
  • Store1
  • Store2
  • Jump
  • MMX (Multimedia Instructions)

23
IA-32 Pipeline Stages
  • Stages 1-5
  • Fetch and Predecode
  • Stages 67
  • Decode
  • Stage 8
  • Renaming

24
IA-32 Pipeline (cont)
  • Stages 910
  • Issue
  • Stage 11
  • Execution
  • Stage 12
  • Retirement

25
IA-32 Latencies
  • Integer Arithmetic 1
  • Integer Mult 4
  • Float Add 3
  • Float Mult 5
  • Load Store 3
  • MMX Arithmetic 1
  • MMX Mult 3

26
PowerPC 750 Overview
  • 64-bit RISC Processor
  • 32-bit addressing

27
Functional Units
  • Float (3 Stage Pipeline)
  • Branch
  • Load/Store
  • Single Cycle Integer
  • Multi Cycle Integer

28
PowerPC Pipeline
  • Fetch
  • Issue
  • Integer OP (3 Depth)
  • Load OP (7 Depth)
  • Store OP (5 Depth)
  • Float OP (6 Depth)

29
Conclusion
  • While the R10000 and PowerPC are truly RISC
    based, the IA-32 has its roots in the CISC world.
  • The IA-32 has a deeper pipeline, allowing for
    increased clock cycles, which allows for
    increased sales. This is despite the fact that it
    delivers only mediocre performance.

30
Conclusion (cont)
  • For intensive numerical computation and 3D
    rendering the MIPS R10000 is superior
  • For everyday applications that would require
    low-voltage/heat, the PowerPC line has an edge.
  • For the home user, the IA-32 will be sufficient
    until the AMD 64-bit Hammer line is introduced.

31
For More Information
  • http//www.mips.com
  • http//www.intel.com
  • http//www.ibm.com
  • http//e-www.motorola.com/
Write a Comment
User Comments (0)
About PowerShow.com