CSL718 : Superscalar Processors - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

CSL718 : Superscalar Processors

Description:

Cell/B.E. is the CPU of Sony PlayStation 3. Anshul Kumar, CSE IITD. 3 ... SUN SPARC SuperSparc (3) 1992. DEC Alpha 21064(2) 1992. Motorola MC88100 ... – PowerPoint PPT presentation

Number of Views:870
Avg rating:3.0/5.0
Slides: 38
Provided by: anshul8
Category:

less

Transcript and Presenter's Notes

Title: CSL718 : Superscalar Processors


1
CSL718 Superscalar Processors
  • Issue and Despatch
  • 25th Jan, 2008

2
Seminar Announcement
  • Title Beyond multicore Cell/B.E. and trends in
    heterogeneous computing
  • Venue CSE Seminar Room
  • Date Friday 25 Jan 2008 Time 1115
  • Speaker Peter Hofstee, PhD 1994 (Caltech)
  • Chief Scientist for the Cell Broadband Engine
  • IBM Austin Research Laboratory
  • Has 30 patents and 60 pending, mostly Cell
    related
  • Cell/B.E. is the CPU of Sony PlayStation 3

3
Early proposals/prototypes
Term Superscalar
Cheetah
America project(4)
IBM
Multititan project(2)
DEC
Match(2) Torch(4)
Stanford U
SIMP(4) DSNS(4)
Kyushu U
1982 1983 1984 1985 1986 1987
1988 1989
4
Commercial superscalars
  • RISCs
  • Intel 960KA/KB ? 960CA (3) 1989
  • IBM Power 1 RS/6000 (4) 1990
  • HP PA7000 ? PA7100 (2) 1992
  • SUN SPARC ? SuperSparc (3) 1992
  • DEC Alpha 21064(2) 1992
  • Motorola MC88100 ? MC88110(2) 1993
  • Motorola PowerPC 601/603 (3) 1993
  • MIPS R4000 ? R8000(4) 1994

5
Commercial superscalars
  • CISCs
  • Intel 80486 ? Pentium (2) 1993
  • Motorola MC68040 ? MC68060 (2) 1993
  • Gmicro Gmicro/100p ?
  • Gmicro 500 (2) 1993
  • AMD K5(2) 4 RISC instr 1995
  • CYRIX M1 (2) 1995

6
Tasks of superscalar processing
Parallel Parallel
Preserving the decoding
instruction
sequential and issue
execution
consistency of

instruction execution

and

exception processing
7
Superscalar decode and issue
8
Parallel Decoding
  • Fetch multiple instructions in instruction buffer
  • Decode multiple instructions in parallel
    instruction window
  • Possibly check dependencies among these as well
    as with the instructions already under execution

9
Reducing decoding time
  • Pre-decoding
  • Do partial decoding while instructions are being
    loaded in I-cache
  • Decoded information is appended to the
    instruction
  • This includes instruction class, resources
    required etc.

Second level cache or main memory
N bits/cycle
Pre-decode unit
N n bits/cycle
I - cache
10
Pre-decoding examples
  • Processor No. of predecode bits
  • PA 7200 (1995) 5
  • PA 8000 (1996) 5
  • PowerPC 620(1996) 7
  • UltraSparc (1995) 4
  • HAL PM1 (1995) 4
  • AMD K5 (1995) 5 (per byte)
  • R 10000 (1996) 4

11
Blocking during issue
Decode and issue instructions directly to
EUs Instructions may be blocked due to data
dependency
12
Non-blocking Issue
Decode and issue to buffers
From buffers dispatch to EUs
13
Handling of Issue Blockages
Preserving issue order Alignment of
instruction issue
aligned unaligned
in-order out of order
14
Issue Order
Issue in strict program order
Out of order Issue
Issue window
Issue window
Instructions to be issued Instructions issued
Instructions to be issued Instructions issued
c
d
a
b
e
c
d
a
b
e
a
a
c
Example MC 88110, PowerPC 601
Independent instruction
Dependent instruction
Issued instruction
15
Alignment
Aligned Issue
Unaligned Issue
fixed window
next window
gliding window
checked in cycle 1
c
d
a
b
e
f
g
h
c
d
a
b
e
f
g
h
issued in cycle 1
a
a
checked in cycle 2
c
d
b
e
f
g
h
c
d
b
e
f
g
h
issued in cycle 2
b
c
b
c
checked in cycle 3
d
e
f
g
h
d
e
f
g
h
issued in cycle 3
d
d
e
f
16
Design space in instruction issue
Coping with Coping with Use of
Handling of Issue false data
unresolved shelving issue
blockages rate dependencies control

(2-6) dependencies
blocking shelved
no Register renaming
wait speculative
17
Frequently used issue policies
in scalar processors
Traditional Traditional
Traditional
Traditional scalar issue scalar issue
scalar issue
scalar issue with
shelving with shelving
with spec.


and renaming
execution
CDC 6600
IBM 360/91
i386 MC68030 R3000 Sparc
I486 MC68040 R4000 MicroSparc
18
Frequently used issue policies
in super scalar processors
Straightforward Straightforward
Straight forward Advanced superscalar
superscalar
superscalar superscalar
issue issue with
issue with issue
shelving
renaming
(renamingshelving)
R10000 PentiumPro PowerPC602 PA8000 Sparc64 Am2900
0 K5
(speculative execution in all)
aligned unaligned
MC68060 PA7200 UltraSparc
Pentium PowerPC601 PA7100 SuperSparc Alpha21164
MC88110 R8000
PowerPC602
19
Design Space of Shelving
Scope of Layout of
Operand fetch Instruction shelving
shelving
policy dispatch scheme
buffers
partial full
20
Layout of Shelving Buffers
Type of the
Number of Number of
read shelving buffers
shelving buffer entries and write ports
depends on no. of EUs connected
individual 2-4 group 6-16 central 20 total 15-40
Stand combined with alone
renaming and (RS) reordering
21
Reservation Stations (RS)
Individual RSs
Group RSs
Central RS
EU
EU
EU
EU
EU
EU
EU
EU
22
Combined Buffer(for Shelving, Renaming,
Reordering)
From decode/issue
Deferred scheduling, Register renaming and
Instruction Shelving
DRIS
EU
EU
23
Operand Fetch Policies
Issue bound fetch
Dispatch bound fetch
24
Issue bound operand fetch(with single register
file)
instruction data
Decode/issue
RF
EU
EU
EU
EU
25
Dispatch bound operand fetch (with single
register file)
Decode/issue
EU
EU
EU
EU
26
Issue bound operand fetch(with multiple register
files)
instruction data
Decode/issue
RF
RF
EU
EU
EU
EU
27
Dispatch bound operand fetch (with multiple
register files)
Decode/issue
EU
EU
EU
EU
28
Updating RFs and RSs
instruction data
Decode/issue
RF
RF
EU
EU
EU
EU
29
Instruction dispatch scheme
Dispatch Dispatch
Checking Treatment
of policy rate
operand
empty RS
availability
single multiple instr/ instr/ cycle
cycle
Individual RS
Group or central RS
30
Dispatch policy
Selection
Arbitration
Dispatch rule
rule
order
Rule for identifying instructions which are ready
for execution (data dependency check)
Rule for choosing one out of several ready
instructions (earlier instruction has priority)
31
Dispatch order
in-order partially
out of
out of order
order
check
check
32
Checking availability of operands
Direct check of
Check of explicit score-board bits
status bits in RS (usual
for dispatch
(usual for issue bound operand fetch)
bound operand fetch) control flow
approach data flow
approach
Flynns terminology
33
Score-board
Introduced with CDC6600
Data status
0
Register File
1
1
0
2
1
0
1
34
Checking in dispatch bound fetch
decoded instruction
check V bits of sources
Reservation station
update Rd set V bit
Rs1,Rs2,Rd reset V bit of Rd
OC Rs1 Rs2 Rd
Register File
Os1
OC (opcode)
Os2 (operand value)
EU
result, Rd
35
Checking in issue bound fetch
decoded instruction
update Rd, set V bit
Rs1,Rs2,Rd reset V bit of Rd
Register File
Os1
Os2 (operand value)
check Vs1, Vs2
Reservation station
OC, Os1, Os2, Rd
OC Os1/Is1 Vs1 Os2/Is2 Vs2 Rd
EU
associative update of Is1, Is2 with Rd, set Vs
bits
result, Rd
36
Treatment of an empty RS
Straight forward
Bypassing approach
RS if empty
At least one cycle stay in RS
EU
EU
Sparc64 PowerPc 604
Nx586
37
Approaches in dispatching
Straight forward Enhanced
Advanced in order partially
out of order out of order single
single
multiple instr/cycle
instr/cycle instr/cycle individua
l RSs individual RSs
group/central RSs Power1, PPC603
Power2 PM1, PentiumPro Nx586, Am29000
PPC604,620 PA8000, R10000
Write a Comment
User Comments (0)
About PowerShow.com