Title: Cost Effective Memory Dependence Prediction Using Speculation Levels and Color Sets
1Cost Effective Memory Dependence Prediction Using
Speculation Levels and Color Sets
- Soner Önder
- Michigan Technological University, Houghton MI
- www.cs.mtu.edu/soner
2Outline
- Background
- Memory dependence prediction.
- Pairing based approach.
- Store sets.
- Color sets
- Notion of color sets.
- Color set implementation.
- Color set predictor.
- Instruction window modifications.
- Experimental evaluation
- Basic policy.
- Aggressive policy.
3Memory Dependence Prediction
- Assume ST-2, ST-p and LD-s all access the same
memory location. - If we issue LD-s at this point in time, well get
a memory order violation. - If we know Load Ld-s is dependent on Store St-p,
we can issue the load at the right time.
4Dynamic Memory Disambiguation
- Problem
- In the presence of unresolved stores in the
instruction window, which load(s) must be held? - Ideal Solution
- Wait only for the producer store.
- Simple Solutions
- Wait for all - no speculation.
- Issue blindly - blind speculation.
5Memory dependence prediction(Moshovos et al.
1997-1998)
- Earlier work which mainly concentrated on
predicting precise dependencies among pairs of
load/store instructions - To enable early issuing of loads through memory
dependence prediction. - To streamline communication so that values can be
directly passed from producers to consumers
instead of through memory. - Emphasis has been given to identifying the
precise store instruction a load may depend on.
6Store-set Memory Dependence Predictor(Chrysos
Emer - 1998)
- A store set is the set of all stores a load has
been observed to be dependent on. - Initially employ blind speculation for loads.
- Upon memory order violation create a store set
for the offending load and store. - Next time the same load is encountered make the
load wait until the store issues. - Store set may contain multiple stores chain
the stores and make load dependant upon the last
store.
7Store-set Implementation
PC
LFST
SSID
- Dependence information is digested to create SETS
of colliding instructions. - Each set tells exactly which stores a load should
wait for. - Sufficiently large tables yield performance of an
ORACLE.
8Color Set predictor
- Instead of
- predicting precise dependencies among pairs of
loads/stores - or
- constructing sets of store and load instructions
which collided in the past, - We assign the processor, load and store
instructions various speculation levels (colors)
and predict the speculation level (i.e.,the
color) a load or store can be issued without a
collision.
Predictor size
9Color Set predictor
- Since we only try to predict the speculation
level, we expect to have - smaller storage for the predictor,
- better performance at smaller hardware budgets,
- faster implementations,
- power savings and
- more collisions.
10So, it is something like this
00
01
10
11
Processor
00
01
10
11
Load
- The rules governing the color changepolicies.
- We investigate two policies, a basic policy and
an aggressive policy.
11Load instruction selection
Eligible load instructions
00
01
10
11
Current processor color
12Load instruction selection
Eligible load instructions
00
01
10
11
Current processor color
13Load instruction selection
Eligible load instructions
00
01
10
11
Current processor color
14Load instruction selection
Eligible load instructions
00
01
10
11
Current processor color
15Instruction window extensions
0
color
Inhibit
Window details
Global color
0
1
0
lt
0
1
Issue?
0
0
Instructions entering window
16Collisions
00
01
10
11
Current processor color
17Color Set Predictor Basic Policy
- 1. Basic policy gradually becomes aggressive when
port utilization is low. - 2. The load instruction is given a higher color
and a store instruction given a lower color upon
a collision. - 3. Processor runs at the smaller of the current
processor color and the color of the store
instructions. - 4. Rules 2 3 together runs the processor at a
lower speculation level than the level the prior
collision has occurred.
18Color Set Predictor Aggressive Policy
- 1. Aggressive policy switches to maximum
speculation level when port utilization is low. - 2. The load instruction is given a higher color
and a store instruction is specifically marked
upon a collision. - 3. Processor decrements the current processor
color when a colliding store is detected. - 4. As a result, the processor runs at the highest
speculation level that wont result in a
collision and at a different color than the color
it had during the collision.
19Color Set Predictor
- Accessed early in the pipeline using L/S PC
- Updated upon collision/successful speculation
Basic Policy 00 No speculation 01 Level 1 10
Level 2 11 Level 3
L/S PC
L/S color
10
Aggressive Policy 00 No speculation 01 Level
1 10 Level 2 11 Level 3/Colliding store
20Processors colorful perspective
Basic policy
- When port utilization is low, the processor moves
on to next color. - Processor assumes the lowest ranking stores
color.
00
01
10
11
21Processors colorful perspective
Aggressive policy
- When a colliding store enters the window, the
processor decrements its color. - When port utilization is low, processor switches
to red.
00
01
10
11
22Load instruction color states
Both policies
00
01
10
11
23Simulation Framework
- Aggressive out-of-order superscalar processor
- 8 instructions/cycle fetch/dispatch
- 16 instructions/cycle retire width
- 64 entry centralized reservation station
- 8 symmetric functional units
- Multi-block gshare fetch unit
- 2 memory ports r/w
- Perfect D-cache
- Simulated using cycle-accurate simulators
generated automatically from ADL descriptions
using the FAST system.
24Performance Spec Fp
Arithmetic Mean
25Performance Spec Fp
Harmonic Mean
26Performance Spec Int
Arithmetic Mean
27Performance Spec Int
Harmonic Mean
28Individual benchmarks 128-Fp
29Individual benchmarks 4096-Fp
30Individual benchmarks 128-Int
31Individual benchmarks 4096-Int
32So ...
- Cost effective dependence prediction.
- Why does it work?
- Design space
- Number of colors/number of entries.
- Confidence mechanisms.
- Other policies.
- Power consumption
- Disable chunks of predictor and use basic policy
- Enable and become aggressive.
33Have a colorful evening
- Soner Önder
- Michigan Technological University
Antalya, Turkey