Title: Performance Evaluation of Cache Replacement Policies for the SPEC CPU2000 Benchmark Suite
1Performance Evaluation of Cache Replacement
Policies for the SPEC CPU2000 Benchmark Suite
2Overview
- Introduction
- Common cache replacement policies
- Experimental methodology
- Evaluating cache replacement policies questions
and answers - Conclusion
3Introduction
- Increasing speed gap between processor and memory
- Modern processors include multiple levels of
caches, cache associativity increases - Replacement policy Which block to discard when
the cache is full
4Introduction...cont.
- Optimal Replacement (OPT) algorithm replace
cache memory block whose next reference farthest
away in the future, infeasible - State-of-the-art processors employ various
policies
5Introduction...cont.
- Random
- LRU (Least Recently Used)
- Round-robin (FIFO First-In-First-Out)
- PLRU (Pseudo Least Recently Used) reduce the
hardware cost by approximating the LRU mechanism
6Introduction...cont.
- Our goal explore and evaluate common cache
replacement policies - how existing policies relate to OPT
- effect on instruction and data caches
- how good are pseudo techniques in approximating
true LRU
7Common cache replacement policies LRU
8Common cache replacement policiescont.
- Random policy simpler, but at the expense
performance. Linear Feedback Shift Register
(LFSR) - Round Robin (or FIFO) replacement replacing
oldest block in cache memory. Circular counter
9Common cache replacement policiescont.
PLRUt
10Common cache replacement policiescont.
PLRUm
11Experimental methodology
- sim-cache and sim-cheetah simulators
- Alpha version of the SimpleScalar
- original simulators modified to support
additional pseudo-LRU replacement policies - sim-cache simulator modified to print interval
statistics per specified number of instructions
12Evaluating cache replacement policies questions
and answers
- Q How much associativity is enough for
state-of-the-art benchmarks? - A For data cache, performance gain for
transition from a direct mapped to a two-way set
associative cache - For instruction cache, OPT replacement policy
benefits from increased associativity. realistic
policies dont exploit more than 8 ways, or in
some cases even more than 2 ways
13Evaluating cache replacement policies questions
and answerscont.
- Q How much space is there for improvement for
each specific benchmark and cache configuration?
14Evaluating cache replacement policies questions
and answerscont.
- Q Do replacement policies behave differently for
different types of memory references, such as
instruction and data? - A In general, LRU policy has better performance
than FIFO and Random with some exceptions
15Evaluating cache replacement policies questions
and answerscont.
- Q Can dynamic change of replacement policy
reduce the total number of cache misses? - A If one policy better than the other, it stays
consistently better
16Evaluating cache replacement policies questions
and answerscont.
- Can we use most recently used information for
cache way prediction?
17Evaluating cache replacement policies questions
and answerscont.
- Q How good are pseudo LRU techniques at
approximating true LRU? - A PLRUm and PLRUt very efficient in
approximating LRU policy and close to LRU during
whole program execution
18Conclusion
- Eliminating cache misses extremely important for
improving overall processor performance - Cache replacement policies gain more significance
in set associative caches - Gap between LRU and OPT replacement policies, up
to 50, new research to close the gap is
necessary