On the Importance of Optimizing the Configuration of Stream Prefetches - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

On the Importance of Optimizing the Configuration of Stream Prefetches

Description:

Title: PowerPoint Presentation Author: Ilya Last modified by: Ben Zorn Created Date: 9/28/2003 2:18:02 AM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 18
Provided by: Ilya55
Category:

less

Transcript and Presenter's Notes

Title: On the Importance of Optimizing the Configuration of Stream Prefetches


1
On the Importance of Optimizing the Configuration
of Stream Prefetches
  • Ilya Ganusov
  • Martin Burtscher

Computer Systems Laboratory Cornell University
2
Introduction
  • Memory wall
  • Increasing gap between processor and memory
    speeds
  • Concentration on bandwidth at the expense of
    latency
  • Prefetch important data
  • Do not wait until the processor requests data
  • Pro-actively fetch the data that is likely to be
    consumed in the near future

3
Stream Prefetching
  • Prefetching with outcome-based prediction
  • Use the history of previous misses to guess data
    addresses that are likely to miss soon
  • Stream prefetching
  • A special case of outcome-based prediction
  • Proposed 15 years ago
  • The only hardware prefetching scheme used in
    modern microprocessors

4
Contributions
  • Detailed sensitivity analysis of main prefetcher
    parameters on SPECcpu2000 programs
  • No such study in the literature
  • Many research papers fail to specify prefetcher
    parameters in comparative studies
  • Case study
  • Evaluate performance of Runahead execution on a
    baseline with different stream prefetcher
    parameters

5
Outline
  • Introduction
  • Stream Prefetcher Operation
  • Evaluation Methodology
  • Experimental Results
  • Conclusion

6
How Stream Prefetchers Work
Global miss history
miss addr
addr addr addr addr
Stream table
valid stream address stride
valid stream address stride

valid stream address stride
AGU

addr stride lookahead
Stream exists?
prefetch addr
7
Measured Parameters
miss history length
miss addr
addr addr addr addr
valid stream address stride
valid stream address stride

valid stream address stride
Number of supported streams
prefetch distance
AGU

addr stride lookahead
Stream exists?
prefetch addr
8
Evaluation Methodology
  • Benchmarks
  • 22 SPECcpu2000 programs, highly optimized
  • All F77, C, and C programs
  • Multiple reference inputs per program
  • SimPoint interval of 500 million instructions
  • Simulated architecture
  • SimpleScalar v4.0 cycle-accurate simulator
  • Aggressive superscalar Alpha 21264-like core

9
Simulated System
Execution Core Execution Core
Fetch/issue/commit 4/4/4
I-window/ROB/LSQ 64/128/64
LdSt/Int/FP units 2/4/2
Execution latencies Similar to Alpha 21264
Branch predictor 16K-entry bimodal/gshare hybrid
Memory Subsystem Memory Subsystem
Cache sizes 64KB IL1, 64KB DL1, 1MB L2
Cache associativity 2-way L1, 4-way L2
Cache latencies 2 cyc L1, 20 cyc L2
Main memory latency 400 cycles
10
Outline
  • Introduction
  • Motivation
  • Implementation
  • Experimental Results
  • Conclusion

11
Miss History Length
7 programs are very sensitive
16-entry history is enough
12
Number of Stream Table Entries
only 3 programs are sensitive
gt 8 streams provides little benefit
13
L2 Cache Prefetch Distance
11 programs are very sensitive
FP speedup varies by 80 - 140
14
Case Study Runahead Execution
  • Performance of stream prefetching is highly
    dependent on parameter choice
  • Another proposal Runahead execution
  • Pseudo-retire long latency loads stalling the
    pipeline and continue executing
  • Roll back to checkpoint after load comes back
    from memory

15
Speedup over Stream Prefetching
  • SPEC fp speedup drops by gt 2x

16
Conclusion
  • Key observations
  • The performance of the stream prefetcher is
    highly dependent on its configuration
  • Varying the prefetch distance alone almost
    doubles the average performance benefit
  • Choosing a non-optimal stream prefetcher as a
    baseline can distort results by a factor of two
  • Conclusion
  • Parameter optimizations are imperative when
    comparing stream prefetchers to other prefetching
    techniques

17
On the Importance of Optimizing the Configuration
of Stream Prefetches
  • Ilya Ganusov
  • Martin Burtscher

Computer Systems Laboratory Cornell University
Write a Comment
User Comments (0)
About PowerShow.com