Title: Cache Models and Empirical Search in Automatic Tuning of Whole Applications Apan Qasem Ken Kennedy J
1Cache Models and Empirical Search in Automatic
Tuning of Whole ApplicationsApan Qasem Ken
Kennedy John Mellor-Crummey Department of
Computer Science Rice Universityqasem,ken,johnm
c_at_cs.rice.edu
In many cases, simple analytical models used by
traditional compilers are no longer able to yield
effectively optimized code for complex programs
because of the enormous complexity of processor
architectures. A promising alternative approach
for tuning applications has been the use of
search-based empirical methods. However, a main
obstacle to this approach is the need for
evaluating a prohibitively large number of
program variants. To address this problem, we
have developed FETA, a prototype tool for
automatic tuning of whole applications. FETA
combines analytical cost models with detailed
performance feedback to guide search for the best
set of optimization parameters.
FETA A Framework for Empirical Tuning of
Applications
bloop
hpcrun
Analytic Modeler
Vendor Compiler
Initial Parameters
Transformed Source
Binary
F77 Source
LoopTool
Annotated Source
hpcview
Pentium 4
Next Iteration
SGI
Parameterized Search Engine
Alpha
Parameters
Itanium2
Simulated Annealing
Random
Direct Search
Feedback
Experimental Results
Speedup Across Benchmarks Using Direct Search
Speedup Across Platforms Using Direct Search
Conclusions
Speedup Using Cache Models for Loop Fusion
- Significant speedup across a range of
architectures - Direct search method able to find good values
for tile sizes and unroll factors by exploring
only a small fraction of the search space - Combining static models with search works well
for loop fusion
Tuning Time Comparison for Direct Search
Performance Improvement Comparison