Statistical Compiler Tuning - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Statistical Compiler Tuning

Description:

... Mann-Whitney algorithm two other architectures, the IA64 dual core Itanium2 1.296GHz and the SUN SPARC dual core 1.28GHz, to check the robustness of the ... – PowerPoint PPT presentation

Number of Views:17
Avg rating:3.0/5.0
Slides: 26
Provided by: ceEtTu
Category:

less

Transcript and Presenter's Notes

Title: Statistical Compiler Tuning


1
Statistical Compiler Tuning
  • M. Haneda
  • P.M.W. Knijnenburg
  • H.A.G. Wijshoff

2
Motivation
  • An optimal compiler optimization setting can be
    obtained by considering the interaction between
    applications, architectures, and compiler
    optimizations.
  • Profiling is a (the best?) way to understand this
    interaction.
  • However a huge number of optimization settings
    are possible.

3
Example gcc 3.3.1
  • 42 options
  • defer-pop, force-mem, force-addr,
    optimize-sibling-calls, inline-functions,
    merge-constants, strength-reduce, thread-jumps,
    cse-follow-jumps, cse-skip-blocks,
    rerun-cse-after-loop, rerun-loop-opt, gcse,
    loop-optimize, crossjumping, if-conversion,
    if-conversion2, delete-null-pointer-checks,
    expensive-optimizations, optimize-register-move,
    schedule-insns, sched-interblock, sched-spec,
    schedule-insns2, sched-spec-load,
    sched-spec-load-dangerous, caller-saves,
    move-all-movables, reduce-all-givs, peephole
    peephole2, reorder-blocks, reorder-functions,
    strict-aliasing, align-functions, align-labels,
    align-loops, align-jumps, cprop-registers,
    function-sections, data-sections, unroll-loops

4
Example (Contd)
  • 42 options, all on/off switches leads to 242
    4.4x1012
  • Each profile takes 10 runs each taking
    approximately 10 sec. Total time per profile 100
    sec.
  • Total amount of time for full profile takes
    4.4x1014 sec 7.3x108 weeks 1.4x107 years

5
Challenge
  • How to find the optimal configuration with
    limited amount of profiling?

6
Three Approaches
  • Statistical approach
  • Using Main effect
  • Using the Mann-Whitney test
  • Random approach
  • Approach which focuses on the interaction between
    compiler optimizations

7
Statistical Approach
  • Start with an appropriate initial representation
    of the full search space based on the orthogonal
    arrays.
  • Each time after data collection
  • Approach1 Compute main effects of compiler
    options from the profiling data.
  • Approach2 Apply Inferential Statistics
    (Mann-Whitney Test ) to the profiling data to
    detect effective compiler options.

8
Orthogonal Arrays
  • Orthogonal arrays (OAs) are well chosen
    fractional factorial designs.
  • An OA is expressed as an N x k matrix of 0s and
    1s.
  • The columns are interpreted as factors (compiler
    options).
  • Each row of an array defines a compiler setting.

9
Orthogonal Arrays (contd)
  • An OA has the property that for any two arbitrary
    columns the patterns
  • 00 01 10 11
  • occur equally often.
  • According to this property,
  • Each compiler option is turned on and off equally
    often.
  • When we drop columns of an OA, the array is still
    an orthogonal array.

10
Example
  • 0 0 0 0 0
  • 1 0 0 1 1
  • 0 1 0 1 0
  • 0 0 1 0 1
  • 1 1 0 0 1
  • 1 0 1 1 0
  • 0 1 1 1 1
  • 1 1 1 0 0

O1 O2 O3 O4 O5 Run1 off off
off off off Run2 on off off on on Run3
off on off on off Run4 off off on off
on Run5 on on off off on Run6 on off
on on off Run7 off on on on on Run8 on
on on off off
Interpreted as Compiler Settings
11
Inferential Statistics
  • Inferential statistics is used to predict whether
    a factor of an experiment has a significant
    effect in the presence of other factors.
  • Inferential statistics is based on a null
    hypothesis and test statistics.

12
Null Hypothesis
  • The null hypothesis denies the effect of a factor
    in an experiment
  • Compiler option A is not effective to
    optimize application B.
  • The Mann-Whitney test is used to compute the test
    statistics to evaluate the likelihood of the null
    hypothesis.

13
Iterative Algorithm
List of compiler options, OA Target application,
Input dataset
Compile application according to the compiler
setting from OA
Profiling data
New option list
Mann-Whitney test
Remove significant options from option list
Significant options
14
Iterative Algorithm (Contd)
  • Until
  • All options are set, or
  • No options with a significant effect are detected
    anymore, or
  • The experimental data has not enough variation
    (low standard deviation) to apply the
    Mann-Whitney test meaningfully.

15
Application to GCC
  • Compiler version 3.3.1
  • Number of options 42 options
  • Architecture Pentium 4 at 2.8GHz
  • Applications 7 programs from the SPECint 2000
    benchmark suite
  • Measurement Unix time command
  • Improvement of configured setting Onew
  • Obase setting optimization level O with all
    options explicitly turned off

16
Case study (parser, SPECint2000)
  • 1st iteration of the experiment using the
    benchmark Parser
  • We use an OA of order 48, which derives 48
    compiler settings
  • Option 5 is selected.

17
Case study (parser, SPECint2000)
  • 2nd iteration of the experiment using the
    benchmark Parser
  • Option 3 and 13 are selected.

18
Case study (parser, SPECint2000)
  • 3rd iteration of the experiment using the
    benchmark Parser
  • Option 4, 19, and 33 are selected.

19
Overall Results
20
Profiling Time
21
Different Architectures
  • We apply the Mann-Whitney algorithm two other
    architectures, the IA64 dual core Itanium2
    1.296GHz and the SUN SPARC dual core 1.28GHz, to
    check the robustness of the approach.
  • We only apply the Mann-Whitney test and 5 out of
    the 12 SPECint benchmarks due to compilation
    errors.

22
Different Architectures
IA64
SPARC
  • The performance for the IA64 is comparable to the
    performance of O3
  • The results for the SPARC are better.

23
Different Input Sets (1)
  • Improvements using the resulted
  • setting and reference input datasets.
  • The performance is modest over all
  • architectures, but it seems that
  • the resulted settings can be
  • applicable to different dataset.

24
Different Input Sets (2)
25
Conclusion
  • The Mann-Whitney test can be used to achieve a
    fully automated method to determine optimal
    compiler settings for a single application.
  • Resulted compiler settings are applicable for
    different input dataset, however the results are
    better when we use the target datasets.
  • The same methodology can be applied to the code
    size reduction.
Write a Comment
User Comments (0)
About PowerShow.com