PREDICTING UNROLL FACTORS USING SUPERVISED LEARNING - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

PREDICTING UNROLL FACTORS USING SUPERVISED LEARNING

Description:

Calls to a library at all exit points. Compile and run at all unroll factors (1.. 8) ... Every major release of ORC has had a different unrolling heuristic for SWP ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 25
Provided by: marks252
Category:

less

Transcript and Presenter's Notes

Title: PREDICTING UNROLL FACTORS USING SUPERVISED LEARNING


1
PREDICTING UNROLL FACTORS USING SUPERVISED
LEARNING
  • Mark Stephenson Saman Amarasinghe
  • Massachusetts Institute of Technology
  • Computer Science and Artificial Intelligence Lab

2
INTRODUCTION MOTIVATION
  • Compiler heuristics rely on detailed knowledge of
    the system
  • Compiler interactions not understood
  • Architectures are complex

3
HEURISTIC DESIGN
  • Current approach to heuristic development is
    somewhat ad hoc
  • Can compiler writers learn anything from
    baseball?
  • Is it feasible to deal with empirical data?
  • Can we use statistics and machine learning to
    build heuristics?

4
CASE STUDY
  • Loop unrolling
  • Code expansion can degrade performance
  • Increased live ranges, register pressure
  • A myriad of interactions with other passes
  • Requires categorization into multiple classes
  • i.e., whats the unroll factor?

5
ORCS HEURISTIC (UNKNOWN TRIPCOUNT)
if (trip_count_tn NULL) UINT32
ntimes MAX(1, OPT_unroll_times-1) INT32
body_len BB_length(head) while (ntimes gt 1
ntimes body_len gt CG_LOOP_unrolled_size_max)
ntimes-- Set_unroll_factor(ntimes)
else
6
ORCS HEURISTIC (KNOWN TRIPCOUNT)
else BOOL const_trip
TN_is_constant(trip_count_tn) INT32
const_trip_count const_trip ?
TN_value(trip_count_tn) 0 INT32 body_len
BB_length(head) CG_LOOP_unroll_min_trip
MAX(CG_LOOP_unroll_min_trip, 1) if
(const_trip CG_LOOP_unroll_fully (body_len
const_trip_count lt CG_LOOP_unrolled_size_max
CG_LOOP_unrolled_size_max 0
CG_LOOP_unroll_times_max gt const_trip_count))
Set_unroll_fully() Set_unroll_factor(c
onst_trip_count) else UINT32
ntimes OPT_unroll_times ntimes
MIN(ntimes, CG_LOOP_unroll_times_max) if
(!is_power_of_two(ntimes)) ntimes 1 ltlt
log2(ntimes) while (ntimes gt 1
ntimes body_len gt CG_LOOP_unrolled_size_max) nt
imes / 2 if (const_trip) while (ntimes
gt 1 const_trip_count lt 2 ntimes) ntimes
/ 2 Set_unroll_factor(ntimes)

7
SUPERVISED LEARNING
  • Supervised learning algorithms try to find a
    function F(X) ? Y
  • X vector of characteristics that define a loop
  • Y empirically found best unroll factor

1
2
3
4
Loops
Unroll Factors
5
6
7
8
F(X)
8
EXTRACTING THE DATA
  • Extract features
  • Most features readily available in ORC
  • Kitchen sink approach
  • Finding the labels (best unroll factors)
  • Added instrumentation pass
  • Assembly instructions inserted to time loops
  • Calls to a library at all exit points
  • Compile and run at all unroll factors (1.. 8)
  • For each loop, choose the best one as the label

9
LEARNING ALGORITHMS
  • Prototyped in Matlab
  • Two learning algorithms classified our data set
    well
  • Near neighbors
  • Support Vector Machine (SVM)
  • Both algorithms classify quickly
  • Train at the factory
  • No increase in compilation time

10
NEAR NEIGHBORS
FP operations
branches
unroll
dont unroll
11
SUPPORT VECTOR MACHINES
  • Map the original feature space into a
    higher-dimensional space (using a kernel)
  • Find a hyperplane that maximally separates the
    data

12
SUPPORT VECTOR MACHINES
unroll
dont unroll
13
PREDICTION ACCURACY
  • Leave-one-out cross validation
  • Filter out ambiguous training examples
  • Only keep obviously better examples (1.05x)
  • Throw away obviously noisy examples

14
REALIZING SPEEDUPS (SWP DISABLED)
15
FEATURE SELECTION
  • Feature selection is a way to identify the best
    features
  • Start with loads of features
  • Small feature sets are better
  • Learning algorithms run faster
  • Are less prone to overfitting the training data
  • Useless features can confuse learning algorithms

16
FEATURE SELECTION CONT.MUTUAL INFORMATION SCORE
  • Measures the reduction of uncertainty in one
    variable given knowledge of another variable
  • Does not tell us how features interact with each
    other

17
FEATURE SELECTION CONT.GREEDY FEATURE SELECTION
  • Choose single best feature
  • Choose another feature, that together with the
    best feature, improves classification accuracy
    most

18
FEATURE SELECTIONTHE BEST FEATURES
19
RELATED WORK
  • Monsifrot et al., A Machine Learning Approach to
    Automatic Production of Compiler Heuristics.
    2002
  • Calder et al., Evidence-Based Static Branch
    Prediction Using Machine Learning. 1997
  • Cavazos et al., Inducing Heuristic to Decide
    Whether to Schedule. 2004
  • Moss et al., Learning to Schedule Straight-Line
    Code. 1997
  • Cooper et al., Optimizing for Reduced Code Space
    using Genetic Algorithms. 1999
  • Puppin et al., Adapting Convergent Scheduling
    using Machine Learning. 2003
  • Stephenson et al., Meta Optimization Improving
    Compiler Heuristics with Machine Learning. 2003

20
CONCLUSION
  • Supervised classification can effectively find
    good heuristics
  • Even for multi-class problems
  • SVM and near neighbors perform well
  • Potentially have big impact
  • Spent very little time tuning the learning
    parameters
  • Let a machine learning algorithm tell us which
    features are best

21
(No Transcript)
22
SOFTWARE PIPELINING
  • ORC has been tuned with SWP in mind
  • Every major release of ORC has had a different
    unrolling heuristic for SWP
  • Currently 205 lines long
  • Can we learn a heuristic that outperforms ORCs
    SWP unrolling heuristic?

23
REALIZING SPEEDUPS (SWP ENABLED)
24
HURDLES
  • Compiler writer must extract features
  • Acquiring labels takes time
  • Instrumentation library
  • 2 weeks to collect data
  • Predictions confined to training labels
  • Have to tweak learning algorithms
  • Noise
Write a Comment
User Comments (0)
About PowerShow.com