Title: PREDICTING UNROLL FACTORS USING SUPERVISED LEARNING
1PREDICTING UNROLL FACTORS USING SUPERVISED
LEARNING
- Mark Stephenson Saman Amarasinghe
- Massachusetts Institute of Technology
- Computer Science and Artificial Intelligence Lab
2INTRODUCTION MOTIVATION
- Compiler heuristics rely on detailed knowledge of
the system - Compiler interactions not understood
- Architectures are complex
3HEURISTIC DESIGN
- Current approach to heuristic development is
somewhat ad hoc - Can compiler writers learn anything from
baseball? - Is it feasible to deal with empirical data?
- Can we use statistics and machine learning to
build heuristics?
4CASE STUDY
- Loop unrolling
- Code expansion can degrade performance
- Increased live ranges, register pressure
- A myriad of interactions with other passes
- Requires categorization into multiple classes
- i.e., whats the unroll factor?
5ORCS HEURISTIC (UNKNOWN TRIPCOUNT)
if (trip_count_tn NULL) UINT32
ntimes MAX(1, OPT_unroll_times-1) INT32
body_len BB_length(head) while (ntimes gt 1
ntimes body_len gt CG_LOOP_unrolled_size_max)
ntimes-- Set_unroll_factor(ntimes)
else
6ORCS HEURISTIC (KNOWN TRIPCOUNT)
else BOOL const_trip
TN_is_constant(trip_count_tn) INT32
const_trip_count const_trip ?
TN_value(trip_count_tn) 0 INT32 body_len
BB_length(head) CG_LOOP_unroll_min_trip
MAX(CG_LOOP_unroll_min_trip, 1) if
(const_trip CG_LOOP_unroll_fully (body_len
const_trip_count lt CG_LOOP_unrolled_size_max
CG_LOOP_unrolled_size_max 0
CG_LOOP_unroll_times_max gt const_trip_count))
Set_unroll_fully() Set_unroll_factor(c
onst_trip_count) else UINT32
ntimes OPT_unroll_times ntimes
MIN(ntimes, CG_LOOP_unroll_times_max) if
(!is_power_of_two(ntimes)) ntimes 1 ltlt
log2(ntimes) while (ntimes gt 1
ntimes body_len gt CG_LOOP_unrolled_size_max) nt
imes / 2 if (const_trip) while (ntimes
gt 1 const_trip_count lt 2 ntimes) ntimes
/ 2 Set_unroll_factor(ntimes)
7SUPERVISED LEARNING
- Supervised learning algorithms try to find a
function F(X) ? Y - X vector of characteristics that define a loop
- Y empirically found best unroll factor
1
2
3
4
Loops
Unroll Factors
5
6
7
8
F(X)
8EXTRACTING THE DATA
- Extract features
- Most features readily available in ORC
- Kitchen sink approach
- Finding the labels (best unroll factors)
- Added instrumentation pass
- Assembly instructions inserted to time loops
- Calls to a library at all exit points
- Compile and run at all unroll factors (1.. 8)
- For each loop, choose the best one as the label
9LEARNING ALGORITHMS
- Prototyped in Matlab
- Two learning algorithms classified our data set
well - Near neighbors
- Support Vector Machine (SVM)
- Both algorithms classify quickly
- Train at the factory
- No increase in compilation time
10NEAR NEIGHBORS
FP operations
branches
unroll
dont unroll
11SUPPORT VECTOR MACHINES
- Map the original feature space into a
higher-dimensional space (using a kernel) - Find a hyperplane that maximally separates the
data
12SUPPORT VECTOR MACHINES
unroll
dont unroll
13PREDICTION ACCURACY
- Leave-one-out cross validation
- Filter out ambiguous training examples
- Only keep obviously better examples (1.05x)
- Throw away obviously noisy examples
14REALIZING SPEEDUPS (SWP DISABLED)
15FEATURE SELECTION
- Feature selection is a way to identify the best
features - Start with loads of features
- Small feature sets are better
- Learning algorithms run faster
- Are less prone to overfitting the training data
- Useless features can confuse learning algorithms
16FEATURE SELECTION CONT.MUTUAL INFORMATION SCORE
- Measures the reduction of uncertainty in one
variable given knowledge of another variable - Does not tell us how features interact with each
other
17FEATURE SELECTION CONT.GREEDY FEATURE SELECTION
- Choose single best feature
- Choose another feature, that together with the
best feature, improves classification accuracy
most
18FEATURE SELECTIONTHE BEST FEATURES
19RELATED WORK
- Monsifrot et al., A Machine Learning Approach to
Automatic Production of Compiler Heuristics.
2002 - Calder et al., Evidence-Based Static Branch
Prediction Using Machine Learning. 1997 - Cavazos et al., Inducing Heuristic to Decide
Whether to Schedule. 2004 - Moss et al., Learning to Schedule Straight-Line
Code. 1997 - Cooper et al., Optimizing for Reduced Code Space
using Genetic Algorithms. 1999 - Puppin et al., Adapting Convergent Scheduling
using Machine Learning. 2003 - Stephenson et al., Meta Optimization Improving
Compiler Heuristics with Machine Learning. 2003
20CONCLUSION
- Supervised classification can effectively find
good heuristics - Even for multi-class problems
- SVM and near neighbors perform well
- Potentially have big impact
- Spent very little time tuning the learning
parameters - Let a machine learning algorithm tell us which
features are best
21(No Transcript)
22SOFTWARE PIPELINING
- ORC has been tuned with SWP in mind
- Every major release of ORC has had a different
unrolling heuristic for SWP - Currently 205 lines long
- Can we learn a heuristic that outperforms ORCs
SWP unrolling heuristic?
23REALIZING SPEEDUPS (SWP ENABLED)
24HURDLES
- Compiler writer must extract features
- Acquiring labels takes time
- Instrumentation library
- 2 weeks to collect data
- Predictions confined to training labels
- Have to tweak learning algorithms
- Noise