Commutativity Analysis for Software Parallelization: Letting Program Transformations See the Big Picture - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Commutativity Analysis for Software Parallelization: Letting Program Transformations See the Big Picture

Description:

Commutativity Analysis for Software Parallelization: Letting Program Transformations See the Big Picture Farhana Aleen, Nate Clark Georgia Institute of Technology – PowerPoint PPT presentation

Number of Views:148
Avg rating:3.0/5.0
Slides: 26
Provided by: nate222
Category:

less

Transcript and Presenter's Notes

Title: Commutativity Analysis for Software Parallelization: Letting Program Transformations See the Big Picture


1
Commutativity Analysis for Software
Parallelization Letting Program Transformations
See the Big Picture
  • Farhana Aleen, Nate Clark
  • Georgia Institute of Technology
  • Modified by Michelle Goodstein
  • LBA Reading Group 6/4/09

2
Motivation
Extracting performance from multi-core is hard
I need to write parallel program
Automatic compiler-based parallelization helps
2
3
Source Of Parallelism Commutativity
sum(5)
sum(10)
15

sum (10)
sum(5)
15

Application
Application
foo(a) foo(b)
foo(b) foo(a)
output
output

4
Existing Approach Of Detecting Commutativity
  • Execute the function in two different orders
  • Check equivalence of memory

sum(x)
sum(y)
xy

sum(y)
sum(x)
yx

5
Opportunities Missed By Existing Approach
Insertion of elements in to Hash-set (vector
ltlinked-listgt)
2
6
2
insert(2)
insert(6)
6
2
6
insert(6)
insert(2)
6
The Idea
6
2
2
2
remove(6)
Yes!
insert(2)
insert(6)
is_member(2)
2
6
6
remove(6)
2
Yes!
insert(6)
insert(2)
is_member(2)
class hash_set vectorltlinked_listgt
set insert() remove() is_member()
  • Identical memory does not matter
  • Final output matters

7
Our Approach Step 1
  • Symbolically execute in two different orders
  • Check for the identical memory layout

I1
M
I2
M
insert()
insert()
M2
M1
I1
I2
insert()
insert()
?
M1,2
M2,1

If not similar, check reader functions
8
Step 2 Checking Reader Functions
M2,1
M2,1
I
M1,2
M1,2
I
I
I
is_member()
is_member()
remove()
remove()
M1,2
M2,1
M1,2
M2,1


insert()
Candidate function
Readers of candidate functions output
is_member()
remove()
Readers of readers output




9
Pros/Cons Of Our Approach
  • Pros-
  • Identifies more commutativity
  • Finds more parallelism
  • Cons-
  • More equivalence checking

10
Equivalence Checking Options
Random Testing
X
X
Random Interpretation
X
Speed
Symbolic Execution
X
Accuracy
11
Random Interpretation Example
Input(x,y)
x
x2 y3
2
axy
y
3
  • Choose random values for input variables

a
5
x
3
  • Execute taken branch of the condition
  • Execute fall-through branch
  • Replicate initial memory state
  • Adjust values

if(x!y)
y
3
a
6
fall-through
taken
b2x
ba
w3
  • Affine join of v1 and v2 w.r.t. weight w
  • ?w(v1,v2) w v1 (1-w)v2

x
x
3
2
y
3
y
3
a
6
a
5
b
b
4
6
assert(b2x)
x
5
y
3
a
8
b
10
12
Random Interpretation In Equivalence Checking
Initial memory
Initial memory
foo(x)
foo(y)
foo(y)
foo(x)
Modified memory
13
Why Random Interpretation Works
  • Avoids scalability problem
  • Affine join superposes all execution paths
  • Linear relationships same before and after the
    join
  • The error probability is very low
    at most
  • Decreases the error probability exponentially

14
(Added Slide) Probability details
  • Low error probability
  • In general, at most 1 bad random value / join in
    program
  • Prob(error) ( joins )/264
  • Empiricially (prior work) of joins increases
    linearly in of program statements
  • Coefficient of .5 to 5.2
  • Assume 1000 statement function, commutative
  • Prob(error) ? (5.2 1000) / 264 ? 2.8 10-16
  • To decrease error, increase of runs

15
Experimental Methodology
  • Trimaran compiler
  • Scheduled them
  • Infinite issue machine
  • Perfect memory system
  • Pointer Analysis
  • Stack and heap sensitive
  • Tested on
  • SPECint2000
  • MediaBench

16
(Added) Experimental Methodology
  • In some ways, an upper bound on commutativity
  • Can issue as many instructions as are commutative
  • Memory is perfect
  • Not a true upper bound tho
  • Random interpretation will sometimes fail/give up

17
(Added) Suggested Parallelism
  • Suppose a sorting algorithm will print to stderr
    if debug flag is set
  • Cannot be parallelized, b/c of dependences
    between writes
  • Human can differentiate
  • Compiler identifies things that are almost
    parallel,
  • Human states that the semantic changes (e.g.,
    printf orders) do not matter ?parallel
  • Otherwise, ignore

18
Analysis Time Commutativity Analysis
19
Functions Commutative
20
Parallelism Uncovered
21
Summary
  • Commutativity a significant source of parallelism
  • Identical memory does not matter for identifying
    commutative functions
  • Our technique
  • 13 more commutative functions detected
  • 28 more parallelism uncovered

22
  • Thank you

23
Functions Commutative
24
Parallelism Uncovered
25
Analysis Time Commutativity Analysis
Write a Comment
User Comments (0)
About PowerShow.com