Program Analysis via Graph Reachability - PowerPoint PPT Presentation

About This Presentation
Title:

Program Analysis via Graph Reachability

Description:

Program Analysis via Graph Reachability. Thomas Reps. University of Wisconsin ... Left out by algorithm of Horwitz, Reps, & Binkley [PLDI 88; TOPLAS 90] Each ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 104
Provided by: thoma425
Category:

less

Transcript and Presenter's Notes

Title: Program Analysis via Graph Reachability


1
Program Analysis via Graph Reachability
  • Thomas Reps
  • University of Wisconsin

http//www.cs.wisc.edu/reps/
PLDI ?00 Tutorial, Vancouver, B.C., June 18, 2000
2
Backward Slice
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
3
Backward Slice
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Backward slice with respect to printf(d\n,i)
4
Slice Extraction
int main() int i 1 while (i lt 11)
i i 1 printf(d\n,i)
Backward slice with respect to printf(d\n,i)
5
Forward Slice
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
6
Forward Slice
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Forward slice with respect to sum 0
7
What Are Slices Useful For?
  • Understanding Programs
  • What is affected by what?
  • Restructuring Programs
  • Isolation of separate computational threads
  • Program Specialization and Reuse
  • Slices specialized programs
  • Only reuse needed slices
  • Program Differencing
  • Compare slices to identify changes
  • Testing
  • What new test cases would improve coverage?
  • What regression tests must be rerun after a
    change?

8
Line-Character-Count Program
void line_char_count(FILE f) int lines
0 int chars BOOL eof_flag FALSE int
n extern void scan_line(FILE f, BOOL bptr,
int iptr) scan_line(f, eof_flag, n) chars
n while(eof_flag FALSE) lines lines
1 scan_line(f, eof_flag, n) chars chars
n printf(lines d\n,
lines) printf(chars d\n, chars)
9
Character-Count Program
void char_count(FILE f) int lines 0 int
chars BOOL eof_flag FALSE int n extern
void scan_line(FILE f, BOOL bptr, int
iptr) scan_line(f, eof_flag, n) chars
n while(eof_flag FALSE) lines lines
1 scan_line(f, eof_flag, n) chars chars
n printf(lines d\n,
lines) printf(chars d\n, chars)
10
Line-Character-Count Program
void line_char_count(FILE f) int lines
0 int chars BOOL eof_flag FALSE int
n extern void scan_line(FILE f, BOOL bptr,
int iptr) scan_line(f, eof_flag, n) chars
n while(eof_flag FALSE) lines lines
1 scan_line(f, eof_flag, n) chars chars
n printf(lines d\n,
lines) printf(chars d\n, chars)
11
Line-Count Program
void line_count(FILE f) int lines 0 int
chars BOOL eof_flag FALSE int n extern
void scan_line2(FILE f, BOOL bptr, int
iptr) scan_line2(f, eof_flag, n) chars
n while(eof_flag FALSE) lines lines
1 scan_line2(f, eof_flag, n) chars
chars n printf(lines d\n,
lines) printf(chars d\n, chars)
12
Control Flow Graph
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Enter
F
sum 0
i 1
printf(sum)
printf(i)
while(i lt 11)
T
sum sum i
i i i
13
Flow Dependence Graph
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Flow dependence
Value of variable assigned at p may be used at q.
p
q
Enter
i 1
sum 0
printf(sum)
printf(i)
while(i lt 11)
sum sum i
i i i
14
Control Dependence Graph
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Control dependence
q is reached from p if condition p is true (T),
not otherwise.
p
q
T
Similar for false (F).
p
q
F
Enter
T
T
T
T
T
T
sum 0
i 1
printf(sum)
printf(i)
while(i lt 11)
T
T
sum sum i
i i i
15
Program Dependence Graph (PDG)
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Control dependence
Flow dependence
Enter
T
T
T
T
T
T
sum 0
i 1
printf(sum)
printf(i)
while(i lt 11)
T
T
sum sum i
i i i
16
Program Dependence Graph (PDG)
int main() int i 1 int sum 0 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Enter
T
T
T
T
T
T
sum 0
i 1
printf(sum)
printf(i)
while(i lt 11)
T
T
sum sum i
i i i
17
Backward Slice
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Enter
T
T
T
T
T
T
sum 0
i 1
printf(sum)
printf(i)
while(i lt 11)
T
T
sum sum i
i i i
18
Backward Slice (2)
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Enter
T
T
T
T
T
T
sum 0
i 1
printf(sum)
printf(i)
while(i lt 11)
T
T
sum sum i
i i i
19
Backward Slice (3)
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Enter
T
T
T
T
T
T
sum 0
i 1
printf(sum)
printf(i)
while(i lt 11)
T
T
sum sum i
i i i
20
Backward Slice (4)
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Enter
T
T
T
T
T
T
sum 0
i 1
printf(sum)
printf(i)
while(i lt 11)
T
T
sum sum i
i i i
21
Slice Extraction
int main() int i 1 while (i lt 11)
i i 1 printf(d\n,i)
Enter
T
T
T
T
i 1
printf(i)
while(i lt 11)
T
i i i
22
(No Transcript)
23
Interprocedural Slice
int main() int sum 0 int i 1 while (i
lt 11) sum add(sum,i) i
add(i,1) printf(d\n,sum) printf(d\n,i
)
int add(int x, int y) return x y
24
Interprocedural Slice
int main() int sum 0 int i 1 while (i
lt 11) sum add(sum,i) i
add(i,1) printf(d\n,sum) printf(d\n,i
)
int add(int x, int y) return x y
Backward slice with respect to printf(d\n,i)
25
Interprocedural Slice
int main() int sum 0 int i 1 while (i
lt 11) sum add(sum,i) i
add(i,1) printf(d\n,sum) printf(d\n,i
)
int add(int x, int y) return x y
Superfluous components included by Weisers
slicing algorithm TSE 84 Left out by algorithm
of Horwitz, Reps, Binkley PLDI 88 TOPLAS 90
26
How is an SDG Created?
  • Each PDG has nodes for
  • entry point
  • procedure parameters and function result
  • Each call site has nodes for
  • call
  • arguments and function result
  • Appropriate edges
  • entry node to parameters
  • call node to arguments
  • call node to entry node
  • arguments to parameters

27
System Dependence Graph (SDG)
Enter main
Call p
Call p
Enter p
28
SDG for the Sum Program
Enter main
while(i lt 11)
sum 0
i 1
printf(sum)
printf(i)
Call add
Call add
yin i
xin sum
sum xout
xin i
yin 1
i xout
Enter add
x xin
y yin
x x y
xout x
29
Interprocedural Backward Slice
Enter main
Call p
Call p
Enter p
30
Interprocedural Backward Slice (2)
Enter main
Call p
Call p
Enter p
31
Interprocedural Backward Slice (3)
Enter main
Call p
Call p
Enter p
32
Interprocedural Backward Slice (4)
Enter main
Call p
Call p
Enter p
33
Interprocedural Backward Slice (5)
Enter main
Call p
Call p
Enter p
34
Interprocedural Backward Slice (6)
Enter main
Call p
Call p
Enter p
35
Matched-Parenthesis Path
36
Interprocedural Backward Slice (6)
Enter main
Call p
Call p
Enter p
37
Interprocedural Backward Slice (7)
Enter main
Call p
Call p
Enter p
38
Slice Extraction
Enter main
Call p
Enter p
39
Slice of the Sum Program
Enter main
while(i lt 11)
i 1
printf(i)
Call add
xin i
yin 1
i xout
Enter add
x xin
y yin
x x y
xout x
40
CFL-ReachabilityYannakakis 90
  • G Graph (N nodes, E edges)
  • L A context-free language
  • L-path from s to t iff
  • Running time O(N 3)

41
Interprocedural Slicingvia CFL-Reachability
  • Graph System dependence graph
  • L L(matched) roughly
  • Node m is in the slice w.r.t. n iff there is an
    L(matched)-path from m to n

42
(No Transcript)
43
CFL-Reachability via Dynamic Programming
Graph
Grammar
B
C
44
Degenerate Case CFL-Recognition
exp ? id exp exp exp exp ( exp )
?
(a b) c ? L(exp) ?
45
Degenerate Case CFL-Recognition
exp ? id exp exp exp exp ( exp )
a b) c ? L(exp) ?
46
CYK Context-Free Recognition
M ? M M ( M ) M
( )
? ( )
Is ? ? L(M)?
47
CYK Context-Free Recognition
M ? M M ( M ) M
( )
48
CYK
Is ( ) ? L(M)?
?
M ? M M LPM ) LBM (
) LPM ? ( M
LBM ? M
49
CFL-Reachability via Dynamic Programming
Graph
Grammar
B
C
50
Dynamic Transitive Closure ?!
  • Aiken et al.
  • Set-constraint solvers
  • Points-to analysis
  • Henglein et al.
  • type inference
  • But a CFL captures a non-transitive reachability
    relation Valiant 75

51
Program Chopping
Given source S and target T, what program points
transmit effects from S to T?
Intersect forward slice from S with backward
slice from T, right?
52
Non-Transitivity and Slicing
int main() int sum 0 int i 1 while (i
lt 11) sum add(sum,i) i
add(i,1) printf(d\n,sum) printf(d\n,i
)
int add(int x, int y) return x y
53
Non-Transitivity and Slicing
int main() int sum 0 int i 1 while (i
lt 11) sum add(sum,i) i
add(i,1) printf(d\n,sum) printf(d\n,i
)
int add(int x, int y) return x y
Forward slice with respect to sum 0
54
Non-Transitivity and Slicing
int main() int sum 0 int i 1 while (i
lt 11) sum add(sum,i) i
add(i,1) printf(d\n,sum) printf(d\n,i
)
int add(int x, int y) return x y
55
Non-Transitivity and Slicing
int main() int sum 0 int i 1 while (i
lt 11) sum add(sum,i) i
add(i,1) printf(d\n,sum) printf(d\n,i
)
int add(int x, int y) return x y
Backward slice with respect to printf(d\n,i)
56
Non-Transitivity and Slicing
int main() int sum 0 int i 1 while (i
lt 11) sum add(sum,i) i
add(i,1) printf(d\n,sum) printf(d\n,i
)
int add(int x, int y) return x y
Forward slice with respect to sum 0
?
Backward slice with respect to printf(d\n,i)
57
Non-Transitivity and Slicing
int main() int sum 0 int i 1 while (i
lt 11) sum add(sum,i) i
add(i,1) printf(d\n,sum) printf(d\n,i
)
int add(int x, int y) return x y
?
Chop with respect to sum 0 and
printf(d\n,i)
58
Non-Transitivity and Slicing
Enter main
while(i lt 11)
sum 0
i 1
printf(sum)
printf(i)
Call add
Call add
yin i
xin sum
sum xout
xin i
yin 1
i xout
Enter add
x xin
y yin
x x y
xout x
59
Program Chopping
Given source S and target T, what program points
transmit effects from S to T?
Precise interprocedural chopping Reps Rosay
FSE 95
60
CF-Recognition vs. CFL-Reachability
  • CF-Recognition
  • Chain graphs
  • General grammar sub-cubic time Valiant75
  • LL(1), LR(1) linear time
  • CFL-Reachability
  • General graphs O(N3)
  • LL(1) O(N3)
  • LR(1) O(N3)
  • Certain kinds of graphs O(NE)
  • Regular languages O(NE)

Gen/kill IDFA
GMOD IDFA
61
Regular-Language ReachabilityYannakakis 90
  • G Graph (N nodes, E edges)
  • L A regular language
  • L-path from s to t iff
  • Running time O(NE)
  • Ordinary reachability ( transitive closure)
  • Label each edge with e
  • L is e

62
Themes
  • Harnessing CFL-reachability
  • Relationship to other analysis paradigms
  • Exhaustive alg. ? Demand alg.
  • Understanding complexity
  • Linear . . . cubic . . . undecidable
  • Beyond CFL-reachability

63
Relationship to Other Analysis Paradigms
  • Dataflow analysis
  • reachability versus equation solving
  • Deduction
  • Set constraints

64
Dataflow Analysis
  • Goal For each point in the program, determine a
    superset of the facts that could possibly hold
    during execution
  • Examples
  • Constant propagation
  • Reaching definitions
  • Live variables
  • Possibly uninitialized variables

65
Useful For . . .
  • Optimizing compilers
  • Parallelizing compilers
  • Tools that detect possible logical errors
  • Tools that show the effects of a proposed
    modification

66
Possibly Uninitialized Variables

w,x,y
w,y
w,y
w,y
w
w,y

67
Precise Intraprocedural Analysis
C
68
if . . .
69
Precise Interprocedural Analysis
ret
C
n
start
Sharir Pnueli 81
70
Representing Dataflow Functions
Identity Function
a
b
c
Constant Function
71
Representing Dataflow Functions
a
b
c
Gen/Kill Function
a
b
c
Non-Gen/Kill Function
72
if . . .
73
Composing Dataflow Functions
74
x
y
a
b
if . . .
75
matched ? matched matched
(i matched )i 1 ? i ? CallSites
edge ?
76
unbalLeft ? matched unbalLeft
(i unbalLeft 1 ? i ? CallSites
?
77
Interprocedural Dataflow Analysisvia
CFL-Reachability
  • Graph Exploded control-flow graph
  • L L(unbalLeft)
  • Fact d holds at n iff there is an
    L(unbalLeft)-path from

78
Asymptotic Running Time Reps, Horwitz, Sagiv
95
  • CFL-reachability
  • Exploded control-flow graph ND nodes
  • Running time O(N3D3)
  • Exploded control-flow graph Special
    structure

Running time O(ED3)
Typically E l N, hence O(ED3) l O(ND3)
Gen/kill problems O(ED)
79
Why Bother?Were only interested in
million-line programs
  • Know thy enemy!
  • Any algorithm must do these operations
  • Avoid pitfalls (e.g., claiming O(N2) algorithm)
  • The essence of context sensitivity
  • Special cases
  • Gen/kill problems O(ED)
  • Compression techniques
  • Basic blocks
  • SSA form, sparse evaluation graphs
  • Demand algorithms

80
Relationship to Other Analysis Paradigms
  • Dataflow analysis
  • reachability versus equation solving
  • Deduction
  • Set constraints

81
The Need for Pointer Analysis
int main() int sum 0 int i 1 int p
sum int q i int (f)(int,int)
add while (q lt 11) p (f)(p,q)
q (f)(q,1) printf(d\n,p)
printf(d\n,q)
int add(int x, int y) return x y
82
The Need for Pointer Analysis
int main() int sum 0 int i 1 int p
sum int q i int (f)(int,int)
add while (q lt 11) p (f)(p,q)
q (f)(q,1) printf(d\n,p)
printf(d\n,q)
int add(int x, int y) return x y
83
The Need for Pointer Analysis
int main() int sum 0 int i 1 int p
sum int q i int (f)(int,int)
add while (i lt 11) sum add(sum,i)
i add(i,1) printf(d\n,sum)
printf(d\n,i)
int add(int x, int y) return x y
84
Flow-Sensitive Points-To Analysis
p q
p q
p q
p q
85
Flow-Sensitive ? Flow-Insensitive
86
Flow-Insensitive Points-To AnalysisAndersen 94,
Shapiro Horwitz 97
p q
p q
p q
p q
87
Flow-Insensitive Points-To Analysis
a
a e b a c f b c d a
e
b
c
f
d
88
CFL-Reachability via Dynamic Programming
Graph
Grammar
B
C
89
CFL-Reachability Chain Programs
Graph
Grammar
B
C
a(X,Z) - b(X,Y), c(Y,Z).
90
Base Facts for Points-To Analysis
p q
assignAddr(p,q).
p q
assign(p,q).
p q
assignStar(p,q).
p q
starAssign(p,q).
91
Rules for Points-To Analysis (I)
pointsTo(P,Q) - assignAddr(P,Q).
pointsTo(P,R) - assign(P,Q), pointsTo(Q,R).
92
Rules for Points-To Analysis (II)
pointsTo(P,S) - assignStar(P,Q),pointsTo(Q,R),poi
ntsTo(R,S).
pointsTo(R,S) - starAssign(P,Q),pointsTo(P,R),poi
ntsTo(Q,S).
93
Creating a Chain Program
pointsTo(R,S) - starAssign(P,Q),pointsTo(P,R),poi
ntsTo(Q,S).
pointsTo(R,S) - pointsTo(P,R),starAssign(P,Q),poi
ntsTo(Q,S).
94
Base Facts for Points-To Analysis
p q
assignAddr(p,q).
p q
assign(p,q).
p q
assignStar(p,q).
p q
starAssign(p,q).
95
Creating a Chain Program
pointsTo(P,Q) - assignAddr(P,Q).
pointsTo(P,R) - assign(P,Q), pointsTo(Q,R).
pointsTo(P,S) - assignStar(P,Q),pointsTo(Q,R),poi
ntsTo(R,S).
96
. . . and now to CFL-Reachability
97
Themes
  • Harnessing CFL-reachability
  • Relationship to other analysis paradigms
  • Exhaustive alg. ? Demand alg.
  • Understanding complexity
  • Linear . . . cubic . . . undecidable
  • Beyond CFL-reachability

98
Exhaustive Versus Demand Analysis
  • Exhaustive analysis All facts at all points
  • Optimization Concentrate on inner loops
  • Program-understanding tools Only some facts are
    of interest

99
Exhaustive Versus Demand Analysis
  • Demand analysis
  • Does a given fact hold at a given point?
  • Which facts hold at a given point?
  • At which points does a given fact hold?
  • Demand analysis via CFL-reachability
  • single-source/single-target CFL-reachability
  • single-source/multi-target CFL-reachability
  • multi-source/single-target CFL-reachability

100
if . . .
101
Experimental ResultsHorwitz , Reps, Sagiv
1995
  • 53 C programs (200-6,700 lines)
  • For a single fact of interest
  • demand always better than exhaustive
  • All appropriate demands beats exhaustive when
    percentage of yes answers is high
  • Live variables
  • Truly live variables
  • Constant predicates
  • . . .

102
Demand Analysis and LP Queries (I)
  • Flow-insensitive points-to analysis
  • Does variable p point to q?
  • Issue query ?- pointsTo(p, q).
  • Solve single-source/single-target
    L(pointsTo)-reachability problem
  • What does variable p point to?
  • Issue query ?- pointsTo(p, Q).
  • Solve single-source L(pointsTo)-reachability
    problem
  • What variables point to q?
  • Issue query ?- pointsTo(P, q).
  • Solve single-target L(pointsTo)-reachability
    problem

103
Demand Analysis and LP Queries (II)
  • Flow-sensitive analysis
  • Does a given fact f hold at a given point p?
  • ?- dfFact(p, f).
  • Which facts hold at a given point p?
  • ?- dfFact(p, F).
  • At which points does a given fact f hold?
  • ?- dfFact(P, f).
  • E.g., flow-sensitive points-to analysis
  • ?- dfFact(p, pointsTo(x, Y)).
  • ?- dfFact(P, pointsTo(x, y)).
  • etc.
Write a Comment
User Comments (0)
About PowerShow.com