Data Structures and Algorithms for Efficient Shape Analysis - PowerPoint PPT Presentation

Loading...

PPT – Data Structures and Algorithms for Efficient Shape Analysis PowerPoint presentation | free to view - id: 7149cc-NjA2Y



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Data Structures and Algorithms for Efficient Shape Analysis

Description:

Data Structures and Algorithms for Efficient Shape Analysis by Roman Manevich Prepared under the supervision of Dr. Shmuel (Mooly) Sagiv – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 70
Provided by: Roma114
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Data Structures and Algorithms for Efficient Shape Analysis


1
Data Structures and Algorithms for Efficient
Shape Analysis
  • by Roman Manevich
  • Prepared under the supervision of Dr. Shmuel
    (Mooly) Sagiv

2
Motivation
  • TVLA is a powerful and general abstract
    interpretation system
  • Abstract interpretation in TVLA
  • Operational semantics is expressed with
    first-order logic TC formulae
  • Program states are represented as sets of
    Evolving First-Order Structures
  • Efficiency is an issue

3
Outline
  • Shape Analysis quick intro
  • Compactly representing structures
  • Tuning abstraction to improve performance

4
What is Shape Analysis
  • Determines Shape Invariants for imperative
    programs
  • Can be used to verify a wide range of properties
    over different programming languages

5
reverse Example
/ list.h / typedef struct node struct node
n int data List
/ print.c / include list.h List reverse
(List x) List y, t y NULL while
(x ! NULL) t y y x
x x ? n y ? n t return
y
6
reverse Example
Shape before
x
n
n
. . .
Shape after
y
n
n
. . .
7
Definition of a First-Order Logical Structure
  • S ltU, ?gt
  • U a set of individuals (node set)
  • ? a mapping p(r) ? (Ur ? 0,1) the
    interpretation of p

8
Three-Valued Logic
  • 1 True
  • 0 False
  • 1/2 Unknown
  • A join semi-lattice 0 ? 1 1/2

1/2
?
?
9
Canonical Abstraction
  • Partition the individuals into equivalence
    classes based on the values of their unary
    predicates
  • Collapse other predicates via ?
  • pS (u1, ..., uk) ? pB (u1, ..., uk)
    f(u1)u1, ..., f(uk)uk)
  • At most 3n abstract individuals

10
Canonical Abstraction Example
u1 rn,x
u2 rn,x
u3 rn,x
n
n
n
x
n
x
n
11
Compactly Representing First-Order Logical
Structures
  • Space is a major bottleneck
  • Analysis explores many logical structures
  • Reduce space by sharing information across
    structures

12
Desired Properties
  • Sparse data structures
  • Share common sub-structures
  • Inherited sharing
  • Incidental sharing due to program invariants
  • But feasible time performance
  • Phase sensitive data structures

13
Chapter Outline
  • Background
  • First-order structure representations
  • Base representation (TVLA 0.91)
  • BDD representation
  • Empirical evaluation
  • Conclusion

14
First-Order Logical Structures
  • Generalize shape graphs
  • Arbitrary set of individuals
  • Arbitrary set of predicates on individuals
  • Dynamically evolving
  • Usually small changes
  • Properties are extracted by evaluating first
    order formula ?v1 , v x(v1) ? n(v1, v)
  • Join operator requires isomorphism testing

15
First-Order Structure ADT
  • Structure new() / empty structure /
  • SetOfNodes nodeSet(Structure)
  • Node newNode(Structure)
  • removeNode(Structure, node)
  • Kleene eval(Structure, p(r), ltu1, . . . ,urgt)
  • update(Structure, p(r), ltu1, . . . ,urgt, Kleene)
  • Structure copy(Structure)

16
print_all Example
/ list.h / typedef struct node struct node
n int data L
/ print.c / include list.h void print_all(L
y) L x x y while (x ! NULL) /
assert(x ! NULL) / printf(elemd,
x?data) x x?n
17
print_all Example

u sm½
u1 y1

S0
x y x(v) y(v)
copy(S0) S1
nodeset(S0) u1, u
eval(S0, y, u1) 1
update(S1, x, u1, 1)
x1
eval(S0, y, u) 0
update(S1, x, u, 0)
18
print_all Example

u1 x1 y1
while (x ! NULL) precondition ?v x(v)
u sm½

S1

x x ? n focus ?v1 x(v1) ? n(v1, v) x(v)
?v1 x(v1) ? n(v1, v)
u sm½
u1 y1
S2.0

u1 y1
u x1
S2.1
n1



u.0 sm½
u1 y1
n1
S2.2
u.1 x1
19
Overview and Main Results
  • Two novel representations of first-order
    structures
  • New BDD representation
  • New representation using functional maps
  • Implementation techniques
  • Empirical evaluation
  • Comparison of different representations
  • Space is reduced by a factor of 410
  • New representations scale better

20
Base Representation (Tal Lev-Ami SAS 2000)
  • Two-Level Map Predicate ? (Node Tuple ?
    Kleene)
  • Sparse Representation
  • Limited inherited sharing by Copy-On-Write

21
BDDs in a Nutshell (Bryant 86)
  • Ordered Binary Decision Diagrams
  • Data structure for Boolean functions
  • Functions are represented as (unique) DAGs

f x3 x2 x1
0 0 0 0
0 1 0 0
0 0 1 0
1 1 1 0
0 0 0 1
1 1 0 1
0 0 1 1
1 1 1 1
x1
x2
x2
x3
x3
x3
x3
1
0
0
0
0
1
0
1
22
BDDs in a Nutshell (Bryant 86)
  • Ordered Binary Decision Diagrams
  • Data structure for Boolean functions
  • Functions are represented as (unique) DAGs
  • Also achieve sharing across functions

x1
x1
x1
x2
x2
x2
x2
x2
x3
x3
x3
x3
x3
x3
x3
0
1
0
1
0
1
Duplicate Terminals
Duplicate Nonterminals
Redundant Tests
23
Encoding Structures Using Integers
  • Static encoding of
  • Predicates
  • Kleene values
  • Dynamic encoding of nodes
  • 0, 1, , n-1
  • Encode predicate ps values as
  • ep(p).en(u1). en(u2) . . en(un) . ek(Kleene)

24
BDD Representation of Integer Sets
  • Characteristic function
  • S1,5 1lt001gt 5lt101gt ?S
    (x1?x2?x3) ? (x1?x2?x3)

25
BDD Representation of Integer Sets
  • Characteristic function
  • S1,5 1lt001gt 5lt101gt ?S
    (x1?x2?x3) ? (x1?x2?x3)

26
BDD Representation Example

u sm½
S0

S0
u1 y1
1
27
BDD Representation Example

u sm½
S0
S1

S0
u1 y1
xy

u1 x1 y1
u sm½

S1
1
28
BDD Representation Example
S2.2

u sm½
S0
S1

S0
u1 y1
xy

u1 x1 y1
u sm½

S1
xx?n



u.0 sm½
u1 y1
n1
S2.2
u.1 x1
1
29
BDD Representation Example
S2.2

u sm½
S0
S1

S0
u1 y1
xy

u1 x1 y1
u sm½

S1
xx?n



u.0 sm½
u1 y1
n1
S2.2
u.1 x1
1
30
Improved BDD Representation
  • Using this representation directly doesnt save
    space canonicity doesnt carry over from
    propositional to first-order logic
  • Observation
  • Node names can be arbitrarily remapped without
    affecting the ADT semantics
  • Our heuristics
  • Use canonic node names to encode nodes and obtain
    a canonic representation
  • Increases incidental sharing
  • Reduces isomorphism test to pointer comparison
  • 4-10 space reduction

31
Reducing Time Overhead
  • Current implementation not optimized
  • Expensive formula evaluation
  • Hybrid representation
  • Distinguish between phases mutable phase ? Join
    ? immutable phase
  • Dynamically switch representations

32
Functional Representation
  • Alternative representation for first-order
    structures
  • Structures represented by maps from integers to
    Kleene values
  • Tailored for representing first-order structures
  • Achieves better results than BDDs
  • Techniques similar to the BDD representation
  • More details in the thesis

33
Introduction to Functional Maps
  • A mapping N ? 0,½,1

2 1 0
1 0 ½
34
Introduction to Functional Maps
  • Sparse maps

size 27 size 27 size 27

size 9 size 9 size 9

2 1 0
1 0 ½
5 4 3
0 0 0
8 7 6
1 0 ½
35
Introduction to Functional Maps
  • Share unique sub-maps

size 27 size 27 size 27

size 9 size 9 size 9

2 1 0
1 0 ½
8 7 6
1 0 ½
36
Introduction to Functional Maps
  • Share unique sub-maps

size 27 size 27 size 27

size 9 size 9 size 9

2 1 0
1 0 ½
37
Functional Representation Example

u sm½

u1 y1
S0 S0 S0
binary unary nullary
size27 size27 size27

size9 size9 size9

size9 size9 size9

y x sm
1 0 0
y x sm
0 0 ½
n
½
38
Functional Representation Example


u1 x1 y1
u sm½

u sm½

u1 y1
S0 S0 S0
binary unary nullary
S1 S1 S1
binary unary nullary
size27 size27 size27

size9 size9 size9

size9 size9 size9

size9 size9 size9

size9 size9 size9

y x sm
1 0 0
y x sm
0 0 ½
y x sm
1 1 0
n
½
39
Functional Representation Example




u1 x1 y1

u.0 sm½
u sm½
u1 y1
n1

u.1 x1
u sm½

u1 y1
S0 S0 S0
binary unary nullary
S2.2 S2.2 S2.2
binary unary nullary
S1 S1 S1
binary unary nullary
size81 size81 size81

size27 size27 size27

size27 size27 size27

size27 size27 size27

size9 size9 size9

size9 size9 size9

size9 size9 size9

size9 size9 size9

size9 size9 size9

size9 size9 size9

size9 size9 size9

size9 size9 size9

y x sm
1 0 0
y x sm
0 0 ½
y x sm
0 1 0
y x sm
1 1 0
n
½
n
1
40
Reducing Time Overhead
  • Lazy normalization is used to balance
    time/space performance

41
Empirical Evaluation
  • Benchmarks
  • Cleanness Analysis (SAS 2000)
  • Garbage Collector
  • CMP (PLDI 2002) of Java Front-End and Kernel
    Benchmarks
  • Mobile Ambients (ESOP 2000)
  • Stress testing the representations
  • We use relational analysis
  • Save structures in every CFG location

42
Space Results
43
Space Results
44
Abstract Counters
  • Ignore language/implementation details
  • A more reliable measurement technique
  • Count only crucial space information
  • Independent of C/Java

45
Abstract Counters Results
46
Trends in the Cleanness Analysis Benchmark
47
Conclusions
  • Two novel representations of first-order
    structures
  • New BDD representation
  • New representation using functional maps
  • Implementation techniques
  • Substantially better than inherited sharing
  • Structure canonization is crucial
  • Normalization via hash-consing is the key
    technique

48
Conclusions
  • The use of BDDs for static analysis is not a
    panacea for space saving
  • Domain-specific encoding crucial for saving space
  • Failed attempts
  • Original implementation of Veiths encoding
  • PAG

49
Tuning Abstraction for Improved Performance
  • Analysis can be very costly
  • Explores many structures GC example explores
    gt180,000 structures

50
Existing Analysis Modes
  • Relational analysis
  • Doubly-exponential in worst case
  • Our most precise method
  • Single-structure analysis (Tal Lev-Ami SAS 2000)
  • Singly-exponential in worst case
  • Can be very efficient
  • Can be very imprecise
  • Sometimes very inefficient

51
Single-Structure Analysis
May exist
n
u1
u
x
S0
n
u1
u
x
S0 ? S1
u1
x
S1
52
Single-Structure Analysis
  • Active property
  • ac0 doesnt exist in every concrete structure
  • ac1 exists in every concrete structure
  • ac1/2 may exist in some concrete structure

u1 ac1
u ac1
n
x
S0
u1 ac1
u ac1/2
n
x
S0 ? S1
u1 ac1
x
S1
53
Single-Structure Analysis
  • Sometimes overly imprecise
  • Refine analysis by using nullary predicates to
    distinguish between different structures

54
Is there a sweet spot?
Efficiency
Relational Analysis
Precision
55
Chapter Outline
  • Removing embedded structures
  • Merging structures with same set of canonical
    names
  • Staged analysis to localize abstraction
  • Merging pseudo-embedded structures

56
Order Relations on Structures and Sets of
Structures
  • S, S ? 3-STRUCT S ?ƒ S if for every predicate p
  • ps(u1,,uk) ? ps(ƒ(u1),, ƒ(uk))
  • (u ƒ(u)u gt 1) ? sms(u)
  • X, X ? 23-STRUCT X ? X
  • Every S?X has S?X and S?S

57
Compacting Transformations
  • We look for transformation T 23-STRUCT?
    23-STRUCT with the following properties
  • Compacting T(x) ? x
  • Conservative T(x) ? x
  • Without sacrificing precision

58
Removing Embedded Structures
S1
S0
x
x
n
y
y
u1 rn,t rn,y
n
n
t
t
59
Removing Embedded Structures
Reversing a list with exactly 3 cells
Reversing a list with at least 3 cells
S1
S0
x
x
n
y
y
u1 rn,t rn,y
n
n
t
t
60
Detecting Embedding is hard
  • In general, as hard as GRAPH ISOMORPHISM
  • Conditions for a unique mapping
  • Canonical abstraction
  • Definite values
  • Polynomial time check

61
Results (structures explored)
62
Results (structures explored)
63
Canonical Names Method
  • Canonical abstraction merges individuals with
    same canonical names (unary abstraction
    predicate values)
  • Merge structures with same set of canonical names
  • Both transformations preserve definity of
    abstraction predicates
  • But ignores precision of non-abstraction
    predicates

64
Canonical Abstraction Example
u1 rn,x
u2 rn,x
u3 rn,x
n
n
n
x
n
x
n
65
Merging Structures with Same Canonical Names
Example
u rn,x
n
x
n
S0
S0 ? S1
n
x
S1
n
n
x
66
Merging Structures with Same Canonical Names
Example
n
u0
u
x
S0
n
S0 ? S1
u0
u
x
S1
u0
u
x
67
Results (structures explored)
68
Localizing Abstraction
  • Find an appropriate subset of abstraction
    predicates for every CFG node
  • Observation programs contain dead variables
    exploit to make corresponding predicates dead
  • Compute predicate liveness to determine subset
    of abstraction predicates

69
reverse Example
List reverse (List x) L0 List y, t L1
y NULL L2 while (x ! NULL) L3
t y L4 y x L5 x x ? n L6
y ? n t L7 return y
y dead
t dead
all dead
70
Results (structures explored)
71
Compaction via Pseudo-Embedding
  • Pseudo-Embedding similar to embedding with
    respect to abs. predicates
  • S, S ? 3-STRUCT S ?ƒ S if for every abstract
    predicate p
  • ps(u) ? ps(ƒ(u))
  • (u ƒ(u)u gt 1) ? sms(u)

72
Modified blur
  • Order relation on nodes u1 ? u2 if for every
    abstraction predicate p ps(u1) ? ps(u2)
  • blur merges u1 with u2 if u1 ? u2

73
blur Example
n
u0 rn,x
u rn,x
x
blur
n
u rn,x
x
74
Merging Pseudo-Embedded Structures Example
Abstraction predicates x,y Non-abstraction
predicates rn,x, rn,y, n
u rn,y rn,x
n
u0 rn,x
x
y
n
S0
x
u rn,y 1/2 rn,x
S0 ? S1
n
y
S1
x
u rn,y rn,x
y
75
Results (structures explored)
76
Empirical Evaluation
  • Benchmarks
  • Garbage Collector
  • Mobile Ambients (ESOP 2000)
  • Sorting procedures (ISSTA 2000)
  • MA J2 completed without instrumentation
    predicates and without messages

77
Results (structures explored)
Out of memory
Out of time
False alarms
78
Conclusion
  • New method is usually much more efficient (by
    orders of magnitude)
  • Doesnt lose precision on benchmarks
  • Performance more stable than other methods

79
Future and Ongoing Work
  • Time optimizations
  • Symbolic (BDD) execution of TVLA operations
  • Compactly represent sets of structures
  • Improving abstraction locality
  • Truly live predicates
  • Analyzing liveness for core predicates and
    deriving for instrumentation predicates
  • Experiment with other compacting transformations
  • Achieve polynomial complexity

80
The End
About PowerShow.com