Shape Analysis via 3-Valued Logic - PowerPoint PPT Presentation

1 / 93
About This Presentation
Title:

Shape Analysis via 3-Valued Logic

Description:

Title: Program Analysis via Graph Reachability Author: Thomas Reps Last modified by: sagiv Created Date: 3/24/1998 3:26:02 AM Document presentation format – PowerPoint PPT presentation

Number of Views:163
Avg rating:3.0/5.0
Slides: 94
Provided by: thomas399
Category:
Tags: analysis | jdbc | logic | shape | valued | via

less

Transcript and Presenter's Notes

Title: Shape Analysis via 3-Valued Logic


1
Shape Analysisvia 3-Valued Logic
  • Mooly Sagiv
  • Tel Aviv University

http//www.cs.tau.ac.il/msagiv/toplas02.ps www.cs
.tau.ac.il/tvla
2
Topics
  • A new abstract domain for static analysis
  • Abstract dynamically allocated memory
  • TVLA A system for generating abstract
    interpreters
  • Applications

3
Motivation
  • Dynamically allocated storage and pointers are
    essential programming tools
  • Object oriented
  • Modularity
  • Data structure
  • But
  • Error prone
  • Inefficient
  • Static analysis can be very useful here

4
A Pathological C Program
a malloc() b a free (a) c malloc
() if (b c) printf(unexpected equality)
5
Dereference of NULL pointers
  • typedef struct element
  • int value
  • struct element next
  • Elements

bool search(int value, Elements c) Elements
elemfor (elem c c ! NULL
elem elem-gtnext) if (elem-gtval
value) return TRUE return FALSE
6
Dereference of NULL pointers
  • typedef struct element
  • int value
  • struct element next
  • Elements

bool search(int value, Elements c) Elements
elemfor (elem c c ! NULL
elem elem-gtnext) if (elem-gtval
value) return TRUE return FALSE
potential null de-reference
7
Memory leakage
typedef struct element int value struct
element next Elements
  • Elements reverse(Elements c)
  • Elements h,gh NULLwhile (c! NULL) g
    c-gtnext h c c-gtnext h c
    g return h

8
Memory leakage
typedef struct element int value struct
element next Elements
  • Elements reverse(Elements c)
  • Elements h,gh NULLwhile (c! NULL) g
    c-gtnext h c c-gtnext h c
    g return h

leakage of address pointed-by h
9
Memory leakage
typedef struct element int value struct
element next Elements
  • Elements reverse(Elements c)
  • Elements h,gh NULLwhile (c! NULL) g
    c-gtnext h c c-gtnext h c
    g return h

? No memory leaks
10
Example List Creation
typedef struct node int val struct
node next List
List create () List x, t x NULL while ()
do t malloc() t ?nextx x
t return x
? No null dereferences
? No memory leaks
? Returns acyclic list
11
Example Collecting Interpretation
12
Example Abstract Interpretation
13
Challenge 1 - Memory Allocation
  • The number of allocated objects/threads is not
    known
  • Concrete state space is infinite
  • How to guarantee termination?

14
Challenge 2 - Destructive Updates
  • The program manipulates states using destructive
    updates
  • e ? next t
  • Hard to define concrete interpretation
  • Harder to define abstract interpretation

15
Challenge 2 - Destructive Update
Unsound ?
16
Challenge 2 - Destructive Update
Imprecise ?
17
Challenge 3 Re-establishing Data Structure
Invariants
  • Data-structure invariants typically only hold at
    the beginning and end of ADT operations
  • Need to verify that data-structure invariants are
    re-established

18
Challenge 3 Re-establishing Data Structure
Invariants
  • rotate(List first, List last)
  • if ( first ! NULL)
  • last ? next first
  • first first ? next
  • last last ? next
  • last ? next NULL

19
Plan
  • Concrete interpretation
  • Canonical abstraction
  • Abstract interpretation using canonical
    abstraction
  • The TVLA system

20
Traditional Heap Interpretation
  • States Two level stores
  • Env Var ? Values
  • fields Loc ? Values
  • ValuesLoc ?Atoms
  • Example
  • Env x ? 30, p ? 79
  • next 30 ?40, 40 ? 50, 50 ?79, 79 ? 90
  • val 30 ?1, 40 ? 2, 50 ?3, 79 ? 4, 90 ?5

21
Predicate Logic
  • Vocabulary
  • A finite set of predicate symbols Peach with a
    fixed arity
  • Logical Structures S provide meaning for
    predicates
  • A set of individuals (nodes) U
  • pS (US)k ? 0, 1
  • FOTC over TC,????? express logical structure
    properties

22
Representing Stores as Logical Structures
  • Locations ? Individuals
  • Program variables ? Unary predicates
  • Fields ? Binary predicates
  • Example
  • U u1, u2, u3, u4, u5
  • x u1, p u3
  • n ltu1, u2gt, ltu2, u3gt, ltu3, u4gt, ltu4, u5gt

23
Formal Semantics of First Order Formulae
  • For a structure SltUS, pSgt
  • Formulae ? with LVar free variables
  • Assignment z LVar?US
  • ???S(z) 0, 1

?1?S(z)1
?0?S(z)0
?p (v1, v2, , vk)?S(z)pS (z(v1), z(v2), ,
z(vk))
24
Formal Semantics of First Order Formulae
  • For a structure SltUS, pSgt
  • Formulae ? with LVar free variables
  • Assignment z LVar?US
  • ???S(z) 0, 1

??1??2?S(z)max (??1 ?S(z), ??2 ?S(z))
??1??2?S(z)min (??1 ?S(z), ??2 ?S(z))
???1?S(z)1- ??1 ?S(z)
??v ?1?S(z)max ??1 ?S(zv?u) u ? US
25
Formal Semantics of Transitive Closure
  • For a structure SltUS, pSgt
  • Formulae ? with LVar free variables
  • Assignment z LVar?US
  • ???S(z) 0, 1

?p(v1, v2)?S(z) max u1, ..., uk ? U,
Z(v1)u1, Z(v2)uk min1 ? i lt k
pS(ui, ui1)
26
Concrete Interpretation Rules
Statement Update formula
x NULL x(v) 0
x malloc() x(v) IsNew(v)
xy x(v) y(v)
xy ?next x(v) ?w y(w) ? n(w, v)
x ?nexty n(v, w) (?x(v)? n(v, w)) ? (x(v) ? y(w))
27
Invariants
  • No memory leaks?v ?x ?PVar ?w x(w) ? n(w,
    v)
  • Acyclic list(x)?v, w x(v) ? n(v, w) ? ?n(w,
    v)
  • Reverse (x)?v, w, r x(v) ? n(v, w) ?
    n(w, r) ? n(r, w)

28
Why use logical structures?
  • Naturally model pointers and dynamic allocation
  • No a priori bound on number of locations
  • Use formulas to express semantics
  • Indirect store updates using quantifiers
  • Can model other features
  • Concurrency
  • Abstract fields

29
Why use logical structures?
  • Behaves well under abstraction
  • Enables automatic construction of abstract
    interpreters from concrete interpretation rules
    (TVLA)

30
Collecting Interpretation
  • The set of reachable logical structures in every
    program point
  • Statements operate on sets of logical structures
  • Cannot be directly computed for programs with
    unbounded store and loops

x NULL while () do t malloc()
t ?nextx x t
empty
31
Plan
  • Concrete interpretation
  • Canonical abstraction
  • TVLA

32
Canonical Abstraction
  • Convert logical structures of unbounded size into
    bounded size
  • Guarantees that number of logical structures in
    every program is finite
  • Every first-order formula can be conservatively
    interpreted

33
Kleene Three-Valued Logic
  • 1 True
  • 0 False
  • 1/2 Unknown
  • A join semi-lattice 0 ? 1 1/2

Logical order
34
Boolean Connectives Kleene
35
3-Valued Logical Structures
  • A set of individuals (nodes) U
  • Predicate meaning
  • pS (US)k ? 0, 1, 1/2

36
Canonical Abstraction
  • Partition the individuals into equivalence
    classes based on the values of their unary
    predicates
  • Every individual is mapped into its equivalence
    class
  • Collapse predicates via ?
  • pS (u1, ..., uk) ? pB (u1, ..., uk)
    f(u1)u1, ..., f(uk)uk)
  • At most 2A abstract individuals

37
Canonical Abstraction
x NULL while () do t malloc()
t ?nextx x t
u1
u2
u3
u1
u2,3
x
t
38
Canonical Abstraction
x NULL while () do t malloc()
t ?nextx x t
n
n
u2
u1
u3
x
t
39
Canonical Abstraction and Equality
  • Summary nodes may represent more than one
    element
  • (In)equality need not be preserved under
    abstraction
  • Explicitly record equality
  • Summary nodes are nodes with eq(u, u)1/2

40
Canonical Abstraction and Equality
eq
eq
eq
x NULL while () do t malloc()
t ?nextx x t
n
n
eq
u1
u2
u3
eq
x
t
eq
eq
eq
eq
n
u2,3
u1
u2,3
x
t
n
41
Canonical Abstraction
x NULL while () do t malloc()
t ?nextx x t
n
n
u1
u2
u3
x
t
42
Challenges Heap ConcurrencyYahav POPL01
  • Concurrency with the heap is evil
  • Java threads are just heap allocated objects
  • Data and control are strongly related
  • Thread-scheduling info may require understanding
    of heap structure (e.g., scheduling queue)
  • Heap analysis requires information about thread
    scheduling

Thread t1 new Thread() Thread t2 new
Thread() t t1 t.start()
43
Configurations Example
held_by
atl_C
atl_1
rvalmyLock
rvalmyLock
blocked
atl_1
atl_0
atl_0
rvalmyLock
l_0 while (true) l_1 synchronized(myLock)
l_C // critical actions l_2 l_3
44
Concrete Configuration
held_by
atl_1
atl_C
rvalmyLock
blocked
rvalmyLock
atl_1
atl_0
atl_0
rvalmyLock
45
Abstract Configuration
held_by
blocked
atl_C
atl_1
rvalmyLock
rvalmyLock
atl_0
46
Examples Verified
Program Property
twoLock Q No interference No memory leaks Partial correctness
Producer/consumer No interference No memory leaks
Apprentice Challenge Counter increasing
Dining philosophers with resource ordering Absence of deadlock
Mutex Mutual exclusion
Web Server No interference
47
Summary
  • Canonical abstraction guarantees finite number of
    structures
  • The concrete location of an object plays no
    significance
  • But what is the significance of 3-valued logic?

48
Topics
  • Embedding
  • Instrumentation
  • Abstract Interpretation
  • Extensions

49
Embedding
50
Embedding
  • B ?f S
  • onto function f
  • pB(u1, .., uk) ? pS (f(u1), ..., f(uk))
  • S is a tight embedding of B with respect to f if
  • pS(u1, .., uk) ?pB (u1 ..., uk) f(u1)u1,
    ..., f(uk)uk
  • Canonical Abstraction is a tight embedding

51
Embedding (cont)
  • S1 ?f S2 ? every concrete state represented by S1
    is also represented by S2
  • The set of nodes in S1 and S2 may be different
  • No meaning for node names (abstract locations)
  • ?(S) S 2-valued structure S, S ?f S

52
Embedding Theorem
  • Assume B ?f S, pB(u1, .., uk) ? pS
    (f(u1), ..., f(uk))
  • Then every formula ? is preserved
  • If ??? 1 in S, then ??? 1 in B
  • If ??? 0 in S, then ??? 0 in B
  • If ??? 1/2 in S, then ??? could be 0 or 1 in B

53
Embedding Theorem
  • For every formula ? is preserved
  • If ??? 1 in S, then ??? 1 for all B??(S)
  • If ??? 0 in S, then ??? 0 for all B??(S)
  • If ??? 1/2 in S, then ??? could be 0 or 1 in
    ?(S)

54
Challenge 2 - Destructive Update
x
n
p
y
y?next NULL
n(v, w) ?y(v)? n(v, w)
Sound ?
55
Challenge 2 - Destructive Update
x
n
p
y
y?next NULL
n(v, w) ? y(v)? n(v, w)
Sound ?
56
Embedding Theorem
?v x(v)
1Yes
?v x(v)?t(v)
1Yes
?v x(v)?y(v)
0No
?v,w x(v)?n(v, w)
½Maybe
?v, w x(v)?n(v, w) ?n(v, w)
0No
?v,w x(v) ? n(v,w) ? n(w, w)
1/2Maybe
57
Summary
  • The embedding theorem eliminates the need for
    proving near commutavity
  • Guarantees soundness
  • Applied to arbitrary logics
  • But can be imprecise

58
Limitations
  • Information on summary nodes is lost
  • Leads to useless verification

59
Increasing Precision
  • User (Programming Language) supplied global
    invariants
  • Naturally expressed in FOTC
  • Record extra information in the concrete
    interpretation
  • Tune the abstraction
  • Refine concretization

60
Cyclicity predicate
cx() ?v1,v2 x(v1) ? n(v1,v2) ? n(v2, v2)
cx()0

u1
u2
un
x
n
n
n
t
n
u2..n
u1
x
cx()0
t
n
61
Cyclicity predicate
cx() ?v1,v2 x(v1) ? n(v1,v2) ? n(v2, v2)
n
cx()1

u1
u2
un
x
n
n
n
t
n
u2..n
u1
x
cx()1
t
n
62
Heap Sharing predicate
is(v) ?v1,v2 n(v1,v) ? n(v2,v) ? v1 ? v2
is(v)0
is(v)0
is(v)0

u1
u2
un
x
n
n
n
t
n
u2..n
u1
x
t
n
is(v)0
is(v)0
63
Heap Sharing predicate
is(v) ?v1,v2 n(v1,v) ? n(v2,v) ? v1 ? v2
is(v)0
is(v)1
is(v)0

u1
u2
un
x
n
n
n
t
n
64
Concrete Interpretation Rules
Statement Update formula
x NULL x(v) 0
x malloc() x(v) IsNew(v)
xy x(v) y(v)
xy ?next x(v) ?w y(w) ? n(w, v)
x ?nextNULL n(v, w) ?x(v)? n(v, w) is(v) is(v) ? ?v1, v2 n(v1, v) ?n(v2, v) ? ?x(v1) ? ?x(v2) ? ?eq(v1, v2)
65
Reachability predicate
tn(v1, v2) n(v1,v2)
u2
u1
un
x
n
n
n
t
n
u2..n
u1
x
t
n
66
Additional Instrumentation predicates
  • reachable-from-variable-x(v)
  • cfb(v) ?v1 f(v, v1) ?b(v1, v)
  • tree(v)
  • dag(v)
  • inOrder(v) ?v1 n(v, v1) ? dle(v,v1)
  • Weakest Precondition Ramalingam PLDI 02

67
Instrumentation (Summary)
  • Refines the abstraction
  • Adds global invariants
  • But requires update-formulas (generated
    automatically in TVLA2

is(v) ?v1,v2 n(v1,v) ? n(v2,v) ? v1 ? v2
is(v) ? ?v1,v2 n(v1,v) ? n(v2,v) ? v1 ? v2
?(S)S S ? ?, S ?f S
68
Plan
  • Embedding Theorem
  • Instrumentation
  • Abstract interpretation using canonical
    abstraction
  • TVLA

69
Best Conservative Interpretation (CC79)
70
Best Transformer (x x ? n)
inverse embedding
71
Focus- Based Transformer (x x ? n)
x
y
inverse embedding
canonic abstraction
72
Focus-Based Transformer (x x ? n)
x
y
73
Semantic Reduction
  • Improve the precision by recovering properties of
    the program semantics
  • A Galois connection (L1, ?, ?, L2)
  • An operation opL2?L2 is a semantic reduction
  • ?l?L2 op(l)?l
  • ?(op(l)) ?(l)
  • Can be applied before and after basic operations

74
Three Valued Logic Analysis (TVLA)T. Lev-Ami
R. Manevich
  • Input (FOTC)
  • Concrete interpretation rules
  • Definition of instrumentation predicates
  • Definition of safety properties
  • First Order Transition System (TVP)
  • Output
  • Warnings (text)
  • The 3-valued structure at every node (invariants)

75
Null Dereferences
bool search( int value, Element ?x) Element
? c x while ( x ! NULL ) if (c? val
value) return TRUE c c ? n return
FALSE
typedef struct element int value struct
element ?n Element
Demo
40
76
TVLA inputs
  • TVP - Three Valued Program
  • Predicate declaration
  • Action definitions SOS
  • Control flow graph
  • TVS - Three Valued Structure

Demo
77
Challenge 1
  • Write a C procedure on which TVLA reports false
    null dereference

78
Proving Correctness of Sorting Implementations
(Lev-Ami, Reps, S, Wilhelm ISSTA 2000)
  • Partial correctness
  • The elements are sorted
  • The list is a permutation of the original list
  • Termination
  • At every loop iterations the set of elements
    reachable from the head is decreased

79
Example InsertSort
List InsertSort(List x) List r, pr, rn, l,
pl r x pr NULL while (r ! NULL)
l x rn r ? n pl NULL while
(l ! r) if (l ? data gt r ? data)
pr ? n rn r ? n l
if (pl NULL) x r else pl ? n
r r pr break
pl l l l ? n
pr r r rn return x

typedef struct list_cell int data
struct list_cell n List
pred.tvp
actions.tvp
Run Demo
80
Example InsertSort
List InsertSort(List x) if (x NULL)
return NULL pr x r x-gtn while (r !
NULL) pl x rn r-gtn l x-gtn while (l
! r) pr-gtn rn r-gtn
l pl-gtn r r pr
break pl l l
l-gtn pr r r rn
typedef struct list_cell int data
struct list_cell n List
Run Demo
14
81
Example Reverse
typedef struct list_cell int data
struct list_cell n List
List reverse (List x) List y, t y
NULL while (x ! NULL) t y
y x x x ? next y ? next
t return y
Run Demo
82
Challenge
  • Write a sorting C procedure on which TVLA fails
    to prove sortedness or permutation

83
Example Mark and Sweep
void Sweep() unexplored Universe
collected ? while (unexplored ? ?) x
SelectAndRemove(unexplored) if (x ? marked)
collected collected ? x
assert(collected Universe
Reachset(root) )
void Mark(Node root) if (root ! NULL)
pending ? pending pending ? root
marked ? while (pending ? ?)
x SelectAndRemove(pending) marked
marked ? x t x ? left if (t
? NULL) if (t ? marked)
pending pending ? t t x ? right
if (t ? NULL) if (t ? marked)
pending pending ? t
assert(marked Reachset(root))
pred.tvp
Run Demo
84
Challenge 2
  • Use TVLA to show termination of markAndSweep

85
Verification of Safety Properties(PLDI02, 04)
  • The Canvas Project (with IBM Watson)
  • (Component Annotation, Verification and Stuff)

Component a library with cleanly encapsulated
state
Client a program that uses the library
  • Lightweight Specification
  • "correct usage" rules a client must follow
  • "call open() before read()"

Certification does the client program satisfy the
lightweight specification?
86
Prototype Implementation
  • Applied to several example programs
  • Up to 5000 lines of Java
  • Used to verify
  • Absence of concurrent modification exception
  • JDBC API conformance
  • IOStreams API conformance

87
(No Transcript)
88
(No Transcript)
89
(No Transcript)
90
Scaling
  • Staged analysis
  • Controlled complexity
  • More coarse abstractions Manevich SAS04
  • Handle libraries
  • Use procedure specificationsYorsh, TACAS04
  • Decision procedures for linked data
    structuresImmerman, CAV04, Lev-Ami, CADE05
  • Handling procedures
  • Compute procedure summaries Jeannet, SAS04
  • Local heaps Rinetzky, POPL05

91
Local heaps Rinetzky, POPL05
call p(x)
y
g
t
92
Why is Heap Analysis Difficult?
  • Destructive updating through pointers
  • p?next q
  • Produces complicated aliasing relationships
  • Track aliasing on 3-valued structures
  • Dynamic storage allocation
  • No bound on the size of run-time data structures
  • Canonical abstraction ? finite-sized 3-valued
    structures
  • Data-structure invariants typically only hold at
    the beginning and end of operations
  • Need to verify that data-structure invariants are
    re-established
  • Query the 3-valued structures that arise at the
    exit

93
Summary
  • Canonical abstraction is powerful
  • Intuitive
  • Adapts to the property of interest
  • Used to verify interesting program properties
  • Very few false alarms
  • But scaling is an issue

94
Summary
  • Effective Abstract Interpretation
  • Always terminates
  • Precise enough
  • But still expensive
  • Can model
  • Heap
  • Unbounded arrays
  • Concurrency
  • More instrumentation can mean more efficient
  • But canonic abstraction is limited
  • Correlation between list lengths
  • Arithmetic
  • Partial heaps

95
Summary
  • The embedding theorem eliminates the need for
    proving near commutavity
  • Guarantees soundness
  • Applied to arbitrary logics
  • But can be imprecise
Write a Comment
User Comments (0)
About PowerShow.com