Using First-Order Theorem Provers in Data Structure Verification - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Using First-Order Theorem Provers in Data Structure Verification

Description:

SPASS, E, Vampire, Theo, Prover9, ... continuously improving (yearly competition) ... filtering. take rarity of symbols into account. check for occurring ... – PowerPoint PPT presentation

Number of Views:160
Avg rating:3.0/5.0
Slides: 38
Provided by: diE8
Category:

less

Transcript and Presenter's Notes

Title: Using First-Order Theorem Provers in Data Structure Verification


1
Using First-Order Theorem Provers in Data
Structure Verification
  • Charles Bouillaguet
  • Ecole Normale Supérieure, Cachan, France

Viktor Kuncak Martin Rinard MIT CSAIL
2
Implementing Data Structures is Hard
  • Often small, but complex code
  • Lots of pointers
  • Unbounded, dynamic allocation
  • Complex shape invariants
  • Dag
  • Properties involving arithmetic (ordering)
  • Need strong invariants to guarantee correctness
  • e.g. lookup in ordered tree needs sortedness

3
How to obtain reliable data structure
implementations?
  • Approach
  • Prove that the program is correct
  • For all program executions (sound)
  • Verified properties
  • Data structure operations do not crash
  • Data structure invariants are preserved
  • Data structure content is correctly updated

4
Infrastructure
  • Jahob system for verifying data structure
    implementation
  • Kuncak, Wies, Zee, Rinard, Nguyen, Bouillaguet,
    Schmitt, Marnette, Bugrara
  • Analyzed programs subset of Java
  • Specification subset of Isabelles language

5
Summary of Verified Data Structures
  • Implementations of relations
  • Add a binding
  • Remove all bindings for a given key
  • Test key membership
  • Retrieve data bound to a key
  • Test emptiness
  • Verified implementations
  • Linked list
  • Ordered tree
  • Hash table

6
An Example Ordered Trees
  • Implementation of a finite map
  • Operations
  • insert
  • lookup
  • remove
  • Representation invariants
  • tree shaped (acyclicity, unique parent)
  • ordering constraints

keyvalue
right
left
7
Sample code
  • public static FuncTree update(int k, Object v,
    FuncTree t)
  • FuncTree new_left, new_right Object
    new_data int new_key
  • if (tnull)
  • new_data v new_key k
  • new_left null new_right null
  • else
  • if (k lt t.key)
  • new_left update(k, v, t.left)
    new_right t.right
  • new_key t.key new_data t.data
  • else if (t.key lt k) else
  • new_data v new_key k
  • new_left t.left new_right
    t.right
  • FuncTree r new FuncTree()
  • r.left new_left r.right new_right
  • r.data new_data r.key new_key

8
Sample code
  • public static FuncTree update(int k, Object v,
    FuncTree t)
  • / requires "v null
  • ensures "result..content t..content -
    (x,y). xk (k,v) /
  • FuncTree new_left, new_right Object
    new_data int new_key
  • if (tnull)
  • new_data v new_key k
  • new_left null new_right null
  • else
  • if (k lt t.key)
  • new_left update(k, v, t.left)
    new_right t.right
  • new_key t.key new_data t.data
  • else if (t.key lt k) else
  • new_data v new_key k
  • new_left t.left new_right
    t.right
  • FuncTree r new FuncTree()
  • r.left new_left r.right new_right
  • r.data new_data r.key new_key

no null dereferences
3 lines spec 30 lines code
postcondition holds and invariants preserved
9
Ordered tree interface
  • public ghost specvar content "(int obj) set"
    ""
  • public static FuncTree empty_set()ensures
    "result..content "
  • public static FuncTree add(int k, Object v,
    FuncTree t)requires "v null (ALL y. (k,y)
    t..content)ensures "result..content
    t..content Un (k,v)
  • public static FuncTree update(int k, Object v,
    FuncTree t)requires "v nullensures
    "result..content t..content - (x,y). xk
    (k,v)
  • public static Object lookup(int k, FuncTree t)
    ensures "((k, result) t..content)
    (result null (ALL v. (k,v) t..content))
  • public static FuncTree remove(int k, FuncTree
    t)ensures "result..content t..content -
    (x,y). xk

10
Representation Invariants
  • public final class FuncTree private int
    keyprivate Object dataprivate FuncTree left,
    right
  • / public ghost specvar content "(int obj)
    set"
  • invariant ("content definition") "this null
    --gt content (key, data) Un left..content
    Un right..content"
  • invariant ("null implies empty") "this null
    --gt content "
  • invariant ("left children are smaller")
  • "ALL k v. (k,v) left..content --gt k lt
    key
  • invariant ("right children are bigger")
    "ALL k v. (k,v) right..content --gt k gt key"
  • /

abstract set-valued field
tuples
implicit universal quantification over this
equality between sets
arithmetic
explicit quantification
11
How could these properties be verified?
12
Standard Approach
eauto intros . intuition subst . apply
Extensionality_Ensembles. unfold Same_set.
unfold Included. unfold In. unfold In in
H1. intuition. destruct H0. destruct (eq_nat_dec
x1 ArraySet_size).subst. rewrite
arraywrite_match in H0 auto. intuition. subst.
apply Union_intror. auto with sets. assert (x1 lt
ArraySet_size). omega. clear n. apply
Union_introl. rewrite arraywrite_not_same_i in
H0.unfold In. exists x1. intuition.omega.
inversion H0 subst clear H0. unfold In in
H3. destruct H3. exists x1. intuition. rewrite
arraywrite_not_same_i. intuition omega. omega.
exists ArraySet_size. intuition. inversion H3.
subst. rewrite arraywrite_match trivial.
  • Transform program into a logic formula
  • Using weakest precondition
  • The program is correct iff the formula is valid
  • Prove the formula
  • Very difficult formulas interactively (Coq,
    Isabelle)
  • Decidable classes automated (MONA, CVCL, Omega)
  • This talk difficult formulas in automated way )
  • low efficiency
  • 1 line per grad student-minute
  • parallelization looks non-trivial

13
Formulas in Jahob
  • Very expressive specification language
  • Higher-Order features
  • How to prove formulas automatically?
  • Convert them to something simpler
  • Decidable classes
  • First-Order Logic

14
Automated reasoning in Jahob
15
Why FOL?
  • Existing theorem provers
  • SPASS, E, Vampire, Theo, Prover9,
  • continuously improving (yearly competition)
  • Effective on formulas with short proofs
  • Handle nicely formulas with quantifiers

16
HOL ? FOL
  • Ideas
  • avoid axiomatizing rich theories
  • Translate what can naturally be expressed in FOL
  • soundly approximate the rest
  • Sound, incomplete approach
  • Full details in long version of the paper
  • (x,y) ? z.content ? Content(x,y,z)
  • w.f y ?(xy ? wv) ? (x ? y ? wf(y) )
  • ?x.E ?x.F??x. EF

17
Arithmetic
  • Numbers are uninterpreted constants in FOL
  • Provers do not know that 112 !
  • Still need to reason about arithmetic
  • Our Solution
  • Provide partial, incomplete axiomatization
  • Still cannot deduce 112 !
  • comparison between constants in formula
  • Satisfactory results in practice
  • ordering of elements in tree
  • array bound checks

18
Observation
  • Most formulas are easy to prove
  • ie in no measurable time
  • have very short proofs (in of resolution step)
  • Problem often concentrated in a small number that
    take very long to prove
  • We applied two existing techniques to make them
    easier
  • Eliminating type/sort information
  • Filtering unnecessary assumptions

19
Sort Information
  • Specification language has sorts
  • Integers
  • Objects
  • Boolean
  • Translate to unsorted FOL
  • ?(x Obj). P(x)
  • ?
  • ?x. Obj(x) ?P(x)

20
Sort Information
  • Encoding sort information
  • bigger formulas
  • longer proofs
  • Formulas become harder to prove
  • Temptation to omit sort information

21
Effect on hard formulas
  • Formulas that take more than 1s to prove, from
    the Tree implementation (SPASS)

22
Omitting Sorts (contd)
  • Great speed-up (more than x10 sometimes) !
  • However
  • ? (x yS). x y
  • ? (x yT). x ? y
  • Satisfiable with sorts (Sa, Tb,c)
  • Unsatisfiable without!
  • Omitting sort guards breaks soundness!!!
  • Possible workaround type-check generated proof
  • When it is possible to skip type-checking ?

23
Omitting Sorts Result
  • We proved the following
  • Theorem. Suppose that
  • Sorts are pair-wise disjoint (no sub-sorting)
  • Sorts have the same cardinality
  • Then omitting sort guards is
  • sound and complete
  • This justify this useful optimization

24
Assumption Filtering
  • Provers get confused by too many assumptions
  • Lots of useless assumptions
  • Hardest shown benchmark needs 12 out of 56
  • Big benchmark on average 33 necessary
  • Assumption filtering
  • Try to eliminate irrelevant assumptions
    automatically
  • Give a score to assumption based on relevance

25
Experimental results
26
Verification effort
  • Decreased as we improved the system
  • functional list was easy
  • a few days for trees
  • two hours for simple hash table
  • FOL Currently most usable method for these kind
    of data structures

27
Related work
  • Interactive Provers Isabelle, Coq, HOL, PVS,
    ACL2
  • First-Order ATP
  • Vampire Voronkov 04
  • SPASS Weidenbach 01
  • E Shultz IJCAR04
  • Program Checking
  • ESC/Java2 Kiniry, Chalin, Hurlin
  • Krakatoa Marche, Paulin-Mohring, Urbain 03
  • Spec Barnett, DeLine, Jacobs, Fähndrich,
    Leino, Schulte, Venter 05
  • Hob system verify set implementations (we verify
    relations)
  • Shape analysis
  • PALE - Møller and Schwartzbach PLDI01
  • TVLA - Sagiv, Reps, and Wilheim TOPLAS02
  • Roles - Kuncak, Lam, and Rinard POPL02

28
Multiple Provers - Screenshot
29
Conclusion
  • Jahob verification system
  • Automation by translation HOL?FOL
  • omitting sorts theorem gives speedup
  • filtering automates selection of assumptions
  • Promising experimental results
  • strong properties correct implementation
  • Do not crash
  • operations correctly update the content,
    clarifies behavior in case of duplicate keys,
  • representation invariants preserved (ordering,
    treeness, each element is in appropriate bucket)
  • relatively fast
  • verification effort much smaller than using
    interactive provers

30
Thank you
  • Formal Methods are the Future of computer
    Science.
  • Always have been
  • Always will be.
  • Questions ?

31
Converting to GCL
  • Conditionnal statement easy
  • if cond then tbranch else fbranch
  • (Assume cond tbranch ) ? (Assume
    !cond fbranch )
  • Procedure calls
  • Could inline (potentially exponential blowup)
  • Desugaring (modularity)
  • r CALL m(x, y, z)
  • Assert (ms precondition)
  • Havoc r
  • Havoc vars modified by m
  • Assume (ms postcondition)

32
Converting to GCL (contd)
  • Loops invariant required
  • while / invariant / (condition) lbody
  • assert invariant
  • havoc vars(lbody)
  • assume invariant
  • ((assume condition
  • lbody
  • assert invariant
  • assume false)
  • ? (assume !condition))

invariant hold initially
no assumptions on variables except that
invariant hold
condition hold
invariant is preserved
no need to verify anything more
or condition do not hold and execution continues
33
Verification condition for remove
  • ((((fieldRead Pair_data null) null)
    ((fieldRead FuncTree_data null) null)
    ((fieldRead FuncTree_left null) null)
    ((fieldRead FuncTree_right null) null) (ALL
    (xObjobj). (xObj Object)) ((Pair Int
    FuncTree) null) ((Array Int FuncTree)
    null) ((Array Int Pair) null) (null
    Object_alloc) (pointsto Pair Pair_data Object)
    (pointsto FuncTree FuncTree_data Object)
    (pointsto FuncTree FuncTree_left FuncTree)
    (pointsto FuncTree FuncTree_right FuncTree)
    comment ''unalloc_lonely'' (ALL (xobj). ((x
    Object_alloc) --gt ((ALL (yobj). ((fieldRead
    Pair_data y) x)) (ALL (yobj). ((fieldRead
    FuncTree_data y) x)) (ALL (yobj).
    ((fieldRead FuncTree_left y) x)) (ALL
    (yobj). ((fieldRead FuncTree_right y) x))
    ((fieldRead Pair_data x) null) ((fieldRead
    FuncTree_data x) null) ((fieldRead
    FuncTree_left x) null) ((fieldRead
    FuncTree_right x) null)))) comment
    ''ProcedurePrecondition'' (True comment
    ''FuncTree_PrivateInv content definition'' (ALL
    (thisobj). (((this Object_alloc) (this
    FuncTree) ((this obj) null)) --gt
    ((fieldRead (FuncTree_content (obj gt ((int
    obj)) set)) (this obj)) ((((fieldRead
    (FuncTree_key (obj gt int)) (this obj)),
    (fieldRead (FuncTree_data (obj gt obj)) (this
    obj))) Un (fieldRead (FuncTree_content
    (obj gt ((int obj)) set)) (fieldRead
    (FuncTree_left (obj gt obj)) (this obj))))
    Un (fieldRead (FuncTree_content (obj gt ((int
    obj)) set)) (fieldRead (FuncTree_right (obj
    gt obj)) (this obj))))))) comment
    ''FuncTree_PrivateInv null implies empty'' (ALL
    (thisobj). (((this Object_alloc) (this
    FuncTree) ((this obj) null)) --gt
    ((fieldRead (FuncTree_content (obj gt ((int
    obj)) set)) (this obj)) ))) comment
    ''FuncTree_PrivateInv no null data'' (ALL
    (thisobj). (((this Object_alloc) (this
    FuncTree) ((this obj) null)) --gt
    ((fieldRead (FuncTree_data (obj gt obj)) (this
    obj)) null))) comment ''FuncTree_PrivateIn
    v left children are smaller'' (ALL (thisobj).
    (((this Object_alloc) (this FuncTree)) --gt
    (ALL k. (ALL v. (((k, v) (fieldRead
    (FuncTree_content (obj gt ((int obj)) set))
    (fieldRead (FuncTree_left (obj gt obj)) (this
    obj)))) --gt (intless k (fieldRead
    (FuncTree_key (obj gt int)) (this
    obj)))))))) comment ''FuncTree_PrivateInv right
    children are bigger'' (ALL (thisobj). (((this
    Object_alloc) (this FuncTree)) --gt (ALL k.
    (ALL v. (((k, v) (fieldRead (FuncTree_content
    (obj gt ((int obj)) set)) (fieldRead
    (FuncTree_right (obj gt obj)) (this obj))))
    --gt ((fieldRead (FuncTree_key (obj gt int))
    (this obj)) lt k))))))) comment ''t_type''
    (((t obj) (FuncTree obj set)) ((t
    obj) (Object_alloc obj set)))) --gt ((comment
    ''TrueBranch'' (((t obj) null) bool) --gt
    (comment ''ProcedureEndPostcondition''
    ((((fieldRead (FuncTree_content (obj gt ((int
    obj)) set)) (null obj)) ((fieldRead
    (FuncTree_content (obj gt ((int obj)) set))
    (t obj)) - p. (EX x y. ((p (x, y)) (x
    (k int)))))) (ALL (framedObjobj).
    (((framedObj Object_alloc) (framedObj
    FuncTree)) --gt ((fieldRead FuncTree_content
    framedObj) (fieldRead FuncTree_content
    framedObj))))) comment ''FuncTree_PrivateInv
    content definition'' (ALL (thisobj). (((this
    Object_alloc) (this FuncTree) ((this
    obj) null)) --gt ((fieldRead (FuncTree_content
    (obj gt ((int obj)) set)) (this obj))
    ((((fieldRead (FuncTree_key (obj gt int))
    (this obj)), (fieldRead (FuncTree_data (obj
    gt obj)) (this obj))) Un (fieldRead
    (FuncTree_content (obj gt
  • And 200 more kilobytes
  • Infeasible to prove directly

34
Splitting heuristic
  • Verification condition is big conjunction
  • conjunctions in postcondition
  • proving each invariant
  • proving each branch in program
  • Solution split VC into individual conjuncts
  • Prove each conjunct separately
  • Each conjunct has form
  • H1 /\ /\ Hn ? Gi
  • Tree.Remove has 230 such conjuncts
  • How do we prove them?

35
Detupling (contd)
  • Complete rules

36
Handling of Fields (contd)
  • We dealt with field updates
  • New function expressed in terms of old one
  • Base case field variables
  • Natural encoding in FOL using functions
  • x y.f ! x f(y)

37
Future work
  • Verify more examples
  • balanced trees
  • fancy priority queues (binomial, Fibonacci, )
  • hash table with dynamic resizing
  • hash function
  • verify clients of data structures
  • Improve assumption filtering
  • take rarity of symbols into account
  • check for occurring polarity
Write a Comment
User Comments (0)
About PowerShow.com