Title: Decision-Procedure Based Theorem Provers Tactic-Based Theorem Proving Inferring Loop Invariants
1Decision-Procedure Based Theorem
ProversTactic-Based Theorem ProvingInferring
Loop Invariants
2Review
Source language
VCGen
FOL
Theories
?
Goal-directed proving
Sat. proc 1
Sat. proc n
3Combining Satisfiability Procedures
- Consider a set of literals F
- Containing symbols from two theories T1 and T2
- We split F into two sets of literals
- F1 containing only literals in theory T1
- F2 containing only literals in theory T2
- We name all subexpressions
- p1(f2(E)) is split into f2(E) n Æ
p1(n) - We have unsat (F1 Æ F2) iff unsat(F)
- unsat(F1) Ç unsat(F2) ) unsat(F)
- But the converse is not true
- So we cannot compute unsat(F) with a trivial
combination of the sat. procedures for T1 and T2
4Combining Satisfiability Procedures. Example
- Consider equality and arithmetic
5Combining Satisfiability Procedures
- Combining satisfiability procedures is non
trivial - And that is to be expected
- Equality was solved by Ackerman in 1924,
arithmetic by Fourier even before, but E A only
in 1979 ! - Yet in any single verification problem we will
have literals from several theories - Equality, arithmetic, lists,
- When and how can we combine separate
satisfiability procedures?
6Nelson-Oppen Method (1)
- Represent all conjuncts in the same DAG
- f(f(x) - f(y)) ? f(z) Æ y x Æ x y z
Æ z 0
f
-
f
f
f
y
x
z
0
7Nelson-Oppen Method (2)
- Run each sat. procedure
- Require it to report all contradictions (as
usual) - Also require it to report all equalities between
nodes
f
-
f
f
f
y
x
z
0
8Nelson-Oppen Method (3)
- Broadcast all discovered equalities and re-run
sat. procedures - Until no more equalities are discovered or a
contradiction arises
f
x Contradiction
-
f
f
f
y
x
z
0
9What Theories Can be Combined?
- Only theories without common interpreted symbols
- But Ok if one theory takes the symbol
uninterpreted - Only certain theories can be combined
- Consider (Z, , ) and Equality
- Consider 1 x 2 Æ a 1 Æ b 2 Æ f(x) ¹
f(a) Æ f(x) ¹ f(b) - No equalities and no contradictions are
discovered - Yet, unsatisfiable
- A theory is non-convex when a set of literals
entails a disjunction of equalities without
entailing any single equality
10Handling Non-Convex Theories
- Many theories are non-convex
- Consider the theory of memory and pointers
- It is not-convex
- true ) A A Ç sel(upd(M, A, V), A)
sel(M, A) (neither of the disjuncts is
entailed individually) - For such theories it can be the case that
- No contradiction is discovered
- No single equality is discovered
- But a disjunction of equalities is discovered
- We need to propagate disjunction of equalities
11Propagating Disjunction of Equalities
- To propagate disjunctions we perform a case
split - If a disjunction E1 Ç Ç En is discovered
- Save the current state of the prover
- for i 1 to n
- broadcast Ei
- if no contradiction arises then return
satisfiable - restore the saved prover state
-
- return unsatisfiable
12Handling Non-Convex Theories
- Case splitting is expensive
- Must backtrack (performance --)
- Must implement all satisfiability procedures in
incremental fashion (simplicity --) - In some cases the splitting can be prohibitive
- Take pointers for example.
- upd(upd((upd(m, i1, x), , in-1, x), in, x)
- upd((upd(m, j1, x), ,
jn-1, x) Æ - sel(m, i1) ¹ x Æ Æ sel(m, in) ¹ x
- entails Çj ¹ k ij ¹ ik
- (a conjunction of length n entails n2 disjuncts)
13Forward vs. Backward Theorem Proving
14Forward vs. Backward Theorem Proving
- The state of a prover can be expressed as
- H1 Æ Æ Hn )? G
- Given the hypotheses Hi try to derive goal G
- A forward theorem prover derives new hypotheses,
in hope of deriving G - If H1 Æ Æ Hn ) H then
- move to state H1 Æ Æ Hn Æ H )? G
- Success state H1 Æ Æ G Æ Hn )? G
- A forward theorem prover uses heuristics to reach
G - Or it can exhaustively derive everything that is
derivable !
15Forward Theorem Proving
- Nelson-Oppen is a forward theorem prover
- The state is L1 Æ Æ Ln Æ L )? false
- If L1 Æ Æ Ln Æ L ) E (an equality) then
- New state is L1 Æ Æ Ln Æ L Æ E )? false (add
the equality) - Success state is L1 Æ Æ L Æ Æ L Æ Ln )?
false - Nelson-Oppen provers exhaustively produce all
derivable facts hoping to encounter the goal - Case splitting can be explained this way too
- If L1 Æ Æ Ln Æ L ) E Ç E (a disjunction of
equalities) then - Two new states are produced (both must lead to
success) - L1 Æ Æ Ln Æ L Æ E )? false
- L1 Æ Æ Ln Æ L Æ E )? false
16Backward Theorem Proving
- A backward theorem prover derives new subgoals
from the goal - The current state is H1 Æ Æ Hn )? G
- If H1 Æ Æ Hn Æ G1 Æ Æ Gn ) G (Gi are
subgoals) - Produce n new states (all must lead to
success) - H1 Æ Æ Hn )? Gi
- Similar to case splitting in Nelson-Oppen
- Consider a non-convex theory
- H1 Æ Æ Hn ) E Ç E
- is same as H1 Æ Æ Hn Æ E Æ E ) false
- (thus we have reduced the goal false to
subgoals E Æ E )
17Programming Theorem Provers
- Backward theorem provers most often use
heuristics - If it useful to be able to program the heuristics
- Such programs are called tactics and tactic-based
provers have this capability - E.g. the Edinburgh LCF was a tactic based prover
whose programming language was called the
Meta-Language (ML) - A tactic examines the state and either
- Announces that it is not applicable in the
current state, or - Modifies the proving state
18Programming Theorem Provers. Tactics.
- State Formula list Formula
- A set of hypotheses and a goal
- Tactic State ! (State ! a) ! (unit ! a) ! a
- Continuation passing style
- Given a state will invoke either the success
continuation with a modified state or the failure
continuation - Example a congruence-closure based tactic
- cc (h, g) c f
- let e1, , en new equalities in the
congruence closure of h - c (h e1, , en, g)
- A forward chaining tactic
19Programming Theorem Provers. Tactics.
- Consider an axiom 8x. a(x) ) b(x)
- Like the clause b(x) - a(x) in Prolog
- This could be turned into a tactic
- clause (h, g) c f if unif(g, b) f then
- s (h, f(a))
- else
- f ()
- A backward chaining tactic
20Programming Theorem Provers. Tacticals.
- Tactics can be composed using tacticals
- Examples
- THEN tactic ! tactic ! tactic
- THEN t1 t2 ls.lc.lf.
- let newc s t2 s c f
in t1 s newc f - REPEAT tactic ! tactic
- REPEAT t THEN t (REPEAT t)
- ORELSE tactic ! tactic ! tactic
- ORELSE t1 t2 ls.lc.lf.
- let newf x t2 s c f
in t1 s c newf
21Programming Theorem Provers. Tacticals
- Prolog is just one possible tactic
- Given tactics for each clause c1, , cn
- Prolog tactic
- Prolog REPEAT (c1 ORLESE c2 ORELSE
ORELSE cn) - Nelson-Oppen can also be programmed this way
- The result is not as efficient as a
special-purpose implementation - This is a very powerful mechanism for
semi-automatic theorem proving - Used in Isabelle, HOL, and many others
22Techniques for Inferring Loop Invariants
23Inferring Loop Invariants
- Traditional program verification has several
elements - Function specifications and loop invariants
- Verification condition generation
- Theorem proving
- Requiring specifications from the programmer is
often acceptable - Requiring loop invariants is not acceptable
- Same for specifications of local functions
24Inferring Loop Invariants
- A set of cutpoints is a set of program points
- There is at least one cutpoint on each circular
path in CFG - There is a cutpoint at the start of the program
- There is a cutpoint before the return
- Consider that our function uses n variables x
- We associate with each cutpoint an assertion
Ik(x) - If a is a path from cutpoint k to j then
- Ra(x) Zn ! Zn expresses the effect of path a on
the values of x at j as a function of those at k - Pa(x) Zn ! B is a path predicate that is true
exactly of those values of x at k that will
enable the path a
25Cutpoints. Example.
- p01 true
- r01 A Ã A, K Ã 0, L Ã len(A), S Ã 0,
- m à m
- p11 K 1 lt L
- r11 A Ã A, K Ã K 1, L Ã L,
- S à S sel(m, A K), m à m
- p12 K 1 L
- r12 r11
- Easily obtained through sym. eval.
L len(A) K 0 S 0
S S AK K K lt L
return S
26Equational Definition of Invariants
- A set of assertions is a set of invariants if
- The assertion for the start cutpoint is the
precondition - The assertion for the end cutpoint is the
postcondition - For each path from i to j we have
- 8x. Ii(x) Æ Pij(x) )
Ij(Rij(x)) - Now we have to solve a system of constraints with
the unknowns I1, , In-1 - I0 and In are known
- We will consider the simpler case of a single
loop - Otherwise we might want to try solving the
inner/last loop first
27Invariants. Example.
0
- I0 ) I1(r0(x))
- The invariant I1 is established initially
- I1 Æ K1 lt L ) I1(r1(x))
- The invariant I1 is preserved in the loop
- I1 Æ K1 L ) I2(r1(x))
- The invariant I1 is strong enough (i.e. useful)h
L len(A) K 0 S 0
ro
1
S S AK K K lt L
r1
2
return S
28The Lattice of Invariants
true
- Weak predicates satisfy the condition 1
- Are satisfied initially
- Strong predicates satisfy condition 3
- Are useful
- A few predicates satisfy condition 2
- Are invariant
- Form a lattice
false
29Finding The Invariant
- Which of the potential invariants should we try
to find ? - We prefer to work backwards
- Essentially proving only what is needed to
satisfy In - Forward is also possible but sometimes wasteful
since we have to prove everything that holds at
any point
30Finding the Invariant
true
- Thus we do not know the precondition of the
loop - The weakest invariant that is strong enough has
most chances of holding initially - This is the one that well try to find
- And then check that it is weak enough
false
31Induction Iteration Method
true
- Equation 3 gives a predicate weaker than any
invariant - I1 Æ K1 L ) I2(r1(x))
- I1 ) ( K1 L ) I2(r1(x)))
- W0 K1 L ) I2(r1(x))
- Equation 2 suggest an iterative computation of
the invariant I1 - I1 ) (K1 lt L ) I1(r1(x)))
false
32Induction Iteration Method
- Define a family of predicates
- W0 K1 L ) I2(r1(x))
- Wj W0 Æ K1 lt L ) Wj-1(r1(x))
- Properties of Wj
- Wj ) Wj-1 ) ) W0 (they form a strengthening
chain) - I1 ) Wj (they are weaker than any invariant)
- If Wj-1 ) Wj then
- Wj is an invariant (satisfies both equations 2
and 3) - Wj ) K1 lt L ) Wj(r1(x))
- Wj is the weakest invariant
- (recall domain theory, predicates form a
domain, and we use the fixed point theorem to
obtain least solutions to recursive equations)
33Induction Iteration Method
- W K1 L ) I2(r1(x)) //
This is W0 - W true
- while (not (W ) W))
- W W
- W (K1 L ) I2(r1(x)))
Æ (K 1 lt L ) W(r1(x))) -
- The only hard part is to check whether W ) W
- We use a theorem prover for this purpose
34Induction Iteration.
true
- The sequence of Wj approaches the weakest
invariant from above - The predicate Wj can quickly become very large
- Checking W ) W becomes harder and harder
- This is not guaranteed to terminate
W0
W1
W2
W3
false
35Induction Iteration. Example.
0
- Consider that the strength condition is
- I1 ) K 0 Æ K lt len(A) (array bounds)
- We compute the Ws
- W0 K 0 Æ K lt len(A)
- W1 W0 Æ K1ltL ) K1 0 Æ K1ltlen(A)
- W2 W0 Æ K1 lt L )
- (K 1 0 Æ K 1 lt len(A) Æ
- K2lt L ) K2 0 Æ
- K2 lt len(A)))
L len(A) K 0 S 0
ro
1
S S AK K K lt L
r1
2
return S
36Induction Iteration. Strengthening.
- We can try to strengthen the inductive invariant
- Instead of
- Wj W0 Æ K1 lt L ) Wj-1(r1(x))
- we compute
- Wj strengthen (W0 Æ K1 lt L ) Wj-1(r1(x)))
- where strengthen(P) ) P
- We still have Wj ) Wj-1 and we stop when Wj-1 )
Wj - The result is still an invariant that satisfies 2
and 3
37Strengthening Heuristics
- One goal of strengthening is simplification
- Drop disjuncts P1 Ç P2 ! P1
- Drop implications P1 ) P2 ! P2
- A good idea is to try to eliminate variables
changed in the loop body - If Wj does not depend on variables changed by r1
(e.g. K, S) - Wj1 W0 Æ K1 lt L ) Wj(r1(x))
- W0 Æ K1 lt L ) Wj
- Now Wj ) Wj1 and we are done !
38Induction Iteration. Strengthening
true
- We are still in the strong-enough area
- We are making bigger steps
- And we might overshoot then weakest invariant
- We might also fail to find any invariant
- But we do so quickly
W0
W1
W2
W1
W3
W2
false
39One Strengthening Heuristic for Integers
- Rewrite Wj in conjunctive normal form
- W1 K 0 Æ K lt len(A) Æ (K 1 lt L ) K
1 0 Æ K 1 lt len(A)) - K 0 Æ K lt len(A) Æ (K 1 L Ç
K 1 lt len(A)) - Take each disjunction containing arithmetic
literals - Negate it and obtain a conjunction of arithmetic
literals - K1 lt L Æ K1 len(A)
- Weaken the result by eliminating a variable
(preferably a loop-modified variable) - E.g., add the two literals L gt len(A)
- Negate the result and get another disjunction
- L len(A)
- W1 K 0 Æ K lt len(A) Æ L len(A) (check
that W1 ) W2)
40Induction Iteration
- We showed a way to compute invariants
algoritmically - Similar to fixed point computation in domains
- Similar to abstract interpretation on the lattice
of predicates - Then we discussed heuristics that improve the
termination properties - Similar to widening in abstract interpretation
41Theorem Proving. Conclusions.
- Theorem proving strengths
- Very expressive
- Theorem proving weaknesses
- Too ambitious
- A great toolbox for software analysis
- Symbolic evaluation
- Decision procedures
- Related to program analysis
- Abstract interpretation on the lattice of
predicates