Title: Open Source Model Checking Radu Grosu SUNY at Stony Brook
1Open Source Model Checking Radu Grosu SUNY at
Stony Brook
- Joint work with
- X. Huang, S. Jain and S. A. Smolka
2GCC Compiler
- Early stages A modest C compiler.
- Translation source code translated directly to
RTL. - Optimization at low RTL level.
- High level information lost calls, structures,
fields, etc. - Now days Full blown, multi-language compiler
- generating code for more than 30 architectures.
- Input C, C, Objective-C, Fortran, Java and
Ada. - Tree-SSA added GENERIC, GIMPLE and SSA ILs.
- Optimization at GENERIC, GIMPLE, SSA and RTL
levels. - Verification Tree-SSA API suitable for
verification, too.
3GCC Compilation Process
4C Program and its GIMPLE IL
int main int a,b,c int T1,T2,T3,T4
a 5 b a 10 T1 foo(a,b)
T2 a T1 if (a gt T2) goto fi T3
b / a T4 b a c T2 T3
b b 1 fi bar(a,b,c)
int main() int a,b,c a 5 b a 10
c a foo(a,b) if (a gt c) c b/a
ba bar(a,b,c)
5Associated GIMPLE CFG
6GCC Model Checking (GMC)
- GMC a suite of analysis and verification tools
we are developing for the Tree-SSA level of GCC.
Currently - Intra-procedural slicer in work is
inter-procedural slicing. - Symbolic execution engine for Boolean C
programs. - Interpreter traverses the CFG using Tree-SSA
iterators. - Monte Carlo MC (GMC2) OSE, randomized alg. for
LTL MC. - GMC2 a newly developed technique that uses the
theory of geometric random variables, statistical
hypothesis testing and random sampling of lassos.
7LTL MC ? Finding Accepting Lassos
Lassos Computation tree (CT)
recurrence diameter
LTL
Explore all lassos in the CT DDFS,SCC time
efficient DFS memory efficient
8Randomized Algorithms
- Takes of next step algorithm may depend on random
choice (coin flip). - Benefits simplicity, efficiency, and symmetry
breaking. - Monte Carlo may produce incorrect result but
with bounded error probability. - Example Elections result prediction
- Las Vegas always gives correct result but
running time is a random variable. - Example Randomized Quick Sort
9Monte Carlo Approach
Lassos Computation tree (CT)
recurrence diameter
LTL
flip a k-sided coin
Explore N(?,?) independent lassos in the CT Error
margin ? and confidence ratio ?
10Bernoulli Random Variable Z(coin flip)
Probability mass function
p(1) PZ1 pZ 1/8
p(0) PZ0 qZ 7/8
11Geometric Random Variable
- Value of geometric RV X with parameter pz
- No. of independent lassos until success.
- Probability mass function
- p(N) PX N qzN-1 pz
- Cumulative Distribution Function
- F(N) PX ? N ?i ? Np(i) 1 qzN 1
(1- pz)N
12How Many Lassos?
- Requiring 1- (1-pz)N 1- d yields
- N ln (d) / ln (1- pz)
- Lower bound on number of trials N needed to
achieve success with confidence ratio d.
13What If pz Unknown?
- Requiring pz ? e yields
- M ln (d) / ln (1- e) ? N ln (d) / ln
(1- pz) - and therefore PX ? M ? 1- d
- Lower bound on number of trials M needed to
achieve success with - confidence ratio d and error margin e .
14Statistical Hypothesis Testing
- Null hypothesis H0 pz ? e
- Alternative hypothesis H1 pz lt e
- If no success after N trials, then reject H0
- Type I error a P X gt M H0 lt d
- Since P X ? M H0 ? 1- d
15Monte Carlo Model Checking (MC2)
input B(S,Q,Q0,d,F), e, d N ln (d) / ln
(1- e) for (i 1 i ? N i) if (RL(B) 1)
return (1, error-trace) return (0, reject H0
with a Pr XgtN H0 lt d) where RL(B)
performs a uniform random walk through B to
obtain a random lasso.
16GCC MC2 (GMC2)
- Input a set of CFGs.
- Main function A specifically designated CFG.
- Random walks in the Büchi automaton generated
on-the-fly. - Initial state of the main routine bookkeeping
information. - Next state choose process call interpreter on
its CFG. - Processes created by using the fork primitive.
- Optimization interpreter returns only upon
context switch. - Lassos detected by using a hierarchic hash
table. - Local variables removed upon return from a
procedure.
17Program State
Shared Variables Valuation (channels semaphores)
List Of Process states
p1
p2
p3
Control State
Data State
CFG Name
Statement
18Program State
Shared Variables Valuation (channels semaphores)
List Of Process states
p1
p2
p3
Control State
Data State
Heap
Global Variables Valuation
Frame Stack
f1
f2
Return Control State
Local Variables Valuation
19Interpreter
- Interprets GIMPLE statements according to their
semantics. Interesting - Inter-procedural call(), return(). Manipulate
the frame stack. - Catches and interprets function calls to
various modeling and concurrency primitives - Modeling toss(), assert(). Nondeterminism and
checks. - Processes fork(), Manipulate the process
list. - Communication send(), recv(). Manipulate shared
vars. May involve a context switch.
20Results TCAS
21DPh Symmetric Fair Version
(Deadlock freedom)
22Needham-Schroeder Protocol
- Quite sophisticated C implementation.
- However, of a sequential nature
- Essentially executes only one round of a
- reactive system
23Related Work
- Software model checkers for concurrent C/C
- VeriSoft, Spin, Blast (Slam), Magic, C-Wolf.
Bogor? - Cooperative Bug Isolation Liblit, Naik Zheng
- Compile-time instrumentation. Distribute
binaries/collect bugs. - Statistical analysis to isolate erroneous code
segments. - Random interpretation Gulvany Necula
- Execute random paths and merge with random linear
operators. - Monte Carlo and abstract interpretation
Monniaux - Analyze programs with probabilistic and
nondeterministic input.
24Conclusions
- Presented GMC2 a software MC for GCC based on
Monte Carlo MC - At Tree-SSA level applicable to C, C, Ada,
Java, etc. - Open source freely available for
usage/critique/extension. - Ongoing and Future Work Create a software MC
branch of GCC, which also includes - Automated abstraction/refinement/interpolation
techniques. - Currently we manually apply a form of
bounded-range abstraction (e.g. in TCAS).
25Talk Outline
- Model Checking
- Randomized Algorithms
- LTL Model Checking
- Probability Theory Primer
- Monte Carlo Model Checking
- Implementation Results
- Conclusions Open Problem
26Linear Temporal Logic
- LTL formula made up inductively of
- atomic propositions p, boolean connectives ?, ?,
? - temporal modalities X (neXt) and U (Until).
- Safety nothing bad ever happens
- E.g. G(? (pc1cs ? pc2cs)) where G is a
derived modality (Globally). - Liveness something good eventually happens
- E.g. G( req ? F serviced ) where F is a
derived modality (Finally).
27Model Checking
- S is a nondeterministic/concurrent system.
- ? is a temporal logic formula.
- in our case Linear Temporal Logic (LTL).
- Basic idea intelligently explore Ss state space
in attempt to establish S ?.
28LTL Model Checking
- Every LTL formula ? can be translated to a Büchi
automaton B? such that L(?) L(B?) - Automata-theoretic approach
- S ? iff L(BS) ? L(B? ) iff L(BS ?
B?? ) ? - Checking non-emptiness is equivalent to finding a
reachable accepting cycle (lasso).
29Emptiness Checking
- Checking non-emptiness is equivalent to finding
an accepting cycle reachable from initial state
(lasso). - Double Depth-First Search (DDFS) algorithm can be
used to search for such cycles, and this can be
done on-the-fly!
30Randomized Algorithms
- Huge impact on CS (distributed) algorithms,
complexity theory, cryptography, etc. - Takes of next step algorithm may depend on random
choice (coin flip). - Benefits of randomization include simplicity,
efficiency, and symmetry breaking.
31Lassos Probability Space
- Sample Space lassos in BS ? B??
- Bernoulli random variable Z
- Outcome 1 if randomly chosen lasso accepting
- Outcome 0 otherwise
- pZ ? pi Zi (expectation of an accepting
lasso) - where pi is lasso prob. (uniform random walk)
32Bernoulli Random Variable(coin flip)
- Value of Bernoulli RV Z
- Z 1 (success) Z 0 (failure)
- Probability mass function
- p(1) PrZ1 pz
- p(0) PrZ0 1- pz qz
- Expectation EZ pz
33Statistical Hypothesis Testing
- Example Given a fair and a biased coin.
- Null hypothesis H0 - fair coin selected.
- Alternative hypothesis H1 - biased coin
selected. - Hypothesis testing Perform N trials.
- If number of heads is LOW, reject H0 .
- Else fail to reject H0 .
34Statistical Hypothesis Testing
35Random Lasso (RL) Algorithm
36Correctness of MC2
- Theorem Given a Büchi automaton B, error margin
e, and confidence ratio d, if MC2 rejects H0,
then its type I error has probability - a P X gt M H0 lt d
37Complexity of MC2
- Theorem Given a Büchi automaton B having
diameter D, error margin e, and confidence ratio
d, MC2 runs in time O(ND) and uses space O(D),
where N ln(d) / ln(1- e)
Cf. DDFS which runs in O(2Sf) time for B
BS ? B?? .
38Alternative Sampling Strategies
- Multilasso sampling ignores backedges that do
not lead to an accepting lasso.
PrLn O(2-n)
- Probabilistic systems there is a natural way
to assign a probability to a RL.
- Input partitioning partition input into classes
that trigger the same behavior (guards).