How Many Paths must a Test Walk Down

About This Presentation

Title:

How Many Paths must a Test Walk Down

Description:

Oops, I guess I should know! 3. 3. What is Symbolic Execution? Static Analysis Technique ... At each program location, the state of the system is defined by ... – PowerPoint PPT presentation

Number of Views:88

Avg rating:3.0/5.0

Slides: 41

Provided by: visse9

Category:

more less

Transcript and Presenter's Notes

Title: How Many Paths must a Test Walk Down

1
How Many Paths must a Test Walk Down?
Willem Visser with C
orina Pasareanu NASA
Ames Research Center
2
What is Symbolic Execution?

Lets Ask You know who

3
What is Symbolic Execution?

Static Analysis Technique
Executes code in a non-standard way
Instead of concrete inputs, symbolic values are
manipulated
At each program location, the state of the system
is defined by
The current assignments to the symbolic inputs
and local variables
A path condition that must hold for the execution
to reach this location
At each branch in the code, both paths must be
followed
On the true branch the condition is added to the
path condition
On the false branch the negation of the
condition is added to the path condition
If a branch is infeasible, then execution along
that branch is terminated

4
Symbolic Execution Walking Many Paths at Once
pres 460pres_min 640pres_max 960
if( (pres
pres_max)) else

pres Xpres_min MINpres_max MAX PC
TRUE
if ((pres
pres_max)) else
if ((pres
pres_max)) else
if ((pres
pres_max)) else
PC XPC X MAX
5
Concrete Execution Path (example)
x 1, y 0 1 ? 0 x 1 0 1 y 1 0 1 x
1 1 0 0 1 ? 0
int x, y if (x y) x x y y x
y x x y if (x y 0)
//Reachable?
6
Symbolic Execution Tree (example)
x X, y Y
int x, y if (x y) x x y y x
y x x y if (x y 0)
//Reachable?
7
For this we believe
symbolic execution for testing programs is a
more exploitable technique in the short term than
the more general one of program verification
James KingCACM 197, 1976
8
Symbolic Execution Timeline
9
Generalized Symbolic Execution
void foo(Node n1, Node n2) if (n1 ! null
n2 ! null) n1.x 2 n2.x 3
assert n1.x 2 n2.x 3

Deals with Object-oriented programs
Fields
Inheritance
Aliasing!
Uses lazy initialization of Objects
Only creates objects when they are required
No bound on the number of objects
Concurrency

10
Observation from King Paper
The symbolic execution of IF statements requires
theorem proving which, even for modest
programming languages, is mechanically
impossible.

Although true, it didnt account for two things
Moores Law
Improvements in decision procedure efficiency
Driven by the great strides made by SAT solvers

11
Decision Procedures Everywhere

Satisfiability Modulo Theories Competition
(SMT-COMP)
Competition between DPs for all kinds of theories
(integer, float, bitvectors etc.)
Participants in 2006
Ario 1.2 website, description, download
Barcelogic 1.1 website, description, download
CVC website, description, download
CVC3 website, description, download
ExtSAT website, description, download
HTP website, description, download
Jat description download
MathSAT 3.4 website, description, download
NuSMV website, description, download
Sateen description, download
STP description, download
Yices 1.0 website, description, download
Doesnt even include commercial bigshots such as
Prover Technology

12
Symbolic Execution Everywhere

Static Analysis
Prefix
Intrinsa, now Microsoft
Testing
DART and SMART
Lucent
JTest
Parasoft
Model Checking
XRT
Microsoft
JPF
NASA
Only listed industry and this is a woefully
inadequate list at best!

13
Example using Java PathFinder
public class SymExample // Node has an int
x field and a Node next field public static
void foo(Node n1, Node n2) if (n1 ! null
n2 ! null) n1.x 2 n2.x
3 assert n1.x 2 n2.x 3
public static void main(String args)
Node n1 Node._get_Node() // get Symbolic Node
Node n2 Node._get_Node() // get Symbolic
Node try foo(n1, n2)
System.out.println("No violation for n1 " n1
" and n2 " n2) catch (AssertionError
ae) System.out.println("Found violation
for n1 " n1 " and n2 " n2)
14
Example Output
No violation for n1 null and n2 null No
violation for n1 null and n2 Node_43844 (x
, next ) No violation for n1 Node_43844
(x , next ) and n2 null No violation for
n1 Node_43844 (x , next ) and
n2 Node_43791 (x , next
) Found violation for n1 Node_43844 (x
, next ) and n2 Node_43844
(x , next )
if (n1 ! null n2 ! null) n1.x
2 n2.x 3 assert n1.x 2
n2.x 3
15
What is Model Checking?
Oh no, we got downgraded
16
Analogy
The Web
Property
(no deadlock)
User Query
17
What is Java PathFinder (1)

explicit state model checker for Java bytecode
focus is on finding bugs in Java programs
concurrency related deadlocks, (races), missed
signals etc.
Java runtime related unhandled exceptions, heap
usage, (cycle budgets)
but also complex application specific assertions

18
What makes JPF interesting?

It can search the behaviors of a Java program in
an efficient fashion
Records all visited states and stops traversing a
path when it revisits a state
Uses various search heuristics
It treats non-deterministic behavior
Scheduling is non-deterministic in many Java
settings
The precise inputs to a program can be
non-deterministic
Data non-determinism is user definable
User can define the environment the program
executes in
However, to make JPF scale we also need symbolic
input data
Hence the extensions to do symbolic execution
Without symbolic execution the tool enumerates
all input values, which is often too large
One of the key issue is configurable
extensibility overcome scalability constraints
with suitable customization (using heuristics)

19
JPF Status

developed at the Robust Software Engineering
Group at NASA Ames Research Center
currently in its fourth development cycle
v1 Spin/Promela translator - 1999
v2 backtrackable, state matching JVM - 2000
v3 extension infrastructure (listeners, MJI) -
2004
v4 symbolic execution, choice generators - 4Q
2005
open sourced since 04/2005 under NOSA 1.3
license
its a first no NASA system development hosted
on public site before
11100 downloads since publication 04/2005

20
Implementation via Instrumentation
decision procedure
continue/ backtrack
state
program instrumentation
instrumented program
model checking
original program
correctness specification
counterexample(s)/test suite heapconstraintthre
ad scheduling
21
Handling Aliasing (illustration)
consider executingnext t.next
22
Symbolic Execution in Testing

Symbolic execution gives constraints on the
inputs to reach a specific
Statement
Branch
Condition
Etc.
Using an instrumented program that will save the
test inputs whenever the testing criterion is
reached will allow automatic test generation

23
Red-Black Trees
Self-balancing Binary Search Trees Java TreeMap
Implementation
(1) The root is BLACK
(3) All paths from a node to its leaves contain
the same number of black nodes.
(2) Red nodes can only have black children
(4) Acyclic (5) Consistent Parents
24
repOk() Fragment
boolean repOk(Entry e) // root has no parent,
root is black, // RedHasOnlyBlackChildren
workList new LinkedList() workList.add(e)
while (!workList.isEmpty()) Entry
current(Entry)workList.removeFirst() Entry
cl current.left Entry cr current.right
if (current.color RED) if(cl !
null cl.color RED) return false
if(cr ! null cr.color RED) return false
if (cl ! null) workList.add(cl) if
(cr ! null) workList.add(cr) // equal
number of black nodes on left and right
sub-tree return true
25
Black-box Test GenerationRed-Black Trees

Symbolic execution of repOk()
Generate new structures only when repOk() returns
true
Limit the size of the structures generated
Only correct structures will be generated
repOk() returns true after all nodes in the tree
have been visited, hence they must all be
concrete
symbolic (partial) structures can fail repOk()

26
Symbolic Execution of repOk() Example
public static boolean repOk() if (root
null) return true if (root.color
RED) return false
27
ISSTA 2004

Test Input Generation with Java PathFinderW.
Visser, C. Pasareanu, S. Khurshid
Paper also shows how to do white-box test
generation to obtain branch coverage using
symbolic execution
Whenever a branch is reached in the code the
input structure required to reach that code is
saved as a test input

28
ISSTA 2006

Generate test inputs for Java container classes
using sequences of API calls
Random, Model Checking, Symbolic Execution, etc.
Objective was to generate test cases to cover
basic blocks and predicates

29
Statement coverage is harder than you think

One of the basic blocks in the Binomial Heap
implementation required a minimum sequence of 13
API calls to be covered

private void merge(BinomialHeapNode binHeap)
BinomialHeapNode temp1 Nodes, temp2 binHeap
while ((temp1 ! null) (temp2 ! null))
if (temp1.degree temp2.degree)
BinomialHeapNode tmp temp2 temp2
temp2.sibling tmp.sibling
temp1.sibling temp1.sibling tmp
temp1 tmp.sibling else
if (temp1.degree ((temp1.sibling null)
(temp1.sibling.degree temp2.degree)) //
HERE!
X4(1) X8(1) X10(2) X8(1) X10(2) X11(2) X11(2) 0 X10(2) 0 X8(1) X9(1) X9(1) 0 X8(1) 0 X4(1) X2(1) X6(2) X4(1) X6(2) X7(2) 0 X6(2) 0 X4(1) X5(1) 0 X4(1) 0 X2(1) X2(1) 0 X2(1) 0
X0(1) 0 X0(1)
0 insert(X0)insert(X1)insert(X2)insert(X3)inse
rt(X4) insert(X5)insert(X6)insert(X7)insert(X8
)insert(X9) insert(X10)insert(X11)extractMin()

30
Orion Onboard Abort Executive

During ascent the OAE decides if an abort should
occur
It encodes 27 flight rules for when an abort
should occur, and if one gets triggered
It decided which of 7 aborts it should pick
We applied JPFs symbolic execution to a
prototype
Within 1 min it finds 23 tests that cover all the
flight rules and all 7 of the aborts
In addition it found a case in which an abort
should have occurred but none was selectedSee
constraint below to get to this scenario!

inputs_fr.navIN_fr.geod_alt(300000) 300000
inputs_fr.navIN_fr.geod_alt(300000) 120000
inputs_fr.navIN_fr.geod_alt(300000) 38000
inputs_fr.navIN_fr.geod_alt(300000) 10000
inputs_fr.sysIN_fr.cev_cm_cabin_pres_rate(-1)
-1 inputs_fr.sysIN_fr.cev_cm_cabin_pres_rate(-1
) -2 inputs_fr.navIN_fr.las_jettison_cmd(-10
00000) ! 1 inputs_fr.navIN_fr.roll_rate(40)

inputs_fr.navIN_fr.pitch_rate(70)
inputs_fr.navIN_fr.las_jettison_cmd(-1000000) !
0 inputs_fr.sysIN_fr.stage2_apu_volt(22)
inputs_fr.sysIN_fr.stage2_apu_volt(22) 22
inputs_fr.navIN_fr.vmissmag(5)
inputs_fr.sysIN_fr.stage2_thrust(221922)
332883 inputs_fr.sysIN_fr.stage2_thrust(221922)
221922 inputs_fr.sysIN_fr.stage2_helium_tnk
_pres(560)
ium_tnk_pres(560) 560 inputs_fr.sysIN_fr.sta
ge2_lh2_tnk_pres(26)
tage2_lh2_tnk_pres(26) 26
inputs_fr.sysIN_fr.stage2_lox_tnk_pres(17)
inputs_fr.sysIN_fr.stage2_lox_tnk_pres(17)
17 inputs_fr.sysIN_fr.stage2_hpft_speed(28288)

8288) 28288 inputs_fr.sysIN_fr.stage2_lpft_s
peed(12948)
lpft_speed(12948) 12948 inputs_fr.sysIN_fr.s
tage2_hpot_speed(22496)
inputs_fr.sysIN_fr.stage2_hpot_speed(22496)
22496 inputs_fr.sysIN_fr.stage2_lpot_speed(4120
)
4120) 4120 inputs_fr.sysIN_fr.stage2_efi_pre
s(4800)
res(4800) 4800 ((inputs_fr.sysIN_fr.stage1_t
vc_actual(0) MINUS inputs_fr.sysIN_fr.stage1_tvc_c
ommanded(0)) MULT 10)
1_tvc_commanded(0) MULT 3) inputs_fr.sysIN_fr.s
tage1_chmbr_pres(640)
inputs_fr.sysIN_fr.stage1_chmbr_pres(640) 640
theQueue.q7.spred.outputs_p.vgo_Mag(115)
515 theQueue.q7.spred.outputs_p.vgo_Mag(115)
115 theQueue.q0.spred.outputs_p.vmissmag_e
o_now(410)
.vmissmag_eo_now(410) 410
inputs_fr.navIN_fr.yaw_rate(31)
inputs_fr.navIN_fr.yaw_rate(31) 31
inputs_fr.navIN_fr.yaw(100)
inputs_fr.navIN_fr.yaw(100) 100
inputs_fr.navIN_fr.roll_rate(40)
inputs_fr.navIN_fr.roll_rate(40) 40
inputs_fr.navIN_fr.roll(100)
inputs_fr.navIN_fr.roll(100) 100
inputs_fr.navIN_fr.pitch_rate(70)
inputs_fr.navIN_fr.pitch_rate(70) 70
inputs_fr.navIN_fr.pitch(45)
inputs_fr.navIN_fr.pitch(45) 45
inputs_fr.navIN_fr.vmissmag(5)
inputs_fr.navIN_fr.vmissmag(5) 5
inputs_fr.navIN_fr.inert_vel_mag(22000)
inputs_fr.navIN_fr.inert_vel_mag(22000)
22000 inputs_fr.navIN_fr.geod_alt(300000)
310000 inputs_fr.navIN_fr.geod_alt(300000)
0 inputs_fr.sysIN_fr.stage2_thrust(221922)
342883 inputs_fr.sysIN_fr.stage2_thrust(221922)
211922 inputs_fr.sysIN_fr.stage2_lpot_speed
(4120)
peed(4120) 3620 inputs_fr.sysIN_fr.stage2_lp
ft_speed(12948)
ge2_lpft_speed(12948) 11948
inputs_fr.sysIN_fr.stage2_lox_tnk_pres(17)
inputs_fr.sysIN_fr.stage2_lox_tnk_pres(17)
12 inputs_fr.sysIN_fr.stage2_lh2_tnk_pres(26)

) 21 inputs_fr.sysIN_fr.stage2_hpot_speed(22
496)
eed(22496) 20496 inputs_fr.sysIN_fr.stage2_h
pft_speed(28288)
age2_hpft_speed(28288) 26288
inputs_fr.sysIN_fr.stage2_helium_tnk_pres(560)
1040 inputs_fr.sysIN_fr.stage2_helium_tnk_pres(
560) 360 inputs_fr.sysIN_fr.stage2_efi_pres(
4800)
s(4800) 4300 inputs_fr.sysIN_fr.stage2_apu_v
olt(22)
t(22) 7 inputs_fr.sysIN_fr.stage1_tvc_comman
ded(0)
anded(0) 0 inputs_fr.sysIN_fr.stage1_tvc_act
ual(0)
al(0) 0 inputs_fr.sysIN_fr.stage1_chmbr_pres
(640)
res(640) 440 inputs_fr.sysIN_fr.cev_cm_cabin
_pres_rate(-1)
abin_pres_rate(-1) -3
31
Did someone say Loops?

How do we terminate our symbolic analysis?
We use the following termination conditions
Limit the depth that the model checker searches
to
Limit the number of calls made from the
environment
Limit the size of structures required
Model checkers work from one core idea, called
state matching
Whenever you see a state you have visited before,
you stop traversing that path
However, doing state matching over symbolic
states that include a path condition is very hard
We implemented a subsumption check for symbolic
states over container structures
Test Input Generation for Java Containers using
State MatchingW. Visser, C. Pasareanu, R.
Pelanek, ISSTA 2006
We also developed a heuristic for widening to
compute loop invariants
Verification of Java Programs Using Symbolic
Execution and Invariant GenerationC. Pasareanu
and W. Visser, SPIN 2004

32
Decision Procedure Support
JPF
Formula
satisfiable/unsatisfiable
Generic Decision Procedure Interface
Omega Maryland
CVCLite Stanford
Yices SRI
STP Stanford
33
A New Idea

Path-sensitive Static Analysis tools for finding
runtime errors are now very popular
Coverity, KlocWork, etc.
This new generation of tools can miss bugs
Path sensitive analyses tend to have this issue
But it is much better at false positive warnings
than the old abstract interpretation style defect
detectors
That will never miss a bug, but it floods you
with warnings that the user must classify as bugs
or not
However, wouldnt it be nice to have actual test
inputs to tell you if something is a bug or not.
We combined symbolic execution and test input
generation to do this
Joint work with Aaron Tomb from UCSC

34
Codename Jitterbug
Java classes

Starts symbolic analysis at each method in the
class file(s)
Symbolic execution detects a possible error
passes it and the symbolic state to the test
generator
From the current state and the path condition
generate a test to try and cover the error
Execute the test and check if the expected error
is triggered

Symbolic Execution SOOT CVCL
Warnings
Test Generation POOC
Test Cases
Test Cases
35
Symbolic Execution Benefit
public class Example public String hexAbs(int
x) String result null if(x 0)
result Integer.toHexString(x) else
if(x return result.toUpperCase()
Dataflow Analysis Warning possible null
dereference on line 8
Symbolic Execution Error null dereference on
line 8 if x 0
36
Small Example
public class ArrayBound public void f(int n,
int m) int array new intm
for(int i 0 i 0
WARNING possible array upper bound violation
(f5) Symbolic state at time of warning
Method
Instruction arrayi 0 Line number 5
Depth instruction 6, branch 1, pc 4
Path condition U1 0, 0 0 len(A0), len(A0)
0 Parameter values U0, U1 This
object o0 Local vars i0, mU1, nU0,
thiso0, arrayA0 Solution (1) this o0,
param0 1, param1 0
Running 5 test(s)... 2) Solution (1) Testing
ArrayBound.f REAL? Caught expected exception
java.lang.ArrayIndexOutOfBoundsException 0
Occurred at ArrayBound.f5
37
Variably Interprocedural

Experience with runtime error detection tools
suggest that a very large percentage of errors
are intraprocedural
However, for object-oriented programs it is
common to access instance fields through accessor
methods
We made the level of interprocedural analysis an
input to the tool

public void foo(int m) m answer(m) m
m/(1-m) private int answer(int v) return
v 42 ? 1 0
38
Deals with Objects and Fields
public class Node public int value
public Node next public Node swap()
if (next ! null) if (value next.value)
Node t next next t.next t.next
this return t return this
WARNING possible null dereference
(swap8) Symbolic state at time of warning
Method
Instruction i1 r0.
Line number 8 Depth instruction 3, branch
0, pc 1 Path condition U1 null
Parameter values This object o0
Local vars thiso0, i0U0, r0U1 Initial
field values o0.valueU0, o0.nextU1
Current field values o0.valueU0,
o0.nextU1 Solution (0) this.value
-1000000 this.next null this o0
39
Deals with Objects and Fields (2)
public class Node public int value
public Node next public Node swap()
if (next ! null) if (value next.value)
Node t next next t.next t.next
this int x 10/(5-value) return t
return this
WARNING possible division by zero
(swap12) Symbolic state at time of warning
Method
Instruction x 10 / i1 Line number 12
Depth instruction 13, branch 2, pc 3
Path condition U0 / null, U1 U2, 5 - U1
0 Parameter values This object o0
Local vars i15 - U1, r0U0, thiso0,
i0U1, i2U1, r1U3,
tU0 Initial field values o1.valueU2,
o0.nextU0, o0.valueU1,
o1.nextU3 Current field values
o1.valueU2, o0.nextU3, o0.valueU1,
o1.nexto0
Unknown value mappings U0o1 Solution
(1) this.next.next U3 this.next.value
-1000000 this.value 5 this.next o1 this o0
40
Doesnt do Full Aliasing yet
void foo(Node n1, Node n2) if (n1 ! null
n2 ! null) n1.x 2 n2.x 3
assert n1.x 2 n2.x 3

Cannot find the bug here, since it will not
consider the case wheren1 n2
To handle this we need special paths that
consider possible aliasingas we did in the JPF
case
This will impact the performance
We need to study the trade-offs

41
Efficient Array Bounds Checking

Here everything is concrete in the symbolic
execution, hence it cannot find the violation in
the code
We add an acceleration heuristic to find the bug
here
For conditions such as i assignment to set i n-1 to accelerate the
loop to its boundary condition
Side-effect is to make i symbolic

WARNING possible array upper bound violation
(k6) Symbolic state at time of warning
Method
Instruction l2l3 0 Line number 6
Depth instruction 4, branch 1, pc 4
Boundary hack true branch (false), false branch
(true) Path condition 0
len(A0), len(A0) 0 Parameter values
U0 This object o0 Local vars l3U0 -
1, l0o0, l2A0, l1U0 Solution (0) this
o0 param0 101
public void k(int n) int array new
int100 for (int i 0 i
arrayi 0
42
Termination Revisited

This is the 1 issue in path sensitive analyses
Set the termination criteria too weak then you
miss bugs
Set it too precise then the analysis runs too
long
At first we counted the number of times an
instruction got executed across all paths
This criteria was too weak and it missed many
bugs
We currently use size of the path condition
This works well, but it can lead to many warnings
for the same error
We plan to use abstraction based termination
conditions, such as predicate abstractions, as
proposed in
Concrete Model Checking with Abstract Matching
and RefinementCorina Pasareanu, Radek Pelanek
and Willem Visser
The exact trade-offs for the termination
conditions must still be studied

43
Current State

Absolutely zero effort has gone into making the
tool fast
Finds all the same errors as ChecknCrash and a
few more
Christoph Csallner (Georgia Tech, and Google
Summer intern this past year)
Ran it on some large NASA source bases
Found numerous errors
Quick study showed some are real
Note not all errors reported are reachable
Not considering pre-conditions that are not
explicit in the code
Not all warnings will have tests to show the
error
Path conditions only constrain explicit inputs
and the values of other variables might be
important too, e.g. in the code below there is a
bug only if the number of files in the directory
is greater than 10 but the test case has no
control over the number of files
In general we use a random test generator to try
and create valid data if we have no constraints
on some variable

File dataDir new File(dataDirName) File
conflicts dataDir.listFiles() TesterThread
threadList new TesterThread10 for(int
i0i
fis new FileInputStream(conflictsi)
44
Future Work
Termination
Call Depth
Aliasing

Turn the dials to find the sweet spot between
accuracy and speed
Address the uncontrolled environment problem
Improve efficiency
Integrate with Eclipse and JUnit
Compare with commercial tools
Both KlocWork and Coverity supports Java now

45
Conclusions

Symbolic Execution is a powerful technique for
doing advanced testing
Showed two different tools supporting symbolic
exeution
Java PathFinder
Whole program analysis
Every error discovered is a real error
Stops when it runs into a constraint none of its
decision procedures can deal with
Codename JitterBug
Variably interprocedural
Not every warning is a real bug, not even every
failed test indicates a real bug
Can find errors even when the decision procedure
fails because it assumes feasibility
Main research goal is to combine symbolic
execution with concurrency analysis
Concurrency errors are hard to catch
Model checkers are about as good as it gets in
this domain
Model checking by itself cannot deal with all
types of programs
Static analysis based techniques will be required