Title: EXE: A System for Automatically Generating Inputs of Death Using Symbolic Execution
1EXE A System for Automatically Generating Inputs
of Death Using Symbolic Execution
- Cristian Cadar, Vijay Ganesh, Peter Pawlowski,
Dawson Engler, and David Dill
2The High-Level Idea
- Automatically generate test cases for systems
code by using the code itself - Mark a set of inputs (for example, a network
packet or a regular expression) as symbolic,
indicating they can have any value - On each conditional involving symbolic data, fork
execution once for the true branch and once for
the false branch - Add constraints on each branch indicating under
what conditions that branch is reachable - Use a constraint solver to generate concrete
inputs from constraints on symbolic data
3A Bit of Background
- Program analysis comes in two principle flavors
static and dynamic analysis - Static analysis involves only considering the
programs source. Type-checking is the most
widely-used form of static analysis. - Dynamic analysis involves instrumenting the
original program source with a variety of checks
and then actually watching the program run.
Purify is one of the best known dynamic analysis
tools. EXE is a dynamic tool as well.
4A Tiny Example
- include ltstdio.hgt
- int main()
- int a
-
- make_symbolic(a)
- if (a 0)
- printf(a can be zero!\n)
- else
- printf(a can be non-zero!\n)
-
- return 0
Output a can be zero a can be non-zero
5A (bit) Larger Example
include ltassert.hgt int main(void)
unsigned i, t, a4 1, 3, 5, 2
make_symbolic(i) if(i gt 4) exit(0)
// cast symbolic indexing mutation char
p (char )a i 4 p p - 1 // Just
modifies one byte! // ERROR EXE catches
overflow at i2 t ap // At this
point i ! 2. // ERROR EXE catches mod by
0 when i0 t t / ai // At this
point i ! 0 i ! 2. // EXE proves
neither assert fires if(t 2)
assert(i 1) else assert(i 3)
return 0
6Constraints
- EXE models memory at the bit level, thus
successfully handling all sorts of C ugliness
pointer arithmetic, gratuitous use of casts, etc. - Whenever EXE takes a specific branch, it adds the
constraints on the original symbolic data
necessary to take that branch to the set of
current constraints. - When the program reaches an exit point for any
reason, the Simple Theorem Prover (STP) generates
a set of concrete values that will drive
execution to that point in the code.
7Workflow for Checking a Piece of Code
- Edit programs source, marking appropriate input
as symbolic. - Compile code with exe-cc to obtain an
instrumented executable. - Run this executable to generate test cases. These
test cases are files that contain concrete values
to be fed in as input to the program. - Compile the original source (plus a small harness
to feed in concrete input from file) with gcc to
obtain an uninstrumented binary. - Run this executable with valgrind to look for
memory-bound violations, as well as crashes.
Key Point Test cases are run through an
uninstrumented version of the original code,
preventing the introduction of artifacts related
to EXEs instrumentation. If the test case
crashes unistrumented code, it must be a real
bug.
8Fast Statement Coverage
9Results
- Found real bugs in
- Berkeley Packet Filter
- Perl Compatible Regular Expression library (PCRE)
- udhcp
- ext2, ext3, JFS (in a different paper)
Response from PCRE author
Its a bug! An interestingly obscure one that
took a bit of time to work out
10Real Inputs of Death
for udhcp (packet)
- unsigned char pkt548
- / 0/ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
- / 10/ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
- / 20/ 0, 0, 0, 0, 0, 0, 0, 0, 0, 90,
- ...
- /240/ 0, 33, 249, 0, 0, 0, 0, 0, 0, 0,
- ...
- /490/ 0, 0, 52, 39, 0, 0, 0, 0, 0, 0,
- ...
- /520/ 0, 0, 0, 0, 0, 0, 0, 53, 15, 3,
- /530/ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
- /540/ 0, 0, 0, 0, 50, 0, 54, 0
for PCRE (regular expression)
\0\0\-?\0 \-\\0\\0\-?\0
\-\\0\0\-?\0 (?)\?\0\0\-\0
(?)\?\0\0\ \\0 (?)\?\0\0\-\0
(?)\?\0\0\0\0\0 (?)\?\0\0\0\0\
\0
11PCRE bug, explicated
// ptr points to current location in // the
string being parsed // consider ptr
\0\0. . . if (ptr
check_posix_syntax(ptr, ptr, compile_block))
// ptr now points to , and parsing continues
ptr / . . . / static BOOL
check_posix_syntax( const uschar ptr, const
uschar endptr, compile_data
cd) // ptr now points to , and parsing
continues int terminator (ptr) //
ptr now points to second \0 if ((ptr)
) ptr /. . ./ // evaluates to
true if (ptr terminator ptr1 )
// ptr in caller now points to second \0
endptr ptr return TRUE return
FALSE