EXE: A System for Automatically Generating Inputs of Death Using Symbolic Execution - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

EXE: A System for Automatically Generating Inputs of Death Using Symbolic Execution

Description:

if(t == 2) assert(i == 1); else. assert(i == 3); return 0; Constraints ... all sorts of C ugliness: pointer arithmetic, gratuitous use of casts, etc. ... – PowerPoint PPT presentation

Number of Views:141
Avg rating:3.0/5.0
Slides: 12
Provided by: curisSt
Category:

less

Transcript and Presenter's Notes

Title: EXE: A System for Automatically Generating Inputs of Death Using Symbolic Execution


1
EXE A System for Automatically Generating Inputs
of Death Using Symbolic Execution
  • Cristian Cadar, Vijay Ganesh, Peter Pawlowski,
    Dawson Engler, and David Dill

2
The High-Level Idea
  • Automatically generate test cases for systems
    code by using the code itself
  • Mark a set of inputs (for example, a network
    packet or a regular expression) as symbolic,
    indicating they can have any value
  • On each conditional involving symbolic data, fork
    execution once for the true branch and once for
    the false branch
  • Add constraints on each branch indicating under
    what conditions that branch is reachable
  • Use a constraint solver to generate concrete
    inputs from constraints on symbolic data

3
A Bit of Background
  • Program analysis comes in two principle flavors
    static and dynamic analysis
  • Static analysis involves only considering the
    programs source. Type-checking is the most
    widely-used form of static analysis.
  • Dynamic analysis involves instrumenting the
    original program source with a variety of checks
    and then actually watching the program run.
    Purify is one of the best known dynamic analysis
    tools. EXE is a dynamic tool as well.

4
A Tiny Example
  • include ltstdio.hgt
  • int main()
  • int a
  • make_symbolic(a)
  • if (a 0)
  • printf(a can be zero!\n)
  • else
  • printf(a can be non-zero!\n)
  • return 0

Output a can be zero a can be non-zero
5
A (bit) Larger Example
include ltassert.hgt int main(void)
unsigned i, t, a4 1, 3, 5, 2
make_symbolic(i) if(i gt 4) exit(0)
// cast symbolic indexing mutation char
p (char )a i 4 p p - 1 // Just
modifies one byte! // ERROR EXE catches
overflow at i2 t ap // At this
point i ! 2. // ERROR EXE catches mod by
0 when i0 t t / ai // At this
point i ! 0 i ! 2. // EXE proves
neither assert fires if(t 2)
assert(i 1) else assert(i 3)
return 0
6
Constraints
  • EXE models memory at the bit level, thus
    successfully handling all sorts of C ugliness
    pointer arithmetic, gratuitous use of casts, etc.
  • Whenever EXE takes a specific branch, it adds the
    constraints on the original symbolic data
    necessary to take that branch to the set of
    current constraints.
  • When the program reaches an exit point for any
    reason, the Simple Theorem Prover (STP) generates
    a set of concrete values that will drive
    execution to that point in the code.

7
Workflow for Checking a Piece of Code
  • Edit programs source, marking appropriate input
    as symbolic.
  • Compile code with exe-cc to obtain an
    instrumented executable.
  • Run this executable to generate test cases. These
    test cases are files that contain concrete values
    to be fed in as input to the program.
  • Compile the original source (plus a small harness
    to feed in concrete input from file) with gcc to
    obtain an uninstrumented binary.
  • Run this executable with valgrind to look for
    memory-bound violations, as well as crashes.

Key Point Test cases are run through an
uninstrumented version of the original code,
preventing the introduction of artifacts related
to EXEs instrumentation. If the test case
crashes unistrumented code, it must be a real
bug.
8
Fast Statement Coverage
9
Results
  • Found real bugs in
  • Berkeley Packet Filter
  • Perl Compatible Regular Expression library (PCRE)
  • udhcp
  • ext2, ext3, JFS (in a different paper)

Response from PCRE author
Its a bug! An interestingly obscure one that
took a bit of time to work out
10
Real Inputs of Death
for udhcp (packet)
  • unsigned char pkt548
  • / 0/ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  • / 10/ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  • / 20/ 0, 0, 0, 0, 0, 0, 0, 0, 0, 90,
  • ...
  • /240/ 0, 33, 249, 0, 0, 0, 0, 0, 0, 0,
  • ...
  • /490/ 0, 0, 52, 39, 0, 0, 0, 0, 0, 0,
  • ...
  • /520/ 0, 0, 0, 0, 0, 0, 0, 53, 15, 3,
  • /530/ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  • /540/ 0, 0, 0, 0, 50, 0, 54, 0

for PCRE (regular expression)
\0\0\-?\0 \-\\0\\0\-?\0
\-\\0\0\-?\0 (?)\?\0\0\-\0
(?)\?\0\0\ \\0 (?)\?\0\0\-\0
(?)\?\0\0\0\0\0 (?)\?\0\0\0\0\
\0
11
PCRE bug, explicated
// ptr points to current location in // the
string being parsed // consider ptr
\0\0. . . if (ptr
check_posix_syntax(ptr, ptr, compile_block))
// ptr now points to , and parsing continues
ptr / . . . / static BOOL
check_posix_syntax( const uschar ptr, const
uschar endptr, compile_data
cd) // ptr now points to , and parsing
continues int terminator (ptr) //
ptr now points to second \0 if ((ptr)
) ptr /. . ./ // evaluates to
true if (ptr terminator ptr1 )
// ptr in caller now points to second \0
endptr ptr return TRUE return
FALSE
Write a Comment
User Comments (0)
About PowerShow.com