Title: Static and Dynamic Protection from Vulnerabilities for Web Applications
1Static and Dynamic Protection from
Vulnerabilities for Web Applications
- Benjamin Livshits
- SUIF Compiler Group
- Computer Science Lab
- Stanford University
2Real-Life Hacking Stories
- blogger.com cracked
Aug. 2005 - Firefox marketing site hacked
Jul. 2005 - MS UK defaced in hacking attack
Jul. 2005 - Hacker hits Duke system
Jun. 2005 - MSN site hacked in South Korea
Jun. 2005 - MSN site hacking went undetected for days
Jun. 2005 - Phishers manipulate SunTrust site to steal data
Sep. 2004 - Tower Records settles charges over hack attacks
Apr. 2004 - Western Union Web site hacked
Sep. 2000
- 75 of all security attacks today are at the
application level
- 97 of 300 audited sites were vulnerable to Web
application attacks
- 300K average financial loss from unauthorized
access or info theft
- Average 100K/hour of downtime lost
- Source Gartner Research
- Source Computer Security Institute survey
3Key Insight
- Bugs in application code lead to vulnerabilities
- Vulnerabilities lead to security breaches
- Information leaks stolen sensitive data
- Write access to unauthorized data fraud
80
4Simple Web App
- Web form allows user to look up account details
- Underneath Java Web app. serving requests
5SQL Injection Example
- Happy-go-lucky SQL statement
- Leads to SQL injection
- One of the most common Web application
vulnerabilities caused by lack of input
validation - But how?
- Typical way to construct a SQL query using
concatenation - Looks benign on the surface
- But lets play with it a bit more
String query SELECT Username, UserID, Password
FROM Users WHERE username
user AND password password
6Injecting Malicious Data (1)
submit
query SELECT Username, UserID, Password
FROM Users WHERE Username 'bob' AND
Password
7Injecting Malicious Data (2)
submit
query SELECT Username, UserID, Password
FROM Users WHERE Username 'bob-- AND
Password
8Injecting Malicious Data (3)
submit
query SELECT Username, UserID, Password
FROM Users WHERE Username 'bob DROP
Users-- AND Password
9Summary of Attacks Techniques
Input and output validation are at the core of
the issue
- Inject
- (taint sources)
- Parameter manipulation
- Hidden field manipulation
- Header manipulation
- Cookie poisoning
- Second-level injection
- 2. Exploit
- (taint sinks)
- SQL injections
- Cross-site scripting
- HTTP request splitting
- HTTP request smuggling
- Path traversal
- Command injection
1. Header manipulation 2. HTTP splitting
vulnerability
10Which Vulnerabilities are Most Prevalent?
- http//www.SecurityFocus.com
- 500 vulnerability reports
- A week of Nov. 26th Dec. 3rd, 2005
58
11Focusing on Input/Output Validation
- SQL injection and cross-site scripting are most
prevalent - Buffer overruns are losing their market share
18
19
30
12Overview of Our Approach
Overview
Static
Dynamic
Experiments
Extensions
Conclusions
Future
13Overview of Our Approach
- Targets developers
- Finds vulnerabilities early in development cycle
- Sounds, so finds all vuln. of a particular type
- Can be run after every build ensuring continuous
security
Static analysis
Dynamic analysis
- Dynamic analysis incurs an overhead
- Static results optimize dynamic overhead
- Overhead often drops from 50 to under 1
- Targets system administrators
- Prevents vulnerabilities from doing harm
- Safe mode for Web application execution
- Can quarantine suspicious actions, application
continues to run - No false positives
14Describing Vulnerabilities
- Described using PQL OOPSLA05
- General language for describing events on objects
- Used for other applications
- Finding memory leaks
- Mismatched method pairs
- Serialization errors
- Unsafe password manipulation
- Finding optimization opportunities
- etc.
- Simple example
- SQL injections caused by parameter manipulation
- Looks like a code snippet
query simpleSQLInjection returns object
String param, derived uses object
HttpServletRequest req object Connection
con object StringBuffer
temp matches param
req.getParameter(_) temp.append(param)
derived temp.toString()
con.executeQuery(derived)
Parameter manipulation
SQL injection
- Real queries are longer and more involved
- Describing all vulnerabilities we are looking
for 159 lines of PQL - But it is suitable for pretty much all J2EE
applications
15Analysis Framework Architecture
Static analysis
Dynamic analysis
Warnings
Instrumented applications
16Importance of a Sound Solution
- Sound solution can detect all bugs
- Sound means the tool cannot miss a bug
- No warnings reported gt no bugs
- Especially attractive for security
- Provide guarantees about security posture of an
application - A sound solution is hard to get
- Tension between soundness and precision
- Soundness means lots of false positives to some
- With care a sound solution can remain precise
- Need to analyze all of the program
- Hard because Java allows dynamic class loading
- Need to have a complete spec of what to look for
- Soundness statement
Our analysis finds all vulnerabilities in
statically analyzed code that are captured by
the specification
17Static Analysis
Overview
Static
Dynamic
Experiments
Extensions
Conclusions
Future
18Early-Stage Prevention
requirements
design
development
testing
deployment maintenance
19Why Pointer Analysis?
- Imagine manually auditing an application
- Two statements somewhere in the program
- Can these variables refer to the same object?
- Question answered by pointer analysis...
// get Web form parameter String param
request.getParameter(...) ... ... ... //
execute query con.executeQuery(query)
20Pointers in Java?
- Java references are pointers in disguise
Heap
21Taint Propagation
String session.ParameterParser.getRawParameter(Str
ing name)
public String getRawParameter(String name)
throws ParameterNotFoundException
String values request.getParameterValues(nam
e) if (values null)
throw new ParameterNotFoundException(name "
not found") else if
(values0.length() 0)
throw new ParameterNotFoundException(name " was
empty") return (values0)
ParameterParser.java586
String session.ParameterParser.getRawParameter(Str
ing name, String def)
public String getRawParameter(String name, String
def) try return getRawParameter(name)
catch (Exception e) return def
ParameterParser.java570
Element lessons.ChallengeScreen.doStage2(WebSessio
n s)
String user s.getParser().getRawParameter(
USER, "" ) StringBuffer tmp new
StringBuffer() tmp.append("SELECT cc_type,
cc_number from user_data WHERE userid
') tmp.append(user) tmp.append("') query
tmp.toString() Vector v new Vector() try
ResultSet results statement3.executeQuery(
query ) ...
ChallengeScreen.java194
22What Does Pointer Analysis Do for Us?
- Statically, the same object can be passed around
in the program - Passed in as parameters
- Returned from functions
- Deposited to and retrieved from data structures
- All along it is referred to by different
variables - Pointer analysis summarizes these operations
- Doesnt matter what variables refer to it
- We can follow the object throughout the program
a
b
c
23Pointer Analysis Background
- Question
- Determine what objects a given variable may refer
to - A classic compiler problem for 20 years
- Until recently, sound analysis implied lack of
precision - We want to have both soundness and precision
- Context-sensitive inclusion-based pointer
analysis - Whaley and Lam PLDI04
- Recent breakthrough in pointer analysis
technology - An analysis that is both scalable and precise
- Context sensitivity greatly contributes to the
precision
24Importance of Context Sensitivity (1)
- Distinguishing between different calling contexts
tainted
c1
c1
String id(String str) return str
c2
c2
untainted
25Importance of Context Sensitivity (2)
tainted
String id(String str) return str
untainted
tainted
Excessive tainting!!
26Pointer Analysis Object Naming
- Static analysis approximates dynamic behavior
- Some approximation is necessary
- Unbounded number of dynamic objects
- Finite number of static entities for analysis
- Allocation-site object naming de facto standard
- Dynamic objects are represented by the line of
code (allocation site) that allocates them - Can be imprecise
- Two dynamic objects allocated at the same site
have the same static representation - Works well most of the time, but not always
27Imprecision with Default Object Naming
- All objects returned by String.toLowerString()
are allocated in the same place
foo.java45
String.java7251
700 String toLowerCase(String str)
725 return new String() 726
700 String toLowerCase(String str)
725 return new String() 726
String.java725
bar.java30
String.java7252
28Improved Object Naming
- We introduced an enhanced object naming
- Containers HashMap, Vector, LinkedList,
- Factory functions String.toLowerCase(), ...
- Very effective at increasing precision
- Avoids false positives in all apps but one
- All FPs caused by a single factory method
- Improving naming further gets rid of all FPs
29Simple SQL Injection Query Translated
query simpleSQLInjection returns object
String param, derived uses object
HttpServletRequest req object Connection
con object StringBuffer
temp matches param
req.getParameter(_) temp.append(param)
derived temp.toString()
con.executeQuery(derived)
- PQL is automatically translated into Datalog
- Syntax-driven translation
- Obviates the need for hand-written Datalog
30Analysis in Datalog
simpleSQLInjection(hparam, hderived ) ret(i1,
v1), call(c1, i2, "HttpServletRequest.getParamete
r"), pointsto(c1, v1, hparam), actual(i2, v2,
0), actual(i2, v3, 1), call(c2, i2,
"StringBuffer.append"), pointsto(c2, v2,
htemp), pointsto(c2, v3, hparam), actual(i3,
v4, 0), ret(i3, v5), call(c3, i3,
"StringBuffer.toString"), pointsto(c3, v4,
htemp), pointsto(c3, v5, hderived), actual(i4,
v6, 0), actual(i4, v7, 1), call(c4, i4,
"Connection.execute"), pointsto(c4, v6,
hcon), pointsto(c4, v7, hderived).
31Dynamic Analysis
Overview
Static
Dynamic
Experiments
Extensions
Conclusions
Future
32Late-Stage Prevention
requirements
design
development
testing
deployment maintenance
33Preventing Vulnerabilities
- query main()
- returns object Object source, sink
- uses
- object java.sql.Connection con
- object java.sql.Statement stmt
- matches
- source UserSource()
- sink StringPropStar(source)
-
- replaces
- con.prepareStatement(sink)
- with SQL.SafePrepare(con, source, sink)
-
- replaces stmt.executeQuery(sink)
- with SQL.SafeExecute(stmt, source, sink)
34PQL Instrumentation Engine
xo3
y x
xyo3
e
e
tx.toString()
y derived(t)
e
e
e
xo3
e
xo3
t.append(x)
y derived(t)
xyo3
xo3
- PQL is translated into bytecode instrumentation
- State machines interpret PQL queries and run
alongside program - Keep track of partial matches
- Execute recovery code on a query match when
necessary - Provides a safe execution mode for Web
applications - Smart and customizable (compare to perl T)
- User can insert recovery code
- Finding Application Errors and Security Flaws
Using PQL a Program Query Language, Michael
Martin, Benjamin Livshits, and Monica S. Lam,
OOPSLA05
35Experimental Results
Overview
Static
Dynamic
Experiments
Extensions
Conclusions
Future
36Benchmarks for Our Experiments
- Benchmark suite Stanford SecuriBench
- Publicly available (Google SecuriBench)
- Defining a Set of Common Benchmarks for Web
Application Security, Benjamin Livshits,
Workshop on Defining the State of the Art in
Software Security Tools, 2005 - SecuriBench Micro is coming out soon
- Widely used programs
- Suite of 9 large open-source Java benchmark
applications - Most are blogging/bulletin board applications
- Installed at a variety of Web sites
- Thousands of users combined
- Applied out static dynamic analysis to these
applications - Reused the same J2EE PQL query for all
- Statically Found bugs, Measured false
positives - Dynamically Prevented bugs, Measured the
overhead
37Benchmark Statistics
- Real-life large open source Web applications
- Released as Stanford Securibench
38Classification of Errors (1)
39Classification of Errors (2)
6
40Classification of Errors (3)
- Total of 29 vulnerabilities found
- Were are sound all analysis versions report them
41Some Interesting Attack Vectors
- TRACE vulnerability in J2EE
- Found a vulnerability in J2EE sources
- Appears in four of our benchmarks
- Known as cross-site tracing attacks
- Session.find in Hibernate 2.0
- SQL injection vulnerability, found other similar
ones - Causes two application vulnerabilities
- Common situation attack vectors in libraries
should be removed or at least documented
42Validating the Vulnerabilities
- Reported issues back to program maintainers
- Most of them responded
- Most reported vulnerabilities confirmed as
exploitable - More that a dozen code fixes
- Often difficult to convince that a statically
detected vulnerability is exploitable - Had to convince some people by writing exploits
- Library maintainers blamed application writers
for the vulnerabilities
43Low False Positive Rate
- Very high precision
- With context sensitivity improved object naming
- Still have some false positives
- Only 12 false positives in 9 benchmark
applications - Have the same cause and can be fixed easily
- Slight modification of our object-naming scheme
- One-line change to the pointer analysis
- However, may have false positives
- We ignore predicates, which may be important
- Better object naming may still be needed
- No disambiguation of objects in a container
- Finding Security Vulnerabilities in Java
Applications with Static Analysis, Benjamin
Livshits and Monica S. Lam, UsenixSec05
44False Positives
Remaining 12 false positives for the most precise
analysis version
45Instrumented Executables
- Experimental confirmation
- Found and prevented error in our experiments
- Blocked exploits at runtime (2 SQL injections)
- Naïve implementation
- Instrument every string operation
- Overhead is relatively high
- Use static information to narrow down the scope
of instrumentation - Overhead
- Unoptimized version 9-125, 57 average
- Optimized version 1-37, 14 average
- Static optimization removes 82-99 of instr.
points
46Extensions
Overview
Static
Dynamic
Experiments
Extensions
Conclusions
Future
47Beyond The Basics
Basic Analysis Framework
Precision
Containers Factories
Usenix Security 05
Object sensitivity, etc.
to be published
Completeness
Reflection
APLAS 05
Derivation methods
to be published
bddbddb
Specification discovery
FSE 05, to be published
PQL
Our analysis finds all vulnerabilities in
statically analyzed code that are captured by
the specification
48Reflection
Overview
Static
Dynamic
Experiments
Extensions
Conclusions
Future
49The Issue of Reflection
- Most analyses for Java ignore reflection
- Fine approach for a while SpecJVM hardly uses
reflection at all - Can no longer get away with this
- Reflection extremely common
- JBoss, Tomcat, Eclipse, etc. are reflection-based
- Same is true about Web apps. EJBs are entirely
reflection-based - Call graph is incomplete
- Code not analyzed gt bugs are missing
- Ignoring reflection misses ½ application more
50Reflection Resolution Algorithm
- Developed the first call graph construction
algorithm to explicitly deal with the issue of
reflection - Uses points-to analysis for call graph discovery
- Finds specification points
- Type casts in program are used to reduce
specification effort - Applied to 6 large apps, 190,000 LOC combined
- About 95 of calls to Class.forName are resolved
at least partially without any specs - There are some stubborn calls that require
user-provided specification or cast-based
approximation - Cast-based approach reduces the specification
burden - Reflection resolution significantly increases
call graph size - As much as 7X more methods
- Adds 7,000 new methods in for some benchmarks
- Reflection Analysis for Java, Benjamin
Livshits, John Whaley, and Monica S. Lam,
APLAS05
51Derivation Routines
Overview
Static
Dynamic
Experiments
Extensions
Conclusions
Future
52Flow of Taint
sink
sanitizer
String.substring
source
source
StringBuffer.append
Security violation
How do we know what these are?
sink
53Finding Derivation Routines
- Many standard derivation routines such as
- String.toLowerCase(), String.replace(),
String.insert(), String.substring(),
String.concat() - StringBuffer.append(),...
- StringTokenizer.nextElement(),
StringTokenizer.nextToken(),... - Many application-specific derivation routines as
well - Many methods that manipulate string values
- Involve low-level character operations
- Derivation routines can play different roles
- Depends on the analysis (sources sinks)
- Some of them work as derivation routines
- Others are sanitizers
54Derivation Routines (1)
- public static String filterNewlines(String s)
- if (s null)
- return null
-
- StringBuffer buf new StringBuffer(s.length())
- // loop through characters and replace if
necessary - int length s.length()
- for (int i 0 i lt length i)
- switch (s.charAt(i))
- case \n
- break
- default
- buf.append(s.charAt(i))
-
-
- return buf.toString()
55Derivation Routines (2)
public static String removeNonAlphanumeric(String
str) StringBuffer ret new StringBuffer(str.len
gth()) char testChars str.toCharArray() fo
r (int i 0 i lt testChars.length i) //
MR Allow periods in page links if
(Character.isLetterOrDigit(testCharsi)
testCharsi .) ret.append(te
stCharsi) return ret.toString()
56Derivation Routines Statistics
- Developed a specialized analysis that computes
method summaries - return value depends on a parameter i
- parameter i depends on parameter j
- Deals with character assignment, etc.
57Conclusions
Overview
Static
Dynamic
Experiments
Extensions
Conclusions
Future
58Conclusions
- Web application security is a huge problem
- SQL injections, cross-site scripting, etc. are
dominating vulnerability reports - Hybrid static dynamic solution
- Static detection early in development cycle
- Dynamic exploit prevention and recovery
- Found several dozen bugs
- Most fixed by developers right away
- Prevent exploits at runtime
- Significant reduction in overhead with static
- Working on analyzing more code
- Extensions (common to most bug finding tools)
- Reflection
- User-defined derivation descriptors
- Specification completeness
59Project Status
- Griffin Security Project
- http//suif.stanford.edu/livshits/work/griffin/
- Stanford SecuriBench Stanford Securibench Micro
- http//suif.stanford.edu/livshits/securibench
- PQL language and dynamic instrumentation
framework - http//pql.sourceforge.net/
- bdddbdd program analysis system
- http//bddbddb.sourceforge.net/
- joeq Java compiler infrastructure
- http//joeq.sourceforge.net/
60References (1)
- Publications
- Reflection Analysis for Java. Benjamin Livshits,
John Whaley and Monica S. Lam Presented at the
Third Asian Symposium on Programming Languages
and Systems, Tsukuba, Japan, November, 2005. - Finding Application Errors and Security Flaws
Using PQL a Program Query Language. - Michael Martin, Benjamin Livshits, and Monica S.
Lam Presented at the 20th Annual ACM Conference
on Object-Oriented Programming, Systems,
Languages, and Applications, San Diego,
California, October 2005. - DynaMine Finding Common Error Patterns by Mining
Software Revision Histories. - Benjamin Livshits and Thomas Zimmermann
Presented at the ACM SIGSOFT Symposium on the
Foundations of Software Engineering (FSE 2005),
Lisbon, Portugal, September 2005. - Defining a Set of Common Benchmarks for Web
Application Security.Benjamin LivshitsPosition
paper on Stanford SecuriBench for the Workshop on
Defining the State of the Art in Software
Security Tools, Baltimore, August 2005. - Finding Security Vulnerabilities in Java
Applications with Static Analysis. - Benjamin Livshits and Monica S. Lam In
Proceedings of the Usenix Security Symposium,
Baltimore, Maryland, August 2005. - Locating Matching Method Calls by Mining Revision
History Data. - Benjamin Livshits and Thomas Zimmermann In
Proceedings of the Workshop on the Evaluation of
Software Defect Detection Tools, Chicago,
Illinois, June 2005.
61References (2)
- Context-Sensitive Program Analysis as Database
Queries. - Monica S. Lam, John Whaley, Benjamin Livshits,
Michael Martin, Dzintars Avots, Michael Carbin,
Christopher Unkel. In Proceedings of Principles
of Database Systems (PODS), Baltimore, Maryland,
June 2005. - Improving Software Security with a C Pointer
Analysis. Dzintars Avots, Michael Dalton,
Benjamin Livshits, Monica S. Lam.In Proceedings
of the 27th International Conference on Software
Engineering (ICSE), May 2005 - Turning Eclipse Against Itself Finding Bugs in
Eclipse Code Using Lightweight Static Analysis.
Benjamin LivshitsIn Proceedings of the
Eclipsecon Research Exchange Workshop, March 2005
- Findings Security Errors in Java Applications
Using Lightweight Static Analysis. Benjamin
Livshits.In Annual Computer Security
Applications Conference, Work-in-Progress Report,
November 2004. - Tracking Pointers with Path and Context
Sensitivity for Bug Detection in C Programs. - Benjamin Livshits and Monica S. LamIn
Proceedings of the 11th ACM SIGSOFT International
Symposium on the Foundations of Software
Engineering (FSE-11), September 2003. - Technical Reports
- Reflection Analysis for Java.
- Benjamin Livshits, John Whaley, and Monica S.
Lam - Turning Eclipse Against Itself Improving the
Quality of Eclipse Plugins. - Benjamin Livshits
- Finding Security Vulnerabilities in Java
Applications with Static Analysis.
62Future Paper Plans
- Analyzing Sanitization Routines
- Learning Specification from Runtime Histories
- Analyze Sources of Imprecision in Datalog
- Serialization or Cloning Analysis
- Analysis of Parsers Written in Java
- Partitioned BDDs to Enhance Scalability of
bddbddb - Attack Vectors in Library Code
- Using Model Checking to Break Sanitizers
- Applying Model-checking to Servlet Interaction
63The End.
Overview
Static
Dynamic
Experiments
Benjamin Livshits, John Whaley, and Monica S. Lam
Extensions
Conclusions
Future
64Frequency of Vulnerabilities
65Web Application Security Space
automatic code scanning tools
manual code reviews
black-box testing solutions
application firewalls
manual penetration testing
client-side protection
66A More Complete Description (1)
query main() returns object Object
sourceObj, sinkObj matches sourceObj
source() sinkObj derived(sourceObj)
sinkObj sink() query derived(object
Object x) returns object Object y uses
object Object temp matches y
x temp derived(x) y derived(temp)
67A More Complete Description (2)
query source() returns object Object
sourceObj uses object String sourceArray
object HttpServletRequest req matches
sourceObj req.getParameter(_) sourceObj
req.getHeader(_) sourceArray
req.getParameterValues(_) sourceObj
sourceArray ...
68A More Complete Description (3)
query sink() returns object Object
sinkObj uses object java.sql.Statement stmt
object java.sql.Connection con matches
stmt.executeQuery(sinkObj)
stmt.execute(sinkObj) con.prepareStatement(s
inkObj) ...
69A More Complete Description (4)
query derived(object Object x) returns
object Object y matches y.append(x)
y _.append(x) y new String(x) y
new StringBuffer(x) y x.toString()
y x.substring(_ ,_) y x.toString(_)
...
70Cost of Web App Security Breaches
- Average 303K financial loss from unauthorized
access - Average 355K financial loss from theft of
proprietary info - Estimated 400B USD/year total cost of online
fraud ad abuse - Source Computer Security Institute survey
- Source US Department of Justice report
71Other Extensions
- Using machine learning techniques to complete the
PQL specification - Its difficult to get a specification thats 100
complete - If its not, some bugs are missed
- DynaMine Finding Common Error Patterns by
Mining Software Revision Histories, Ben Livshits
and Thomas Zimmermann, FSE05 - Especially true with custom sanitization routines
- Partitioned BDDs for better scalability
- Higher precision requirements push scalability
limits of bddbddb tool - One hope is to use Partitioned BDDs (POBDDs) to
scale the problem better - Applying model-checking to servlet interaction
- Our analysis relies on a harness that we
automatically generate - Only finds bugs that appear within a single
interaction