Finding Security Vulnerabilities in Java Applications with Static Analysis USENIX Security 2005 - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Finding Security Vulnerabilities in Java Applications with Static Analysis USENIX Security 2005

Description:

... most web application vulnerabilities. Many ways to send data to a web application. ... Web application could send multiple responses, could corrupt proxy cache ... – PowerPoint PPT presentation

Number of Views:80
Avg rating:3.0/5.0
Slides: 26
Provided by: Mat4215
Category:

less

Transcript and Presenter's Notes

Title: Finding Security Vulnerabilities in Java Applications with Static Analysis USENIX Security 2005


1
Finding Security Vulnerabilities in Java
Applications with Static Analysis USENIX
Security 2005
  • Authors V. Benjamin Livshits and Monica S. Lam

Presented in UIUC CS527 (Fall 07) by Matt
Stockton
2
Introduction / Motivation
  • Many web applications create, delete, update, and
    display sensitive information that has financial
    value to hackers
  • Many of these web applications are vulnerable to
    attacks (Imperva Application Defense Center
    Study)
  • Attacks on web applications are expensive to deal
    with after the fact (litigation, lost proprietary
    information, lost customer information, etc.)
  • The most commons means of discovering web
    application vulnerabilities before application
    deployment is also expensive.

How can we solve this dilemma?
3
Static Analysis Overview
Definition the analysis of computer software
that is performed without actually executing
programs built from that software - wikipedia
Many static analysis tools exist for analyzing
C/C code. These look for buffer overflows,
string format vulnerabilities, etc. Java
language safety features prevent direct memory
access so static analysis is not as necessaryor
is it? Even with automatic memory management,
Java applications are still exploitable ,
although the vector of attack is quite different.
The paper presents a technique for identifying
these attack vectors by using static analysis
techniques.
4
Unchecked User Input
  • Unchecked user input is the source of most web
    application vulnerabilities
  • Many ways to send data to a web application. Web
    programmers make assumptions on what data can be
    sent and how this data will be formatted. When
    assumptions are wrong, there is opportunity for
    attack.
  • Attackers must fulfill two goals to exploit
    unchecked input vulnerabilities
  • Inject malicious data into a web applications
  • Manipulate web applications using the malicious
    data

5
Injecting Malicious Data
Parameter Tampering Enter maliciously formed
data into HTML forms URL Tampering Directly
edit the URL string (usually modifying an HTTP
GET request after form submissions) Hidden Field
Manipulation Web sites sometimes use hidden
forms for persistence. An attacker can manually
change the values HTTP Header Manipulation
Free tools allow you to intercept browser
requests, and change HTTP headers. Cookie
Poisoning Manually modify web site cookies
stored on a computer Non-Web Input Sources
Modify command-line parameters sent to web
application management scripts.
6
Manipulating apps with unchecked data
SQL Injections Use input to generate SQL
queries that will leak information from the
database, or perform a malicious insert, update,
or deletion. Example Username Cross-Sit
e Scripting User-controlled content that the
web application displays without filtering at all
(e.g. load a javascript library, send cookie
information to another domain) HTTP Response
Splitting User-controlled content that the web
application uses in the HTTP response header. Web
application could send multiple responses, could
corrupt proxy cache Path Traversal Crafted
user input allows user to read / write / update
files that shouldnt be accessible. Example
File To Delete Command Injection Force web
application to execute a command it shouldnt be
executing
Matt OR 1 1 --
Matt/../../../etc/passwd
7
Usual web app security analysis
  • Several manual techniques are prevalent for web
    app security analysis
  • Source code audits by security professionals /
    white box analysis
  • Penetration testing / black box analysis
  • Shortcomings from manual techniques include
  • Time / cost associated with the investigation
  • Coverage / precision of the investigation
  • If investigation causes code changes, the
    changes may need to be re-audited


Need a less costly, more automated process for
this type of auditing. Primary motivation for
this paper.
8
Static Analysis in more detail
Analyze the code without actually running the
application Use different algorithms to analyze
the code to find errors. Wide range of complexity
to the algorithms If source code is not
available, byte code can be used to perform the
static analysis (technique used in this
tool) Basic premise is to give a static analysis
tool something to look for (some type of
pattern). If the tool finds a match, it will note
the match. Simple Example grep Complex
Example this tool General Goals for static
analysis Soundness, Precision, Scalability

9
Contributions of the paper
Proposes a methodology and tool to detect a
diverse set of common web application
vulnerabilities. Improve precision of tool by
using fully context-sensitive pointer analysis
(less false positives) Deliver an actual
implementation of the idea, built as an Eclipse
Plug-in http//suif.stanford.edu/livshits/work/
lapse/ Validate the methodology against real web
applications. Found real errors, with few false
positives.

10
Data as an attack vector Tainted Objects
  • How does data propagate from data the user
    controls (user input) to data the application
    uses (e.g. SQL query)?
  • We can model this using tainted object
    propagation, which is composed into three
    segments
  • Source Descriptors How user-provided data can
    enter a web application
  • Sink Descriptors Potentially unsafe ways that
    data can be used in a program (e.g. if the data
    is tainted)
  • Derivation Descriptors How tainted objects can
    be manipulated, and still remain tainted in the
    application (e.g. what methods can be sent to a
    tainted object, or can use the tainted object as
    an argument to a method, to create another
    tainted object or keep the objected in a tainted
    state.


Tainted Object Propagation Modeling object flow
through an application
11
Tainted Object Propagation Descriptions
Source Descriptor Example ltHttpServletRequest.ge
tParameter(String), -1, egt Sink Descriptor
Example ltConnection.executeQuery(String), 1,
egt Derivation Descriptor Examples ltStringBuffer.
append(String), 1, e, -1, e gt ltStringBuffer.toStr
ing(String), 0, e, -1, e gt Using the
descriptors, you can theoretically find all
sources and sinks in the code, and can understand
when a sink uses a tainted source object that is
still tainted after manipulation by derivation
descriptor rules.

12
Tainted Object Security Violation
  • A Security Violation occurs when
  • A source object is tainted (given the rules for
    source descriptors)
  • From this tainted source object, there are
    derivations performed in the code to produce
    another object that is tainted, or keep the
    current object in the tainted state (based on
    derivation descriptors)
  • The tainted object is used in a sink (based on
    the sink descriptors)
  • When the above steps occur, then user-controlled
    input is being used by the
  • application in a potentially vulnerable /
    exploitable way.


13
Generating Sources, Sinks, and Derivations
Generating the rules for sources, sinks, and
derivations is a manual process. Without
providing 100 coverage for all sources, sinks,
and derivations, the model is incomplete and can
miss vulnerabilities! For this tool, J2EE APIs
were evaluated to generate sources and sinks, and
Java String manipulation libraries were evaluated
to generate derivation descriptors What if
something is missing? Definitely a
possibility. - This tool used some additional
static analysis to pinpoint tainted sources that
were never passed to a method listed in
derivation descriptors. This found additional
derivation descriptors. Concern What if the
source is written to a File, and used later by
the application? This derivation cannot be
covered through String manipulation

14
What about object references?
To have sound static analysis, your tool needs to
track what object references (program variables)
point to tainted objects (on the heap) In a
naïve implementation, to maintain soundness, you
could end up with a very large number of
potentially tainted object references if you do
not perform good points-to analysis.

Example Are buf1 and buf2 both tainted? Is this
a violation?
15
Points-to analysis
  • Solve the tainted object scalability problem
    using approximation with static
  • object names
  • Do not want to miss potential pointers to a
    tainted object, but at the same
  • time, if you do not do any bounding, end up with
    a huge number of
  • potentially tainted objects
  • Tool uses a context-sensitive Java points-to
    analysis developed by Whaley
  • and Lam
  • Uses Binary Decision Diagrams (BDDs) to represent
    points-to results for multiple execution contexts
    in a program


Not many technical details on the BDD method, but
this essentially allows this tool to perform
context-sensitive static analysis to reduce the
set of objects that could be tainted. NOTE
Exact points-to analysis is an undecidable
problem. Need a conservative estimate that is
still sound (doesnt miss any tainted objects)
16
Additional Claims Of Novelty
Sound and precise context-based points-to
analysis, reducing the tainted object
space Further reduction of tainted object space
by introducing a clever way to handle Container
references can identify / name underlying
structure of the Container, resulting in a
further reduced tainted object space. Object
naming for String manipulation methods.
Introduced logic to name Strings produced from
String manipulation methods to further reduce
tainted object space.

17
Programmatic representation of descriptions
Source, Sink, and Derivation descriptions can be
created using Program Query Language (PQL) PQL
Java-like language that can be used to describe a
sequence of dynamic events that involves
variables referring to object instances
Two main PQL statements define the framework that
is used to find security violations. User must
then define source(), derived() and sink()

18
PQL Example SQL Injection

Fairly simple to understand the definitions for
source, sink, and derived.
19
Evaluation / Experimental Results
Tested against 8 large open-source web
applications Created set of source, sink, and
derivation descriptors (derivation focused on
String, StringBuffer, and StringTokenizer
classes Four combinations of testing
(with/without context sensitivity,
with/without improved object naming) Recorded a
total of 41 potential security violations. 29
turned out to be security errors, and 12 were
false positives More precise with both context
sensitivity and improved naming enabled (and
actually faster execution time) Found two errors
in common library code (J2EE and
hibernate) Almost all errors were confirmed by
the application developers, resulting in code
fixes.

20
Errors Discovered
Parameter manipulation to perform HTTP splitting
was the most prevalent attack vector Browser
re-direct attacks based on user-entered data
(HTTP referrer field was modified) SQL injection
vector in Hibernate library code

False Positives
All due to not defining an object naming rule
correctly -StringWriter.toString() Once this was
added to the naming rules, there were no false
positives
21
Shortcomings
Input validation / control flow is not handled
If application does some parameter validation
this tool will not take that into account Source
/ Sink / Derivation descriptions need to be
manually created and potentially updated J2EE
sources / sinks, and String library descriptors
cover a lot are there more? Need to manually
tune the object naming rules so that you
can minimize false positives Can you think of
other paths not covered by the implementation? Ex
ample - user input gets stored to a file, then
read in later and used in a sink

22
Other Techniques used in Practice
Penetration Testing Black box and white box.
Depending on the effort, may only catch a small
sample of security risks. Will not identify parts
of the system that remain untested Runtime
Monitoring Pattern matching of HTTP requests
at runtime by a proxy. White list of good inputs
and/or blacklist of bad inputs. Protection
against errors already manifested
in application. Protection at levels other than
application (e.g. Oracle virtual private
databases to minimize amount of data available
to application)

23
Conclusions
This paper proposes applying tainted object
propagation techniques to Java web applications,
and presents a tool implemented as an Eclipse
plug-in The proposed technique maintains static
analysis soundness, and increases scalability and
precision with context-sensitive pointer
analysis and object naming. Improved object
naming by modifying naming for Containers and
Strings Seems like a good tool requiring minimal
manual integration work to use as an additional
mechanism to measure your web applications
security

24
LAPSE Tool

ltsource id"javax.servlet.ServletRequest.getParam
eterMap()"gt ltcategorygtParameter
tamperinglt/categorygt lt/sourcegt ltsource
id"javax.servlet.ServletRequest.getParameterNames
()"gt ltcategorygtParameter tamperinglt/categorygt
lt/sourcegt
25
Initial Student Feedback
Shortcomings - Weak Analysis(?), Manually
creating PQL descriptors Can we use this with
other languages (.NET, ROR, SPs) Do people
actually use PQL? (http//pql.sourceforge.net/)
Write a Comment
User Comments (0)
About PowerShow.com