Software Forensics - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

Software Forensics

Description:

Flaws / Bugs. Some programmers never learn. Metrics. Lines per function, several others ... San Diego, CA, June 9-12, 2003. ACM Press. ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 12
Provided by: mpr8
Category:

less

Transcript and Presenter's Notes

Title: Software Forensics


1
Software Forensics
I just want to say LOVE YOU SAN!! Lovsan virus
  • Michael Price
  • CS 691.001 Fall 2004
  • Software Forensics
  • September 23, 2004

2
Presentation Outline
  • Definitions
  • Basic Ideas
  • Authorship Analysis
  • Executable Examination
  • Source Code Examination
  • Evasion Methods
  • Common Usages
  • Summary / Questions
  • References

3
Definitions
  • Linguistics
  • The study of the nature, structure, and variation
    of language, including phonetics, phonology,
    morphology, syntax, semantics, sociolinguistics,
    and pragmatics.
  • Software Metrics
  • A set of repeatable measurements of certain
    aspects of a piece of software.
  • Software Forensics - Two competing definitions.
  • The examination of failed software to determine
    the cause of failure. See http//www.cs.mdx.ac.uk
    /research/SFC/ .
  • The examination of software in an effort to
    classify its function, identify its author(s),
    etc

4
Basic Ideas
  • When programmers program, they unwittingly
    (perhaps not) leave fingerprints in the
    content, structure, style, and other elements
    that can be used to correctly identify the
    author(s) at a later time.
  • When programmers compile, the tools they use
    leave fingerprints in the resulting executable
    code that can be used to correctly identify those
    tools and the environment in which they were used.

5
Authorship Analysis
  • Author Discrimination
  • Given some set of similar (in function) programs,
    can we discriminate between different authors,
    and can we say that with high confidence that any
    subset were produced by the same author(s).
  • Author Characterization
  • Given a set of programs produced by the same
    author(s), can we deduce certain information
    about the author(s).
  • Author Identification
  • Given a set of programs known to have been
    produced by a particular author(s), can we say
    with high confidence that a different program was
    written by the same author(s).
  • Author Intent
  • Given a set of programs, can we determine the
    (possibly hidden) intent of the author(s).

6
Executable Examination
  • Data Structures
  • Linked-lists vs. Red-Black Trees (for searching)
  • Array access (Index vs. Pointer Arithmetic)
  • Algorithms
  • Bubble Sort vs. Merge Sort
  • Compiler Information
  • Instruction Ordering / Optimizations
  • System Information
  • Unique system calls
  • Author Expertise
  • Recursion? Error-handling? Re-invention?
  • System Calls
  • Preference in one over another?
  • Errors
  • Consistent?
  • Debug Symbols
  • Were they left?

7
Source Code Examination
  • Programming Language Choice
  • Functional vs OO
  • Compiled vs Interpreted
  • Formatting (often language specific)
  • C - Indention, , etc..
  • Editor of Choice
  • Can often leave tell-tale marks in the source
  • Special Features
  • Pragmas (?)
  • Comments
  • Style, Frequency, Detail
  • Variable Names
  • Descriptive, Vulgar, Utility
  • Spelling and Grammar
  • Often consistent
  • Choice of language features used
  • while vs dowhile vs for vs goto
  • Scoping
  • Ratio of Globals to Locals
  • Execution Paths
  • Unexecutable paths
  • Flaws / Bugs
  • Some programmers never learn
  • Metrics
  • Lines per function, several others
  • See Executable Examination for more

8
Evasion Methods
  • How can we trick software forensics?
  • Be extremely knowledgeable, completely aware, and
    somewhat paranoid.
  • Run source executable through obfuscators so
    that all of the fingerprints we left are wiped
    away.
  • But can we obfuscate everything we leave behind?

9
Common Usages (or maybe just possible)
  • Malicious code analysis
  • Computer fraud
  • Plagiarism (MOSS system)
  • Patent lawsuits
  • Programmer auditing
  • Others?

10
Summary / Questions
  • Software Forensics borders linguistics, software
    engineering, and criminology.
  • Programmers unwittingly leave evidence of their
    authorship.
  • Discrimination, Characterization, Identification,
    Intent
  • Perhaps, if clever enough, blackhats can fool the
    whitehats.
  • Questions/Comments/Ideas/Complaints?

11
References
  • Andrew Gray, Philip Sallis, and Stephen
    MacDonnell. Software Forensics Extending
    Authorship Analysis Techniques to Computer
    Programs. In Proceedings of the 3rd Biannual
    Conference of the International Association of
    Forensic Linguists. Durham, NC, 1997.
  • Saul Schleimer, Daniel S. Wilkerson, and Alex
    Aiken. Winnowing Local Algorithms for Document
    Fingerprinting. SIGMOD 2003. San Diego, CA,
    June 9-12, 2003. ACM Press.
  • Slade, Robert M. Software Forensics Collecting
    Evidence from the Scene of a Digital Crime. New
    York McGraw-Hill, 2004.
  • Spafford, E.H., and Weeber, S.A. Software
    Forensics Can We Track Code to its Authors?
    1992.
Write a Comment
User Comments (0)
About PowerShow.com