Software Reverse Engineering Education - PowerPoint PPT Presentation

Loading...

PPT – Software Reverse Engineering Education PowerPoint presentation | free to download - id: 11394e-NDNkZ



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Software Reverse Engineering Education

Description:

The technique Obfuscating the Program, is demonstrated in a Wintel machine code ... Encryption of string literals (data obfuscation) ... – PowerPoint PPT presentation

Number of Views:594
Avg rating:3.0/5.0
Slides: 125
Provided by: IBMU483
Learn more at: http://reversingproject.info
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Software Reverse Engineering Education


1
Software Reverse Engineering Education
http//www.reversingproject.info
  • Teodoro Cipresso, tcipress_at_hotmail.com
  • San José State University, Spring 2009
  • Advisor Dr. Mark Stamp
  • Committee Dr. Robert Chun, Dr. David Taylor

2
Background Information
Introduction to Software Reverse Engineering
  • Software Reverse Engineering (SRE) can be
    described as the practice of analyzing a software
    system to create abstractions that identify the
    individual components and their dependencies,
    and, if possible, the overall system architecture
    1.
  • Once the components and design of an existing
    system have been recovered, it becomes possible
    to repair and even enhance them.
  • Reverse engineering skills are also used to
    detect and neutralize viruses, worms and other
    malware, as well as to protect intellectual
    property 1.

3
Background Information (contd)
Importance of SRE Education
  • More emphasis is needed in SE and CS
    undergraduate and graduate programs on the issue
    of software evolution and change. Students need
    to be educated on the theory and practice of
    software comprehension, maintenance and
    reengineering. They need to learn how to live
    with the monsters from the past and tame them
    2.
  • Most of the time, students are trained in
    developing very small programs starting from
    scratch. This approach is really misleading since
    most students learn to believe that software
    engineering is just about developing brand new
    software. In fact many students will be involved
    in evolution-related activities after completion
    of their studies 3.

4
Background Information (contd)
Student Feedback on SRE Education
  • Incorporation of software reverse engineering
    techniques and methodologies into regular course
    work was tried at the University of
    Missouri-Rolla 1.
  • The results of this experiment were quite
    positive
  • 77 of students thought that the incorporation of
    SRE techniques and methodologies reinforced
    concepts taught during lectures.
  • 82 of students wanted SRE to be included in
    future courses, especially those that deal with
    software design.

5
Background Information (contd)
Development-Related Reversing Scenarios
Figure 1. Development-related software reverse
engineering scenarios.
6
Background Information (contd)
Security-Related Reversing Scenarios
Figure 2. Security-related software reverse
engineering scenarios.
7
Background Information (contd)
Legacy Software Development Process
Figure 3. Software development process in a
typical enterprise software system.
8
Project Overview Baseline Education in
Software Reverse Engineering
Figure 4. Activities related to providing a
baseline SRE education.
9
Materials and Methods
  • More than ten peer-reviewed articles on the
    topics of software reverse engineering,
    re-engineering, maintenance, reuse, and security
    were selected and used to address the research
    questions.
  • Of the articles selected, three were chosen for
    their specific coverage of experiences with
    teaching courses in software reversing,
    reengineering, and maintenance.
  • Drew upon my experience, which is just shy of a
    decade, with designing and developing legacy
    software modernization tools at IBM.

10
Results Overview of Developed
SRE Course Modules
  • Reversing and Patching Wintel Machine Code
  • Reversing and Patching Java Bytecode
  • Applying Anti-Reversing Techniques to Machine
    Code
  • Applying Anti-Reversing Techniques to Java
    Bytecode
  • Reengineering and Reuse of Legacy Software
  • Identifying, Monitoring, and Reporting Malware

11
Results (contd) Overview of
Developed SRE Course Modules
  • Reversing and Patching Wintel Machine Code
  • Reversing and Patching Java Bytecode
  • Applying Anti-Reversing Techniques to Machine
    Code
  • Applying Anti-Reversing Techniques to Java
    Bytecode
  • Reengineering and Reuse of Legacy Software
  • Identifying, Monitoring, and Reporting Malware

12
Results (contd) Reversing and
Patching Wintel Machine Code
  • An introduction to the compilation of high-level
    languages to machine code is provided. Assembly
    is contrasted as having a one-to-one mapping to
    machine code
  • The negative results of experimentation with two
    decompilers (Boomerang and REC) for machine code
    are documented. Given the current state of
    decompiler technology, it was concluded that
    working with disassembly is the most feasible
    approach.
  • A Wintel machine code reversing and patching
    exercise was developed against Password Vault, a
    non-trivial application that is provided with the
    exercise to avoid any legal concerns with
    reversing software written by others.

13
Results (contd) Reversing and Patching
Wintel Machine Code (contd)
  • The machine code reversing and patching exercise
    asks the learner to create a new executable
    version of the application that no longer has a
    trial limitation of five password records per
    user.
  • A reliable, and repeatable reversing strategy is
    used place a breakpoint on a memory artifact and
    trace back stack frames to locate the section in
    the disassembly.
  • For instructional purposes, an animated solution
    that demonstrates the application of this
    reversing strategy using OllyDbg, an interactive
    debugger-disassembler, was developed using Qarbon
    Viewlet Builder.

14
Results (contd) Reversing and Patching
Wintel Machine Code (contd)
Figure 5. Animated solution to the Wintel
reversing and patching exercise.
15
Results (contd) Reversing and Patching
Wintel Machine Code (contd)
Figure 6. Animated solution to the Wintel
reversing and patching exercise.
16
Results (contd) Reversing and Patching
Wintel Machine Code (contd)
Figure 7. Animated solution to the Wintel
reversing and patching exercise.
17
Results (contd) Reversing and Patching
Wintel Machine Code (contd)
Figure 8. Animated solution to the Wintel
reversing and patching exercise.
18
Results (contd) Reversing and Patching
Wintel Machine Code (contd)
Figure 9. Animated solution to the Wintel
reversing and patching exercise.
19
Results (contd) Reversing and Patching
Wintel Machine Code (contd)
Figure 10. Animated solution to the Wintel
reversing and patching exercise.
20
Results (contd) Reversing and Patching
Wintel Machine Code (contd)
Figure 11. Animated solution to the Wintel
reversing and patching exercise.
21
Results (contd) Reversing and Patching
Wintel Machine Code (contd)
Figure 12. Animated solution to the Wintel
reversing and patching exercise.
22
Results (contd) Reversing and Patching
Wintel Machine Code (contd)
Figure 13. Animated solution to the Wintel
reversing and patching exercise.
23
Results (contd) Reversing and Patching
Wintel Machine Code (contd)
Figure 14. Animated solution to the Wintel
reversing and patching exercise.
24
Results (contd) Reversing and Patching
Wintel Machine Code (contd)
  • Idea for an advanced Wintel machine code ()
    exercise
  • It should be feasible to patch in additional
    function to the Password Vault machine code
  • The GCC compiler can generate assembly language
    instead of machine code, so the programmer can
    work in a high-level language.
  • Patching in the generated assembly code would
    require some significant amount of time spent in
    the program understanding phase.
  • Final integration of the new code would require
    modification of the Windows PE header to increase
    the size of the .code section, also the .rdata
    and .data sections if new variables and constants
    are added.

25
Results (contd) Overview of
Developed SRE Course Modules
  • Reversing and Patching Wintel Machine Code
  • Reversing and Patching Java Bytecode
  • Applying Anti-Reversing Techniques to Machine
    Code
  • Applying Anti-Reversing Techniques to Java
    Bytecode
  • Reengineering and Reuse of Legacy Software
  • Identifying, Monitoring, and Reporting Malware

26
Results (contd) Reversing
and Patching Java Bytecode
  • An introduction to interpreted/intermediate
    executable formats such as Java bytecode is
    provided. These formats are contrasted with
    machine code and assembly language.
  • Java bytecode disassembly using javap is
    covered for help with analysis of bytecode
    generated by javac.
  • The positive results of experimentation with the
    Jad Java bytecode decompiler are documented it
    is concluded that direct reading/writing of
    bytecode is not necessary.
  • A Java bytecode reversing and patching exercise
    was developed against a Java version of Password
    Vault.

27
Results (contd) Reversing and
Patching Java Bytecode (contd)
  • The Java bytecode reversing and patching exercise
    asks the learner to create a new executable
    version of the application that no longer has a
    trial limitation of five password records per
    user.
  • Since the Password Vault application consists of
    a small number of classes in a single package, a
    simple reversing strategy of unpacking the Jar
    archive, batch decompiling the classes, modifying
    the generated Java source, and recompiling is
    used.
  • For instructional purposes, an animated solution
    that demonstrates the application of this
    reversing strategy using FrontEnd Plus, a
    graphical interface to Jad, was developed using
    Qarbon Viewlet Builder.

28
Results (contd) Reversing and
Patching Java Bytecode (contd)
Figure 15. Animated solution to the Java bytecode
reversing and patching exercise.
29
Results (contd) Reversing and
Patching Java Bytecode (contd)
Figure 16. Animated solution to the Java bytecode
reversing and patching exercise.
30
Results (contd) Reversing and
Patching Java Bytecode (contd)
Figure 17. Animated solution to the Java bytecode
reversing and patching exercise.
31
Results (contd) Reversing and
Patching Java Bytecode (contd)
Figure 18. Animated solution to the Java bytecode
reversing and patching exercise.
32
Results (contd) Reversing and
Patching Java Bytecode (contd)
Figure 19. Animated solution to the Java bytecode
reversing and patching exercise.
33
Results (contd) Reversing and
Patching Java Bytecode (contd)
Figure 20. Animated solution to the Java bytecode
reversing and patching exercise.
34
Results (contd) Reversing and
Patching Java Bytecode (contd)
Figure 21. Animated solution to the Java bytecode
reversing and patching exercise.
35
Results (contd) Reversing and
Patching Java Bytecode (contd)
Figure 22. Animated solution to the Java bytecode
reversing and patching exercise.
36
Results (contd) Reversing and
Patching Java Bytecode (contd)
  • Idea for an advanced Java bytecode () exercise
  • Use available Java class libraries, such as
    jclasslib, to directly read and write Java
    bytecode.
  • Write a Java program that scans through the
    bytecode for the Java Password Vault application
    and locates the instructions for the trial
    limitation.
  • Once the instructions are located, overwrite them
    with a sequence that disables the trial
    limitation.
  • This can be good practice for getting a feel for
    writing code that patches an executable.

37
Results (contd) Overview of
Developed SRE Course Modules
  • Reversing and Patching Wintel Machine Code
  • Reversing and Patching Java Bytecode
  • Applying Anti-Reversing Techniques to Machine
    Code
  • Applying Anti-Reversing Techniques to Java
    Bytecode
  • Reengineering and Reuse of Legacy Software
  • Identifying, Monitoring, and Reporting Malware

38
Results (contd) Applying Anti-Reversing
Techniques to Machine Code
  • An brief introduction to basic anti-reversing
    techniques is provided Eliminating Symbolic
    Information, Obfuscating the Program, and
    Embedding Anti-Debugger Code.
  • Machine code typically has very little symbolic
    information that can be altogether eliminated,
    therefore a discussion illustrates how debuggers
    insert quite a bit of information that makes
    machine code easier to reverse.
  • The technique Obfuscating the Program, is
    demonstrated in a Wintel machine code
    anti-reversing exercise where data, computation,
    and control flow obfuscations are applied to the
    C source code for Password Vault.

39
Results (contd) Applying Anti-Reversing
Techniques to Machine Code (contd)
  • Commercial tools such as EXECryptor
    www.strongbit.com, fully obfuscate and pack
    Windows executables, using advanced algorithms
    that are based on the elementary techniques
    described in this module.
  • It is difficult to provide a before and after
    illustration of machine code that is obfuscated
    using EXECryptor, so the examples and exercise in
    this module are implemented first at the source
    code level and then confirmed in the machine code
    using live and static analysis.
  • In the case of control-flow obfuscation, only
    static analysis is used, where subsequent run
    traces are compared using an edit-distance
    measurement.

40
Results (contd) Applying Anti-Reversing
Techniques to Machine Code (contd)
  • The Wintel machine code anti-reversing exercise
    asks the learner to create a new executable
    version of the Password Vault application where
    the following transformations are applied
  • Encryption of string literals (data obfuscation).
  • Obfuscation of the numeric representation of the
    password record limit (computation obfuscation).
  • Obfuscation of the method that performs the
    record limit check (control flow obfuscation).

41
Results (contd) Applying Anti-Reversing
Techniques to Machine Code (contd)
  • Encryption of String Literals (data obfuscation)

Figure 23. Strings are decrypted each time they
are used using a bundled cipher.
42
Results (contd) Applying Anti-Reversing
Techniques to Machine Code (contd)
  • Obfuscation of the numeric representation of the
    password record limit (computation obfuscation)

Figure 24. Complex evaluations obscure the actual
condition.
43
Results (contd) Applying Anti-Reversing
Techniques to Machine Code (contd)
  • Obfuscation of the numeric representation of the
    password record limit (computation obfuscation)
    (contd)

Figure 25. Testing for a function of a number can
slow a reverser down.
44
Results (contd) Applying Anti-Reversing
Techniques to Machine Code (contd)
  • Obfuscation of the method that performs the
    record limit check (control flow obfuscation)
  • We introduce some non-essential, recursive, and
    randomized logic to the password limit check to
    make it more difficult for a reverser to perform
    static and/or live analysis.
  • Since no standards exist for control flow
    obfuscation, a custom algorithm was designed to
    hinder live and static analysis through use of
    recursive and randomized procedure calls.
  • Recursion grows the stack considerably, making
    stepping through the code difficult, while
    randomization makes execution unpredictable
    (breakpoints may not trigger run traces differ).

45
Results (contd) Applying Anti-Reversing
Techniques to Machine Code (contd)
Depth of the recursion is randomized on each
check of the limit.
Random procedure call targets generate and return
a number that is added to an instance variable,
preventing the procedures from being identified
as NOOPs by a code optimizer.
Figure 26. A control flow obfuscation algorithm
for the record limit check.
46
Results (contd) Applying Anti-Reversing
Techniques to Machine Code (contd)
  • To measure the effectiveness of the control flow
    algorithm in hindering analysis, three execution
    traces of the section of the code containing the
    record limit check were compared.
  • The Levenshtein Distance (LD) was computed
    between the three traces where each instruction
    in the trace was compared. LD was modified to
    consider each line as opposed to each character.
  • The execution traces were collected using OllyDbg
    and had to be cleaned of disassembly artifacts
    such as line numbers, base addresses, and
    comments in order to ensure that the analysis was
    fair.

47
Results (contd) Applying Anti-Reversing
Techniques to Machine Code (contd)
Figure 27. Comparison of executions of record
limit check on identical program input.
48
Results (contd) Applying Anti-Reversing
Techniques to Machine Code (contd)
  • The Wintel anti-reversing module also
    demonstrates source code obfuscation which is a
    useful anti-reversing technique for source code.
  • There may exist a requirement to ship the source
    code of an application so that the machine code
    can be generated on the end users computer.
  • If the source code contains intellectual property
    that is worth protecting, one can perform
    transformations to the source code which make it
    difficult to read, but have no impact on the
    machine code that would ultimately be generated
    when the program is compiled.

49
Results (contd) Applying Anti-Reversing
Techniques to Machine Code (contd)
  • Demonstration of the COBF source code obfuscator

VerifyPassword.cpp 01 int main(int argc, char
argv) 02 03 const char password
"jup!ter" 04 string specified 05 cout ltlt
"Enter password " 06 getline(cin,
specified) 07 if (specified.compare(password)
0) 08 09 cout ltlt "OK Access
granted." ltlt endl 10 else 11 12
cout ltlt "Error Access denied." ltlt endl 13
14 COBF invocation 01 C\cobf_1.06\src\win
32\release\cobf.exe 02 _at_C\cobf_1.06\src\setup_cp
p_tokens.inv -o cobfoutput -b -p C 03
\cobf_1.06\etc\pp_eng_msvc.bat VerifyPassword.cpp
50
Results (contd) Applying Anti-Reversing
Techniques to Machine Code (contd)
COBF obfuscated source for VerifyPassword.cpp 01
include"cobf.h" 02 ls lp lklf lo(lf
ln,ldlj)ll ldlc"\x6a\x75\x70\x21\x74 03
\x65\x72"lh lalbltlt"\x45\x6e\x74\x65\x72\x20\x70\
x61\x73\x73 04 \x77\x6f\x72\x64""\x3a\x20"li(lq,
la)lm(la.lg(lc)0)lbltlt"\x5b 05
\x4f\x4b\x5d\x20\x41" "\x63\x63\x65\x73\x73\x20\x6
7\x72\x61\x6e 06 \x74\x65\x64\x2e"ltltlelrlbltlt"\
x5b\x45\x72\x72\x6f\x72\x5d 07
\x20\x41\x63\x63\x65\x73\x73\x20\x64"
"\x65\x6e\x69\x65 08 \x64\x2e"ltltle COBF
generated header (cobf.h)
51
Results (contd) Overview of
Developed SRE Course Modules
  • Reversing and Patching Wintel Machine Code
  • Reversing and Patching Java Bytecode
  • Applying Anti-Reversing Techniques to Machine
    Code
  • Applying Anti-Reversing Techniques to Java
    Bytecode
  • Reengineering and Reuse of Legacy Software
  • Identifying, Monitoring, and Reporting Malware

52
Results (contd) Applying
Anti-Reversing Techniques to Java Bytecode
  • While experiments with decompiling machine code
    were not successful, decompilation of Java
    bytecode to Java source code yielded acceptable
    results.
  • Given these results, one does need to be
    concerned with protecting Java bytecode from
    decompilation if there is significant
    intellectual property in the program.
  • Obfuscating bytecode is inherently easier than
    obfuscating source code because bytecode has a
    significantly more strict and organized
    representation than source code.

53
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
  • Variable, class, and method names, are all left
    intact when compiling Java source code to Java
    bytecode. This is a stark difference from machine
    code where variable and local method names are
    not preserved.
  • A high-level of protection can be achieved for
    Java bytecode by applying three transformations
    Name Obfuscation, String Encryption, and Control
    Flow Obfuscation.
  • Zelix Klassmaster, a commercial product, is
    capable of all performing all three.
    Unfortunately no open-source or free tool exists
    that can perform all three.

54
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
  • The trial version of Zelix Klassmaster is
    restricted to 30 days, and the company will only
    e-mail a trial version to non-free e-mail
    addresses.
  • Not much is learned by having everything done for
    us, so this module sees how far one can get with
    open-source and free software.
  • ProGuard and RetroGuard are free Java bytecode
    obfuscators capable of Name Obfuscation.
  • SandMark, a Java bytecode watermarking and
    obfuscation tool from the University of Arizona,
    is capable of String Encryption and some weak
    control flow obfuscations.

55
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
  • A Java bytecode anti-reversing exercise was
    developed against the Java version of Password
    Vault.
  • Since the learner will have already experienced
    manually applying obfuscations in the Wintel
    machine code anti-reversing, this exercise
    focuses on the use of tools.
  • In the exercise, it is expected that the Java
    bytecode for the Password Vault application will
    be incrementally obfuscated using two or more
    tools.
  • For instructional purposes, an animated solution
    that demonstrates obfuscating the Password Vault
    Java bytecode to the point of inhibiting
    decompilation, was developed using Qarbon Viewlet
    Builder.

56
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 28. Animated solution to the Java bytecode
anti-reversing exercise.
57
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 29. Animated solution to the Java bytecode
anti-reversing exercise.
58
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 30. Animated solution to the Java bytecode
anti-reversing exercise.
59
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 31. Animated solution to the Java bytecode
anti-reversing exercise.
60
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 32. Animated solution to the Java bytecode
anti-reversing exercise.
61
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 33. Animated solution to the Java bytecode
anti-reversing exercise.
62
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 34. Animated solution to the Java bytecode
anti-reversing exercise.
63
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 35. Animated solution to the Java bytecode
anti-reversing exercise.
64
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 36. Animated solution to the Java bytecode
anti-reversing exercise.
65
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 37. Animated solution to the Java bytecode
anti-reversing exercise.
66
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 38. Animated solution to the Java bytecode
anti-reversing exercise.
67
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 39. Animated solution to the Java bytecode
anti-reversing exercise.
68
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 40. Animated solution to the Java bytecode
anti-reversing exercise.
69
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 41. Animated solution to the Java bytecode
anti-reversing exercise.
70
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 42. Animated solution to the Java bytecode
anti-reversing exercise.
71
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 43. Animated solution to the Java bytecode
anti-reversing exercise.
72
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 44. Animated solution to the Java bytecode
anti-reversing exercise.
73
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 45. Animated solution to the Java bytecode
anti-reversing exercise.
74
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 46. Animated solution to the Java bytecode
anti-reversing exercise.
75
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 47. Animated solution to the Java bytecode
anti-reversing exercise.
76
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 48. Animated solution to the Java bytecode
anti-reversing exercise.
77
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 49. Animated solution to the Java bytecode
anti-reversing exercise.
78
Results (contd) Applying Anti-Reversing
Techniques to Java Bytecode (contd)
Figure 50. Animated solution to the Java bytecode
anti-reversing exercise.
79
Results (contd) Overview of
Developed SRE Course Modules
  • Reversing and Patching Wintel Machine Code
  • Reversing and Patching Java Bytecode
  • Applying Anti-Reversing Techniques to Machine
    Code
  • Applying Anti-Reversing Techniques to Java
    Bytecode
  • Reengineering and Reuse of Legacy Software
  • Identifying, Monitoring, and Reporting Malware

80
Results (contd) Reengineering
and Reuse of Legacy Software
  • The question of whether to reengineer or reuse
    components of a software system most often arises
    in the context of large business or government
    organizations.
  • Over time the processes and procedures of a
    business or organization will inevitably be
    reflected in the software systems that enable
    efficient, day-to-day operations 5.
  • While reverse engineering of legacy software is
    inherently intractable, some of us will
    inevitably find ourselves in a situation where no
    other option is available because the cost of
    rewriting a large, complex software system is
    prohibitive 6.

81
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • If good development practices were followed,
    legacy software is typically composed of three
    layers 5

Figure 51. Layers of a well-structured legacy
software application.
82
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • Legacy applications that are not sufficiently
    componentized, such that their general
    organization resembles the three layers, are not
    good candidates for reengineering and reuse.
  • The most widely accepted technique to reuse
    legacy application components is that of
    Wrappering 5, where a new piece of code
    provides an interface to a legacy application
    component or layer without requiring code changes
    to it.
  • Typically, candidate applications should be
    well-structured such that the business logic can
    be isolated, encapsulated, and made into reusable
    components.

83
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • Unless enough of an application's source code
    remains such that it's possible to identify the
    names of reusable entry points (procedures) and
    their I/O data structures, attempting to reuse
    the application may be difficult.
  • While it is possible to learn the names of entry
    points that have been explicitly exported by an
    application in the case of a DLL, the names don't
    indicate the layout of the expected I/O data
    structures.
  • One way to discover the entry points and I/O data
    structures in legacy machine code is to read the
    source code of other applications which depend on
    it.

84
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • The COBOL programming language is most often
    associated with legacy software applications.
  • Normally, COBOL programs have a single entry
    point additional alternate entry points are
    rare.
  • Legacy COBOL programs often include functional
    discriminators in their I/O data structures.

Figure 52. Mapping legacy functional
discriminators to an object-oriented design.
85
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • In a real-world situation, we would be looking to
    reuse legacy components whose machine code is the
    result of thousands of lines of high-level
    language statements (COBOL) that implement a
    particular business process.
  • Since our focus is more on reuse and
    reengineering of legacy code at a basic level,
    it's not necessary to encumber ourselves with a
    very large program in order to learn strategies
    for reuse and reengineering.
  • Included with this module is a small COBOL
    calculator that we wish to make reusable from
    Java. This program is assumed to be something
    from the business logic layer.

86
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
01
02 Simple COBOL program
that performs integer arithmetic 03

04 IDENTIFICATION
DIVISION. 05 PROGRAM-ID. 'SMPLCALC'. 06 DATA
DIVISION. 07 WORKING-STORAGE SECTION. 08 77
MSG-NUMERIC-OVERFLOW PIC X(25) 09 VALUE
'Numeric overflow occurred'. 10 77
MSG-SUCCESSFUL PIC X(22) 11 VALUE 'Completed
successfully'. 12 LINKAGE SECTION. 13
Input/Output data structure 14 01
SMPLCALC-INTERFACE. 15 02 SI-OPERAND-1 PIC
S9(9) COMP-5. 16 02 SI-OPERAND-2 PIC S9(9)
COMP-5. 17 02 SI-OPERATION PIC X. 18
88 DO-ADD VALUE ''. 19 88 DO-SUB VALUE
'-'. 20 88 DO-MUL VALUE ''. 21 02
SI-RESULT PIC S9(18) COMP-3. 22 02
SI-RESULT-MESSAGE PIC X(128). 23 PROCEDURE
DIVISION USING 24 BY REFERENCE
SMPLCALC-INTERFACE. 25 MAINLINE SECTION. 26
Perform requested arithmetic
87
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
27 INITIALIZE SI-RESULT SI-RESULT-MESSAGE 28
EVALUATE TRUE 29 WHEN DO-ADD 30
COMPUTE SI-RESULT SI-OPERAND-1
SI-OPERAND-2 31 ON SIZE ERROR 32
PERFORM HANDLE-SIZE-ERROR 33
END-COMPUTE 34 WHEN DO-SUB 35
COMPUTE SI-RESULT SI-OPERAND-1 -
SI-OPERAND-2 36 ON SIZE ERROR 37
PERFORM HANDLE-SIZE-ERROR 38
END-COMPUTE 39 WHEN DO-MUL 40
COMPUTE SI-RESULT SI-OPERAND-1
SI-OPERAND-2 41 ON SIZE ERROR 42
PERFORM HANDLE-SIZE-ERROR 43
END-COMPUTE 44 END-EVALUATE 45 Successful
return 46 MOVE MSG-SUCCESSFUL TO
SI-RESULT-MESSAGE 47 MOVE 2 TO
RETURN-CODE 48 GOBACK 49 .
88
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • Many commercial tools support importing a COBOL
    data structure and generating Java marshalling
    classes.
  • These marshalling classes are intended to be used
    with the J2EE Connector Architecture (JCA) where
    a Java application wrappers a legacy software
    application.

Figure 53. Example JCA implementation for
accessing a legacy application.
89
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • A popular alternative to using the JCA
    architecture to reengineer and reuse legacy
    applications is to implement a Service Oriented
    Architecture (SOA).
  • SOA components become capable of communicating
    without the tight and fragile coupling of
    traditional binary interfaces because they are
    wrappered with a platform-neutral interface such
    as XML and Web services.
  • When XML is used as envisioned, all data, both of
    type character and numeric are represented as
    printable textcompletely divorced from any
    platform specific representation or encoding.

90
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • The net effect of this is that two entities or
    programs can interact without having to know the
    data structures that comprise each other's binary
    interface.
  • Of course, the XML that is exchanged cannot be
    arbitrary, so industry standards such as XML
    Schema (XSD), and Web Services Definition
    Language (WSDL) fill this gap.
  • A Web service is considered to be WS-I compliant,
    or generally interoperable, if it meets many
    criteria, one of which is the use of XML for the
    input and output of each operation exposed by
    service.

91
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • This particular requirement of WS-I where XML is
    the interoperable interface of choice, sets the
    stage for a meaningful exercise.
  • A Legacy Software Reengineering and Reuse
    Exercise was developed for this module where the
    focus is on wrappering a COBOL program so that is
    reusable from Java using XML in a local
    environment.
  • The learner is asked to create a language neutral
    XML interface to the COBOL calculator program and
    invoke it from a Java program, which incidentally
    makes it reusable from other Java programs.

92
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • Overview of the architecture for the exercise

Figure 54. Architecture for legacy application
reengineering and reuse from Java.
93
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • Steps in the reengineering and reuse exercise
  • Create an XML Schema which represents all of the
    data in the SMPLCALC-INTERFACE COBOL data
    structure.
  • Write a Java interface ISimpleCalculator.java for
    three computation types supported by
    SMPLCALC.cbl.
  • Write a Java class JSimpleCalculator.java that
    implements the interface defined in
    ISimpleCalculator.java and provides a user
    interface.
  • Use the Java command-line utility xjc, in
    combination with the XML Schema, generate Java to
    XML marshalling code (JAXB).

94
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • Steps in the reengineering and reuse exercise
    (contd)
  • Write a small C/C JNI program
    Java2CblXmlBridge.cpp which exports a method
    Java2SmplCalc that
  • Invokes XML2CALC.cbl, passing the XML document
    received from JSimpleCalculator.java.
  • Returns the XML generated by XML2CALC.cbl to
    JSimpleCalculator.java.

95
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • Steps in the reengineering and reuse exercise
    (contd)
  • Write a COBOL program XML2CALC.cbl
  • Marshalls XML from Java2CblXmlBridge.cpp into
    SMPLCALC-INTERFACE.
  • Invokes SMPLCALC.cbl, passing SMPLCALC-INTERFACE
    by reference.
  • Marshalls SMPLCALC-INTERFACE back to XML before
    returning to Java2CblXmlBridge.cpp.
  • Compile XML2CALC.cbl and link it with the object
    code for SMPLCALC.cbl (SMPLCALC.obj).

96
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • Steps in the reengineering and reuse exercise
    (contd)
  • Create a DLL to be loaded by JSimpleCalculator.jav
    a by compiling and linking Java2CblXmlBridge.cpp
    with the object code for XML2CALC.cbl.
  • Update JSimpleCalculator.java to use the JAXB
    marshalling code to send/receive XML through the
    JNI layer and display the results.

97
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • Highlights of the solution code
  • SimpleCalculator.xsd

ltelement name"SI-OPERAND-1"gt ltsimpleTypegt
ltrestriction base"integer"gt
lttotalDigits value"9" /gt lt/restrictiongt
lt/simpleTypegt lt/elementgt . . . ltelement
name"SI-OPERATION"gt ltsimpleTypegt
ltrestriction base"string"gt ltenumeration
value"" /gt ltenumeration value"-" /gt
ltenumeration value"" /gt
lt/restrictiongt lt/simpleTypegt lt/elementgt
98
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • Highlights of the solution code (contd)
  • ISimpleCalculator.java

99
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • Highlights of the solution code (contd)
  • JSimpleCalculator.java

100
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • Highlights of the solution code (contd)
  • JSimpleCalculator.java (contd)

101
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • Highlights of the solution code (contd)
  • Java2CblXmlBridge.c

102
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • Highlights of the solution code (contd)
  • XML2CALC.cbl

103
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • Sample run of solution code

Figure 55. Reuse of COBOL from Java using JAXB,
JNI, and COBOL XML Support.
104
Results (contd) Reengineering and
Reuse of Legacy Software (contd)
  • Sample run of solution code

Figure 56. Reuse of COBOL from Java using JAXB,
JNI, and COBOL XML Support.
105
Results (contd) Overview of
Developed SRE Course Modules
  • Reversing and Patching Wintel Machine Code
  • Reversing and Patching Java Bytecode
  • Applying Anti-Reversing Techniques to Machine
    Code
  • Applying Anti-Reversing Techniques to Java
    Bytecode
  • Reengineering and Reuse of Legacy Software
  • Identifying, Monitoring, and Reporting Malware

106
Results (contd) Identifying,
Monitoring, and Reporting Malware
  • Malware describes a category of software that
    does always operate in a way that benefits the
    user.
  • Of course, those of us who have ever used
    software might contend that this definition of
    malware will cause programs that we use every day
    to be categorized as malware.
  • So let's qualify it a bit the malicious or
    annoying behaviors of malware are intentional,
    not the result of one or more bugs.

107
Results (contd) Identifying,
Monitoring, and Reporting Malware (contd)
  • There are currently five types of malware that
    affect computer systems 6 7
  • Viruses require some deliberate action to help
    them spread.
  • Worms similar to a virus but can spread by
    itself over computer networks.
  • Trojan Horses functional software that performs
    hidden malicious or annoying operations.
  • Backdoor a vulnerability purposely embedded in
    software.
  • Rabbit a program that exhausts system resources.

108
Results (contd) Identifying,
Monitoring, and Reporting Malware (contd)
  • Malware usually isn't of just one type for
    example, 3 of the top 10 malicious codes families
    reported in 2008 were Trojans with a backdoor
    component 8.
  • Using the machine code and bytecode reversing
    experiences gained from the previous modules, one
    could try reversing malware.
  • Using virtualization tools such as VMware to
    create secondary operating system images on which
    to analyze malware can still result in infection
    of the primary operating system.

109
Results (contd) Identifying,
Monitoring, and Reporting Malware (contd)
  • The goal of this module is to help the learner
    become familiar with using tools to identify,
    monitor, and report software that might be
    malicious.
  • Since it's not practical to ask a learner to
    install a virus, worm, backdoor, or rabbit, we
    are left with the possibility of a benign
    software Trojan. (discussed later).
  • In 1996, Mark Russinovich founded a company
    called Winternals Software where he was the
    chief software architect on a comprehensive suite
    of tools for diagnosing, debugging, and repairing
    Windows systems and applications 9.

110
Results (contd) Identifying,
Monitoring, and Reporting Malware (contd)
  • Mark's company has since been purchased by
    Microsoft and his suite of tools have been
    rebranded Windows Sysinternals and are offered
    for free on Microsoft Technet.
  • Mark's story is an interesting one because he is
    recognized as an expert on the internals of
    Windows even though he did not participate in
    its developmenta true testament to what can be
    learned about software through reverse
    engineering.
  • The Sysinternals suite contains 66 different
    utilities, but we'll focus on the most useful one
    in this context of analyzing the behavior of
    malware Process Monitor.

111
Results (contd) Identifying,
Monitoring, and Reporting Malware (contd)
  • The Process Monitor can capture detailed
    information about any running process in a
    Windows system including file system, registry,
    and network activity.

Figure 57. Process Monitor session for the
Password Vault application.
112
Results (contd) Identifying,
Monitoring, and Reporting Malware (contd)
  • Of course, Process Monitor itself doesn't
    identify malware, it simply reports what a
    process is doing.
  • With a little bit of ingenuity, one can identify
    Trojan Horses by looking for activities that
    don't seem to fit with the advertised
    functionality of a program.
  • It's common practice to download free software
    from the Internet, and because we've been
    convinced that open-source software, which is
    sometimes confused with free software, should
    have the fewest number of vulnerabilities, we do
    it without much afterthought.

113
Results (contd) Identifying,
Monitoring, and Reporting Malware (contd)
  • Incidentally, the data on the number of
    vulnerabilities found in popular Internet
    browsers does not support this belief.
  • Mozilla browsers were affected by 99 new
    vulnerabilities in 2008, more than any other
    browser there were 47 new vulnerabilities
    identified in Internet Explorer, 40 in Apple
    Safari, 35 in Opera, and 11 in Google Chrome
    8.
  • It seems counter-intuitive that an open-source
    browser would have twice as many security holes
    than a closed-source browser like Internet
    Explorer.

114
Results (contd) Identifying,
Monitoring, and Reporting Malware (contd)
  • Becoming familiar with the Windows Sysinternals
    suite can help you evaluate whether the software
    on your Windows machine is acting in your best
    interest.
  • If you suspect a particular program to be
    malware, it can be submitted online to a service
    called ThreatExpert.
  • ThreatExpert is a Web-based tool that supports
    submission of software executables that are to be
    evaluated against an on-line malware database.
  • Matching against existing malware is just one
    part of ThreatExpert's automated engine the
    service tries to execute suspected malware in an
    isolated environment in order to perform
    heuristic analysis of its actions.

115
Results (contd) Identifying,
Monitoring, and Reporting Malware (contd)
Figure 59. Example ThreatExpert report summary
for submitted malware.
116
Results (contd) Identifying,
Monitoring, and Reporting Malware (contd)
  • A Malware Identification and Monitoring Exercise
    was developed against a Java Alarm Clock
    application. This program was written to be a
    benign software Trojan.
  • The exercise asks the learner to identify the
    behaviors of the Alarm Clock application that
    make it a software Trojan using the Windows
    Sysinternals tool suite.
  • The Alarm Clock application bytecode has been
    aggressively obfuscated to discourage the use of
    decompilation as a strategy for learning the
    programs behavior.

117
Results (contd) Identifying,
Monitoring, and Reporting Malware (contd)
  • The Alarm Clock application is a benign software
    Trojan that, in addition to being a rudimentary
    alarm clock, performs unadvertised functions on
    background threads
  • Logs information from the Windows registry
  • Logs locations of office documents in the file
    system.
  • Scans for computers that respond to an ICMP ping.
  • Paced background threads are used.

118
Results (contd) Identifying,
Monitoring, and Reporting Malware (contd)
Figure 60. Background threads log information
about the users system.
119
Results (contd) Identifying,
Monitoring, and Reporting Malware (contd)
Figure 61. Process Monitor session for the Alarm
Clock application.
120
Conclusions
  • Since programmers would benefit from reverse
    engineering education, instructors need to be
    able to teach it to them.
  • At the present time, computer science instructors
    will be hard pressed to find materials for
    teaching a course that are compatible with
    classroom delivery.
  • Several books exist on reverse engineering that
    cater to industry professionals or those
    interested in self-study.
  • However, in a university setting, instructors
    engage students in ordered learning through
    exercises, quizzes, and exams.

121
Conclusions
  • Universities should continue to work toward
    establishing standard content for software
    reverse engineering and software maintenance
    courses.
  • Software Reverse Engineering is an activity that
    relies heavily on tools. Better tools can only
    make this activity more feasible and reliable.
  • The market for reverse engineering tools does not
    seem saturated there appear to be some
    opportunities for either new open-source projects
    or commercial products.

122
Thank you!
123
References
  • 1 M. R. Ali, Why teach reverse engineering?
    ACM SIGSOFT SEN, v.30, n.4, pp.1-4, Jul 2005.
  • 2 M. El-Ramly, Experience in teaching a
    software reengineering course, in Proceedings of
    the 28th International Conference on Software
    Engineering (ICSE). Shanghai, China, 2006, pp.
    699-702.
  • 3 A. V. Deursen, J. Favre, R. Koschke, and J.
    Rilling, Experiences in Teaching Software
    Evolution and Program Comprehension, in
    Proceedings of the 11th IEEE international
    Workshop on Program Comprehension, Washington,
    DC, 2003, pp. 2834-284.
  • 4 B. W. Weide, W. D. Heym, J. E. Hollingsworth,
    Reverse engineering of legacy code exposed, in
    Proceedings of the 17th international Conference
    on Software Engineering, Seattle, Washington, WA,
    1995, pp. 327-331.
  • 5 H. M. Sneed, Encapsualtion of legacy
    software A technique for reusing legacy software
    components, in Annals of Software Engineering,
    v.9, n.4, pp.293-313, 2000.

124
References (contd)
  • 6 B. W. Weide, W. D. Heym, J. E. Hollingsworth,
    Reverse engineering of legacy code exposed, in
    Proceedings of the 17th international Conference
    on Software Engineering, Seattle, Washington, WA,
    1995, pp. 327-331.
  • 7 E. Eliam, Secrets of Reverse Engineering,
    Indianapolis, IN Wiley, 2005. M. Stamp,
    Information Security Principles and Practice,
    Hoboken, NJ John Wiley Sons, 2006.
  • 8 Symantec Corp. (2009, Apr.). Symantec Global
    Internet Security Threat Report. Online.
    Available http//eval.symantec.com/mktginfo/enter
    prise/white_papers/bwhitepaper_internet_security_t
    hreat_report_xiv_04-2009.en-us.pdf. (Accessed
    April 26th, 2009).
  • 9 Microsoft Corporation, Windows Sysinternals
    utilities to help manage, troubleshoot and
    diagnose Windows systems and applications.
    Online. Available http//technet.microsoft.com/
    en-us/sysinternals/default.aspx. (Accessed April
    30th, 2009).
About PowerShow.com