Partial Automation of an Integration Reverse Engineering Environment of Binary Code - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Partial Automation of an Integration Reverse Engineering Environment of Binary Code

Description:

Idiom analyzer. Control flow graph generator. UBM/UDM. Loader ... An idiom is a sequence o instructions that has a special meaning that can't be ... – PowerPoint PPT presentation

Number of Views:161
Avg rating:3.0/5.0
Slides: 16
Provided by: Libr8
Category:

less

Transcript and Presenter's Notes

Title: Partial Automation of an Integration Reverse Engineering Environment of Binary Code


1
Partial Automation of an Integration Reverse
Engineering Environment of Binary Code
  • Author Cristina Cifuentes
  • Reverse Engineering, 1996., Proceedings of the
    Third Working Conference on
  • On page(s) 50 - 56
  • 8-10 Nov. 1996
  • Monterey, CA, USA

2
Introduction
  • Whats the problem?
  • Investment made on software when newer machine is
    available.
  • Two points of view for migration of software
  • From a commercial view
  • Software needs to be available on the new machine
    at the same time.
  • From a software developers point of view
  • Software developed in-house is an investment and
    asset to an organization.
  • Software migration is not a trivial problem!!

3
Four approaches to solve this problem
  • Use a native compiler to compile the source code
    for the new platform.
  • Emulation of old machines instructions using
    micro-code hardware in new machine.
  • Emulation of old machines instructions in
    software in new machine.
  • Binary translation

4
Problems
  • On using a native compiler to compile the source
    code
  • Compilation requires access to all source code,
    which may not be feasible.
  • On Emulation of old machines instructions using
    micro-code hardware
  • Its requires special micro-programmable
    hardware, which is not include in todays RISC
    machine.
  • On Emulation of old machines instructions in
    software
  • Software emulation is easy to implement but slow.

5
Structure of a Binary Translator and a De-compiler
  • Front-end
  • The front-end is a machine-dependent module that
    loads the source binary program, disassembles it,
    and translates it into an intermediate
    representation.
  • Middle-end
  • Performs the code analysis for the translation,
    and performs optimizations on the code
  • Back-end
  • It is a target machine-dependent module that
    generates code for the target machine

6
Integrated Reverse Engineering Environment for
Binary Code
7
A Compilers Structure
8
An Integrated Reverse Engineering Environment for
Binary Code
  • Loader
  • Disassembler
  • Signature generator
  • Prototype generator
  • New Jersey machine-code toolkit (NJMC)
  • Idiom analyzer
  • Control flow graph generator
  • UBM/UDM

9
Loader
  • Just like the operating system loader.
  • Read the binary file by decoding the binary-file
    format used to store the program, and determine
    the files structure (instructions, tables,
    symbol tables).

10
Disassembler
  • Parses the binary image of the program and
    translates it to assembler or some equivalent
    representation.
  • It parsed starting at the entry point and
    following all paths from this point.
  • Analysis address of indexed and indirect jumps or
    calls

11
Idiom analyzer
  • Detect idioms and translates the sequence of
    instructions into intermediate instructions.
  • An idiom is a sequence o instructions that has a
    special meaning that can't be derived from
    semantics of the individual instructions alone.
  • Examples
  • ARM
  • bl foo
  • X86
  • Sub ax,immedLo
  • Sbb ax,immedHi
  • sub dxax, immedHiimmedLo

12
Control flow graph generator
  • Constructs a control flow graph for each
    subroutine of the program.
  • The control flow graph is part of the
    intermediate representation of any reverse
    engineering tool that deals with binary code.

13
Second Generation Tools
  • Signature generator
  • Automatically determines library signatures
  • Prototype generator
  • Automatically determines the types of the formal
    arguments of library subroutines, and the type of
    the return value for functions.
  • New Jersey machine-code toolkit (NJMC)
  • Facilitate the decoding of machine instructions
    by provide a specification language to define
    machine instructions.

14
UBM/UDM
  • Universal binary-translation machine
  • Generates binary programs for target machine
  • Universal decompilation machine
  • Generates high-level language (like C).

15
Conclusions
  • This paper presents an integrated environment for
    the reverse engineering of binary programs.
  • Such environment is suitable for the development
    of disassemblers, binary translators and
    decompilers.
  • Make retargetable techniques essential in order
    to develop such tools for a variety of machines
    rather than for one specific machine.
Write a Comment
User Comments (0)
About PowerShow.com