Nora Sovarel and Joel Winstead - PowerPoint PPT Presentation

About This Presentation
Title:

Nora Sovarel and Joel Winstead

Description:

Monoculture and Diversity Nora Sovarel and Joel Winstead 21 September 2004 What is monoculture? the cultivation or growth of a single crop or organism especially ... – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 50
Provided by: jaw7
Category:

less

Transcript and Presenter's Notes

Title: Nora Sovarel and Joel Winstead


1
Monoculture and Diversity
  • Nora Sovarel and Joel Winstead
  • 21 September 2004

2
What is monoculture?
  • the cultivation or growth of a single crop or
    organism especially on agricultural or forest
    land
  • Merriam-Webster Online

3
Monoculture in Biology
  • The Irish Potato Famine, 1845-1850
  • About half of Irelands population depended on
    the potato crop
  • The fungus Phytophthora infestans appeared in
    Ireland in 1845
  • Every potato farm in Ireland was vulnerable
  • Consequences for Ireland
  • 1 million people died
  • 1-2 million emigrated

4
What about computing?
  • Most statistics agree that Microsoft has at least
    90 of the OS market.
  • For example thecounter.com
  • Win XP 56
  • Win 98 20
  • Win 2000 15
  • Win NT 1
  • Win 95 Win 3x less than 1
  • http//www.thecounter.com/stats/2004/August/os.php

5
Monocultures in Computing
  • Operating Systems 90 Microsoft
  • Browsers IE, Opera, Netscape
  • Web Servers Apache, IIS
  • Routers 85 Cisco
  • Processors x86, Sparc

6
Why are we in this situation?
  • Users single interface
  • System Administrators - uniform software
    configurations
  • Software Companies
  • Lower distribution and maintenance costs
  • Compatibility and file formats

7
What are the consequences?
  • Same vulnerabilities for everyone
  • One worm/virus for majority of systems
  • Virus writers also like economy of scale
  • write once, exploit everywhere

8
What can we do ?
  • opposite of monoculture
  • diversity
  • more than one

9
Diversity as a defense
  • If were not all running exactly the same code
  • A single attack cannot compromise everybody
  • epidemic attacks cease to scale
  • An attacker wont know what specific attack to
    use against a particular target
  • targeted attacks become more expensive

10
How many?
  • Are 10 variants of each piece of software and
    hardware enough?
  • normal operations disrupted with only a small
    fraction of computers attacked
  • Witty worm
  • applications show same vulnerabilities across OS's

11
We need more....
  • We need every system to look different to the
    attacker
  • We need all systems to look exactly the same to
    the users and administrators
  • We need to be able to deploy and patch systems
    quickly and economically

12
Can we have the benefits without the
disadvantages?
  • Same user interface
  • Different vulnerabilities
  • Can the right kind of diversity be generated
    automatically, without side-effects?

13
Roadmap
  • Threat Model
  • Classes of attacks
  • Diversity defences
  • Address space randomization
  • Pointer randomization
  • Instruction set randomization
  • Keyed hash functions
  • Effectiveness of these defences

14
Threat Model
  • Threat automated, destructive worms
  • Require quick, automated, remote infection
  • Write-once, exploit everywhere
  • Assume attacker knows code, but not key material
  • We are not
  • defending against local attackers
  • defending against expensive brute-force attacks
  • defending against targeted attacks
  • Goal make cost of automated infection high
  • Crashing program is better than spreading worm

15
Classes of Attacks
  • Code injection attacks
  • Existing code attacks
  • Algorithmic complexity attacks

16
Code Injection Attacks
  • Stack Smashing Attack
  • SQL Code Injection
  • Perl Code Injection
  • Double Pointer Attacks

17
Stack Smashing
return addr
main(int argc, char argv) ...
foo(a,b,c) ... if (everything_is_kosher
) exec(/bin/sh) void
foo(int a,int b,int c) char buf100
... gets(buf) ...
argc
argv
a
b
c
return addr
buf
18
Stack Smashing
return addr
main(int argc, char argv) ...
foo(a,b,c) ... if (everything_is_kosher
) exec(/bin/sh) void
foo(int a,int b,int c) char buf100
... gets(buf) ...
argc
argv
a
malicious payload
b
c
return addr
buf
19
Stack Smashing
return addr
  • Payload overwrites return address
  • New address can point to injected code or
    existing code
  • The payload can also overwrite local variables
  • Pointers to code can also occur in other places
  • virtual functions, callbacks
  • Runtime type information on the heap can also be
    overwritten
  • See Smashing the Stack for Fun and Profit in
    Phrack 49 for more

argc
argv
a
return addr malicious code
b
c
return addr
buf
20
Existing Code Attacks
  • Format String Attack
  • Data Modification Attack
  • Integer Overflow
  • return-to-libc attacks

21
Why do these attacks work?
  • The way code, stack, and data are laid out in
    memory is fairly predictable

22
Why do these attacks work?
  • The way code, stack, and data are laid out in
    memory is fairly predictable

Shared Libraries
Stack
Heap
Code
23
Defence Through Diversity
  • Solution randomise layout of address space

Shared Libraries
Shared Libraries
Stack
Stack
Heap
Heap
Code
Code
24
What does this buy us?
  • This can be done at link time
  • low overhead
  • Attacker must know or guess what address to jump
    to
  • The starting addresses of code, stack, heap, and
    library segments add some entropy
  • On a 32-bit system, about 16 bits for each
    segment
  • Is this enough?

25
Attacking Address Space Randomization
  • Attacker needs address of only one function to
    make successful attack
  • Information leaks can reveal this
  • format string vulnerability
  • 16 bits can be brute-forced
  • Shacham et al. show how to do this in 216 seconds
    over a network

26
Can we use a larger key?
  • Cant get more than 20 bits without changing
    virtual memory system
  • We can add padding to stack and code
  • We can rearrange functions and data structures in
    memory
  • but this is tricky for shared libraries
  • But an attacker needs only one address to succeed
  • 64-bit address spaces may help

27
Address Space Obfuscation and Randomization
  • start address
  • reorder
  • gaps
  • encryption

28
Defenses - Stack
  • Canary Value
  • Write/Executable Pages
  • Padding
  • Local Variables Reordering
  • Parameter Reordering

29
Defenses Memory Layout Randomization
  • Base Address Randomization
  • stack
  • heap
  • text
  • DLL

30
Defenses Memory Layout Randomization
  • Reordering of static variables
  • Reordering of routines
  • Gaps in heap
  • Gaps between routines

31
Pointer Encryption
  • Rearranging address spaces doesnt give us a very
    large key
  • Can we have diversity not just in how memory is
    laid out, but in what pointers mean?
  • What if we encrypted all pointers in the program?
  • We could use a larger key
  • Attacker must guess key in order to overwrite a
    return address with something meaningful

32
PointGuard
  • Developed by Cowan et al. at Immunix
  • All pointers stored in memory are encrypted
  • Pointers are decrypted immediately before
    dereference
  • Pointers are encrypted before storing in memory
  • An attacker must guess key in order to generate
    valid pointer to attack code

33
PointGuard code transformation
  • Unlike address space transformations, requires
    compiler changes
  • Cleartext pointers appear only in registers
  • Registers are not vulnerable to modification
  • Encryption must be fast and efficient
  • We dont want to encrypt non-pointer data,
    because that would mean encrypting the buffer
    containing the attackers pointer
  • Accessing libraries is tricky

34
Effectiveness of PointGuard
  • Overhead is low
  • but requires recompilation
  • interaction with non-PG-aware code is tricky
  • Defends against most code injection and
    return-to-existing-code attacks
  • Does not defend against all data modification
    attacks
  • Information leaks may reveal ciphertext, allowing
    attacker to guess key

35
What if code gets in anyway?
  • The previous techniques work by preventing an
    attacker from jumping to malicious code in the
    system
  • What if we didnt think of every way that could
    happen?
  • Defense-in-depth
  • make sure injected code wont run no matter how
    control is transferred

36
What must an attacker know?
  • An attacker must know how to write code to run on
    the targeted system
  • SPARC exploit code will not run on x86
  • What if no two computers had the same instruction
    set?
  • It would be difficult or impossible to write
    exploit code that will run everywhere

37
Instruction Set Randomization
  • Kc, Keromytis, and Prevelakis
  • Encrypt the programs instructions with a
    different key for each copy of the program
  • Decrypt each instruction at runtime immediately
    before execution
  • Attacker must know key in order to write code
    that will decrypt to something meaningful
  • Unsuccessful attack will cause illegal
    instruction, address, or raise exception

38
How many bits do we need?
  • Strong symmetric cryptography typically requires
    a 128-bit key or larger to resist known-plaintext
    attacks
  • Large performance penalty to decrypt
  • If we assume attacker doesnt have our
    ciphertext, we can use much smaller key
  • 32-bit XOR may be good enough if our goal is to
    prevent large-scale automated worms

39
Encoding schemes
  • XOR
  • each word in legitimate code is XORed with the
    same key
  • Bit permutation
  • The bits in each word are rearranged according to
    a key
  • log2(32!) 160 bits, for 32-bit word
  • Can move bits from one instruction to another
  • In practice, key size is smaller
  • more than one way to encode an instruction
  • more than one harmful instruction

40
Variable-sized instructions
  • x86 instructions vary in size
  • Some instructions are 1 byte
  • 8 bit key insufficient
  • Padding with NOPs has cost
  • generally requires source code
  • Solution 1
  • Pad branch targets only
  • Solution 2
  • Encrypt words, not instructions

41
x86 Implementation
  • Authors modified Bochs x86 emulator to decrypt
    code at runtime
  • Encrypted image consisting of kernel and
    statically-linked binaries
  • Cost of emulation is high for CPU-bound processes
  • Not so bad for I/O bound processes
  • Reprogrammable processors could reduce overhead
    (TransMeta Crusoe)

42
Interpreted Languages
  • Some code injection attacks use VBScript, SQL,
    Perl, or shell languages
  • Append key material to keywords
  • e.g. foreach becomes foreach12345
  • Overhead is negligible
  • The languages are interpreted anyway
  • Error messages may reveal key

43
Libraries
  • Libraries present a problem
  • Use different keys for applications and libraries
  • Use single key for all system libraries
  • Change the key from time to time
  • Or
  • Statically link everything so that library code
    uses same key as application

44
Other issues
  • Self-modifying code wont run
  • (Yes, gcc sometimes generates this)
  • Significant performance penalty
  • Attacker with ciphertext could brute-force the
    key offline
  • No defense against local attackers
  • May be okay for defense against worms
  • Does not resist existing code attacks
  • Does not resist data corruption attacks

45
Algorithmic Complexity Attacks
  • The Linux networking code uses hash tables to
    classify packets
  • Hash tables, binary trees, and other data
    structures have good performance in average case
  • But poor performance in worst case
  • An attacker who knew the hash function could
    deliberately generate collisions
  • This can force worst-case behavior
  • This can cause denial of service

46
Diversity as a Defense
  • Attacker can find collisions only if he knows
    hash function
  • What if every copy used a different hash
    function?
  • Solution keyed hash functions
  • Every copy uses same code
  • Every copy uses a different key
  • Attacker cannot force collisions without key

47
Effectiveness
  • The techniques presented are orthogonal
  • Other attacks
  • integer overflow
  • data modification
  • Other threat models
  • local attacker
  • determined remote attacker
  • denial of service

48
Other approaches
  • StackGuard, StackShield, MemGuard, etc.
  • bounds checking, canaries, non-executable stack
    and heap
  • Safe library routines, wrappers
  • Sandboxes and safe languages (Java)
  • Static analysis to detect (or prove the absence
    of) buffer overflows

49
Will this prevent catastrophic failures?
  • 3. Things will be much like they are now
    persistent threats, common annoyances, but people
    will still trust Internet for semi-critical
    tasks.
  • 4. Technologies have emerged (and been
    successfully deployed) that make epidemic attacks
    a thing of the past. The Internet will be
    trusted for the most critical tasks.
  • Do these techniques give us hope for (4)?
Write a Comment
User Comments (0)
About PowerShow.com