Nora Sovarel and Joel Winstead - PowerPoint PPT Presentation

About This Presentation

Title:

Nora Sovarel and Joel Winstead

Description:

Monoculture and Diversity Nora Sovarel and Joel Winstead 21 September 2004 What is monoculture? the cultivation or growth of a single crop or organism especially ... – PowerPoint PPT presentation

Number of Views:99

Avg rating:3.0/5.0

Slides: 50

Provided by: jaw7

Learn more at: https://www.cs.virginia.edu

Category:

more less

Transcript and Presenter's Notes

Title: Nora Sovarel and Joel Winstead

1
Monoculture and Diversity

Nora Sovarel and Joel Winstead
21 September 2004

2
What is monoculture?

the cultivation or growth of a single crop or
organism especially on agricultural or forest
land
Merriam-Webster Online

3
Monoculture in Biology

The Irish Potato Famine, 1845-1850
About half of Irelands population depended on
the potato crop
The fungus Phytophthora infestans appeared in
Ireland in 1845
Every potato farm in Ireland was vulnerable
Consequences for Ireland
1 million people died
1-2 million emigrated

4
What about computing?

Most statistics agree that Microsoft has at least
90 of the OS market.
For example thecounter.com
Win XP 56
Win 98 20
Win 2000 15
Win NT 1
Win 95 Win 3x less than 1
http//www.thecounter.com/stats/2004/August/os.php

5
Monocultures in Computing

Operating Systems 90 Microsoft
Browsers IE, Opera, Netscape
Web Servers Apache, IIS
Routers 85 Cisco
Processors x86, Sparc

6
Why are we in this situation?

Users single interface
System Administrators - uniform software
configurations
Software Companies
Lower distribution and maintenance costs
Compatibility and file formats

7
What are the consequences?

Same vulnerabilities for everyone
One worm/virus for majority of systems
Virus writers also like economy of scale
write once, exploit everywhere

8
What can we do ?

opposite of monoculture
diversity
more than one

9
Diversity as a defense

If were not all running exactly the same code
A single attack cannot compromise everybody
epidemic attacks cease to scale
An attacker wont know what specific attack to
use against a particular target
targeted attacks become more expensive

10
How many?

Are 10 variants of each piece of software and
hardware enough?
normal operations disrupted with only a small
fraction of computers attacked
Witty worm
applications show same vulnerabilities across OS's

11
We need more....

We need every system to look different to the
attacker
We need all systems to look exactly the same to
the users and administrators
We need to be able to deploy and patch systems
quickly and economically

12
Can we have the benefits without the
disadvantages?

Same user interface
Different vulnerabilities
Can the right kind of diversity be generated
automatically, without side-effects?

13
Roadmap

Threat Model
Classes of attacks
Diversity defences
Address space randomization
Pointer randomization
Instruction set randomization
Keyed hash functions
Effectiveness of these defences

14
Threat Model

Threat automated, destructive worms
Require quick, automated, remote infection
Write-once, exploit everywhere
Assume attacker knows code, but not key material
We are not
defending against local attackers
defending against expensive brute-force attacks
defending against targeted attacks
Goal make cost of automated infection high
Crashing program is better than spreading worm

15
Classes of Attacks

Code injection attacks
Existing code attacks
Algorithmic complexity attacks

16
Code Injection Attacks

Stack Smashing Attack
SQL Code Injection
Perl Code Injection
Double Pointer Attacks

17
Stack Smashing
return addr
main(int argc, char argv) ...
foo(a,b,c) ... if (everything_is_kosher
) exec(/bin/sh) void
foo(int a,int b,int c) char buf100
... gets(buf) ...
argc
argv
a
b
c
return addr
buf
18
Stack Smashing
return addr
main(int argc, char argv) ...
foo(a,b,c) ... if (everything_is_kosher
) exec(/bin/sh) void
foo(int a,int b,int c) char buf100
... gets(buf) ...
argc
argv
a
malicious payload
b
c
return addr
buf
19
Stack Smashing
return addr

Payload overwrites return address
New address can point to injected code or
existing code
The payload can also overwrite local variables
Pointers to code can also occur in other places
virtual functions, callbacks
Runtime type information on the heap can also be
overwritten
See Smashing the Stack for Fun and Profit in
Phrack 49 for more

argc
argv
a
return addr malicious code
b
c
return addr
buf
20
Existing Code Attacks

Format String Attack
Data Modification Attack
Integer Overflow
return-to-libc attacks

21
Why do these attacks work?

The way code, stack, and data are laid out in
memory is fairly predictable

22
Why do these attacks work?

The way code, stack, and data are laid out in
memory is fairly predictable

Shared Libraries
Stack
Heap
Code
23
Defence Through Diversity

Solution randomise layout of address space

Shared Libraries
Shared Libraries
Stack
Stack
Heap
Heap
Code
Code
24
What does this buy us?

This can be done at link time
low overhead
Attacker must know or guess what address to jump
to
The starting addresses of code, stack, heap, and
library segments add some entropy
On a 32-bit system, about 16 bits for each
segment
Is this enough?

25
Attacking Address Space Randomization

Attacker needs address of only one function to
make successful attack
Information leaks can reveal this
format string vulnerability
16 bits can be brute-forced
Shacham et al. show how to do this in 216 seconds
over a network

26
Can we use a larger key?

Cant get more than 20 bits without changing
virtual memory system
We can add padding to stack and code
We can rearrange functions and data structures in
memory
but this is tricky for shared libraries
But an attacker needs only one address to succeed
64-bit address spaces may help

27
Address Space Obfuscation and Randomization

start address
reorder
gaps
encryption

28
Defenses - Stack

Canary Value
Write/Executable Pages
Padding
Local Variables Reordering
Parameter Reordering

29
Defenses Memory Layout Randomization

Base Address Randomization
stack
heap
text
DLL

30
Defenses Memory Layout Randomization

Reordering of static variables
Reordering of routines
Gaps in heap
Gaps between routines

31
Pointer Encryption

Rearranging address spaces doesnt give us a very
large key
Can we have diversity not just in how memory is
laid out, but in what pointers mean?
What if we encrypted all pointers in the program?
We could use a larger key
Attacker must guess key in order to overwrite a
return address with something meaningful

32
PointGuard

Developed by Cowan et al. at Immunix
All pointers stored in memory are encrypted
Pointers are decrypted immediately before
dereference
Pointers are encrypted before storing in memory
An attacker must guess key in order to generate
valid pointer to attack code

33
PointGuard code transformation

Unlike address space transformations, requires
compiler changes
Cleartext pointers appear only in registers
Registers are not vulnerable to modification
Encryption must be fast and efficient
We dont want to encrypt non-pointer data,
because that would mean encrypting the buffer
containing the attackers pointer
Accessing libraries is tricky

34
Effectiveness of PointGuard

Overhead is low
but requires recompilation
interaction with non-PG-aware code is tricky
Defends against most code injection and
return-to-existing-code attacks
Does not defend against all data modification
attacks
Information leaks may reveal ciphertext, allowing
attacker to guess key

35
What if code gets in anyway?

The previous techniques work by preventing an
attacker from jumping to malicious code in the
system
What if we didnt think of every way that could
happen?
Defense-in-depth
make sure injected code wont run no matter how
control is transferred

36
What must an attacker know?

An attacker must know how to write code to run on
the targeted system
SPARC exploit code will not run on x86
What if no two computers had the same instruction
set?
It would be difficult or impossible to write
exploit code that will run everywhere

37
Instruction Set Randomization

Kc, Keromytis, and Prevelakis
Encrypt the programs instructions with a
different key for each copy of the program
Decrypt each instruction at runtime immediately
before execution
Attacker must know key in order to write code
that will decrypt to something meaningful
Unsuccessful attack will cause illegal
instruction, address, or raise exception

38
How many bits do we need?

Strong symmetric cryptography typically requires
a 128-bit key or larger to resist known-plaintext
attacks
Large performance penalty to decrypt
If we assume attacker doesnt have our
ciphertext, we can use much smaller key
32-bit XOR may be good enough if our goal is to
prevent large-scale automated worms

39
Encoding schemes

XOR
each word in legitimate code is XORed with the
same key
Bit permutation
The bits in each word are rearranged according to
a key
log2(32!) 160 bits, for 32-bit word
Can move bits from one instruction to another
In practice, key size is smaller
more than one way to encode an instruction
more than one harmful instruction

40
Variable-sized instructions

x86 instructions vary in size
Some instructions are 1 byte
8 bit key insufficient
Padding with NOPs has cost
generally requires source code
Solution 1
Pad branch targets only
Solution 2
Encrypt words, not instructions

41
x86 Implementation

Authors modified Bochs x86 emulator to decrypt
code at runtime
Encrypted image consisting of kernel and
statically-linked binaries
Cost of emulation is high for CPU-bound processes
Not so bad for I/O bound processes
Reprogrammable processors could reduce overhead
(TransMeta Crusoe)

42
Interpreted Languages

Some code injection attacks use VBScript, SQL,
Perl, or shell languages
Append key material to keywords
e.g. foreach becomes foreach12345
Overhead is negligible
The languages are interpreted anyway
Error messages may reveal key

43
Libraries

Libraries present a problem
Use different keys for applications and libraries
Use single key for all system libraries
Change the key from time to time
Or
Statically link everything so that library code
uses same key as application

44
Other issues

Self-modifying code wont run
(Yes, gcc sometimes generates this)
Significant performance penalty
Attacker with ciphertext could brute-force the
key offline
No defense against local attackers
May be okay for defense against worms
Does not resist existing code attacks
Does not resist data corruption attacks

45
Algorithmic Complexity Attacks

The Linux networking code uses hash tables to
classify packets
Hash tables, binary trees, and other data
structures have good performance in average case
But poor performance in worst case
An attacker who knew the hash function could
deliberately generate collisions
This can force worst-case behavior
This can cause denial of service

46
Diversity as a Defense

Attacker can find collisions only if he knows
hash function
What if every copy used a different hash
function?
Solution keyed hash functions
Every copy uses same code
Every copy uses a different key
Attacker cannot force collisions without key

47
Effectiveness

The techniques presented are orthogonal
Other attacks
integer overflow
data modification
Other threat models
local attacker
determined remote attacker
denial of service

48
Other approaches

StackGuard, StackShield, MemGuard, etc.
bounds checking, canaries, non-executable stack
and heap
Safe library routines, wrappers
Sandboxes and safe languages (Java)
Static analysis to detect (or prove the absence
of) buffer overflows

49
Will this prevent catastrophic failures?

3. Things will be much like they are now
persistent threats, common annoyances, but people
will still trust Internet for semi-critical
tasks.
4. Technologies have emerged (and been
successfully deployed) that make epidemic attacks
a thing of the past. The Internet will be
trusted for the most critical tasks.
Do these techniques give us hope for (4)?

Write a Comment

User Comments (0)