Deterministic MemoryEfficient String Matching Algorithms for Intrusion Detection - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Deterministic MemoryEfficient String Matching Algorithms for Intrusion Detection

Description:

Deterministic Memory-Efficient String Matching Algorithms for Intrusion Detection. Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese ... – PowerPoint PPT presentation

Number of Views:367
Avg rating:3.0/5.0
Slides: 32
Provided by: admi1188
Category:

less

Transcript and Presenter's Notes

Title: Deterministic MemoryEfficient String Matching Algorithms for Intrusion Detection


1
Deterministic Memory-Efficient String Matching
Algorithms for Intrusion Detection
  • Nathan Tuck, Timothy Sherwood, Brad Calder,
    George Varghese
  • Department of Computer Science and Engineering,
    University of California, San Diego
  • Department of Computer Science, University of
    California, Santa Barbara

2
Abstract
  • IDSsIntrusion Detection Systems
  • Space and time efficient string matching
    algorithms
  • Providing worst-case performance
  • Amenable to H/W implementation
  • Aho-Corasick
  • Memory, performance

3
Introduction (i)
  • Combating attacks at every level
  • Automatically monitoring network traffic
  • IDS uses a set of rules
  • Apply to matching packets
  • Edge and core routers
  • Stringent worst-case performance bounds
  • Tight constraints on memory

4
Introduction (ii)
  • At the heart of IDSs is a string matching
    algorithm
  • In Snort, 70 of total execution time and 80 of
    instructions executed
  • Contributions of this paper
  • Characterization
  • New Algorithms
  • Evaluation

5
String matching for intrusion detection
6
Quantifying the Use of String Matching (i)
  • Snort-An intrusion detection system
  • The rules are generated manually
  • Extract relevant content strings from the payload
    and header of known attacks
  • The action can include logging, alerting,
    ignoring,
  • Rules are usually added as new vulnerabilities
    are discovered

7
Quantifying the Use of String Matching (ii)
  • Scalability of the intrusion detection system
    database
  • Beneficial to avoid that has run-time
    proportional to the length of the rules in the
    database
  • New rules are being added to detect or combat new
    attacks

8
(No Transcript)
9
(No Transcript)
10
Quantifying the Use of String Matching (iii)
  • Linearly searching through the of rules is
    becoming increasingly infeasible
  • The database is growing at a rate that is well
    within Moores Law
  • Need a technique with run-time performance

11
State of the Art in String Matching (i)
  • Single-pattern string matching
  • Boyer-Moore,
  • Multi-pattern string matching
  • Aho-Corasick, Wu-Manber,
  • Imprecise string matching
  • Using hashing and signature-based
  • Be reverified using a precise string matching

12
State of the Art in String Matching (ii)
  • Bad Character Heuristics
  • Easily exploitable by attackers
  • Aho-Corasick
  • Use unoptimized data structure for space
    optimizations
  • SFKSearch
  • Worst-case performance is quite poor
  • Wu-Manber
  • Memory access to the shift and hash table

13
(No Transcript)
14
Applying IP Lookup Techniques to String Matching
(i)
  • IP-lookupa set of patterns to match, finding the
    longest possible match for a set of IP address
    that are streaming by
  • String matchinga set of strings to match,
    finding all of the places in the input stream
    where there is a match

15
Applying IP Lookup Techniques to String Matching
(ii)
  • Unibit and Multibit Tries
  • Wastes space with pointer
  • Lulea Algorithm
  • Use the concepts of leaf pushing and bitmaps to
    compress the database
  • Eatherton Algorithm
  • Internal bitmap and external bitmap

16
Optimizations for string matching
17
Bitmap compression (i)
  • With 32-bits pointers
  • In Aho-Corasick has 256 next state pointers
  • Now using a single pointer to the first valid
    next state, and maintain a 256 bit bitmap
  • Summing all the bits prior that bit number and
    adding them to the base next node pointer

18
(No Transcript)
19
Bitmap compression (ii)
  • Original optimized Aho-Corasick
  • 1028 bytes each node
  • Bitmapped version
  • Only 44 bytes each node
  • Incurs two costs
  • Doubles the worst-case of work
  • Performing a sum up to 256 prior bits

20
Path Compression (i)
  • Bitmap is largely wasted information at the
    bottom nodes
  • Any path compressed nodes must be equal in size
    to bitmapped nodes
  • Failure pointers must include an offset

21
(No Transcript)
22
Path Compression (ii)
  • On a 32 bits pointer
  • A single path compressed node can contain data
    equivalent to 4 bitmap compressed nodes
  • In practice, achieve a 2.541 compression ratio

23
(No Transcript)
24
Results
25
Instruction Detection in Hardware
  • The number of rules go up by over a factor of
    2.5, whereas the size of memory for our algorithm
    only goes up by 30
  • Focus our attention on the worst-case performance

26
(No Transcript)
27
Intrusion Detection in Software
  • Examine both average-case and worst-case
    performance
  • Wu-Manber is the fastest in the average-case
    because of hash function

28
(No Transcript)
29
(No Transcript)
30
Summary (i)
  • Current software IDSs largely rely on common-case
    optimizations to gain speed
  • Aho-Corasick is only has deterministic worst-case
    lookup times and friendly enough to use for wire
    speed H/W matching

31
Summary (ii)
  • Contribution of this paper
  • Apply bitmap node compression and path
    compression to Aho-Corasick
  • Gain both compact storage and worst-case
    performance
Write a Comment
User Comments (0)
About PowerShow.com