Title: Fast and MemoryEfficient Regular Expression Matching for Deep Packet Inspection
1Fast and Memory-Efficient Regular Expression
Matching for Deep Packet Inspection
- Fang Yu
- fyu_at_eecs.berkeley.edu
- Joint work with Zhifeng Chen, Yanlei Diao,
- Randy H. Katz, T. V. Lakshman
2Deep Packet Inspection
- Deep Packet Inspection packet processing on the
entire packet content rather than just the header - Applications
- Network intrusion detection systems (NIDS)
- Scan packets in the network to stop worms from
reaching end hosts - Examples Snort, Bro, etc.
- Application layer protocol identification
- Classify packets as Kazaa, eDonkey2000, yahoo
messenger, etc. based on payload information - Examples Linux layer 7 filter (L7-filter), etc.
- An essential technique for other edge network
services - HTTP load balancing, XML processing, annotation
processing, etc.
3Language for Deep Packet Inspection
- Replacing explicit string patterns as the pattern
matching language of choice - L7-filter all protocol identifiers use regular
expressions - Snort no regular expression in April 2003, 1131
out of 4867 rules use regular expressions - Bro all patterns use regular expressions
4Regular Expressions
- High expressive power and flexibility to describe
patterns - Example regular expression for detecting yahoo
traffic - (ymsgypnsyhoo).?.?.?.?.?.?.? lwt.\xc0\x80
- Features
- patterns to be matched at the start of the
packet payload - or relationship
- . a single character wildcard
- ? a quantifier representing one or less,
is zero or more - a class of characters, e.g., lwt denotes
a letter l, w or t - \n any character except \n
- repeat, e.g., \n100 denotes no \n in
the next 100 bytes
5Challenges
- Challenges
- Features specific to networking applications
- Large set of patterns, order of 100s or 1000s
- Goal of this work
- Fast and memory efficient solutions
6Background of Regular Expression Matching
Techniques
- Finite automata are used to match regular
expressions
- Nondeterministic Finite Automata (NFA)
- A group of states can be activated simultaneously
- Deterministic Finite Automata (DFA)
- Only one state is activated at any time
Pattern (AB)C and (AD)E
7Comparison of DFA and NFA
- A pattern of length N
- M patterns, each with length N
- Traditional NFA-based approaches turns to be
slow - Sometimes less than 1Mb/s
- Naïve DFA implementations can have exponential
memory costs - Some of the patterns cant be compiled into DFA
8Outline of Our DFA-based Approach
- Analyze the computation and storage cost of
individual DFA - For individual regular expressions
- Identify the structural characteristics leading
to exponential growth of DFA - Regular expression rewrite techniques that reduce
memory usage - Make DFA-based approach possible
- Compile a set of regular expressions into several
engines - For multiple regular expressions
- Dramatically increase the regular expression
matching speed, without significant memory usage - Algorithms for general processor and multi-core
processor architectures
9DFA Sizes of Regular Expressions
- We identify patterns lead to large DFA
Rewrite Rule 1
Rewrite Rule 2
10Rewriting Rule 1
- The pattern can be matched anywhere in the input,
and a class of characters overlaps with the
prefix pattern - Often used for detecting buffer overflow attempts
- Example AUTH\s\n100
- DFA need to remember all the possible
consequences AUTH\s after first AUTH\s - For example, the second AUTH\s can either match
\n100 or be counted as a new match of the
start of regular expression AUTH\s - This pattern generates a DFA with more than
100,000 states
Input AUTH\s
AUTH\s
\s AUTH\s \s \s AUTH\s ..
NFA for AUTH\s\n100
11Rewriting Rule 1 (Cont.)
- Pattern rewriting intuition we only care the
first AUTH\s - If there is a \n within the next 100 bytes,
\n must also be within 100 bytes to the second
AUTH\s - Otherwise, the first AUTH\s and the following
characters have already matched the pattern - ?Rewrite the pattern to capture one match of the
prefix - (AAUAUTAUTHAUTH\sAUTH\s\n
0,99\n)AUTH\s\n100 generates a DFA of
only 106 states - New pattern may create fewer matches than the
original pattern - e.g., Input starts with AUTH AUTH with no \n
in the following 105 bytes - The original pattern can report two matches
- New pattern only reports one match
12Rewriting Rule 2
- The pattern can only be matched at the beginning
of the input, a class of characters with length
restrictions interacts with the preceding
character - Example SEARCH\s\n1024
- \s overlaps with \n1024
- White space characters can either match \s or be
counted as part of \n1024 - Generate a DFA of O(10242) states
- Rewriting solutions splitting this rule into two
rules - SEARCH\s\n1024
- SEARCH\s\s\n1023
13Rewriting Effect on the SNORT Rule Set
- After rewriting, DFA created for all the patterns
in the SNORT system can fit into 95MB memory
14Outline of Our DFA-based Approach
- Analyze the computation and storage cost of
individual DFA - For individual regular expressions
- Identify the structural characteristics leading
to exponential growth of DFA - Regular expression rewrite techniques that reduce
memory usage - Make DFA-based approach possible
- Compile a set of regular expressions into several
engines - For multiple regular expressions
- Dramatically increase the regular expression
matching speed, without significant memory usage - Algorithms for general processor and multi-core
processor architectures
15State Explosion Problem
- Randomly adding patterns L7-filters into one DFA
-
16Interactions of Regular Expressions
- Patterns with same prefixes generate a composite
DFA with less states - E.g., A DFA for pattern ABCD and ABAB
- Some patterns generate DFA of exponential sizes
- E.g., A DFA for pattern AB.CD and EF.GH
17Overview of Grouping Algorithms
- Fixed local memory limitation
- NPU or multi-core architectures
- Strategy group as many expressions together, to
fit into the local memory - Compute the pair-wise interactive results
- Pick a regular expression has the least number of
interactions to the new group - Keep adding patterns until the composite DFA is
larger than the limit - Fixed total memory limitation
- General single-core CPU architecture
- Strategy distribute the leftover memory evenly
among the ungrouped expressions - First compute the DFA of individual patterns and
compute the leftover memory size - Group patterns if the increased memory usage is
less than their share - Stop grouping when the size of the composite DFA
exceeds its share of the leftover memory
18Experimental Setup
- Regular expression pattern sets
- Linux application layer filer (L7-filter) 70
regular expressions - Large pattern set, over 700 regular expressions
- Packet traces
- MIT DARPA project more than a million packets
- Berkeley networking group dump more than six
million packets - Scanners
- Our generated DFA scanner
- Most updated version of a NFA-based scanner
Pcregrep
19Grouping Results for Patterns in L7-filter
Results of grouping algorithms for fixed total
memory
Results of grouping algorithms for fixed local
memory
70/107 theoretical speedup
Size of the largest individual DFA 196KB
No extra memory cost
70/125.83 theoretical speedup
6.83MB of memory
70/323.3 theoretical speedup
20Throughput Analysis
- For Linux 7 filter (70 patterns)
21Comparison to an NFA-based Approach (Pcregrep)
- For Linux 7 filter (70 patterns)
22Conclusions
- High performance DFA-based regular expression
matching for deep packet inspection is possible - The ungrouped DFA implementation is 5 to 10 times
faster than a widely used NFA implementation - Grouping algorithm
- Further speed up the matching process by 20-50
times compared to ungrouped DFA implementation - 2 orders of magnitude performance improvement
over the NFA implementation - General processor architectures groups are
processed sequentially - Multi-core or NPU architectures one group per
core, memory usage is independent between cores