Efficient Regular Expression Evaluation: Theory to Practice Michela Becchi and Patrick Crowley - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Efficient Regular Expression Evaluation: Theory to Practice Michela Becchi and Patrick Crowley

Description:

Title: PowerPoint Presentation Last modified by: Michela Becchi Created Date: 1/1/1601 12:00:00 AM Document presentation format: On-screen Show (4:3) – PowerPoint PPT presentation

Number of Views:161
Avg rating:3.0/5.0
Slides: 22
Provided by: webMisso48
Category:

less

Transcript and Presenter's Notes

Title: Efficient Regular Expression Evaluation: Theory to Practice Michela Becchi and Patrick Crowley


1
Efficient Regular Expression Evaluation Theory
to PracticeMichela Becchi and Patrick Crowley
ANCS08
2
Motivation
  • Size and complexity of rule-set increased in
    recent years
  • Snort, as of November 2007
  • 8536 rules, 5549 Perl Compatible Regular
    Expressions
  • 99 with character ranges (c1-ck,\s,\w)
  • 16.3 with dot-star terms (., c1..ck
  • 44 with counting constraints (.n.m,
    c1..ckn,m)
  • Several proposals to accelerate regular
    expression matching
  • FPGA
  • Memory centric architecture

3
Objectives
  • Can we converge distinct algorithmic techniques
    into a single proposal also for large data-sets?
  • Can we apply techniques intended for memory
    centric architectures also on FPGAs?

Provide tool to allow anybody to implement a high
throughput DPI system on the architecture of
choice
4
Target Architectures
Regex-Matching Engine
Memory-centric architectures
FPGA logic
FPGA / ASIC memory
Generalpurpose processors
Network processors
5
Challenges
DFA
NFA
FPGA logic
  • Logic cell utilization
  • Clock frequency
  • Memory space
  • Memory bandwidth

6
D2FA default transition compression
  • Observations
  • DFA state set of ? next state pointers
  • Transition redundancy
  • Idea
  • Differential state representation through use of
    non-consuming default transitions
  • In general

s3
a
s3
a
s4
b
s1
s4
b
s1
c
s5
c
s5
s3
a
s4
b
s2
s6
c
c
s2
s6
7
D2FA algorithms
  • Problem set default transitions so to
  • Maximize memory compression
  • Minimize memory bandwidth overhead
  • Kumar et al, SIGCOMM06
  • Bound dpMAX on max default path length
  • O(dpMAX1) memory accesses per input char
  • Better compression for higher dpMAX
  • Becchi et al, ANCS07
  • Only backward-directed default transitions
    (skipping k levels)
  • Amortized memory bandwidth O((k1/k)N) on N input
    chars
  • Depth-first traversal ? at DFA creation

Memory bandwidth O((dpMAX1)N) Time complexity
O(n2logn) Space complexity O(n2)
Memory bandwidth O((k1/k)N) Time complexity
O(n2) Space complexity O(n)
vs.
Compression w/ k1 compression w/ dpMAX8
8
DFA alphabet reduction
  • Effective for
  • Ignore-case regex
  • Char-ranges
  • Never used chars

? ?
a-z 0
A 1
B-Z 2
0-9 3
0-9a-zA-Z 4


Alphabet translation table
8
9
Multiple-stride DFAs
  • Brodie et al, ISCA 2006
  • Idea
  • Process stride input chars at a time
  • Observations
  • Mechanism used on small DFAs (1-2 regex)
  • No distinct accepting state handling

DFA w/ stride 2
DFA
0
10
Multiple stride alphabet reduction
  • Stride s ? Alphabet ?s
  • ?ASCII alphabet ? ?2256265,536
    ?425644,294M
  • Effective alphabet much smaller
  • Char grouping a-cefa, b-fb
  • Alphabet reduction may be necessary to make
    stride doubling feasible on large DFAs

11
Multiple stride default transitions
  • Compression
  • Default transitions eliminate transition
    redundancy
  • In multiple stride DFAs
  • of states does not substantially change
  • of transitions per state increases
    exponentially (? ? ?stride )
  • Fraction distinct/total transitions decreases
  • Increased potential for compression!
  • Accepting state handling
  • Duplicated states have same outgoing transitions
    as original states but different depth
  • Default transition will remove all outgoing
    transitions from new accepting states

12
Multiple stride default transitions (contd)
  • Problem
  • For large ? and stride, uncompressed DFA may be
    unfeasible
  • Out of memory when generating a 2K node, stride 4
    DFA on a Linux machine w/ 4GB memory
  • Solution
  • Perform default transition compression during DFA
    creation
  • Use Becchi et al, ANCS 2006 compression
    algorithm
  • In the situation above, only 10 memory used

13
Putting everything together
DFA
1-22 regex 48-1,940 states
14
NFA
  1. abcd
  2. abce
  3. abc.f
  4. bd-fa
  5. bdc

15
Multiple stride alphabet reduction
  • Stride doubling
  • Alphabet reduction
  • Clustering-based algorithm as for DFA, but sets
    of target states are compared
  • Avoid new state creation
  • Keep multiple transitions on the same symbol
    separated

16
FPGA implementation
(c1b OR c1B) AND NOT (c2a OR c2A)
17
FPGA Results - throughput
18
FPGA Results logic utilization
s7,864 ?164 ?22206
s2,086 ?178 ?21,969
s2,147 ?168 ?21640
  • Utilization
  • 8-46 on XC5VLX50 device (7,400 slices)
  • XC5VLX330 device has 51,840 slices

19
ASIC projected results
Regex partitioning into multiple DFAs
Rule-set Stride 1 Stride 1 Stride 1 Stride 1 Stride 2 Stride 2 Stride 2 Stride 2
Rule-set S states Memory footprint Memory footprint S states Memory footprint Memory footprint
Rule-set S states Compressed states Full states S states Compressed states Full states
k-NFA any 78 2,086 - - 1969 2,091 - -
k-DFA any1 59 23,846 505KB 200 KB 850 28,223 356KB 32MB
k-DFA any2 45 86,977 2.9 MB 55 KB 579 102,940 1.27MB 81MB
k-DFA any3 60 14,084 299MB 48 KB 627 19,344 244KB 16 MB
  • Throughput SRAM_at_500 MHz
  • 2-4 Gbps for stride 1
  • 4-8 Gbps for stride 2

20
Conclusion
  • Algorithm
  • Combination of default transition compression,
    alphabet reduction and stride multiplying on
    potentially large DFAs
  • Extension of alphabet reduction and stride
    multiplying to NFAs
  • FPGA Implementation
  • Use of one-hot encoding w/ incremental
    improvement schemes
  • Logic minimization scheme for alphabet reduction
    decoding
  • Additional aspects
  • Multiple flow handling FPGA vs. memory centric
    architectures
  • Design improvements tailored to specific
    architectures and data-sets
  • Clustering into smaller NFAs and DFAs to allow
    smaller alphabets w/ larger strides

21
Thank you!
  • Questions?
  • http//regex.wustl.edu
Write a Comment
User Comments (0)
About PowerShow.com