Advanced Algorithms for Fast and Scalable Deep Packet Inspection - PowerPoint PPT Presentation

About This Presentation
Title:

Advanced Algorithms for Fast and Scalable Deep Packet Inspection

Description:

Why Regular Expressions Acceleration? RegEx are now widely used ... Need to accelerate RegEx! 3 - Sailesh Kumar - * Can we do better? ... – PowerPoint PPT presentation

Number of Views:128
Avg rating:3.0/5.0
Slides: 17
Provided by: csWu4
Learn more at: https://www.cs.wustl.edu
Category:

less

Transcript and Presenter's Notes

Title: Advanced Algorithms for Fast and Scalable Deep Packet Inspection


1
Advanced Algorithms for Fast and Scalable Deep
Packet Inspection
Sailesh Kumar Jonathan Turner John Williams
2
Why Regular Expressions Acceleration?
  • RegEx are now widely used
  • Network intrusion detection systems, NIDS
  • Layer 7 switches, load balancing
  • Firewalls, filtering, authentication and
    monitoring
  • Content-based traffic management and routing
  • RegEx matching is expensive
  • Space Large amount of memory
  • Bandwidth Requires 1 state traversal per byte
  • RegEx is performance bottleneck
  • In enterprise switches from Cisco, etc
  • Many security appliances
  • Use DFA, 1 GB memory, still sub-gigabit
    throughput
  • Need to accelerate RegEx!

3
Can we do better?
  • Well studied in compiler literature
  • Whats different in Networking?
  • Can we do better?
  • Construction time versus execution time (grep)
  • Traditionally, (construction execution) time is
    the metric
  • In networking context, execution time is critical
  • Also, there may be thousands of patterns
  • DFAs are fast
  • But can have exponentially large number of states
  • Algorithms exist to minimize number of states
  • Still 1) low performance and 2) gigabytes of
    memory

4
Delayed Input DFA (D2FA), SIGCOMM06
  • Many transitions
  • 256 transitions per state
  • 50 distinct transitions per state (real world
    datasets)
  • Need 50 words per state
  • Reduce number of transitions in a DFA

Three rules a, bc, cd
Look at state pairs there are many common
transitions. How to remove them?
4 transitions per state
5
Delayed Input DFA (D2FA), SIGCOMM06
  • Many transitions
  • 256 transitions per state
  • 50 distinct transitions per state (real world
    datasets)
  • Need 50 words per state
  • Reduce number of transitions in a DFA

Alternative Representation
Three rules a, bc, cd
4 transitions per state
Fewer transitions, less memory
6
D2FA Operation
Heavy edges are called default transitions Take
default transitions, whenever, a labeled
transition is missing
DFA
D2FA
7
D2FA versus DFA
  • D2FAs are compact but requires multiple memory
    accesses
  • Up to 20x increased memory accesses
  • Not desirable in off-chip architecture
  • Can D2FAs match the performance of DFAs
  • YES!!!!
  • Content Addressed D2FAs (CD2FA)
  • CD2FAs require only one memory access per byte
  • Matches the performance of a DFA in cacheless
    system
  • Systems with data cache, CD2FA are 2-3x faster
  • CD2FAs are 10x compact than DFAs

8
Introduction to CD2FA
  • How to avoid multiple memory accesses of D2FAs?
  • Avoid lookup to decide if default path needs to
    be taken
  • Avoid default path traversal
  • Solution Assign labels to each state, labels
    contain
  • Characters for which it has labeled transitions
  • Information about all of its default states
  • Characters for which its default states have
    labeled transitions

find node Rat location R
Content Labels
find node U athash(c,d,R)
find node V athash(a,b,hash(c,d,R))
9
Introduction to CD2FA
?(R, a)
?(R, b)

?(Z, a)
?(Z, b)









R
R
all
all


Z
U
c
l
cd,R
lm,Z


Y
d
m
pq,lm,Z
V
a
P
ab,cd,R


X
b
q







?(X, p)
?(X, q)


?(V, a)
?(V, b)

hash(p,q,hash(l,m,Z))
lm,Z
pq,lm,Z
hash(c,d,R)
Input char
d
a
hash(a,b,hash(c,d,R))
Current state V (label ab,cd,R)
? X (label pq,lm,Z)
10
Construction of CD2FA
  • We seek to keep the content labels small
  • Twin Objectives
  • Ensure that states have few labeled transitions
  • Ensure that default paths are as small as
    possible
  • D2FA construction heuristic based upon maximum
    weight spanning tree creates long default paths
  • Limit default paths gt less space efficient D2FAs
  • Proposed new heuristic called CRO to construct
    D2FAs
  • Runs in 3 phases Construction, Reduction and
    Optimization
  • Default path bound 2 edges gt CRO algorithm
    constructs upto 10x space efficient D2FAs
  • CD2FAs are constructed from these D2FAs

11
Memory Mapping in CD2FA
?(R, a)
?(R, b)

?(Z, a)
?(Z, b)









R
Z
R
all
all


U
Y
c
l
cd,R
lm,R


d
m
pq,lm,R
V
X
a
P
ab,cd,R


b
q







WE HAVE ASSUMED THAT HASHING IS COLLISION FREE
hash(a,b,hash(c,d,R))
hash(c,d,R))
hash(p,q,hash(l,m,Z))
COLLISION
12
Collision-free Memory Mapping












a
Four states
hash(abc, )
b
a
b
c
,
.

c
4 memory locations
p
hash(pqr, )
q
p
q
r
,
.

r
l
hash(def, )
hash(mln, )
WE NEED SYSTEMATIC APPRAOCH
n
,
.
l
m
m

n
hash(lmn, )
d
hash(edf, )
e
d
e
f
,
.

f
13
Bipartite Graph Matching
  • Bipartite Graph
  • Left nodes are state content labels
  • Right nodes are memory locations
  • Map state labels to unique memory locations
  • An edge for every choice of content label
  • Perfect matching problem
  • With n left and right nodes
  • Need O(logn) random edges
  • n 1M implies, we need 20 edges per node
  • If we provide slight memory over-provisioning
  • We can uniquely map state labels with much fewer
    edges
  • In our experiments, we found perfect matching
    without memory over-provisioning

14
Memory Reduction Results
15
Throughput Results
3x Faster 4KB cache
16
Conclusion
  • We have proposed CD2FAs
  • Matches/surpasses a DFA in throughput
  • 10x less memory than table compressed DFA
  • Novel randomized memory mapping algorithm based
    upon maximum matching in bipartite graph
  • Zero space overhead
  • Zero bandwidth overhead
  • Thank you and Questions???
Write a Comment
User Comments (0)
About PowerShow.com