Fast Firewall Implementation for Software and Hardware-based Routers - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Fast Firewall Implementation for Software and Hardware-based Routers

Description:

Trie Compression Algorithm. If a path AB satisfies the ... then we compress the entire branches by 3 edges. Center edge with value (AB) pointing to B ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 39
Provided by: lil8
Category:

less

Transcript and Presenter's Notes

Title: Fast Firewall Implementation for Software and Hardware-based Routers


1
Fast Firewall Implementation for Software and
Hardware-based Routers
  • Lili Qiu, Microsoft Research
  • George Varghese, UCSD
  • Subhash Suri, UCSB
  • 9th International Conference on Network Protocols
  • Riverside, CA, November 2001

2
Outline
  • Motivation for packet classification
  • Performance metrics
  • Related work
  • Our approaches
  • Performance results
  • Summary

3
Motivation
  • Traditionally, routers forward packets based on
    the destination field only
  • Firewall and diff-serv require packet
    classification
  • forward packets based on multiple fields in the
    packet header
  • e.g. source IP address, destination IP address,
    source port, destination port, protocol, type of
    service (ToS)

4
Packet Classification Based Router
HEADER
Forwarding Engine
Action
Packet Classification
Classifier (policy database)
Predicate
Action
----
----
----
----
Incoming Packet
----
----
5
Problem Specification
  • Given a set of filters (or rules), find the least
    cost matching filter for each incoming packet
  • Each filter specifies
  • Some criterion on K fields
  • Associated directive
  • Cost
  • ExampleRule 1 24.128.0.0/16 4.0.0.0/8
    udp denyRule 2 64.248.128.0/20
    8.16.192.0/24 tcp permitRule N
    24.2.0.0/16 4.16.128.0/20 any
    permit Incoming packet 24.128.34.8,
    4.16.128.3, udp Answer rule 1

6
Performance Metrics
  • Classification speed
  • Wire rate lookup for minimum size (40 byte)
    packets at OC192 (10 Gbps) speeds.
  • Memory usage
  • Should use memory linear in the number of rules
  • Update time
  • Slow updates are acceptable
  • Impact on search speed should be minimal

7
Related Work
  • Given N rules in K dimensions, the worst-case
    bounds
  • O(log N) search time, O(N(K-1)) memory
  • O(N) memory, O((log N)(K-1)) search time
  • Grid-of-tries (Srinivasan et.al. Sigcomm98)
  • Cross-producting (Srinivasan et.al. Sigcomm98)
  • Lucent bit vector scheme (Lakshman et.al.
    Sigcomm98)
  • RFC (Pankaj et.al. Sigcomm99)
  • Tuple Space Search (Srinivasan et.al. Sigcomm99)
  • Fat Inverted Segment Tree (Feldman et.al.
    Infocom00)
  • Entry Tuple Space Pruning (Srinivasan Infocom01)

8
Backtracking Search
  • A trie is a binary branching tree, with each
    branch labeled 0 or 1
  • The prefix associated with a node is the
    concatenation of all the bits from the root to
    the node

F1 00
F2 10
A
1
0
B
D
0
C
0
F1
E
F2
9
Backtracking Search (Cont.)
A
  • Extend to multiple dimensions
  • Standard backtracking
  • Depth-first traversal of the tree visiting all
    the nodes satisfying the given constraints
  • Example Search for 00,0,0Result F8

1
0
B
0
0
C
0
D
0
1
H
0
E
0
I
J
0
0
1
0
1
F8
F
F3
G
1
1
K
0
F6
F4
F2
F5
F7
F1
10
Set Pruning Tries
  • Multiplane trie
  • Fully specify all search paths so that no
    backtracking is necessary
  • Performance
  • O(logN) search time
  • O(N(k-1)) storage

11
Set Pruning Tries Conversion
  • Terminology
  • Descendant string
  • String S is a descendant of string S if S is a
    prefix of S
  • E.g. 00 is a descendant of
  • Descendant filter
  • Filter A is a descendent of filter B if for all
    dimensions j, string A(j) is a descendant of
    string B(j)
  • E.g. Filter 00,00 is a descendant of filter
    ,
  • Converting a backtracking trie to a set pruning
    trie is essentially replacing a general filter
    with its descendent filters

12
Set Pruning Tries Example
1
1
0
0
0
1
0
1
1
0
D
C
B
E
1
1
F2
0
0
F2
F2
F2
F2
F
A
F3
Min(F1,F2)
Min(F2,F3)
F1
Backtracking Trie
Set Pruning Trie
Replace ,, with 0,0,, 0,0,0,
0,1,, 1,0,,1,1,, and 1,1,1.
13
Performance Evaluation
  • 5 real databases from various sites
  • Performance metrics
  • Total storage
  • Total number of nodes in the multiplane trie
  • Worst-case lookup time
  • Total number of memory accesses in the worst-case
    assuming 1 bit at a time trie traversal

14
Performance Results
Database Rules Backtracking Backtracking Set Pruning Tries Set Pruning Tries
Database Rules Lookup time Storage Lookup time Storage
1 67 146 1848 86 5541
2 158 153 4914 102 51785
3 183 169 3949 102 59180
4 279 202 6785 102 123951
5 266 208 6555 102 165920
Backtracking has small storage and affordable
lookup time.
15
Major Optimizations
  • Trie compression algorithm
  • Pipelining the search
  • Selective pushing

16
Trie Compression Algorithm
0
  • If a path AB satisfies the Compressible Property
  • All nodes on its left point to the same place L
  • All nodes on its right point to the same place R
  • then we compress the entire branches by 3
    edges
  • Center edge with value ?(AB) pointing to B
  • Left edge with value lt ?(AB) pointing to L
  • Right edge with value gt ?(AB) pointing to R
  • Advantages of compression save time storage

0 branch gt01010
0 branch lt 01010
0
1
0 branch 01010
F1
1
0
0
1
F3
F1
1
F2
F1
F3
0
F2
F3
17
Performance Evaluation of Compression
Database Lookup Time of Uncompressed Lookup Time of Compressed
1 146 30
2 153 51
3 169 49
4 202 98
5 208 59
Compression reduces the lookup time by a factor
of 2 - 5
18
Pipelining Backtracking
  • Use pipeline to speed up backtracking
  • Issues
  • The amount of register memory passed between
    pipelining stages need to be small
  • The amount of main memory need to be small
  • Our approaches
  • Propose a backtracking search that only needs K1
    registers (K is the number of dimensions)
  • Have pipeline stage i store only the trie nodes
    that will be visited in the stage i

19
Pipelining BacktrackingLimit the amount of
register
  • Standard backtracking requires O(WK) state for
    filters with K fields and W-bit long
  • Our approach
  • Visit more general filters first, and more
    specific filters later
  • Example
  • Search for 00,0,0A-B-H-J-K-C-D-E-F-GResult
    F8
  • Performance
  • K1 32-bit registers

A
1
0
D
B
0
0
C
0
D
0
1
H
0
E
S
0
I
J
0
0
P
1
0
1
F8
F
F3
G
1
1
K
0
F6
F4
F2
F5
F7
F1
20
Pipelining Backtracking Limit the amount of
memory
  • Simple approach
  • Store an entire backtracking search trie at every
    pipelining stage
  • Storage increases linearly with the number of
    pipelining stages
  • Our approach
  • Have pipeline stage i store only the trie nodes
    that will be visited in the stage i

21
Storage Requirement for Pipeline
22
Trading Storage for Time
  • Smoothly tradeoff storage for time
  • Observations
  • Set pruning tries eliminate all backtracking by
    pushing down all filters ? intensive storage
  • Eliminate backtracking for filters with large
    backtracking time
  • Selective push
  • Push down the filters with large backtracking
    time
  • Iterate until the worst-case backtracking time
    satisfies our requirement

O((logN)(k-1)) Time (e.g. Backtrack)
O(N(k-1)) Space (e.g. Set Pruning)
23
Example of Selective Pushing
  • Goal worst-case memory accesses lt 12
  • The filter 0, 0, 000 has 12 memory accesses.
  • Push the filter down ? reduce lookup time
  • Now the search cost of the filter 0,0,001
    becomes 12 memory accesses. So we need to push it
    down. Done!

0
0
0
0
0
0
0
0
0
0
0
0
F3
0
0
0
0
0
F3
F3
0
0
1
1
1
0
0
0
1
0
0
0
0
F2
0
F2
F2
F2
F1
F1
F1
F1
F1
24
Performance of Selective Push
Compressed Trie
Uncompressed Trie
25
Summary
Approach Description Performance Gain
Trie compression algorithm Effectively exploit redundancy in trie nodes Reduce lookup time by a factor of 2 5, save storage by a factor of 2.8 8.7
Pipelining the search Split the search into multiple pipelining stages, and each stage is responsible for a portion of search Increase throughput with marginal increase in memory cost
Selective push Push down the filters with large backtracking time Reduce lookup time by 10 25 with only marginal increase in storage
26
Traditional routers Destination address lookup
Forwarding Engine
Next Hop
Dstn Addr
Next Hop Computation
Forwarding Table
Dstn-prefix
Next Hop
----
----
----
----
Incoming Packet
----
----
  • Unicast destination address based lookup

27
Selective Push
  • Main idea
  • Push down the filters with large backtracking
    time
  • Iterate until the worst-case backtracking time
    satisfies our requirement

28
Packet Classification
  • Motivation for packet classification
  • Needed for implementing firewalls and diff-serv
  • Problem specification
  • Given a classifier of N rules, find the least
    cost matching filter for the incoming packets
  • ExampleRule 1 24.128.0.0/16 4.0.0.0/8 udp
    denyRule 2 64.248.128.0/20 8.16.192.0/24
    tcp permitRule N 24.0.0.0/8 4.16.128.0/20
    any permit Incoming packet 24.128.34.8,
    4.17.135.3, udp matches rule 1
  • Performance metrics
  • Classification speed
  • Memory usage
  • Update time

29
Related Work
  • Given N rules in K dimensions, the worst-case
    bounds
  • O(log N) search time, O(N(K-1)) memory
  • O(N) memory, O((log N)(K-1)) search time
  • Tree based
  • Grid-of-tries (Srinivasan et.al. Sigcomm98)
  • Fat Inverted Segment Tree (Feldman et.al.
    Infocom00)
  • Lucent bit vector scheme (Lakshman et.al.
    Sigcomm98)

30
Related Work (Cont.)
  • Cross-producting (Srinivasan et.al. Sigcomm98)
  • RFC (Pankaj et.al. Sigcomm99)
  • Tuple space search
  • Tuple space search (Srinivasan et.al. Sigcomm99)
  • Entry Tuple Space Pruning (Srinivasan Infocom01)

31
Related Work
  • Given N rules in K dimensions, the worst-case
    bounds
  • O(log N) search time, O(N(K-1)) memory
  • O(N) memory, O((log N)(K-1)) search time
  • Grid-of-tries (Srinivasan et.al. Sigcomm98)
  • Fat Inverted Segment Tree (Feldman et.al.
    Infocom00)
  • Lucent bit vector scheme (Lakshman et.al.
    Sigcomm98)
  • Cross-producting (Srinivasan et.al. Sigcomm98)
  • RFC (Pankaj et.al. Sigcomm99)
  • Tuple space search (Srinivasan et.al. Sigcomm99)

32
Trie Compression Algorithm
  • If a path AB satisfies the Compressible
    Property
  • All nodes on its left point to the same place L
  • All nodes on its right point to the same place R
  • then we compress the entire branches by 3 edges
  • Center edge with value ?(AB) pointing to B
  • Left edge with value lt ?(AB) pointing to L
  • Right edge with value gt ?(AB) pointing to R
  • Advantages of compression save time storage

33
Trie Compression Algorithm
0
0
0 branch gt01010
1
0 branch lt 01010
F1
1
0 branch 01010
0
0
1
F3
F1
1
0
F2
F3
34
Backtracking Search (Cont.)
  • Extend to multiple dimensions
  • Backtracking is a depth-first traversal of the
    tree which visits all the nodes satisfying the
    given constraints
  • Example search for 00,0,0

35
Example of Selective Push
  • Goal worst-case memory accesses lt 12
  • The filter 0, 0, 0000 has 12 memory
    accesses.
  • Push the filter down ? reduce lookup time
  • Now the search cost of the filter 0,0,001
    becomes 12 memory accesses. So we need to push it
    down. Done!

36
Example of Selective Push
  • Goal worst-case memory accesses lt 12
  • The filter 0, 0, 0000 has 12 memory
    accesses.
  • Push the filter down ? reduce lookup time
  • Now the search cost of the filter 0,0,001
    becomes 12 memory accesses. So we need to push it
    down. Done!

37
Using Available Hardware
  • So far, we have focused on software techniques
    for packet classification.
  • Further improve the performance by taking
    advantage of limited hardware if it is available
  • By moving some filters (or rules) from software
    to hardware
  • Key issue Which filters to move from software to
    hardware?Answer
  • To reduce lookup time, move the filters with the
    largest number of memory accesses when using
    software approach

38
Challenge of Packet Classification
  • The general packet classification problem has
    poor worst-case cost
  • Given N arbitrary filters with k packet fields
  • either the worst-case search time is
    O((logN)(k-1))
  • or the worst-case storage is O(N(k-1))
Write a Comment
User Comments (0)
About PowerShow.com