Packet Classification On Multiple Fields presentation

About This Presentation

Transcript and Presenter's Notes

Title: Packet Classification On Multiple Fields

1
Packet Classification On Multiple Fields

Pankaj Gupta and Nick McKeown
Computer Systems Laboratory,
Stanford University
pankaj,nickm_at_stanford.edu

2
Why classify packets ?

To determine which flow they belong to
gt to decide what service they should receive

Router needs to identify the flow of every
incoming packet and then perform appropriate
special processing
3
Special Processing Requires Identification of
Flows

All packets of a flow obey a pre-defined rule and
are processed similarly by the router
Classification is based on an arbitrary number of
fields in the packet header
E.g. a flow (src-IP-address, dst-IP-address),
or a flow (dst-IP-prefix, protocol) etc.

4
Network services

Routing
Access-control in firewalls
Policy-based routing
Provision of differentiated qualities of service
Traffic billing

5
What to determine?

Forward or filter a packet?
Where to forward it to?
What class of service to receive?
How much to charge for transpoting it?

6
Packet Classifier
HEADER
Forwarding Engine
Action
Packet Classification
Classifier (policy database)
rules
Action
----
----
----
----
Incoming Packet
----
----
7
Need for Differentiated Services
E1
Y
E2
Z
ISP3
NAP
ISP2
X
ISP1
8
Table 2
Class
Relevant Packet Fields
Source Link-layer Address,Source Transport port
number
Email and from ISP2
Source Link-layer Address
From ISP2
From ISP3 and going to E2
Source Link-layer Address Destination
Network-Layer Address
All other packets
---------
9
Packet Classification Problem Definition

Given a classifier C with N rules, Rj, 1 ? j ? N,
where Rj consists of three entities
A regular expression Rji, 1 ? i ? d, on each of
the d header fields,
A number, pri(Rj), indicating the priority of the
rule in the classifier, and
An action, referred to as action(Rj).

For an incoming packet P with the header
considered as a d-tuple of points (P1, P2, ,
Pd), the d-dimensional packet classification
problem is to find the rule Rm with the highest
priority among all the rules Rj matching the
d-tuple i.e., pri(Rm) gt pri(Rj), ? j ? m, 1 ? j
? N, such that Pi matches Rji, 1 ? i ? d. We
call rule Rm the best matching rule for packet P.
10
Classification is a Generalization of Lookup

Classifier routing table
One-dimension (destination address)
Rule routing table entry
Regular expression prefix
Action (next-hop-address, port)
Priority prefix-length

11
Example 4D classifier
152.163.198.4/255.255.255.255
152.163.36.0/255.255.255.255
tcp
R6
Gt 1023
Permit
12
Example Classification Results
13
General characteristics of Classifiers

Number of rules
not a large number
0.7 more than 1000 mean of 50 rules
Number of fields
max of 8 fields src/dst network
layeraddress
src/dst
transport layer port numbers
type-of-service fieldTOS
protocol
field
transport-layer protocol flags
17 of rules 1 field , 23 3 fields ,
60 4 fields

14
General characteristics of Classifiers (contd.)

Transport-layer protocol field
TCP,UDP,ICMP,IGMP,(E)IGRP,GRE,IPINIP or
Transport-layer field specification
10.2 have range specification
Rules with non-contiguous mask
14 of classifiers have 10.2 of all rules
Many different rules in the same classifier share
a number of field specification
Redundant rules
8 of rules in classifiers
4.4 of rules are
backward redundant
3.6 of rules are
forward redundant

15
Goals

The algorithm should
Be fast enough to operate at OC48c linerates
and preferably at OC192c linerates
Allow matching on arbitrary fields
Support general classification rules
prefixes,operators,wildcards
Be suitable for implementation in both software
and hardware
Not have expensive memory requirements
Scale in terms of both memory and speed with the
size of the classifier

16
Previous work

simplest classification algorithm
evaluating rules
sequentially
simple and efficient in its use of memory
poor scaling properties
time grows linearly with the number of
rules

17
Classification with Ternary-CAMs
TCAM
Memory array
0
0
1
1
2
0
3
0
Packet Header
Priority
The first matching rule
encoder
M
1
Too expensive,too small,and consume too much
power for large classifiers
18
Structure of the Classifiers
4 regions
R3
R2
R1
A classification algorithm must keep a record of
each region and be able to determine the region
to which each newly arriving packet belongs
19
Structure of the Classifiers
7 regions
R3
R2
R1
The more region the classifier contains,the more
storage is required and the longer it takes to
classify a packet
20
Algorithm

Packet Classification problem
S bits in the packet header gt T bits of classID
T log N N is number of classifier rules
A simple and fast way of doing this mapping
pre-compute the value of classID for each of
the
2S different packet headers
Yield the answer in one step in one memory
access
Require too much memory

21
Recursive Flow Classificationperform the same
mapping but over several stages
One-step
2S 2128
2T 212
22
Recursive Flow Classification

Consists of P phases
each with a set of parallel memory lookups
Each lookup is a reduction
the value returned by the memory lookup is
shorter than the index of the memory
access

23
Chunking of a Packet
Used to index into multiple memories in parallel
Chunk 0
Source L3 Address
Destination L3 Address
L4 protocol and flags
Source L4 port
Destination L4 port
Type of Service
Chunk 7
Packet Header
24
Packet Flow
eqID
index
action
Header
Phase 0
Phase 1
Phase 2
Phase 3
25
Example 4D classifier
152.163.198.4/255.255.255.255
152.163.36.0/255.255.255.255
tcp
R6
Gt 1023
Permit
26

In phase 0
chunk6
1.www80 2.20,21 3.gt1023
4.remaining numbers
can be encoded by 00b to 11b eqIDs
reduction 16 to 2 bits
chunk4
1.tcp 2.udp 3.remaining numbers
can be encoded by 2 bits reduction
8 to 2 bits
In phase 1
CESs .(80,udp)
2.(20-21,udp) 3.(80,tcp)
4.(gt 1023,tcp) 5.all
remaining crossproducts
concatenating reduction 4
to 3 bits
can be encoded by 3 bits total
reduction 24 to 3 bits

27
RFC preprocessing for chunk j of phase 0

For each rule rl in the classifier
project ith component of rl onto the number
line (from 0 to 2b-1) making the
start and end points of each of its
constituent intervals
End for
Bmp 0
For n in 02b-1
If(any rule starts or ends at n)
update bmp
if(bmp not seen earlier)
eq new_Equivalence_Class( )
eq -gt cbm bmp
end if
End if
Else eq the equovalence class whose cbm is bmp
table_0_jn eq-gtID
End for

28
RFC preprocessing for chunk i of phase j(jgt0)

Index 0
listEqs nil
For each CES,c1eq,of chunk c1
For each CES,c2eq,of chunk c2
For each CES,cmeq,of chunk cm
intersectedBmp c1eq-gtcbm c2eq-gtcbm
cmeq-gtcbm
neweq searchList(listEqs,intersectedBmp)
if(not found in listEqs)
neweq new_Equivalence_Class( )
neweq-gtcbm bmp
add neweq to listEqs
end if
table_j_iindex neweq-gtID
index
End for

29
Performance of RFC

1.number of phases P
we combine those chunks together which have
the most correlation
2.the reduction tree used
we combine as many chunks as we can without
causing unreasonable memory consumpsion

30
Choice of Reduction Tree
Tree_B
Tree-A
0
0
1
1
2
2
ClassID
ClassID
3
3
4
4
5
5
Number of phases P 3 10 memory accesses
31
Choice of reduction tree
Tree_A
Tree_B
0
0
1
1
2
2
ClassID
ClassID
3
3
4
4
5
5
Number of phases P 4 11 memory accesses
32
RFC lookup in Hardware
Phase 1
Phase 0
Chks0 and 1 replicated
SRAM1
chk0 chk1 chk0 chk1
Chks0-2
SDRAM1
Chk0
Chk0 (replicated)
Chks3-5
SRAM2
SDRAM2
Phase 2
Clk 125MHZ gt 31.25
million packets per second
33
RFC lookup in software

30 lines of code in C
compiled on a 333Mhz PentiumII PC running
windows NT
worst case path for the code took
(140clks9tm) for three phases and (146clks11tm)
for four phases
tm memory access time 60
ns
gt 0.98us for 3 phase 1.1us for 4 phases
close to one million packets per second
the average lookup time is 50 faster than the
worst case

34
RFc lookup operation

For(each chunk,chknum of phase 0)
eqNums0chkNum contents of appropriate
rfctable at memory address pktFieldschkNum
For(phaseNum1numphases-1)
For(each chunk,chkNum,in Phase phaseNum)
chd parent descriptor of
(phaseNum,chknum)
Index eqNumsphaseNum of
chkParents0chkNum ofchkParents0
For(I1chd-gtnumChkParents-1)
index index (total equivIDs of
chd-gtchkParentsI)
eqNumsphaseNum of chd-gtchkParentsI
chkNum of chd-gtchkParentsI
End for
eqNumsphaseNumchkNum contents of
appropriate rfctable at address index
End for
Return eqNums0numphases-1

35
Table 6

Src L3 31..16
Src L3 15..0
Dst L3 15..0
Dstn L4 16 bits
Dst L3 31..16
L4 protocol 8 bits
Action

0.77/1
0.0/0.0
0
0.83/1..
4.6/1
udp

permit
0.0/0.0
4.6/1
1.0/255.0
1
0.83/1..
udp
20-30
permit

21
0.0/1
2
0.83/1..
0.0/0.0
0.77/1
permit
3
0.0/0.0
0.0/0.0
0.0/1

21
deny
0.0/0.0
4
0.0/0.0
0.0/0.0
0.0/1
0.0/0.0

permit
36
Variations and improvements of RFC

1.RFC can be extended to process a larger number
of fields in each packet header
2.speed up RFC by taking advantage of available
fast lookup algorithms
3.employ adjacency groups technique
to reduce the memory requirements when
processing large
classifiers

37
Adjacency Groups

Size of the RFC table number of CES s
R S are adjacent in dimension I if
1.they have the same action
2.all but the ith field have the exact same
specification in the two
rules
3.all rules appearing between them have
either
the same action or are disjoint from
R
two rules are simple adjacent if they are
adjacent in
some dimension
SO we will merge adjacent rules

38
Example of adjacency groups
R(a1,b1,c1,d1) S(a1,b1,c2,d1) T(a2,b1,c2,d1) U(a2,
b1,c1,d1) V(a1,b1,c4,d2) W(a1,b1,c3,d2) X(a2,b1,c3
,d2) Y(a2,b1,c4,d2)
RS(a1,b1,c1c2,d1) TU(a2,b1,c1c2,d1) VW(a1,b1,c3
c4,d2) XY(a2,b1,c3c4,d2)
Merge along Dimension 3
Merge along Dimension 1
RSTU(a1a2,b1,c1c2,d1) VWXY(a1a2,b1,c3c4,d2)
Carry out an RFC phase Assume chunks 1 2 are
combined And also chunks 3 4 are combined
RSTU(m1,n1) VWXY(m1,n2))
RSTUVWXY(m1,n1n2)
Merge
Continue with RFC
39
RFC Pros and Cons

Packet Classification On Multiple Fields PowerPoint PPT Presentation