CAMP: Fast and Efficient IP Lookup Architecture - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

CAMP: Fast and Efficient IP Lookup Architecture

Description:

CAMP: Fast and Efficient IP Lookup Architecture. Sailesh Kumar, Michela Becchi, ... Trie based IP lookup. Circular pipeline architectures. Michela Becchi - * Context ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 28
Provided by: michela
Learn more at: https://www.cs.wustl.edu
Category:

less

Transcript and Presenter's Notes

Title: CAMP: Fast and Efficient IP Lookup Architecture


1
CAMP Fast and Efficient IP Lookup Architecture
  • Sailesh Kumar, Michela Becchi,
  • Patrick Crowley, Jonathan Turner
  • Washington University in St. Louis

2
Context
  • Trie based IP lookup
  • Circular pipeline architectures

3
Context
  • Trie based IP lookup
  • Circular pipeline architectures

IP address 111010
Prefix dataset
0 P1
000 P2
0010 P3
0011 P4
011 P5
10 P6
11 P7
110 P8
4
Context
  • Trie based IP lookup
  • Circular pipeline architectures

IP address 111010
Trie
Prefix dataset
P1
0 P1
000 P2
0010 P3
0011 P4
011 P5
10 P6
11 P7
110 P8
P7
P6
P5
P2
P8
P3
P4
5
Context
  • Trie based IP lookup
  • Circular pipeline architectures

IP address 111010
Trie
Prefix dataset
0 P1
000 P2
0010 P3
0011 P4
011 P5
10 P6
11 P7
110 P8
Stage 1
Stage 2
Stage 3
Stage 4
6
Context
  • Trie based IP lookup
  • Circular pipeline architectures

IP address 111010
Trie
Circular pipeline
Prefix dataset
0 P1
000 P2
0010 P3
0011 P4
011 P5
10 P6
11 P7
110 P8
Stage 1
Stage 2
Stage 3
Stage 4
7
CAMP Circular Adaptive and Monotonic Pipeline
  • Problems
  • Optimize global memory requirement
  • Avoid bottleneck stages
  • Make the per stage utilization uniform
  • Idea
  • Exploit a Circular pipeline
  • Each stage can be a potential entry-exit point
  • Possible wrap-around
  • Split the trie into sub-trees and map each of
    them independently to the pipeline

8
CAMP (contd)
  • Implications
  • PROS
  • Flexibility decoupling of maximum prefix length
    from pipeline depth
  • Upgradeability memory bank updates involve only
    partial remapping
  • CONS
  • A stage can be simultaneously an entry point and
    a transition stage for two distinct requests
  • Conflicts origination
  • Scheduling mechanism required
  • Possible efficiency degradation

9
Trie splitting
Direct index table
  • Define initial stride x
  • Use a direct index table with 2x entries for
    first x levels
  • Expand short prefixes to length x
  • Map the sub-trees

Subtree 1
Subtree 3
Subtree 2
E.g. initial stride x2
10
Dealing with conflicts
  • Idea use a request queue in front of each stage
  • Intuition without request queues,
  • a request may wait till n cycles before entering
    the pipeline
  • a waiting request causes all subsequent requests
    to wait as well, even if not competing for the
    same stages
  • Issue ordering
  • Limited to requests with different entry stages
    (addressed to different destinations)
  • An optional output reorder buffer can be used

11
Pipeline Efficiency
  • Metrics
  • Pipeline utilization fraction of time the
    pipeline is busy provided that there is a
    continuous backlog of requests
  • Lookups per Cycle (LPC) average request
    dispatching rate
  • Linear pipeline
  • LPC1
  • Pipeline utilization generally low
  • Not uniform stage utilization
  • CAMP pipeline
  • High pipeline utilization
  • Uniform stage utilization
  • LPC close to 1
  • Complete pipeline traversal for each request
  • pipeline stages trie levels
  • LPC gt 1
  • Most requests dont make complete circles around
    pipeline
  • pipeline stages gt trie levels

12
Pipeline efficiency all stages traversed
  • Setup
  • 24 stages, all traversed by each packet
  • Packet bursts sequences of packets to same entry
    point
  • Results
  • Long bursts result in high utilization and LPC
  • For all burst size, enough queuing (32)
    guarantees 0.8 LPC

13
Pipeline efficiency LPC gt 1
  • Setup
  • 32 stages, rightmost 24 bits, tree-bit map of
    stride 3
  • Average prefix length 24
  • Results
  • LPC between 3 and 5
  • Long bursts result in lower utilization and LPC

14
Nodes-to-stages mapping
  • Objectives
  • Uniform distribution of nodes to stages
  • Minimize the size of the biggest stage
  • Correct operation of the circular pipeline
  • Avoid multiple loops around pipeline
  • Simplified update operation
  • Avoid skipping levels

15
Nodes-to-stages mapping (contd)
  • Problem Formulation (constrained graph coloring)
  • Given
  • A list of sub-trees
  • A list of colors represented by numbers
  • Color nodes so that
  • Every color is nearly equally used
  • A monotonic ordering relationship without gaps
    among colors is respected when traversing
    sub-trees from root to leaves
  • Algorithm (min-max coloring heuristic)
  • Color sub-trees in decreasing order of size
  • At each steps
  • Try all possible colors on root (the rest of the
    sub-tree is colored consequentially)
  • Pick the local optimum

16
Min-max coloring heuristic - example
T4
T3
T2
T1
Present coloring If 1 on new root If 2 on new root If 3 on new root If 4 on new root
Color 1 1
Color 2 2
Color 3 4
Color 4 5
17
Min-max coloring heuristic - example
T4
T3
T2
T1
Present coloring If 1 on new root If 2 on new root If 3 on new root If 4 on new root
Color 1 1 2 5 3 2
Color 2 2 3 3 6 4
Color 3 4 6 5 5 8
Color 4 5 9 7 6 6
18
Min-max coloring heuristic - example
T4
T3
T2
T1
Present coloring If 1 on new root If 2 on new root If 3 on new root If 4 on new root
Color 1 1 2 5 3 2
Color 2 2 3 3 6 4
Color 3 4 6 5 5 8
Color 4 5 9 7 6 6
19
Min-max coloring heuristic - example
T4
T3
T2
T1
Present coloring If 1 on new root If 2 on new root If 3 on new root If 4 on new root
Color 1 3 4 5 4 5
Color 2 6 8 7 8 7
Color 3 5 6 7 6 7
Color 4 6 8 7 8 7
20
Min-max coloring heuristic - example
T4
T3
T2
T1
Present coloring If 1 on new root If 2 on new root If 3 on new root If 4 on new root
Color 1 3 4 5 4 5
Color 2 6 8 7 8 7
Color 3 5 6 7 6 7
Color 4 6 8 7 8 7
21
Min-max coloring heuristic - example
T4
T3
T2
T1
Present coloring If 1 on new root If 2 on new root If 3 on new root If 4 on new root
Color 1 5
Color 2 7
Color 3 7
Color 4 7
22
Evaluation settings
  • Trends in BGP tables
  • Increasing number of prefixes
  • Most of prefixes are lt26 bit (24 bit) long
  • Route updates can concentrate in short period of
    time however, they rarely change the shape of
    the trie
  • 50 BGP tables containing from 50K to 135K
    prefixes

23
Memory requirements
CAMP
Level based mapping
Height based mapping
  • Balanced distribution across stages
  • Reduced total memory requirements
  • Memory overhead 2.4 w/ initial stride 8, 0.02
    w/ initial stride 12, 0.01 w/ initial stride 16

24
Updates
  • Techniques for handling updates
  • Single updates inserted as bubbles in the
    pipeline
  • Rebalancing computed offline and involving only a
    subset of tries
  • Scenario
  • migration between different BGP tables
  • imbalance leads to 4 increase in occupancy of
    larger stage

25
Summary
  • Analysis of a circular pipeline architecture for
    trie based IP lookup
  • Goals
  • Minimize memory requirement
  • Maximize pipeline utilization
  • Handle updates efficiently
  • Design
  • Decoupling of stages from maximum prefix length
  • LPC analysis
  • Nodes to stages mapping heuristic
  • Evaluation
  • On real BGP tables
  • Good memory utilization and ability to keep
    40Gbps line rate through small memory banks

26
  • Thank you!

27
Addressing the worst case
  • Observations
  • We addressed practical datasets
  • Worst case tries may have long and skinny
    sections difficult to split
  • Idea adaptive CAMP
  • Split trie into parent and child subtries
  • Map the parent sub-trie into pipeline
  • Use more pipeline stages to mitigate effect of
    multiple loops around pipeline
Write a Comment
User Comments (0)
About PowerShow.com