Scalability Considerations for Programmable Networks - PowerPoint PPT Presentation

1 / 67
About This Presentation
Title:

Scalability Considerations for Programmable Networks

Description:

Students: Su Wen, Andy Martin, Najati Imam, Muthulakshmi Muthukumaraswamy ... Basic service is useful to a wide range of applications ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 68
Provided by: kenc183
Category:

less

Transcript and Presenter's Notes

Title: Scalability Considerations for Programmable Networks


1
Scalability Considerationsfor Programmable
Networks
  • Jim Griffioen
  • Laboratory for Advanced Networking
  • University of Kentucky, USA
  • Collaborator Prof. Ken Calvert
  • Students Su Wen, Andy Martin, Najati Imam,
    Muthulakshmi Muthukumaraswamy

Research supported by DARPA and Intel
2
Programmable Services(Outline)
  • Background and problem description
  • Design goals and consequences
  • Overview of ESP, a lightweight network service
  • A simple example
  • Engineering considerations
  • Other uses
  • Status

3
The Internet
  • The Internet has proven itself to be a robust,
    scalable, and flexible communication
    infrastructure
  • Simple service abstraction
  • Basic service is useful to a wide range of
    applications
  • Additional services must be implemented at
    end-systems (end-to-end principle)
  • Best-effort service
  • Service can be implemented with low cost
  • Processing Cycles can route packets quickly
  • Router State aggregates router state
  • Result Can support hundreds of thousand of flows
    and millions of end-systems

4
The Evolving Internet
  • New and emerging applications require new
    network-level services not supported by the
    Internet Protocol
  • Examples include
  • Flow Prioritization (forwarding/dropping/routing)
  • Congestion management (RED, ECN)
  • Reliable dissemination (PGM)
  • Specification-based anycasting
  • Layered multicasting (RLM)
  • Scalable aggregation services (Concast)
  • Single source multicast (Express)
  • .... And many more application specific
    services .

5
New Service Approaches
  • Closed Approach Rely on router vendors to add
    new application-specific services to their
    equipment
  • Extremely Open Approach Allows end-systems to
    dynamically load arbitrary code into network
    routers
  • Application Layer Approach Implement new
    services soley at the application layer without
    involving the network layer

6
Basic Router Functions
receive
Lookup enqueue
transmit
Routing Table
congestion control
7
New Service Approaches
  • Closed Approach Rely on router vendors to add
    new application-specific services to their
    equipment
  • Extremely Open Approach Allows end-systems to
    dynamically load arbitrary code into network
    routers
  • Application Layer Approach Implement new
    services soley at the application layer without
    involving the network layer

8
Programmable Router I
transmit
execute
receive
Virtual Machine
program
input
state store
9
Programmable Router II
transmit
execute
Active Application
Virtual Machine
state store
10
New Service Approaches
  • Closed Approach Rely on router vendors to add
    new application-specific services to their
    equipment
  • Extremely Open Approach Allows end-systems to
    dynamically load arbitrary code into network
    routers
  • Application Layer Approach Implement new
    services only at the application layer without
    involving the network layer

11
A New Approach (ESP/LWP)
  • We need a new approach, a middle ground, that
    opens the traditionally opaque network layer
    just enough.
  • We want to allow end-systems to extract
    information about the network or control the way
    the network processes/handles packets without
    exposing too much or creating scalability or
    security problems

12
Our Design Goals
  • A programmable network service that has IP-like
    characteristics
  • Flexible
  • Applicable to more than one kind of problem,
    including presently unknown problems
  • Useful
  • Deployable today
  • Solves (or assists in solving) one or more real
    problems
  • Scalable
  • Can potentially be used by every end system
  • Accomodates 100 000 simultaneous flows
  • Robust
  • Best effort

13
Requirements
  • Allow user-specified information to be stored,
    modified, retrieved inside the network
  • Necessary to solve interesting problems
  • Too cheap to meter
  • Service must be accessible without special
    authorization
  • Crypto authentication/access policies dont scale
    to 100K flows
  • Negligible management overhead
  • DoS-resistance
  • Leverage existing IP forwarding infrastructure
  • Dont re-invent this wheel
  • Necessary for deployability

14
User-controlled State
  • Conventional wisdom Not scalable
  • Too expensive to provide for 100K flows
  • Overhead of managing setup, soft-state refresh,
    garbage collection
  • Limiting factors
  • Time-space product of memory usage (per flow)
  • Signaling overhead and robustness against errors
  • Goal
  • Bound the Time-Space product
  • Reduce or avoid signaling overhead
  • Expect and tolerate errors

15
Per-Packet Processing
  • Centralized computation at routers does not scale
  • Cannot assume all packets processed by a single
    processor
  • Need per-port processing
  • Should be comparable to IP forwarding
  • Approximately constant cost per packet (i.e.,
    active network capsules are not reasonable)
  • Hardware-friendly
  • Goal
  • Bounded per-packet processing costs
  • Processing may not modify the IP header!
  • Instructions may modify packet payload or state
    store

16
Ephemeral State Processing
  • ESP Solution
  • fixed-lifetime state store
  • no management overhead
  • fixed-length computations
  • ESP Components
  • 1) Ephemeral State Store
  • Associative memory set of (tag, value) pairs
  • Fixed size tags and values (e.g. 64 or 128
    unstructured bit strings)
  • Tags ? names of variables
  • Tags randomly selected gt private stores
  • Bindings persist for a (short) fixed time??, then
    vanish
  • Bindings cannot be refreshed
  • No management overhead

17
Ephemeral State Processing
  • 2) Set of packet-borne instructions
  • Fixed-length computations (one per packet)
  • create/update bindings
  • update fields in packet payload (only)
  • operands control behavior
  • On termination, forward or discard packet
  • 3) Wire protocol
  • Instructions are carried in specially marked ESP
    packets recognized and executed hop-by-hop by
    routers on the way to the destination
  • Contain the instruction to execute and its
    parameters
  • Piggy-backed computations are possible

18
ESP Probes
  • Two Common Types of ESP Probes
  • Setup Probe
  • create or modify value bindings for the given ESP
    tags
  • Collect Probe
  • retrieve values associated with the given ESP tags

19
Example Nack Suppression
1. Sender multicasts packet w/ seq N
20
Example Nack Suppression
21
Example Nack Suppression
Count-filter (CF) instruction
Operands Counter tag c Threshold
value thr
if (c ? ?) c 1 fwd else if (c lt pkt.thr)
c fwd else discard
(N,1)
(N,1)
(N,1)
2. Receivers who did not receive pkt N send
Nack with piggybacked ESP instruction CF
22
Example Nack Suppression
Count-filter (CF) instruction
Operands Counter tag c Threshold
value thr
if (c ? ?) c 1 fwd else if (c lt pkt.thr)
c fwd else discard
(N,1)
(N,1)
(N,1)
(N,2)
(N,2)
23
Example Nack Suppression
Count-filter (CF) instruction
Operands Counter tag c Threshold
value thr
if (c ? ?) c 1 fwd else if (c lt pkt.thr)
c fwd else discard
(N,1)
(N,1)
(N,1)
(N,2)
(N,2)
(N,2)
24
Another Example Finding Slowest ReceiverPhase 1
If (c ? ?) c discard else c1 forward
COUNTCH instruction
B
A
D
S
r1
r2
r3
Stimulus
C
E
Send COUNTCH use tag c
Time0
25
Example Finding Slowest ReceiverPhase 1
If (c ? ?) c discard else c1 forward
COUNTCH instruction
B
A
D
S
r1
r2
r3
C
E
Send COUNTCH use tag c
Time1
26
Example Finding Slowest ReceiverPhase 1
If (c ? ?) c discard else c1 forward
COUNTCH instruction
B
A
D
S
r1
r2
r3
C
E
Send COUNTCH use tag c
Time2
27
Example Finding Slowest ReceiverPhase 1
If (c ? ?) c discard else c1 forward
COUNTCH instruction
B
A
D
S
r1
r2
r3
(c,1)
C
E
Send COUNTCH use tag c
Time3
28
Example Finding Slowest ReceiverPhase 1
If (c ? ?) c discard else c1 forward
COUNTCH instruction
B
A
(c,1)
D
S
r1
r2
r3
(c,1)
(c,1)
C
E
Send COUNTCH use tag c
Time4
29
Example Finding Slowest ReceiverPhase 1
If (c ? ?) c discard else c1 forward
COUNTCH instruction
B
A
(c,1)
(c,1)
D
S
r1
r2
r3
(c,2)
(c,1)
C
E
Send COUNTCH use tag c
Time5
30
Example Finding Slowest ReceiverPhase 1
If (c ? ?) c discard else c1 forward
COUNTCH instruction
B
A
(c,2)
(c,1)
D
S
r1
r2
r3
(c,2)
(c,2)
C
E
Send COUNTCH use tag c
Time6
31
Example Finding Slowest ReceiverPhase 1
If (c ? ?) c discard else c1 forward
COUNTCH instruction
B
A
(c,3)
(c,1)
D
S
r1
r2
r3
(c,2)
(c,2)
C
E
Send COUNTCH use tag c
Time7
32
Example Finding Slowest ReceiverPhase 2
v min(v,pkt.v) if (--c 0) pkt.v v
forward
COLLECT instruction
B
A
(c,3)
(c,1)
D
S
r1
r2
r3
(c,2)
(c,2)
C
E
Send COLLECT use tags c,v
Time8
33
Example Finding Slowest ReceiverPhase 2
v min(v,pkt.v) if (--c 0) pkt.v v
forward
COLLECT instruction
B
A
(c,3)
(c,1)
D
S
r1
r2
r3
(c,2)
(c,2)
C
E
Send COLLECT use tags c,v
Time9
34
Example Finding Slowest ReceiverPhase 2
v min(v,pkt.v) if (--c 0) pkt.v v
forward
COLLECT instruction
B
A
(c,3)
(c,1)
D
S
r1
r2
r3
(c,2)
(c,2)
C
E
Send COLLECT use tags c,v
Time10
35
Example Finding Slowest ReceiverPhase 2
v min(v,pkt.v) if (--c 0) pkt.v v
forward
COLLECT instruction
B
A
(c,3)
(c,1)
D
S
r1
r2
r3
(c,1) (v,3)
(c,2)
C
E
Send COLLECT use tags c,v
Time11
36
Example Finding Slowest ReceiverPhase 2
v min(v,pkt.v) if (--c 0) pkt.v v
forward
COLLECT instruction
B
A
5
(c,3)
(c,1)
D
S
r1
r2
r3
(c,1) (v,3)
(c,1) (v,2)
C
E
Time12
37
Example Finding Slowest ReceiverPhase 2
v min(v,pkt.v) if (--c 0) pkt.v v
forward
COLLECT instruction
B
A
(c,2) (v,5)
(c,1)
D
2
S
r1
r2
r3
(c,1) (v,3)
(c,1) (v,2)
C
E
Time13
38
Example Finding Slowest ReceiverPhase 2
v min(v,pkt.v) if (--c 0) pkt.v v
forward
COLLECT instruction
B
A
(c,1) (v,2)
(c,1)
D
S
r1
r2
r3
4
(c,1) (v,3)
(c,1) (v,2)
C
E
Time14
39
Example Finding Slowest ReceiverPhase 2
v min(v,pkt.v) if (--c 0) pkt.v v
forward
COLLECT instruction
B
A
(c,0) (v,2)
(c,1)
D
S
r1
r2
r3
(c,1) (v,3)
(c,1) (v,2)
C
E
Time15
40
Example Finding Slowest ReceiverPhase 2
v min(v,pkt.v) if (--c 0) pkt.v v
forward
COLLECT instruction
B
A
(c,0) (v,2)
(c,1)
D
S
r1
r2
r3
(c,1) (v,3)
(c,0) (v,2)
C
E
Time16
41
Example Finding Slowest ReceiverPhase 2
v min(v,pkt.v) if (--c 0) pkt.v v
forward
COLLECT instruction
B
A
(c,0) (v,2)
(c,1)
D
S
r1
r2
r3
(c,0) (v,2)
(c,0) (v,2)
C
E
Time17
42
Example Finding Slowest ReceiverResult
v min(v,pkt.v) if (--c 0) pkt.v v
forward
COLLECT instruction
B
A
(c,0) (v,2)
(c,0) (v,2)
D
S
r1
r2
r3
2
(c,0) (v,2)
(c,0) (v,2)
C
E
Time18
43
Wire Protocol
IP
Op Code
Standalone
Flow ID
RA ProtoESP
Loc
Err
Operands
Piggyback
e.g. RTP
  • Location field specifies where processing occurs
  • (and where state is stored)
  • Input port
  • Output port
  • Both
  • Neither (aborted instruction)
  • Centralized context
  • Flow ID sorts packets for serial execution
  • Error field carries error code from exceptions

44
Input/Output/Central Processing Contexts
Switch Fabric
ESP
Output Context
Normal IP Input Processing
Normal IP Output Processing
45
Input/Output/Central Processing Contexts
Switch Fabric
ESP
Output Context
Normal IP Input Processing
Normal IP Output Processing
46
Input/Output/Central Processing Contexts
Switch Fabric
ESP
Output Context
Normal IP Input Processing
Input Context
Normal IP Output Processing
47
Input/Output/Central Processing Contexts
Switch Fabric
ESP
Output Context
Normal IP Input Processing
Input Context
Normal IP Output Processing
48
Input/Output/Central Processing Contexts
Switch Fabric
ESP
Output Context
Normal IP Input Processing
Input Context
Normal IP Output Processing
Both Contexts
49
Input/Output/Central Processing Contexts
Switch Fabric
ESP
Output Context
Normal IP Input Processing
Input Context
Normal IP Output Processing
Both Contexts
Central Context
50
Engineering ESP
  • Tag, value sizes
  • 64 bits yields acceptably low collision
    probabilities
  • Store capacity is independent of tag size
  • Setting store lifetime ?
  • Store capacity maximized by minimizing ?
  • For scalability minimize ?
  • But need to be able to complete useful
    computations
  • For robustness maximize ?
  • 10 seconds (enough for 2-3 round-trips through
    the network)
  • Challenges
  • Wire-speed implementation
  • Minimize store access time
  • Minimize store cost

51
Probability of Tag Collision (64 bits)
18 20 22 24
26 28 30
52
Prototype Status
  • FPGA
  • Microcoded processor small ESS
  • Extendable ESP instruction set
  • Non-pipelined proof-of-concept
  • Design runs full speed on 100MHz Virtex chip
  • ESS 6 cycle access time
  • Network Processor
  • ESP/ESS Running on StrongARM
  • Moving to ?Engines
  • Software ESS
  • 2 ?sec access time

53
H/W Ephemeral State Processor
Input Packet
Output Packet
Packet Register
Input Control
Output Control
Macro Controller
? Instruction Store
Ephemeral State Store
? Con- troller
? Instruction Reg
Tag Registers
Value Registers
Location Registers
ALU
54
Network Processor Implementation
Router
  • Per-port ESP facility
  • Transparent to router
  • Intel BridalVeil IXP1200 board
  • 8 100M Ethernet ports
  • ESP/ESS running on StrongARM core
  • 6 microengines, four threads each
  • SRAM DRAM
  • Moving to ?Engines
  • Software ESS
  • 2 ?sec access time

55
Leveraging ESP
  • Observation
  • Application-specific processing often only needed
    at a few nodes
  • Idea
  • Use ESP to identify where processing needs to
    occur
  • Deploy functionality directly from the end-systems

56
Lightweight Packet Processing Modules
  • Simple, pre-defined functions in routers
  • Enabled by end systems via signaling
  • Signaling protocol identifies
  • the functionality to be enabled
  • the parameters to be used by the function
  • the packets to apply the processing to
    (classifier)
  • a timeout value
  • Signaling independent of forwarding (not
    hop-by-hop)
  • Note can use direct point-to-point
    authentication

57
Dup() An Example LWP Module
  • dup() - a simple duplication function
  • snoop specially marked packets (the signaling
    protocol identifies which packets)
  • duplicate the packet
  • change source and destination in the new
    packets IP header (destination specified by the
    signaling protocol)
  • forward the new packet

58
Existing Multicast Services
  • IP multicast
  • Transmit a single packet that is delivered to
    multiple destinations
  • Advantages
  • Single address abstraction
  • Bandwidth savings
  • Scalability - (anonymity/best-effort)
  • Drawbacks
  • Network defines the abstraction (group membership
    and topology)
  • Protocol Heterogeneity

59
Existing Multicast Services
  • Application-Level Multicast
  • Requires no network-level support
  • End-systems construct overlay networks to connect
    group members
  • Advantages
  • Membership and topology controlled by app
  • Provides multicast service everywhere there is
    unicast service
  • Drawbacks
  • Not particularly efficient
  • Not scalable

60
New Building Block Services
  • Our goal achieve the best of both
  • Greater flexibility and control in developing
    network services for applications
  • Performance and scalability similar to that of
    the network-based solutions
  • Use simple building block services that give
    applications very limited control over the
    network to
  • invoke lightweight packet processing modules at
    routers
  • initiate ephemeral state probes that compute or
    gather information about the network

61
An Example Multicast Implementation
  • Sender maintains the tree topology
  • For multicast delivery
  • Sender activates dup()s at each branch point
  • Data transmitted hop-by-hop

dup?X() -- dup() that copies data to x
B
A
dup?A() dup?D() dup?C()
dup?B() dup?r3()
D
S
r1
r2
r3
BP children r1 r3, B r3 A, C, D
C
E
62
An Example Multicast Implementation
  • Q How does the sender know where to insert the
    dup() function?
  • Caveat Dont want to know network topology
  • A Through Ephemeral State Processing (ESP)

dup?X() -- dup() that copies data to x
B
A
dup?A() dup?D() dup?C()
dup?B() dup?r3()
D
S
r1
r2
r3
BP children r1 r3, B r3 A, C, D
C
E
63
Tree Construction
  • Finding the new Branch Point (BP)
  • The sender multicasts a Setup ESP Probe

B
A
dup?A() dup?D() dup?C()
dup?B() dup?r3()
(next, r3)
S
D
r1
r2
r3
C
Setup Probe If no dup() next pkt.dst
E
64
Tree Construction
  • Pinpoint the new BP
  • unicast to the new receiver (E)

B
A
dup?A() dup?D() dup?C()
dup?B() dup?r3()
(next, r3)
S
D
r1
r2
r3
C
E
(pkt.best, r1) (pkt.next, null)
(pkt.best, null) (pkt.next, null)
65
Tree Construction
  • Pinpoint the new BP
  • unicast to the new receiver (E)

B
A
dup?A() dup?D() dup?C()
dup?B() dup?r3()
(next, r3)
S
D
r1
r2
r3
C
E
66
Tree Construction
  • Pinpoint the new BP
  • unicast to the new receiver (E)

B
A
dup?A() dup?D() dup?C()
dup?B() dup?r3()
(next, r3)
S
D
r1
r2
r3
C
E
67
A Sender-Managed Multicast Tree
  • Sender updates topology

B
A
dup?A() dup?D() dup?C()
dup?r3() dup?E()
dup?B() dup?r2()
S
D
r1
r2
r3
C
E
Write a Comment
User Comments (0)
About PowerShow.com