VROOM: Virtual ROuters On the Move - PowerPoint PPT Presentation

About This Presentation
Title:

VROOM: Virtual ROuters On the Move

Description:

Routers should be free to roam around. Useful for many different applications ... E.g., the 'cost-out/cost-in' of IGP link weights. Cannot eliminate the disruption ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 67
Provided by: yiw
Category:
Tags: vroom | move | routers | virtual

less

Transcript and Presenter's Notes

Title: VROOM: Virtual ROuters On the Move


1
VROOM Virtual ROuters On the Move
  • Yi Wang (Princeton)

With Eric Keller (Princeton) Brian Biskeborn
(Princeton) Kobus van der Merwe (ATT Labs -
Research) Jennifer Rexford (Princeton)
2
Virtual ROuters On the Move (VROOM)
  • Key idea
  • Routers should be free to roam around
  • Useful for many different applications
  • Simplify network maintenance
  • Simplify service deployment and evolution
  • Reduce power consumption
  • Feasible in practice
  • No performance impact on data traffic
  • No visible impact on routing protocols

3
VROOM The Basic Idea
  • Virtual routers (VRs) form logical topology

1
4
3
2
physical router
5
virtual router
logical link
4
VROOM The Basic Idea
  • VR migration does not affect the logical topology

2
physical router
3
virtual router
1
logical link
4
5
5
The Rest of the Talk is QA
  • Why is VROOM a good idea?
  • What are the challenges?
  • Or it is just technically trivial?
  • How does VROOM work?
  • The migration process
  • Is VROOM practical?
  • Prototype system
  • Performance evaluation
  • Where to migrate?
  • The scheduling problem
  • Still have questions? Feel free to ask!

5
6
The Coupling of Logical and Physical
  • Today, the physical and logical configurations of
    a router is tightly coupled
  • Physical changes break protocol adjacencies,
    disrupt traffic
  • Logical configuration as a tool to reduce the
    disruption
  • E.g., the cost-out/cost-in of IGP link weights
  • Cannot eliminate the disruption
  • Account for over 73 of network maintenance
    events

7
VROOM Separates the Logical and Physical
  • Make a logical router instance migratable among
    physical nodes
  • All logical configurations/states remain the same
    before/after the migration
  • IP addresses remain the same
  • Routing protocol configurations remain the same
  • Routing-protocol adjacencies stay up
  • No protocol (BGP/IGP) reconvergence
  • Network topology stays intact
  • No disruption to data traffic

8
Case 1 Planned Maintenance
  • Todays best practice cost-out/cost-in
  • Router reconfiguration protocol reconvergence
  • VROOM
  • NO reconfiguration of VRs, NO reconvergence

PR-A
PR-B
9
Case 1 Planned Maintenance
  • Todays best practice cost-out/cost-in
  • Router reconfiguration protocol reconvergence
  • VROOM
  • NO reconfiguration of VRs, NO reconvergence

PR-A
PR-B
10
Case 1 Planned Maintenance
  • Todays best practice cost-out/cost-in
  • Router reconfiguration protocol reconvergence
  • VROOM
  • NO reconfiguration of VRs, NO reconvergence

PR-A
PR-B
11
Case 2 Service Deployment Evolution
  • Deploy a new service in a controlled test
    network first

CE
CE
CE
Test network
Test network
Production network
Test network
12
Case 2 Service Deployment Evolution
  • Roll out the service to the production network
    after it matures
  • VROOM guarantees seamless service to existing
    customers during the roll-out and later evolution

Test network
Test network
Production network
Test network
13
Case 3 Power Savings
  • Big power consumption of routers
  • Millions of Routers in the U.S.
  • Electricity bill hundreds of millions/year

(Source National Technical Information Service,
Department of Commerce, 2000. Figures for 2005
2010 are projections.)
14
Case 3 Power Savings
  • Observation the diurnal traffic pattern
  • Idea contract and expand the physical network
    according to the traffic demand

15
Case 3 Power Savings
Dynamically contract expand the physical
network in a day - 3PM
16
Case 3 Power Savings
Dynamically contract expand the physical
network in a day - 9PM
17
Case 3 Power Savings
Dynamically contract expand the physical
network in a day - 4AM
18
Virtual Router Migration the Challenges
  • Migrate an entire virtual router instance
  • All control plane data plane processes / states
  • Minimize disruption
  • Data plane up to millions packets per second
  • Control plane less stringent (w/ routing message
    retrans.)
  • Migrate links

18
19
Outline
  • Why is VROOM a good idea?
  • What are the challenges?
  • How does VROOM work?
  • The migration enablers
  • The migration process
  • What to be migrated?
  • How? (in order to minimize disruption)
  • Is VROOM practical?
  • Where to migrate?

20
VROOM Architecture
  • Three enablers that make VR migration possible
  • Router virtualization
  • Control and data plane separation
  • Dynamic interface binding

21
A Naive Migration Process
  • Freeze the virtual router
  • Copy states
  • Restart
  • Migrate links
  • Practically unacceptable
  • Packet forwarding should not stop during migration

22
VROOMs Migration Process
  • Key idea separate the migration of control and
    data plane
  • No data-plane interruption
  • Low control-plane interruption
  • Control-plane migration
  • Data-plane cloning
  • Link migration

22
23
Control-Plane Migration
  • Two things to be copied
  • Router image
  • Binaries, configuration files, etc.
  • Memory
  • 1st stage pre-copy
  • 2nd stage stall-and-copy (when the control
    plane is frozen)

2
1
time
t1
t2
t3
t4
pre-copy
stall-and-copy
1 router-image copy
2 memory copy
23
24
Data-Plane Cloning
  • Clone the data plane by repopulation
  • Copying the data plane states is wasteful, and
    could be hard
  • Instead, repopulate the new data plane using the
    migrated control plane
  • The old data plane continues working during
    migration

2
3
1
time
t1
t2
t3
t4
t5
1 router-image copy
2 memory copy
3 data-plane cloning
24
25
Remote Control Plane
  • The migrated control plane plays two roles
  • Act as a remote control plane for the old data
    plane
  • Populate the new data plane

2
3
1
time
t1
t2
t3
t4
t5
remote control plane
control plane
old node
new node
1 router-image copy
2 memory copy
3 data-plane cloning
25
26
Keep the Control Plane Online
  • Data-plane cloning takes time
  • Around 110 us per FIB entry update (for high-end
    router)
  • Installing 250k routes could take over 20 seconds
  • The control plane needs connectivity during this
    period
  • Redirect the routing messages through tunnels

P. Francios, et. al., Achieving sub-second IGP
convergence in large IP networks, ACM SIGCOMM
CCR, no. 3, 2005.
26
27
Double Data Planes
  • At the end of data-plane cloning, two data planes
    are ready to forward traffic (i.e., double data
    planes)

4
2
3
1
0
time
t1
t2
t3
t4
t5
t0
t6
remote control plane
control plane
old node
new node
old node
data plane
new node
0 tunnel setup
double data plane
1 router-image copy
2 memory copy
3 data-plane cloning
4 asynchronous link migration
27
28
Asynchronous Link Migration
  • With the double data planes, each link can be
    migrated independently
  • Eliminate the need for a synchronization system

28
29
Outline
  • Why is VROOM a good idea?
  • What are the challenges?
  • How does VROOM work?
  • Is VROOM practical?
  • Prototype system
  • Performance evaluation
  • Where to migrate?

30
Prototype Implementation
  • PC OpenVZ
  • OpenVZ OS-level virtualization
  • Lighter-weight
  • Supports live migration
  • Two prototypes
  • Software-based data plane (SD) Linux kernel
  • Hardware-based data plane (HD) NetFPGA
  • NetFPGA 4-port gigabit Ethernet PCI with an
    FPGA
  • Why two prototypes?
  • To validate the data-plane hypervisor design
    (e.g., migration between SD and HD)

30
31
The Out-of-box OpenVZ Approach
  • Packets are forwarded inside each VE
  • When a VE is being migrated, packets are dropped

31
32
Control and Data Plane Separation
  • Move the FIBs out of the VEs
  • shadowd in each VE, pushing down route updates
  • virtd in VE0, as the data-plane hypervisor

32
33
Dynamic Interface Binding
  • bindd provides two types of bindings
  • Map substrate interfaces to the right FIB
  • Map substrate interfaces to the right virtual
    interfaces

33
34
Putting It Altogether Realizing Migration
1. The migration program notifies shadowd about
the completion of the control plane migration
34
35
Putting It Altogether Realizing Migration
2. shadowd requests zebra to resend all the
routes, and pushes them down to virtd
35
36
Putting It Altogether Realizing Migration
3. virtd installs routes the new FIB, while
continuing to update the old FIB
36
37
Putting It Altogether Realizing Migration
4. virtd notifies the migration program to start
link migration after finishing populating the new
FIB 5. After link migration is completed, the
migration program notifies virtd to stop updating
the old FIB
37
38
Evaluation
  • Answer three questions
  • Performance of individual migration steps?
  • Impact on data traffic?
  • Impact on routing protocol?
  • Experiments on Emulab

38
39
Performance of Migration Steps
  • Memory copy time
  • With different numbers of routes (dump file
    sizes)

39
40
Performance of Migration Steps
  • FIB population time
  • Grows linearly w.r.t. the number of route entries
  • Installing a FIB entry into NetFPGA 7.4
    microseconds
  • Installing a FIB entry into Linux kernel 1.94
    milliseconds
  • FIB update time time for virtd to install
    entries to FIB
  • Total time FIB update time time for shadowd
    to send routes to virtd

40
41
Data Plane Impact
  • The diamond testbed
  • 64-byte UDP packets, round-trip traffic

41
42
Data Plane Impact
  • HD router with separate migration bandwidth
  • No delay increase or packet loss
  • SD router with separate migration bandwidth
  • Up to 3.7 delay increase at 5k packets/s
  • Less than 0.4 delay increase at 25k packets/s

SD, 5k packets/s
42
43
The Importance of Separate Migration Bandwidth
  • The dumbbell testbed
  • 250k routes in the RIB

43
44
Separate Migration Bandwidth is Important
  • Throughput of the migration traffic

44
45
Separate Migration Bandwidth is Important
  • Delay increase of the data traffic

45
46
Separate Migration Bandwidth is Important
  • Loss rate of the data traffic

46
47
Control Plane Impact
  • The Abilene testbed
  • Assume a backbone running MPLS
  • VR5 configured as
  • Core router (running OSPF only)
  • Edge router (running OSPF BGP)

47
48
Core Router Migration
  • No events during migration
  • Average control plane downtime 0.972 seconds
    (0.924 - 1.008 seconds in 10 runs)
  • Support 1-second OSPF hello-interval (with
    4-second dead-interval)
  • Miss at most one hello message

48
49
Core Router Migration
  • Events happen during migration
  • Introducing events (LSA) by flapping link VR2-VR3
  • Miss at most one LSA
  • Get retransmission 5 seconds later (the default
    LSA retransmission-interval)
  • Can use smaller LSA retransmission-interval
    (e.g., 1 second)

49
50
Edge Router Migration
  • 255k BGP routes OSPF
  • Dump file size grows from 3.2MB to 76.0MB
  • Average control plane downtime 3.560 seconds
    (3.484 - 3.594 seconds in 10 runs)
  • Support 2-second OSPF hello-interval (with
    8-second dead-interval)
  • BGP sessions stay up
  • In practice, ISPs often use the default values
  • 10-second hello-interval
  • 40-second dead interval

50
51
Outline
  • Why is VROOM a good idea?
  • What are the challenges?
  • How does VROOM work?
  • Is VROOM practical?
  • Where to migrate?

52
Deciding Where To Migrate
  • Physical constraints
  • Latency
  • E.g, NYC to Washington D.C. 2 msec
  • Link capacity
  • Enough remaining capacity for extra traffic
  • Platform compatibility
  • Routers from different vendors
  • Router capability
  • E.g., number of access control lists (ACLs)
    supported
  • Good news these constraints limit the search
    space

53
Two Optimization Problems
  • For planned maintenance/service deployment
  • Minimize path stretch
  • With constraints on link capacity, platform
    compatibility, router capability, etc.
  • For power savings
  • Maximize power savings
  • With different regional electricity prices
  • With constraints on path stretch, link capacity,
    etc.

53
54
Conclusions
  • VROOM offers a useful network-management
    primitive
  • separates the tight coupling between physical and
    logical
  • Simplify network management, enable new
    applications
  • Live router migration with minimal disruption
  • Data-plane hypervisor enables
  • Data-plane cloning
  • Remote control plane
  • Double data plane and asynchronous link
    migration
  • No data-plane disruption
  • No visible control-plane disruption

55
Thanks!
  • Questions Comments Please!

56
Backup Slides
57
Packet-aware Access Network
58
Packet-aware Access Network
  • Pseudo-wires (virtual circuits) from CE to PE

PE
CE
P/G-MSS Packet-aware/Gateway Multi-Service
Switch MSE Multi-Service Edge
59
Events During Migration
  • Network failure during migration
  • The old VR image is not deleted until the
    migration is confirmed successful
  • Routing messages arrive during the migration of
    the control plane
  • BGP TCP retransmission
  • OSPF LSA retransmission

60
Flexible Transport Networks
  • Migrate links affixed to the virtual routers
  • Enabled by programmable transport networks
  • Long-haul links are reconfigurable
  • Layer 3 point-to-point links are multi-hop at
    layer 1/2

New York
Chicago
Programmable Transport Network
Washington D.C.
Multi-service optical switch (e.g., Ciena
CoreDirector)
60
61
Requirements Enabling Technologies
  • Migrate links affixed to the virtual routers
  • Enabled by programmable transport networks
  • Long-haul links are reconfigurable
  • Layer 3 point-to-point links are multi-hop at
    layer 1/2

New York
Chicago
Programmable Transport Network
Washington D.C.
Multi-service optical switch (e.g., Ciena
CoreDirector)
62
Requirements Enabling Technologies
  • Enable edge router migration
  • Enabled by packet-aware access networks
  • Access links are becoming inherently virtualized
  • Customers connects to provider edge (PE) routers
    via pseudo-wires (virtual circuits)
  • Physical interfaces on PE routers can be shared
    by multiple customers

Dedicated physical interface per customer
Shared physical interface
63
Link Migration in Transport Networks
  • With programmable transport networks, long-haul
    links are reconfigurable
  • IP-layer point-to-point links are multi-hop at
    transport layer
  • VROOM leverages this capability in a new way to
    enable link migration

63
64
Link Migration in Flexible Transport Networks
  • 2. With packet-aware transport networks
  • Logical links share the same physical port
  • Packet-aware access network (pseudo wires)
  • Packet-aware IP transport network (tunnels)

64
65
Power Consumption of Routers
  • A Synthetic large tier-1 ISP backbone
  • 50 POPs (Point-of-Presence)
  • 20 major POPs, each has
  • 6 backbone routers, 6 peering routers, 30 access
    routers
  • 30 smaller POPs, each has
  • 6 access routers

66
Future Work
  • Algorithms that solve the constrained
    optimization problems
  • Control-plane hypervisor to enable cross-vendor
    migration

66
Write a Comment
User Comments (0)
About PowerShow.com