Internet%20Routing%20(COS%20598A)%20Today:%20Intradomain%20Routing%20Convergence - PowerPoint PPT Presentation

About This Presentation
Title:

Internet%20Routing%20(COS%20598A)%20Today:%20Intradomain%20Routing%20Convergence

Description:

I'll be attending a Computing Research Association (CRA) Board of Directors meeting in D.C. ... Pre-compute effects of certain failure scenarios ... – PowerPoint PPT presentation

Number of Views:131
Avg rating:3.0/5.0
Slides: 33
Provided by: albertgr
Category:

less

Transcript and Presenter's Notes

Title: Internet%20Routing%20(COS%20598A)%20Today:%20Intradomain%20Routing%20Convergence


1
Internet Routing (COS 598A)Today Intradomain
Routing Convergence
  • Jennifer Rexford
  • http//www.cs.princeton.edu/jrex/teaching/spring2
    005
  • Tuesdays/Thursdays 1100am-1220pm

2
General Course Stuff
  • Tuesday March 1
  • Lecture by Larry Peterson on PlanetLab
  • No assignment for written reviews of papers
  • Ill be attending a Computing Research
    Association (CRA) Board of Directors meeting in
    D.C.
  • Course projects
  • Good to start thinking up a project idea
  • Make an appointment to chat with me
  • Written report due on Deans Date, Tue May 10
  • Oral presentations during exam period
  • No formal leading a class discussion item

3
Splitting Convergence into Two Lectures
  • One class is just not enough
  • Today on intradomain routing
  • Next Thursday on interdomain routing
  • Impact on reading you already did
  • One paper intradomain, one on interdomain
  • Though class will focus just on intradomain
  • Impact on reading for Thursday March 3
  • One paper intradomain, one on interdomain
  • Though class will focus just on interdomain
  • Push off the next topic to following class

4
Outline
  • Routing-protocol convergence
  • Steps in reacting to a link failure
  • Effects on the data packets
  • Intradomain routing protocols
  • Steps in protocol convergence
  • Implementation overhead and timers
  • Operational practices to reduce convergence
  • Multiple shortest paths between routers
  • Costing out during planned maintenance

5
Routing Convergence
  • The only constant is change
  • Equipment failures, or new deployment
  • Routing-protocol configuration changes
  • Planned maintenance on the network
  • Routing protocols adapt
  • Detect the change
  • Propagate messages
  • Compute new routes
  • Update the forwarding tables

6
Converging After a Failure
  • Failure detection
  • Router recognizes an incident link has failed
  • Failure notification
  • Router informs other routers about the change
  • Path re-computation
  • Routers compute new paths avoiding the link
  • Forwarding-table update
  • Routers update their forwarding tables
  • Data traffic starts to flow over the new path

7
Bad Things Happen During Convergence
  • Transient inconsistencies
  • Routers have different views of the network
  • Forwarding decisions may be inconsistent
  • Effects on data traffic
  • Black-hole packet loss
  • Loops packets going in circles
  • Delay packets going on very long paths
  • Out-of-order new packets arrive before old ones
  • Want to minimize convergence delay
  • and especially the effects on the data traffic

8
Example Black-hole Causing Packet Loss
  • Router forwarding to dead link
  • Doesnt know (yet) that the link is dead
  • Or, hasnt computed a new forwarding entry

d
s
Fortunately, IP only promises best effort
delivery!
9
Example Forwarding Loop
  • Set of routers disagree
  • One router acting on old information
  • Another router acting on new information

s
d
10
Intradomain Routing Convergence
11
Interior Gateway Protocols (IGPs)
  • Routers running OSPF or IS-IS
  • Flood link-state advertisements (LSAs)
  • Compute shortest paths from link weights
  • Determine next hop to other routers

2
1
3
1
3
2
1
5
4
3
12
Knowing a Link is Dead Heart-Beats
  • Periodic hello packets (hello_interval, 10sec)
  • Timeout if not received (dead_interval, 40 sec)
  • Declare failure and flood the info to others
  • Small values lead to faster detection, but also
  • Higher bandwidth consumption for hellos
  • False detection during congestion interval
  • False detection if router CPU falls a little
    behind

hello
hello
13
Knowing the Link is Dead Interface Support
  • Smart interface hardware
  • Detects loss of connectivity at lower layer
  • Interrupts the router CPU about the failure
  • Common in Packet Over SONET technology
  • E.g. Sprint paper sees delays less than 100 msec
  • But
  • Some media dont support it (e.g., Ethernet, ATM)
  • so, you often need heartbeats anyway
  • Also, want heartbeats to detect failures the
    hardware cannot detect on its own

14
Flooding the Link-State Advertisement
  • After detecting the failure
  • Router sends LSA out each link
  • Each router does the same
  • and so on
  • Flooding delay
  • (CPU delay at each hop) (diameter of the
    network)

15
Computing the Shortest Paths
  • Each router re-computes
  • Shortest-path tree rooted at this router
  • Determine next-hop to every other router

16
Reducing the Computational Overhead
  • Good system
  • Fast processor
  • High-speed memory
  • Good algorithms
  • Traditional approach computes from scratch
  • Incremental algorithms compute only the changes
  • Especially nice if only one edge changes
  • Pre-computation
  • Pre-compute effects of certain failure scenarios
  • E.g., all single-link or single-router failures

17
Updating the Forwarding Table
  • Forwarding table
  • Map destination prefix to outgoing link(s)
  • Copy of table on each interface card
  • Highly optimized for fast lookups
  • Updating the forwarding table
  • Computing the new forwarding table
  • Making updates to the copy of the line card
  • Important source of delay
  • Sprint end-to-end study around 1 second
  • ATT router-level study 100 msec 300 msec

18
All Together Looking Inside the Router
LSA Processing
Route Processor (CPU)
OSPF Process
LSA Flooding
Topology View
SPF Calculation
FIB Update
FIB
Forwarding
Forwarding
Switching Fabric
Interface card
Interface card
19
Significance of Protocol Timers
  • Hello and dead intervals
  • Failure-detection delay vs. false diagnosis
  • Pacing the link-state advertisements
  • Combining LSAs vs. longer convergence delay
  • Some routers wait till after re-running Dijkstra!
  • Delaying start of shortest-path computation
  • Reducing computations vs. convergence delay
  • Especially useful if failure affects multiple
    links

20
Operational Practices
21
Reducing the Effects of Convergence
  • Long convergence delay is bad
  • Transient problems with loss and delay
  • Disruptive for VoIP and online gaming
  • Solution 1 better equipment
  • Interfaces that detect failures automatically
  • Cranking down the values of the timers
  • Faster CPUs and path-computation algorithms
  • Solution 2 network design and operation
  • Improve forwarding-plane convergence
  • Improve convergence during maintenance

22
Equal-Cost Multi-Path (ECMP)
  • Multiple shortest paths
  • Router can compute multiple shortest paths
  • Forwarding table has multiple outgoing links
  • Router splits traffic evenly over the links

2
1
3
1
3
2
1
5
3
3
23
ECMP Reduces Forwarding-Plane Convergence
  • Suppose one of the outgoing link fails
  • Incident router detects the failure
  • Quick recomputation of paths without this link
  • Local forwarding table updated to use other link
  • Other routers have no forwarding-table change!!!

2
1
3
1
3
2
1
5
Only red router changes its forwarding table!
3
3
24
Exploiting This Observation in Traffic Engineering
  • Traffic engineering
  • Given a topology and a traffic matrix
  • set link weights to control the flow of traffic
  • to minimize some objective function
  • Bias toward solutions with ties
  • Penalize solutions with just one shortest path
  • Favor solutions that lead to multiple paths
  • even if the link loads are a little less
    balanced
  • Applied in some traffic-engineering tools
  • Demand from ISPs buying the tools
  • with customers demanding fast convergence

25
Examples of Planned Failures
  • Upgrades
  • Changing link to higher capacity
  • Loading new operating system on a router
  • Swapping out an old interface card
  • Maintenance
  • Fixing a flaky optical amplifier
  • Configuration changes that require a reboot
  • Cable intrusions
  • Construction activities near a fiber

26
Planned Events Happen Often
  • Sprint study
  • Maintenance window
  • From 10pm to 6am EST, covering east to west
  • Period of low network traffic, so less congestion
  • Not much business-critical traffic
  • Responsible for 50 of intradomain failures
  • Significance
  • Planned events should be easier to handle
  • The operator knows the failure(s) will happen
  • but, how to tell the routing protocol?
  • or, how to prepare the network in advance?

27
Costing Out of Equipment
  • Increase cost of link to high value
  • Triggers immediate flooding of LSAs
  • Leads to new shortest paths avoiding the link
  • While the link still exists to forward during
    convergence
  • Then, can safely disconnect the link
  • New flooding of LSAs, but no influence on
    forwarding

2
1
3
1
3
2
1
5
4
3
28
Bigger Picture
  • Learn about a planned event
  • E.g., replace optical amplifier
  • Map the event to the IP equipment
  • E.g., find link(s) that traverse the amplifier
  • Increase the weight on each link
  • Slowly, perhaps one at a time to reduce overhead
  • Disable the equipment
  • Disconnect amplifier and replace with new one
  • Reintroduce the links into the network
  • Slowly, change one link weight at a time

29
Even Bigger Picture
  • What if maintenance would cause congestion?
  • Reducing the capacity of the network
  • Link weights not optimized to new topology
  • Compute weight changes to make
  • Re-optimize the setting of the link weights
  • based on the soon-to-be new topology
  • Then, do the maintenance
  • Cost out the IP links
  • Fix/upgrade the equipment
  • Cost in the IP links
  • Then, go back to the old weight setting

30
Project Ideas
  • Multi-path routing
  • Protocols that allow more multi-path routing
  • not just the equal-cost paths (as in ECMP)
  • Maintenance schedules
  • Compute a sequence of weight changes
  • Avoid link congestion in each step
  • Convergence models
  • What actually happens during convergence?
  • Simulation of forwarding-plane behavior
  • Effective pre-computation on routers
  • Routers precompute reactions to certain failures
  • E.g., all single-link failures or single-router
    failures

31
For Next Time, on Tuesday PlanetLab
  • Guest lecture
  • Professor Larry Peterson
  • Three papers (two short, one regular)
  • A Blueprint for Introducing Disruptive
    Technology into the Internet
  • Overcoming the Internet Impasse through
    Virtualization
  • Operating System Support for Planetary-Scale
    Network Services
  • No written reviews
  • But, be ready to ask hard questions
  • PlanetLab is very useful for course projects

32
Next Thursday Interdomain Convergence
  • Two papers (intradomain and interdomain)
  • Experience in Black-box OSPF Measurement
  • Route Flap Damping Exacerbates Internet Routing
    Convergence
  • Written reviews
  • Summary
  • Reasons to accept
  • Reasons to reject
  • Avenues for future work
  • Optional
  • NANOG video about the second paper
  • Really great essay on You and Your Research
Write a Comment
User Comments (0)
About PowerShow.com