Infrastructurebased Resilient Routing - PowerPoint PPT Presentation

About This Presentation
Title:

Infrastructurebased Resilient Routing

Description:

Network connectivity is not reliable. Disconnections frequent in the wide-area Internet ... Route around failures to maintain connectivity ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 19
Provided by: beny7
Category:

less

Transcript and Presenter's Notes

Title: Infrastructurebased Resilient Routing


1
Infrastructure-basedResilient Routing
  • Ben Y. Zhao, Ling Huang, Jeremy Stribling,
    Anthony Joseph and John Kubiatowicz
  • University of California, Berkeley
  • Sahara Winter Retreat, 2004

2
Challenges Facing Network Applications
  • Network connectivity is not reliable
  • Disconnections frequent in the wide-area Internet
  • IP-level repair is slow
  • Wide-area BGP ? 3 mins
  • Local-area IS-IS ? 5 seconds
  • Next generation network applications
  • Mostly wide-area
  • Streaming media, VoIP, B2B transactions
  • Low tolerance of delay, jitter and faults
  • Our work transparent resilient routing
    infrastructure that adapts to faults in not
    seconds, but milliseconds

3
Talk Overview
  • Motivation
  • A Structured Overlay Infrastructure
  • Mechanisms and policy
  • Evaluation
  • Summary

4
The Challenge
  • Routing failures are diverse
  • Many causes
  • Router misconfigurations, cut fiber, planned
    downtime, protocol implementation bugs
  • Occur anywhere with local or global impact
  • Single fiber cut can disconnect AS pairs
  • Isolating failures is difficult
  • Wide-area measurement is ongoing research
  • Single event leads to complex inter-protocol
    interactions
  • End user symptoms often dynamic or intermittent
  • Requires
  • Fault detection from multiple distributed vantage
    points
  • In-network decision making necessary for timely
    responses

5
An Infrastructure Approach
  • Our goals
  • Overlay focused on resiliency
  • Route around failures to maintain connectivity
  • Respond in milliseconds (react instantaneously to
    faults)
  • Our approach
  • Large-scale infrastructure for fault and route
    discovery
  • Nodes are observation points (similar to Platos
    NEWS service)
  • Nodes are also points of traffic
    redirection(forwarding path determination and
    data forwarding)
  • Automated fault-detection and circumvention
  • No edge node involvement fast response time,
    security focused on infrastructure
  • Fully transparent, no application awareness
    necessary

6
An Illustration
Goal fast fault detection and route-around
Key on the fly in-network traffic redirection
7
Why Structured Overlays
  • Resilient Overlay Networks (MIT)
  • Fully connected mesh
  • Allows each node full knowledge of network
  • Fast, independent calculation of routes
  • Nodes can construct any path, maximum flexibility
  • Cost of flexibility
  • Protocol needs to choose the right route/nodes
  • Per node O(n) state
  • Monitors n - 1 paths
  • O(n2) total path monitoring is expensive

D
S
8
The Big Picture
Internet
  • Locate nearby overlay proxy
  • Establish overlay path to destination host
  • Overlay traffic routes traffic resiliently

9
Traffic Tunneling
A, B are IP addresses
Legacy Node B
Legacy Node A
B
P(B)
Proxy
P(B) B
P(A) A
Proxy
Structured Peer to Peer Overlay
  • Store mapping from end host IP to its proxys
    overlay ID
  • Similar to approach in Internet Indirection
    Infrastructure (I3)

10
Tradeoffs of Tunneling via P2P
  • Less neighbor paths to monitor per node
    O(log(n))
  • Large reduction in probing bandwidth O(n) ?
    O(log(n))
  • Faster fault detection with low bandwidth
    consumption
  • Actively maintain path redundancy
  • Manageable for small of paths
  • Redirect traffic immediately when a failure is
    detectedEliminate on-the-fly calculation of new
    routes
  • Restore redundancy when a path fails
  • Fast fault detection precomputed paths
    increased responsiveness
  • Cons overlay imposes routing stretch (mostly lt 2)

11
In-network Resiliency Mechanisms
  • Efficient fault detection
  • Use soft-state to periodically probe log(n)
    neighbor paths
  • Small number of routes ? reduced bandwidth
  • Exponentially weighted moving averagefor link
    quality estimation
  • Avoid route flapping due to short term loss
    artifacts
  • Loss rate Ln (1 - ?) ? Ln-1 ? ? ?p
  • Simple approach taken, ongoing research available
  • Smart fault-detection / propagation (Zhuang04)
  • Intelligent and cooperative path selection
    (Seshardri04)
  • Maintaining backup paths
  • Each hop has flexible routing constraint
  • Create and store backup routes at node insertion
  • Restore redundancy via intelligent gossip after
    failures
  • Simple policies to choose among redundant paths

12
First Reachable Link Selection (FRLS)
  • Use estimated loss results to choose shortest
    usable path
  • Sort next hop paths by latency
  • Use shortest path withminimal quality gt T
  • Correlated failures
  • Reduce with intelligent topology construction
  • Key is to leverage redundancy available

13
Evaluation
  • Metrics for evaluation
  • How much routing resiliency can we exploit?
  • How fast can we adapt to faults (responsiveness)?
  • Experimental platforms
  • Event-based simulations on transit stub
    topologies
  • Data collected over multiple 5000-node topologies
  • PlanetLab measurements
  • Microbenchmarks on responsiveness
  • More details in paper (ICNP03) and poster session

14
Exploiting Route Redundancy (Sim)
  • Simulation of Tapestry, 2 backup paths per
    routing entry
  • Transit-stub topology shown, results from TIER
    and AS graphs similar

15
Responsiveness to Faults (PlanetLab)
  • Response time increases linearly with probe
    period
  • Minimum link quality threshold T 70, 20 runs
    per data point

16
Link Probing Bandwidth (Planetlab)
  • Medium sized routing overlays incur low probing
    bandwidth
  • Bandwidth increases logarithmically with overlay
    size

17
Conclusion
  • Pros and cons of infrastructure approach
  • Structured routing has low path maintenance costs
  • Allows caching of backup paths for quick
    failover
  • Transparent to user applications
  • Can no longer construct arbitrary paths
  • Structured routing with low redundancy close to
    ideal connectivity
  • Incur low routing stretch
  • Fast enough for highly interactive applications
  • 300ms beacon period ? response time lt 700ms
  • On overlay networks of 300 nodes, b/w cost is
    7KB/s
  • Ongoing questions
  • Is there lower bound on desired
    responsiveness?Should we use multipath redundant
    routing for resilience?
  • How to deploy as a single network across
    ISPs?VPN-like routing service?

18
Related Work
  • Redirection overlays
  • Detour (IEEE Micro 99)
  • Resilient Overlay Networks (SOSP 01)
  • Internet Indirection Infrastructure (SIGCOMM 02)
  • Secure Overlay Services (SIGCOMM 02)
  • Topology estimation techniques
  • Adaptive probing (IPTPS 03)
  • Internet tomography (IMC 03)
  • Routing underlay (SIGCOMM 03)
  • Many, many other structured peer-to-peer
    overlays
  • Thanks to Dennis Geels / Sean Rhea for their work
    on BMark
Write a Comment
User Comments (0)
About PowerShow.com