Title: On the Interaction between Dynamic Routing in the Native and Overlay Layers
1On the Interaction between Dynamic Routing in
theNative and Overlay Layers
- INFOCOM 2006
- Srinivasan Seetharaman
- Mostafa Ammar
- College of Computing
- Georgia Institute of Technology
2Inter-Layer Interaction Problem
- Infrastructure overlay networks offer better
services by deploying intelligent routing
schemes. - Uncoordinated dynamic routing in the two layers
lead to many problems. - We focus on the effect of native link failures,
as they trigger each layer to reroute
independently - Dual Rerouting
3Temporal Dynamics
- Consider a native link failure in CE
- Only one overlay link is affected.
- The native path AE is rerouted over F (ACE ?
ACFDE)
G
2
3
A
I
3
2
4
E
OVERLAY
NATIVE
G
B
I
A
F
C
H
8
D
E
Cost
Overlay recovery 8
Original 2
Native Rerouting 2
Overlay rerouting 4
Time
Native Failure
Native Recovery
Native Repair
4Downside to Dual Rerouting
- Overlap of functionality between layers causing
large number of route flaps (oscillations) - Unawareness of other layers decisions leading to
- resource overloading,
- multiple simultaneous failures
- a low success rate in rerouting
- sub-optimal paths after rerouting
- Lack of flexibility and control
5Problem Statement I
- Assume the two ends of each link (native
overlay) use a keepAlive protocol for link
verification. - 3 keepAlive messages lost ? Failure
- Understand the effects of different parameters on
the rerouting performance. - KeepAlive-time Time between two keepAlive
messages - Hold-time Time window to declare link as down
- Overlay link cost scheme (Ex Native hops,
Overlay hops)
6Performance Metrics
- Hit-time Time taken for traffic to be recovered.
Detection time Convergence time Device
time (depends on timers) (protocol
specific) (Negligible) - Success rate of recovery
- Success rate of a layer Number of paths
recovered - Number of failed overlay paths
- Number of route flaps
- Average route flaps Number of route flaps
- Number of failed overlay paths
- Peak Stabilized inflation (before repair)
- Path cost inflation Path cost after rerouting
- Path cost before failure
7Temporal Dynamics
Overlay path AE Overlay detects first 100
success rate 3 route flaps Peak inflation
8/2 Stabilized inflation 4/2
Hit time
8
Cost
Overlay recovery 8
Original 2
Native Rerouting 2
Overlay rerouting 4
Time
Native Failure
Native Recovery
Native Repair
8Performance Evaluation ns2
- Using GT-ITM, we randomly generate
- 25 topologies (5 overlay network) x (5 native
network) - Two scenarios
- Inspect intra-domain failures in single-domain
native network - Inspect inter-domain failures in multi-domain
native network - In each scenario, tabulate failure recovery
statistics of all overlay paths by breaking one
native link at a time
9Effect of Routing Parameters
- Observations By varying the overlay
keepAlive-time, hold-time and cost scheme, we
observe - hold-time ? hit time ? (only until overlay
hold-time lt native hold-time) - hold-time ? route flaps ?
- hold-time ? sub-optimality ?
- keepAlive-time ? hit-time ?hold-time
10Conclusion I
- Dual rerouting can be made optimal by adopting
the following recommendations - Overlay hold-time very close to the native
hold-time. - Overlay keepAlive-time that is half that of the
hold-time as it leads to an earlier detection.
11Problem Statement II
- Main observation from previous simulations
- Native-rerouting yields the optimal path, albeit
a bit later - Make the overlay layer aware of this observation
and give higher precedence to native rerouting
attempts - Improve overlay routing performance by adjusting
the overlay layer functioning
12Three Levels of Layer Awareness
- No awareness
- Dual rerouting
- Awareness of native layers existence
- Probabilistically Suppressed Overlay Rerouting
(PSOR) - Suppress overlay rerouting attempt with
probability p - Deferred Overlay Rerouting (DOR)
- Delay overlay recovery by time d
13Three Levels of Layer Awareness (contd.)
- Awareness of native layers parameters
- Follow-on Suppressed Overlay Rerouting (FSOR)
- If follow-on time lt threshold f, then suppress
overlay rerouting
Follow-on time
Time
Overlay layer detects failure
Native layer detects failure
Failure
14Effect of Adjusting Overlay
- All three schemes are simple and offer
significant control over the tradeoffs between
hit-time and the other metrics. - PSOR
- Least number of route flaps
- Least peak inflation
- DSOR and FSOR behave similarly (FSOR has slightly
better hit-time) - Better success rate
- Lower stabilized inflation
15Conclusion II
- By appropriately tuning
- keepAlive-time
- hold-time
- suppression probability
- delay
- follow-on threshold
- we can improve results for
- Hit-time
- Route flaps
- Path cost inflation
- Stabilization time
- Success rate
16Problem Statement III
- Main observation from previous simulations
- It is not possible to improve all metrics
simultaneously. Hence, performance is still
bounded! - As overlay applications proliferate, the native
layer should gradually evolve to suit them - Improve overlay routing performance by adjusting
the native layer functioning
17Tuning the Native keepAlive-time
- We adopt a non-invasive procedure to advance the
native layer rerouting - Tuning of the native layer keepAlive-time
- Constraints
- Tuning should not generate any extra overhead
- Effective detection time should be same
18Tuning the Native keepAlive-time (contd.)
- Consider the following scenarios for tuning.
- Scenario B is vanilla Dual rerouting
- Scenario A is the layer-aware overlay rerouting
scheme - Scenario C is the tuning we recommend here
19Conclusions III
- Native layer tuning we proposed achieves the best
performance in all our metrics
20Summary
- We propose means to mitigate the problems
associated in the inter-layer interaction - We explore two directions
- Adjusting the overlay layer functioning
- Adjusting the native layer functioning