Internet Iso-bar: A Scalable Overlay Distance Monitoring System presentation

About This Presentation

Transcript and Presenter's Notes

Title: Internet Iso-bar: A Scalable Overlay Distance Monitoring System

1
Internet Iso-bar A Scalable Overlay Distance
Monitoring System

Yan Chen, Lili Qiu, Chris Overton and Randy H.
Katz

2
Motivations

Applications of end-to-end distance
monitoring/estimation
Overlay Routing/Location
Peer-to-peer Systems
VPN Management/Provisioning
Service Redirection/Placement
Cache-infrastructure Configuration
Requirements for E2E distance monitoring system
Scalable a small amount of probing traffic and
system load
Accurate capture congestion/failures latency
estimation
Fast small computation for real-time estimation
Incrementally deployable
Easy to use
Benefit applications
Application-driven measurement
Inference techniques for trouble shooting, root
cause analysis
Improve application performance and reliability

3
E2E Estimation/Monitoring Systems Comparison
Properties GNP Akamai IDMaps RON Internet Isobar
Dynamic monitoring
Scalability (N hosts, AP address prefixes, K landmarks, C clusters) N gt AP C C K
Estimation accuracy
Monitors deployment
4
E2E Estimation/Monitoring Systems Comparison
Properties GNP Akamai IDMaps RON Internet Isobar
Dynamic monitoring Static estimation
Scalability (N hosts, AP address prefixes, K landmarks, C clusters) N gt AP C C K O(N K) probes, each landmark takes O(N)
Estimation accuracy Accurate, but only symmetric distance
Monitors deployment End hosts
5
E2E Estimation/Monitoring Systems Comparison
Properties GNP Akamai IDMaps RON Internet Isobar
Dynamic monitoring Static estimation Yes Yes Yes
Scalability (N hosts, AP address prefixes, K landmarks, C clusters) N gt AP C C K O(N K) probes, each landmark takes O(N) O(FAP) probes, F number of CDN edge server farms Clustering need pair-wise distance b/t all pairs of APs, O(C2 AP) probes O(N2) probes
Estimation accuracy Accurate, but only symmetric distance No existing comparison. Inaccurate Triangulation inequality proximity-based clustering Exact measurements ?most accurate
Monitors deployment End hosts CDN edge servers Transit ASs (hard to deploy) End hosts
6
E2E Estimation/Monitoring Systems Comparison
Properties GNP Akamai IDMaps RON Internet Isobar
Dynamic monitoring Static estimation Yes Yes Yes Yes
Scalability (N hosts, AP address prefixes, K landmarks, C clusters) N gt AP C C K O(N K) probes, each landmark takes O(N) O(FAP) probes, F number of CDN edge server farms Clustering need pair-wise distance b/t all pairs of APs, O(C2 AP) probes O(N2) probes O(C2 N) probes
Estimation accuracy Accurate, but only symmetric distance No existing comparison. Inaccurate Triangulation inequality proximity-based clustering Exact measurements ?most accurate Similar accuracy to GNP
Monitors deployment End hosts CDN edge servers Transit ASs (hard to deploy) End hosts End hosts
7
Problem Formulation

Given N end hosts, how to select a subset of them
as monitors and build a scalable overlay distance
monitoring service without knowing the underlying
topology?
Distance info desired report congestion/failure
if occurs, otherwise latency

8
E2E Congestion/Failures Analysis

Based on National Lab of Applied Network Research
(NLANR) AMP data set
104 sites in US (including Alaska, Hawaii)
Australia, every host ping all other hosts every
minute
Sliding window of 10 samples, use minimum RTT as
latency sample
105M measurements, 6/25/01 7/1/01
Congestion/failures (uniformly denoted as
congestion) defined as measurement loss or
(latency gt geo mean geo stdev)
Congestions not common, only 0.96 samples
A few congestion links dominate the E2E
congestion
Besides those happened at the last mile, E2E
congestion exhibit strong spatial correlation

9
NLANR AMP Sites
10
Internet Iso-bar

Procedures
Cluster hosts that perceive similar performance
to a small set of sites (landmarks)
For each cluster, select a monitor for active and
continuous probing
Estimate distance between any pair of hosts using
inter- and intra-cluster distance

11
Internet Iso-bar (I) Host Clustering

Define correlation distance between each pair of
hosts
Existing work use network proximity
cor_dist(i,j) net_dist(i,j) (denoted pij)
Iso-bar uses network distance vector (k landmarks
for clustering only) netVi pi1, pi2, ,
pikT
Euclidean distance based
Cosine vector similarity based
Apply generic clustering methods
Optimize the worst case minimize the maximum
radius of all clusters (limit_num_minRmax)
Optimize the average case minimize the sum of
total host-monitor distance (limit_num_minDistSum)

12
Diagram of Internet Iso-bar
Landmark
End Host
13
Diagram of Internet Iso-bar
Cluster C
Cluster B
Cluster A
Landmark
Monitor
End Host
14
Internet Iso-bar (II) Distance Estimation

Intra-cluster estimation
If path(m, i) or path(m, j) is congested, report
path(i, j) as congestion
O/w pDist(i,j) (mDist(m, i) mDist(m, j))/ 2
Inter-cluster estimation
If path(mi, i), path(mi, mj) or path(mj, j) is
congested, report path(i, j) as congestion
O/w pDist(i,j) mDist(mi, mj)

15
Evaluation Methodology

Internet measurement data
NLANR AMP data set
Clustering with geometric mean of training date
Estimation dates 6/25/01 7/24/01, 12/06/01
Keynote CDN measurement data
63 agents covering all major ISPs in US, Europe,
Asia Australia
2 targets (CDN re-directors) in Boston and Texas
Measure TCP connection time (2/3 of handshake)
from each agent to target every minute
Training date 10/21/2002
Estimation dates 10/21/2002 11/25/2002
Similar latency estimation results for both
datasets, present NLANR

16
Evaluation Methodology (II)

Estimation metric
Relative accuracy error for un-congested latency
Stability
For dynamic monitoring systems, amount of
congestion captured and false positive ratio
Internet distance estimation techniques evaluated
Omniscent use g-mean data of (source, dest) on
training date
Global Network Positioning (GNP)
Clustering with network distance vector (Iso-bar)
Clustering with network proximity
15 clusters vs. 15 landmarks of GNP

17
Latency Prediction Accuracy Stability

Training date 06/25/01
Estimation dates 06/25/01 - 12/06/01
Summary of the 90th percentile relative error for
various distance estimation methods

18
Distance Estimation Results

Latency estimation when un-congested
Omniscient is the most accurate, but unscalable
GNP and Iso-bar are the second
Both have good accuracy and stability for
distance estimation
GNP unscalable for online monitoring, static
approach
Iso-bar outperforms proximity-based clustering by
50
90th percentile lt 0.5, if 60ms latency, 45ms lt
prediction lt 90ms
Congestion/failures estimation
6/25/01 7/01/01, averagely 148K congested
measurements per day
Iso-bar captures 78 of them, 32 false positive
ratio
Only 3 of monitoring overhead compared with RON

19
Conclusions

Propose Internet Iso-bar
Cluster hosts based on the network similarity
Inter- and Intra-cluster latency estimation w/
first-step heuristic for congestion/failure
detection
Preliminary results promising
High accuracy stability for normal latency
estimation
Simple heuristics of congestion estimation
captures 78 of congestions, with 32 false
positive, and only 3 of monitoring overhead of
RON

20
Ongoing Work

Current focus switch from latency estimation to
congestion/failures estimation
Apply topology information, e.g. lossy link
detection with network tomography
Cluster and choose monitors based on the lossy
links
Benefit applications
Dynamic node join/leave for P2P systems
Joining client pings landmark sites to get
distance vector, compare with those of monitors,
and choose closest one to join
Split/merge clusters
Multi-path selection
More comprehensive evaluation
Simulate with large network
Deploy on PlanetLab, and operate at finer level

21
Internet Iso-bar

Problem formulation
Given N end hosts, how to select a subset of them
as monitors and build a scalable overlay distance
monitoring service without knowing the underlying
topology?
Distance info desired report congestion/failure
if occurs, o/w latency
Our approach
Cluster hosts that perceive similar performance
to a small set of sites (landmarks)
For each cluster, select a monitor for active and
continuous probing
Estimate distance between any pair of hosts using
inter- and intra-cluster distance
Performance evaluation
Using real Internet measurement data
Compared with other distance estimation services
GNP, RON
Performance metrics accuracy and stability

22
Internet Iso-bar (II) Distance Estimation

Congestion/failures analysis
Congestion/failures (uniformly denoted as
congestion) not common
Defined as measurement loss or (latency gt geo
mean geo stdev)
Only 0.96 out of 105M NLANR ping measurements
over a week
Suggest a few congestion links dominate the E2E
congestion
Besides those happened at the last mile, E2E
congestion exhibit strong spatial correlation
Estimation algorithms
Intra-cluster estimation (i and j use the same
monitor m)
If path(m, i) or path(m, j) is congested, report
path(i, j) as congestion
O/w predictedDist(i,j) (measuredDist(m, i)
measuredDist(m, j))/ 2
Inter-cluster distance estimation
If path(monitori, i), path(monitori, monitorj) or
path(monitorj, j) is congested, report path(i, j)
as congestion
Otherwise predictedDist(i,j) measuredDist(monito
ri, monitorj)
Self-diagnostics of monitors, check for last-mile
congestion

Write a Comment

User Comments (0)

About PowerShow.com

Internet Iso-bar: A Scalable Overlay Distance Monitoring System PowerPoint PPT Presentation