Internet Iso-bar: A Scalable Overlay Distance Monitoring System - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Internet Iso-bar: A Scalable Overlay Distance Monitoring System

Description:

Internet Iso-bar: A Scalable Overlay Distance Monitoring System Yan Chen, Lili Qiu, Chris Overton and Randy H. Katz – PowerPoint PPT presentation

Number of Views:130
Avg rating:3.0/5.0
Slides: 23
Provided by: yanc158
Category:

less

Transcript and Presenter's Notes

Title: Internet Iso-bar: A Scalable Overlay Distance Monitoring System


1
Internet Iso-bar A Scalable Overlay Distance
Monitoring System
  • Yan Chen, Lili Qiu, Chris Overton and Randy H.
    Katz

2
Motivations
  • Applications of end-to-end distance
    monitoring/estimation
  • Overlay Routing/Location
  • Peer-to-peer Systems
  • VPN Management/Provisioning
  • Service Redirection/Placement
  • Cache-infrastructure Configuration
  • Requirements for E2E distance monitoring system
  • Scalable a small amount of probing traffic and
    system load
  • Accurate capture congestion/failures latency
    estimation
  • Fast small computation for real-time estimation
  • Incrementally deployable
  • Easy to use
  • Benefit applications
  • Application-driven measurement
  • Inference techniques for trouble shooting, root
    cause analysis
  • Improve application performance and reliability

3
E2E Estimation/Monitoring Systems Comparison
Properties GNP Akamai IDMaps RON Internet Isobar
Dynamic monitoring
Scalability (N hosts, AP address prefixes, K landmarks, C clusters) N gt AP C C K
Estimation accuracy
Monitors deployment
4
E2E Estimation/Monitoring Systems Comparison
Properties GNP Akamai IDMaps RON Internet Isobar
Dynamic monitoring Static estimation
Scalability (N hosts, AP address prefixes, K landmarks, C clusters) N gt AP C C K O(N K) probes, each landmark takes O(N)
Estimation accuracy Accurate, but only symmetric distance
Monitors deployment End hosts
5
E2E Estimation/Monitoring Systems Comparison
Properties GNP Akamai IDMaps RON Internet Isobar
Dynamic monitoring Static estimation Yes Yes Yes
Scalability (N hosts, AP address prefixes, K landmarks, C clusters) N gt AP C C K O(N K) probes, each landmark takes O(N) O(FAP) probes, F number of CDN edge server farms Clustering need pair-wise distance b/t all pairs of APs, O(C2 AP) probes O(N2) probes
Estimation accuracy Accurate, but only symmetric distance No existing comparison. Inaccurate Triangulation inequality proximity-based clustering Exact measurements ?most accurate
Monitors deployment End hosts CDN edge servers Transit ASs (hard to deploy) End hosts
6
E2E Estimation/Monitoring Systems Comparison
Properties GNP Akamai IDMaps RON Internet Isobar
Dynamic monitoring Static estimation Yes Yes Yes Yes
Scalability (N hosts, AP address prefixes, K landmarks, C clusters) N gt AP C C K O(N K) probes, each landmark takes O(N) O(FAP) probes, F number of CDN edge server farms Clustering need pair-wise distance b/t all pairs of APs, O(C2 AP) probes O(N2) probes O(C2 N) probes
Estimation accuracy Accurate, but only symmetric distance No existing comparison. Inaccurate Triangulation inequality proximity-based clustering Exact measurements ?most accurate Similar accuracy to GNP
Monitors deployment End hosts CDN edge servers Transit ASs (hard to deploy) End hosts End hosts
7
Problem Formulation
  • Given N end hosts, how to select a subset of them
    as monitors and build a scalable overlay distance
    monitoring service without knowing the underlying
    topology?
  • Distance info desired report congestion/failure
    if occurs, otherwise latency

8
E2E Congestion/Failures Analysis
  • Based on National Lab of Applied Network Research
    (NLANR) AMP data set
  • 104 sites in US (including Alaska, Hawaii)
    Australia, every host ping all other hosts every
    minute
  • Sliding window of 10 samples, use minimum RTT as
    latency sample
  • 105M measurements, 6/25/01 7/1/01
  • Congestion/failures (uniformly denoted as
    congestion) defined as measurement loss or
    (latency gt geo mean geo stdev)
  • Congestions not common, only 0.96 samples
  • A few congestion links dominate the E2E
    congestion
  • Besides those happened at the last mile, E2E
    congestion exhibit strong spatial correlation

9
NLANR AMP Sites
10
Internet Iso-bar
  • Procedures
  • Cluster hosts that perceive similar performance
    to a small set of sites (landmarks)
  • For each cluster, select a monitor for active and
    continuous probing
  • Estimate distance between any pair of hosts using
    inter- and intra-cluster distance

11
Internet Iso-bar (I) Host Clustering
  • Define correlation distance between each pair of
    hosts
  • Existing work use network proximity
    cor_dist(i,j) net_dist(i,j) (denoted pij)
  • Iso-bar uses network distance vector (k landmarks
    for clustering only) netVi pi1, pi2, ,
    pikT
  • Euclidean distance based
  • Cosine vector similarity based
  • Apply generic clustering methods
  • Optimize the worst case minimize the maximum
    radius of all clusters (limit_num_minRmax)
  • Optimize the average case minimize the sum of
    total host-monitor distance (limit_num_minDistSum)

12
Diagram of Internet Iso-bar
Landmark
End Host
13
Diagram of Internet Iso-bar
Cluster C
Cluster B
Cluster A
Landmark
Monitor
End Host
14
Internet Iso-bar (II) Distance Estimation
  • Intra-cluster estimation
  • If path(m, i) or path(m, j) is congested, report
    path(i, j) as congestion
  • O/w pDist(i,j) (mDist(m, i) mDist(m, j))/ 2
  • Inter-cluster estimation
  • If path(mi, i), path(mi, mj) or path(mj, j) is
    congested, report path(i, j) as congestion
  • O/w pDist(i,j) mDist(mi, mj)

15
Evaluation Methodology
  • Internet measurement data
  • NLANR AMP data set
  • Clustering with geometric mean of training date
  • Estimation dates 6/25/01 7/24/01, 12/06/01
  • Keynote CDN measurement data
  • 63 agents covering all major ISPs in US, Europe,
    Asia Australia
  • 2 targets (CDN re-directors) in Boston and Texas
  • Measure TCP connection time (2/3 of handshake)
    from each agent to target every minute
  • Training date 10/21/2002
  • Estimation dates 10/21/2002 11/25/2002
  • Similar latency estimation results for both
    datasets, present NLANR

16
Evaluation Methodology (II)
  • Estimation metric
  • Relative accuracy error for un-congested latency
  • Stability
  • For dynamic monitoring systems, amount of
    congestion captured and false positive ratio
  • Internet distance estimation techniques evaluated
  • Omniscent use g-mean data of (source, dest) on
    training date
  • Global Network Positioning (GNP)
  • Clustering with network distance vector (Iso-bar)
  • Clustering with network proximity
  • 15 clusters vs. 15 landmarks of GNP

17
Latency Prediction Accuracy Stability
  • Training date 06/25/01
  • Estimation dates 06/25/01 - 12/06/01
  • Summary of the 90th percentile relative error for
    various distance estimation methods

18
Distance Estimation Results
  • Latency estimation when un-congested
  • Omniscient is the most accurate, but unscalable
  • GNP and Iso-bar are the second
  • Both have good accuracy and stability for
    distance estimation
  • GNP unscalable for online monitoring, static
    approach
  • Iso-bar outperforms proximity-based clustering by
    50
  • 90th percentile lt 0.5, if 60ms latency, 45ms lt
    prediction lt 90ms
  • Congestion/failures estimation
  • 6/25/01 7/01/01, averagely 148K congested
    measurements per day
  • Iso-bar captures 78 of them, 32 false positive
    ratio
  • Only 3 of monitoring overhead compared with RON

19
Conclusions
  • Propose Internet Iso-bar
  • Cluster hosts based on the network similarity
  • Inter- and Intra-cluster latency estimation w/
    first-step heuristic for congestion/failure
    detection
  • Preliminary results promising
  • High accuracy stability for normal latency
    estimation
  • Simple heuristics of congestion estimation
    captures 78 of congestions, with 32 false
    positive, and only 3 of monitoring overhead of
    RON

20
Ongoing Work
  • Current focus switch from latency estimation to
    congestion/failures estimation
  • Apply topology information, e.g. lossy link
    detection with network tomography
  • Cluster and choose monitors based on the lossy
    links
  • Benefit applications
  • Dynamic node join/leave for P2P systems
  • Joining client pings landmark sites to get
    distance vector, compare with those of monitors,
    and choose closest one to join
  • Split/merge clusters
  • Multi-path selection
  • More comprehensive evaluation
  • Simulate with large network
  • Deploy on PlanetLab, and operate at finer level

21
Internet Iso-bar
  • Problem formulation
  • Given N end hosts, how to select a subset of them
    as monitors and build a scalable overlay distance
    monitoring service without knowing the underlying
    topology?
  • Distance info desired report congestion/failure
    if occurs, o/w latency
  • Our approach
  • Cluster hosts that perceive similar performance
    to a small set of sites (landmarks)
  • For each cluster, select a monitor for active and
    continuous probing
  • Estimate distance between any pair of hosts using
    inter- and intra-cluster distance
  • Performance evaluation
  • Using real Internet measurement data
  • Compared with other distance estimation services
    GNP, RON
  • Performance metrics accuracy and stability

22
Internet Iso-bar (II) Distance Estimation
  • Congestion/failures analysis
  • Congestion/failures (uniformly denoted as
    congestion) not common
  • Defined as measurement loss or (latency gt geo
    mean geo stdev)
  • Only 0.96 out of 105M NLANR ping measurements
    over a week
  • Suggest a few congestion links dominate the E2E
    congestion
  • Besides those happened at the last mile, E2E
    congestion exhibit strong spatial correlation
  • Estimation algorithms
  • Intra-cluster estimation (i and j use the same
    monitor m)
  • If path(m, i) or path(m, j) is congested, report
    path(i, j) as congestion
  • O/w predictedDist(i,j) (measuredDist(m, i)
    measuredDist(m, j))/ 2
  • Inter-cluster distance estimation
  • If path(monitori, i), path(monitori, monitorj) or
    path(monitorj, j) is congested, report path(i, j)
    as congestion
  • Otherwise predictedDist(i,j) measuredDist(monito
    ri, monitorj)
  • Self-diagnostics of monitors, check for last-mile
    congestion
Write a Comment
User Comments (0)
About PowerShow.com