On the Power of Off-line Data in Approximating Internet Distances

About This Presentation

Title:

On the Power of Off-line Data in Approximating Internet Distances

Description:

... #hops, # AS, depth Linear Regression for Internet distance estimation Multi-variable linear regression Accuracy of picking closest mirror site The next step ... – PowerPoint PPT presentation

Number of Views:20

Avg rating:3.0/5.0

Slides: 17

Provided by: Timel8

Learn more at: http://web.cse.ohio-state.edu

Category:

more less

Transcript and Presenter's Notes

Title: On the Power of Off-line Data in Approximating Internet Distances

1
On the Power of Off-line Data in Approximating
Internet Distances

Danny Raz (danny_at_cs.technion.ac.il)
Technion - Israel Institute of Technology
and
Prasun Sinha (prasunsinha_at_lucent.com)
Bell Labs., Lucent Technologies

2
Outline

Internet Distance
Off line metrics
Geographic distance, hops, AS, depth
Linear Regression for Internet distance
estimation
Multi-variable linear regression
Accuracy of picking closest mirror site
The next step

3
Internet Distance

Internet Distance one way delay between hosts
Components of Internet Distance
Dynamic
Server Load
Network Congestion / Router Load
Static
propagation delay over the links
Router processing delay
Edge-router processing delay

Goal To study the power of estimating the Static
Internet Distance using off-line metrics
4
Importance of Internet Distance Estimation

Picking closest mirror-site/cache
For use in Content Distribution Networks

5
Approaches

Dynamic
Dynamic probing Dykes et. al. Infocom 00
Passive monitoring Andrews et. al. Infocom 02
Static
Semi-active probing (IDMAPs) Jamin et. al.
Infocom 00
Other relevant work
Geographic Distance and RTT Padmanabhan Sigcomm
02

6
Static Internet Distance
AS 1
AS 2
AS 3
Core Router
Edge Router

Propagation delay geographical distance
Router processing delay hops
Edge-router processing delay AS

AS Autonomous System
Static Internet Distance ? geo-distance ?
hop-count ? AS-count ?
7
Data Collection

Clients 2500 public libraries in US
Servers (mirrors/caches) 8 traceroute locations
in US
The location (latitude, longitude) is known for
every host.
For every client-server pair
Run multiple (10) traceroutes
Pick the traceroute result with the smallest RTT
Compute
Geo-distance based on latitude and longitude
Hop-count from traceroute
AS-count from traceroute based on names of
routers and IP Address Prefixes

8
Linear Regression(Geo-distance and Hop-count)
minRTT vs. Geo-distance SE (Std. Error) 26.93
minRTT vs. Hop-count SE (Std. Error) 25.71
9
Multiple Linear Regression (Multiple metrics)
minRTT vs. Geo-distance, Hop-count SE 21.52
minRTT vs. Geo-distance, AS-count SE 23.80
10
minRTT ? geo-distance ? hop-count ?
AS-count ?
Term Coefficient p-value
Geo-distance 12.53 (?) lt0.0001
Hop-count 2.45 (?) lt0.0001
AS-count -0.64 (?) 0.0387

High correlation between hop-count and AS-count
(highest among any other pair of metrics)
Hop-count and AS-count should not be used together

11
A new Off-line metric Depth

Hop-count requires dynamic probing
Introduce an alternate metric Depth
Average Hop-count to the nearest backbone network
(a hand-made list of 30 big core networks)
Constant per host (client/server)
Alternately, measure in units of time rather than
hops
(Client depth Server depth) as a metric

12
Linear Regression (Depth)
minRTT vs. Depth SE 41.02
minRTT vs. Depth and Geo-distance SE 24.52
13
Squared Errors in Estimating minRTT
Metric SE (Standard Error)
Geo-distance, Hop-count 21.52
Geo-distance, AS-count 23.80
Geo-distance, Depth 24.52
Hop-count 25.71
Geo-Distance 26.93
Depth 41.02
14
Accuracy of picking the nearest mirror site
Allowed Delta Random Geo-distance Hop-count Geo-distance, Hop-count Geo-distance, Depth
0 12.50 37.84 44.32 38.41 33.98
10ms 21.15 53.07 58.98 55.91 50.45
20ms 33.75 73.18 76.70 74.89 70.91
30ms 46.25 90.91 88.75 91.36 89.43
880 clients and 8 servers
15
Summary

Combination of hop-count and geographic distance
improves over individual metrics
Using Depth along with Geo-distance improves
performance and is completely off-line
For closest mirror selection with 30 ms allowed
deviation, almost any metric gives 90 accuracy

Is there much space to improve?
16
The Next Step

Global Data
Collection and analysis of data based on clients
and servers spread across the globe
Using both off-line and on-line
Techniques to combine the power of off line
estimation with on-line estimation.

Write a Comment

User Comments (0)

About PowerShow.com

On the Power of Off-line Data in Approximating Internet Distances - PowerPoint PPT Presentation

On the Power of Off-line Data in Approximating Internet Distances

... #hops, # AS, depth Linear Regression for Internet distance estimation Multi-variable linear regression Accuracy of picking closest mirror site The next step ... – PowerPoint PPT presentation