Title: Internet Distance Estimation through Embedding: from Euclidean to Hyperbolic spaces
1Internet Distance Estimation through Embedding
from Euclidean to Hyperbolic spaces
- Yuval Shavitt and Tomer Tankel
- School of Electrical Engineering
2Why Distance Estimation?
- Many applications can improve performance
- WWW (closest mirror selection)
- Peer-to-peer downloads
- Application layer routing (App Layer Mcast)
- Why
- Shorter RTT ? faster communication
- Optimizing Map discovery algorithms
- Sense of neighborhood
June 2004
Yuval Shavitt
3The Mirror Selection Problem
4The Closet Mirror Selection Problem
- Given a set of servers, si 1iK,
- and a client, c, find a server sj,
- s.t., ?i 1iK d(sj,c) ad(si,c)ß
5How to Get Distance Estimation?
- 1. Client will measure
- cost? billions of measurement!
- 2. Server will measure
- cost? still too many measurements
- 3. IDMaps
- a cost effective ubiquitous service
6What is IDMaps?
- An Internet-wide infrastructure to collect and
disseminate distance information. - Distance metrics hop count, round trip delay,
minimum bandwidth, propagation delay, etc - Participants Sugih Jamin, Paul Francis, Danny
Raz, Yuval Shavitt, Lixia Zhang
7IDMaps components
- Tracers autonomous instrumentation boxes.
- Address Prefixes (APs) the measurement
granularity
8Distance Measurements
- Tracers measure distance to each other.
- Tracers measure distance to APs
- Each distance measurement is called a virtual
link (VL) - A distance map is a graph of Tracers, APs, and VLs
9First Solution
- The original IDMaps solution ToN01 sums segment
distances to estimate end-to-end distance. - No sense of geomerty
T1
T2
10Problem statement
T1
T2
A
B
C
11Problem statement
T1
T2
A
B
C
12Embedding Solutions
- Ng and Zhang, 02 suggest to embed the graph in
d dimensions. - Use down hill simplex (DHS)
- Slow to converge even for 15 Traceres
- Not accurate, high max/var distortion
- We present a better algorithm for this
calculation
Distortion Max Real distance / computed
distance, computed distance / Real distance
13Basic Idea
- In our model
- particles network nodes (Tracers, clients)
- inter-particle force friction difference
between real and estimated distance - Kinetic energy drive particles out of local
minima of the error function - Simulation step set dynamically for numerical
efficiency
14(No Transcript)
15(No Transcript)
16(No Transcript)
17(No Transcript)
18(No Transcript)
19(No Transcript)
20(No Transcript)
21(No Transcript)
22The Internet looks like a Jellyfish Faloutsos
- Distances between nodes in tentacles are
distorted in geo. embedding - We need to bend the routes through the center
23Our Main Idea
Embed the graph in hyperbolic space
Example Poincare disk D2
24Embedding Example in D2
25Curvature and Distances Ratio
26Metric Curvature Equation
- Arranging points A,B,C and D symmetrically at the
hyperbolic origin and equating distances to the
scaled metric we get - cosh(Cmax) - cosh(Cmax AB/AC) 1 0
- Equation for metric without distortion
- cosh(Cmax)-2cosh(Cmax(r2)/(2r2)) 1 0
- Where rb/a is ratio between edges of the graph.
27Metric Curvature Graph
28Embedding Methods
Not scalable
- All pair (AP)
- Embed n-nodes metric, n(n-1)/2 distance pairs, at
once. - Two phase (TP)
- Embed Small subset of t Tracers, t(t-1)/2
distance pairs. - For each of the other nodes, embed its distances
to several nearest Tracers. - Random Neighbors (RN)
- Embed with distances to
- The 1-neighborhood
- Order of log(n)peer nodes, selected uniformly at
random. - No fixed tracers
29Rand. Neigh. vs. Two Phase
- Random Neighbors
- Symmetric
- Central calculation?
- Equally accurate for all distances.
- Suitable for P2P
- How to calculate distributively?
- Two Phase
- Non-symmetric
- Distributed
- Over-estimation of short distances
- Sensitive to Tracer failure
30All-Pairs Embedding for AS Graph 1/00
31Two-Phase Embedding for AS Graph 1/00
32Two-Phase Embedding Rel. Error AS 1/00
(Est-Real)/ min(Est,Real)
33Random Neighbors Embedding Rel. Error
34Random Neighbors Embedding Rel. Error AS 3/01
35Random Neighbors Embedding Rel. Error AS 3/01
36(No Transcript)
37Iterative Geometric Tree Algorithm
- Span the space outwards from core nodes at origin
- Join all nodes according to following rules
- Parent closer to origin
- Stretch limit Parent-Child lt
Child-originDTh. - Search nearest parent at each tree level,
starting from group of orphans - Limit degree of candidate parent
- Rewire based on nodes Geometric Stretch (GS)
- Initiate rewire only for nodes with large GS
- Rewire only with significant GS improvement
- Until stabilizing majority of nodes
38(No Transcript)
39(No Transcript)
40(No Transcript)
41(No Transcript)
423D Hyperbolic IGT
3D Hyperbolic Multicast Tree
AS 1/00 Graph 800 LANCORE members
100
Complementary Frequency
50
0
1
1.5
2
Relative Delay Penalty(RDP)
432D Hyperbolic IGT
2D Hyperbolic Multicast Tree
AS 1/00 800 LANCORE members
100
Complementary Frequency
50
0
1
1.5
2
2.5
Relative Delay Penalty(RDP)
44AS 1/00 Multicast Trees with Two-Phase Curvatures
Stretch
Stress
45Algorithm Performance
- Tested on
- real AS graphs
- GTech generated graphs
- Low stretch
- Medium stress
46GTech Multicast Group Size 100-3200 Two-Phase
47GTech Multicast Group Size 100-3200
RandomNeighbors
48Conclusions and Further Study
- Hyperbolic space improves embedding for all
methods (e.g., GNP) and all curvatures. - Closest mirror selection results.
- Scalable, symmetric, and distributed Tree
Construction - Future work
- Features of HYP Coordinates
- Density vs. Origin Distance
- Degree vs. Origin Distance