Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker - PowerPoint PPT Presentation

1 / 53

About This Presentation

Title:

Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker

Description:

Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, ... mutable content. anonymity. Outline. Introduction. Design. Evaluation. Strengths & Weaknesses ... – PowerPoint PPT presentation

Number of Views:23

Avg rating:3.0/5.0

Slides: 54

Provided by: nik110

Category:

more less

Transcript and Presenter's Notes

Title: Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker

1
A Scalable, Content-Addressable Network
1,2
3
1

Sylvia Ratnasamy, Paul Francis, Mark Handley,
Richard Karp, Scott Shenker

1,2
1
2
3
1
Tahoe Networks
U.C.Berkeley
ACIRI
2
Outline

Introduction
Design
Evaluation
Strengths Weaknesses

3
Internet-scale hash tables

Hash tables
essential building block in software systems
Internet-scale distributed hash tables
equally valuable to large-scale distributed
systems?
peer-to-peer systems
Napster, Gnutella, Groove, FreeNet, MojoNation
large-scale storage management systems
Publius, OceanStore, PAST, Farsite, CFS ...
mirroring on the Web

4
Content-Addressable Network(CAN)

CAN Internet-scale hash table
Interface
insert(key,value)
value retrieve(key)
Properties
scalable
operationally simple
good performance
Related systems Chord/Pastry/Tapestry/Buzz/Plaxto
n ...

5
Problem Scope

Design a system that provides the interface
scalability
robustness
performance
security
Application-specific, higher level primitives
keyword searching
mutable content
anonymity

6
Outline

Introduction
Design
Evaluation
Strengths Weaknesses
Ongoing Work

7
CAN basic idea
8
CAN basic idea
insert(K1,V1)
9
CAN basic idea
insert(K1,V1)
10
CAN basic idea
(K1,V1)
11
CAN basic idea
retrieve (K1)
12
CAN solution

virtual Cartesian coordinate space
entire space is partitioned amongst all the nodes
every node owns a zone in the overall space
abstraction
can store data at points in the space
can route from one point to another
point node that owns the enclosing zone

13
CAN simple example
node Iinsert(K,V)
I
14
CAN simple example
node Iinsert(K,V)
I
(1) a hx(K)
x a
15
CAN simple example
node Iinsert(K,V)
I
(1) a hx(K) b hy(K)
y b
x a
16
CAN simple example
node Iinsert(K,V)
I
(1) a hx(K) b hy(K)
(2) route(K,V) -gt (a,b)
17
CAN simple example
node Iinsert(K,V)
I
(1) a hx(K) b hy(K)
(K,V)
(2) route(K,V) -gt (a,b) (3) (a,b) stores
(K,V)
18
CAN simple example
node Jretrieve(K)
(1) a hx(K) b hy(K)
(K,V)
(2) route retrieve(K) to (a,b)
J
19
CAN

Data stored in the CAN is addressed by name
(i.e. key), not location (i.e. IP address)

20
CAN routing table
21
CAN routing
(a,b)
(x,y)
22
CAN routing

A node only maintains state for its immediate
neighboring nodes
Compared to geographical routing
can be considered as greedy forwarding in
Cartesian space instead of physical space.

23
CAN node insertion
Bootstrap node
new node
1) Discover some node I already in CAN
24
CAN node insertion
I
new node
1) discover some node I already in CAN
25
CAN node insertion
(p,q)
2) pick random point in space
I
new node
26
CAN node insertion
(p,q)
J
I
new node
3) I routes to (p,q), discovers node J
27
CAN node insertion
new
J
4) split Js zone in half new owns one half
28
CAN node insertion

Inserting a new node affects only a single other
node and its immediate neighbors
Problem
Inefficient if the new node and its neighbor(J)
is far away from each other in terms of
communication.

29
CAN node failures

Need to repair the space
recover database (weak point)
soft-state updates
use replication, rebuild database from replicas
repair routing
takeover algorithm

30
CAN takeover algorithm

Simple failures
know your neighbors neighbors
a node periodically broadcast its zone
coordinates and a list of its neighbors and their
zone coordinates.
when a node fails, one of its neighbors takes
over its zone
self-set timer decides which neighbor to take
over.
More complex failure modes
simultaneous failure of multiple adjacent nodes
scoped flooding to discover neighbors
hopefully, a rare event

31
CAN node failures

Only the failed nodes immediate neighbors are
required for recovery

32
Design recap

Basic CAN
completely distributed
self-organizing
nodes only maintain state for their immediate
neighbors
Comment
basic CAN does not work very well, additional
design features are necessary

33
Design improvements

The neighboring relationship in coordinate space
may be completely different from that in
underlying IP network.
How can coordinate space approximately map to
physical space?
Topologically-sensitive CAN construction
distributed binning

34
Distributed Binning

Goal
bin nodes such that co-located nodes land in same
bin
neighbors in the coordinate space are likely
close in IP network
reduce per-hop latency, prevent overly network
routing anomaly
Idea
well known set of landmark machines
each CAN node, measures its RTT to each landmark
orders the landmarks in order of increasing RTT
CAN construction
place nodes from the same bin close together on
the CAN

35
Distributed Binning

4 Landmarks (placed at 5 hops away from each
other)
naïve partitioning

dimensions2
dimensions4
w/o binning w/ binning
w/o binning w/ binning
?
20
15
latency Stretch
10
5
1K
4K
1K
4K
256
256
number of nodes
36
Design improvements

Multi-dimensioned coordinated spaces
To reduce path length
path length is O(d n 1/d)
Hash function more complex?

37
Design improvements

Multiple, independent spaces (realities)
To forward a message, a node checks all its
neighbors on each reality instead of one reality,
and do greedy forwarding.
Reduce routing path length
other benefits
Improve data availability (hash table are
replicated on each reality).
Improve routing fault tolerance.

38
Design improvements

Better CAN routing metrics.
Use RTT instead of Cartesian distance when
selecting a next hop neighbor.
To improve per-hop(CAN hop) latency

39
Design improvements

Overloading coordinate zone
allow multiple node to share the same zone.
A node maintain a list of its peers in addition
to its neighbors.
A node selects one neighbor from the peers of the
neighboring zone
The contents of the hash table may be either
divided or replicated across the nodes in a zone.
Reduce path length
reduce of zones
reduce per-hop latency
has more choice in selecting a neighbor.

40
CAN load balancing

Two pieces
Dealing with hot-spots
popular (key,value) pairs
nodes cache recently requested entries
overloaded node replicates popular entries at
neighbors
Need to deal with cache consistency and update
policy problem.
Uniform coordinate space partitioning
uniformly spread (key,value) entries
uniformly spread out routing load

41
Uniform Partitioning

Added check
at join time, pick a zone
check neighboring zones
pick the largest zone and split that one

42
Uniform Partitioning
65,000 nodes, 3 dimensions
w/o check
w/ check
Percentage of nodes
V
2V
4V
8V
Volume
43
CAN Robustness

Completely distributed
no single point of failure ( not applicable to
pieces of database when node failure happens)
Not exploring database recovery (in case there
are multiple copies of database)
Resilience of routing
can route around trouble

44
Outline

Introduction
Design
Evaluation
Strengths Weaknesses

45
Evaluation

Scalability
Low-latency
Load balancing
Robustness

46
CAN scalability

For a uniformly partitioned space with n nodes
and d dimensions
per node, number of neighbors is 2d
average routing path is (dn1/d)/4 hops
simulations show that the above results hold in
practice
Can scale the network without increasing per-node
state
Chord/Plaxton/Tapestry/Buzz
log(n) nbrs with log(n) hops

47
CAN low-latency

Problem
latency stretch (CAN routing delay)
(IP routing delay)
application-level routing may lead to high
stretch
Solution
increase dimensions, realities (reduce the path
length)
Heuristics (reduce the per-CAN-hop latency)
RTT-weighted routing
multiple nodes per zone (peer nodes)
deterministically replicate entries

48
CAN low-latency
dimensions 2
w/o heuristics
w/ heuristics
Latency stretch
16K
32K
65K
131K
nodes
49
CAN low-latency
dimensions 10
w/o heuristics
w/ heuristics
Latency stretch
16K
32K
65K
131K
nodes
50
Outline

Introduction
Design
Evaluation
Strengths Weaknesses

51
Strengths

More resilient than flooding broadcast networks
Efficient at locating information
Fault tolerant routing
Node Data High Availability (w/ improvement)
Manageable routing table size network traffic

52
Weaknesses

Impossible to perform a fuzzy search
Susceptible to malicious activity
Maintain coherence of all the indexed data
(Network overhead, Efficient distribution)
Still relatively higher routing latency
Poor performance w/o improvement

53
Compare Can and Pastry

CAN is greedy forwarding in Cartesian coordinate
space.
Pastry is maximum address prefix matching in a
tree structure like routing table.
The routing table at each node in Pastry
maintains more information
Routing table maintenance
Both are local.

54
Compare Can and Pastry