Chord:%20A%20Scalable%20Peer-to-peer%20Lookup%20Service%20for%20Internet%20Applications

About This Presentation

Title:

Chord:%20A%20Scalable%20Peer-to-peer%20Lookup%20Service%20for%20Internet%20Applications

Description:

Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Dr. Yingwu Zhu – PowerPoint PPT presentation

Number of Views:175

Avg rating:3.0/5.0

Slides: 52

Provided by: Robert2786

Learn more at: http://fac-staff.seattleu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Chord:%20A%20Scalable%20Peer-to-peer%20Lookup%20Service%20for%20Internet%20Applications

1
Chord A Scalable Peer-to-peer Lookup Service for
Internet Applications

Dr. Yingwu Zhu

2
A peer-to-peer storage problem

10000 scattered music enthusiasts
Willing to store and serve replicas
Contributing resources, e.g., storage, bandwidth,
etc.
How do you find the data?
Efficient Lookup mechanism needed!

3
The lookup problem
N2
N1
N3
Internet
Keytitle ValueMP3 data
?
Client
Publisher
Lookup(title)
N6
N4
N5
4
Centralized lookup (Napster)
N2
N1
SetLoc(title, N4)
N3
Client
DB
N4
Publisher_at_
Lookup(title)
Keytitle ValueMP3 data
N8
N9
N7
N6
Simple, but O(N) state and a single point of
failure Legal issues!
5
Napster Publish
insert(X, 123.2.21.23) ...
I have X, Y, and Z!
123.2.21.23
6
Napster Search
123.2.0.18
search(A) --gt 123.2.0.18
Where is file A?
7
Napster

Central Napster server
Can ensure correct results?
Bottleneck for scalability
Single point of failure
Susceptible to denial of service
Malicious users
Lawsuits, legislation
Search is centralized
File transfer is direct (peer-to-peer)

8
Flooded queries (Gnutella)
N2
N1
Lookup(title)
N3
Client
N4
Publisher_at_
Keytitle ValueMP3 data
N6
N8
N7
N9
Robust, but worst case O(N) messages per lookup
9
Gnutella Query Flooding
Breadth-First Search (BFS)
10
Gnutella Query Flooding

A node/peer connects to a set of Gnutella
neighbors
Forward queries to neighbors
Client which has the Information responds.
Flood network with TTL for termination
Results are complete
Bandwidth wastage

11
Gnutella Random Walk

Improved over query flooding

Same overly structure to Gnutella
Forward the query to random subset of it
neighbors
Reduced bandwidth requirements
Incomplete results
High latency

Peer nodes
12
Kazza (Fasttrack Networks)

Hybrid of centralized Napster and decentralized
Gnutella
Super-peers act as local search hubs
Each super-peer is similar to a Napster server
for a small portion of the network
Super-peers are automatically chosen by the
system based on their capacities (storage,
bandwidth, etc.) and availability (connection
time)
Users upload their list of files to a super-peer
Super-peers periodically exchange file lists
You send queries to a super-peer for files of
interest
The local super-peer may flood the queries to
other super-peers for the files of interest, if
it cannot satisfy the queries.
Exploit the heterogeneity of peer nodes

13
Kazza

Uses supernodes to improvescalability, establish
hierarchy
Uptime, bandwidth
Closed-source
Uses HTTP to carry out download
Encrypted protocol queuing, QoS

14
KaZaA Network Design
15
KaZaA File Insert
insert(X, 123.2.21.23) ...
I have X!
123.2.21.23
16
KaZaA File Search
Where is file A?
17
Routed queries (Freenet, Chord, etc.)
N2
N1
N3
Client
N4
Lookup(title)
Publisher
Keytitle ValueMP3 data
N6
N8
N7
N9
18
Routing challenges

Define a useful key nearness metric
Keep the hop count small
Keep the tables small
Stay robust despite rapid change (node
addition/removal)
Freenet emphasizes anonymity
Chord emphasizes efficiency and simplicity

19
Chord properties

Efficient O(log(N)) messages per lookup
N is the total number of servers
Scalable O(log(N)) state per node
Robust survives massive failures
Proofs are in paper / tech report
Assuming no malicious participants

20
Chord overview

Provides peer-to-peer hash lookup
Lookup(key) ? IP address
Mapping key ? IP address
How does Chord route lookups?
How does Chord maintain routing tables?

21
Chord IDs

Key identifier SHA-1(key)
Node identifier SHA-1(IP address)
Both are uniformly distributed
Both exist in the same ID space
How to map key IDs to node IDs?

22
Consistent hashing Karger 97
Key 5
K5
Node 105
N105
K20
Circular 7-bit ID space
N32
N90
K80
A key is stored at its successor node with next
higher ID
23
Basic lookup
N120
N10
Where is key 80?
N105
N32
N90 has K80
N90
K80
N60
24
Simple lookup algorithm

Lookup(my-id, key-id)
n my successor
if my-id lt n lt key-id
call Lookup(id) on node n // next hop
else
return my successor // done
Correctness depends only on successors

25
Finger table allows log(N)-time lookups
½
¼
Fast track/ Express lane
1/8
1/16
1/32
1/64
1/128
N80
26
Finger i points to successor of n2i
N120
112
½
¼
1/8
1/16
1/32
1/64
1/128
N80
27
Lookup with fingers

Lookup(my-id, key-id)
look in local finger table for
highest node n s.t. my-id lt n lt key-id
if n exists
call Lookup(id) on node n // next hop
else
return my successor // done

28
Lookups take O(log(N)) hops
N5
N10
N110
K19
N20
N99
N32
Lookup(K19)
N80
N60
29
Joining linked list insert
N25
N36
1. Lookup(36)
K30 K38
N40
30
Join (2)
N25
2. N36 sets its own successor pointer
N36
K30 K38
N40
31
Join (3)
N25
3. Copy keys 26..36 from N40 to N36
N36
K30
K30 K38
N40
32
Join (4)
N25
4. Set N25s successor pointer
N36
K30
K30 K38
N40
Update finger pointers in the background Correct
successors produce correct lookups
33
Failures might cause incorrect lookup
N120
N10
N113
N102
Lookup(90)
N85
N80
N80 doesnt know correct successor, so incorrect
lookup
34
Solution successor lists

Each node knows r immediate successors
After failure, will know first live successor
Correct successors guarantee correct lookups
Guarantee is with some probability

35
Choosing the successor list length

Assume 1/2 of nodes fail
P(successor list all dead) (1/2)r
I.e. P(this node breaks the Chord ring)
Depends on independent failure
P(no broken nodes) (1 (1/2)r)N
r 2log(N) makes prob. 1 1/N

36
Lookup with fault tolerance

Lookup(my-id, key-id)
look in local finger table and successor-list
for highest node n s.t. my-id lt n lt key-id
if n exists
call Lookup(id) on node n // next hop
if call failed,
remove n from finger table
return Lookup(my-id, key-id)
else return my successor // done

37
Chord status

Working implementation as part of CFS
Chord library 3,000 lines of C
Deployed in small Internet testbed
Includes
Correct concurrent join/fail
Proximity-based routing for low delay
Load control for heterogeneous nodes
Resistance to spoofed node IDs

38
Experimental overview

Quick lookup in large systems
Low variation in lookup costs
Robust despite massive failure
See paper for more results
Experiments confirm theoretical results

39
Chord lookup cost is O(log N)
Average Messages per Lookup
Number of Nodes
Constant is 1/2
40
Failure experimental setup

Start 1,000 CFS/Chord servers
Successor list has 20 entries
Wait until they stabilize
Insert 1,000 key/value pairs
Five replicas of each
Stop X of the servers
Immediately perform 1,000 lookups

41
Massive failures have little impact
(1/2)6 is 1.6
Failed Lookups (Percent)
Failed Nodes (Percent)
42
Related Work

CAN (Ratnasamy, Francis, Handley, Karp, Shenker)
Pastry (Rowstron, Druschel)
Tapestry (Zhao, Kubiatowicz, Joseph)
Chord emphasizes simplicity

43
Chord Summary

Chord provides peer-to-peer hash lookup
Efficient O(log(n)) messages per lookup
Robust as nodes fail and join
Good primitive for peer-to-peer systems
http//www.pdos.lcs.mit.edu/chord

44
Reflection on Chord

Strict overlay structure
Strict data placement
If data keys are uniformly distributed, and of
keys gtgt of nodes
Load balanced
Each node has O(1/N) fraction of keys
Node addition/deletion only move O(1/N) load,
load movement is minimized!

45
Reflection on Chord

Routing table (successor list finger table)
Deterministic
Network topology unaware
Routing latency could be a problem
Proximity Neighbor Selection (PNS)
m neighbor candidates, choose min latency
Still O(logN) hops

46
Reflection on Chord

Predecessor Successor must be correct,
aggressively maintained
Finger tables are lazily maintained
Tradeoff bandwidth, routing correctness

47
Reflection on Chord

Assume uniform node distribution
In the wild, nodes are heterogeneous
Load imbalance!
Virtual servers
A node hosts multiple virtual servers
O(logN)

48
(No Transcript)
49
Join lazy finger update is OK
N2
N25
K30
N36
N40
N2 finger should now point to N36, not
N40 Lookup(K30) visits only nodes lt 30, will
undershoot
50
CFS a peer-to-peer storage system

Inspired by Napster, Gnutella, Freenet
Separates publishing from serving
Uses spare disk space, net capacity
Avoids centralized mechanisms
Delete this slide?
Mention distributed hash lookup

51
CFS architecturemove later?
Block storage Availability / replication Authentic
ation Caching Consistency Server
selection Keyword search Lookup
Dhash distributed block store
Chord

Powerful lookup simplifies other mechanisms

Write a Comment

User Comments (0)

About PowerShow.com

Chord:%20A%20Scalable%20Peer-to-peer%20Lookup%20Service%20for%20Internet%20Applications - PowerPoint PPT Presentation

Chord:%20A%20Scalable%20Peer-to-peer%20Lookup%20Service%20for%20Internet%20Applications

Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Dr. Yingwu Zhu – PowerPoint PPT presentation