Title: CS 347: Parallel and Distributed Data Management Notes 08: P2P Systems
1CS 347 Parallel and DistributedData
ManagementNotes 08 P2P Systems
2Peer To Peer Systems
- Distributed application where nodes are
- Autonomous
- Very loosely coupled
- Equal in role or functionality
- Share exchangeresources with each other
3Related Terms
- File Sharing
- Ex Napster, Gnutella,Kaaza, E-Donkey,BitTorrent
, FreeNet,LimeWire, Morpheus - Grid Computing
- Autonomic Computing
4Search in a P2P System
ResourcesR3,1, R3,2, ...
Query Who has X?
ResourcesR2,1, R2,2, ...
ResourcesR1,1, R1,2, ...
5Search in a P2P System
ResourcesR3,1, R3,2, ...
Query Who has X?
ResourcesR2,1, R2,2, ...
answers
ResourcesR1,1, R1,2, ...
6Search in a P2P System
ResourcesR3,1, R3,2, ...
Query Who has X?
ResourcesR2,1, R2,2, ...
answers
receive resource
request resource
ResourcesR1,1, R1,2, ...
7Distributed Lookup
- Have ltk, vgt pairs
- Given k, find matching values
v
k
lookup(4) a, d
1 a
1 b
4 a
7 c
3 a
1 a
4 d
8Data Distributed Over Nodes
- N nodes
- Each holds some ltk,vgt data
9Notation
- X.func(params) means RPC ofprocedure
func(params) at node X - X.A means send message to X to getvalue of A
- If X omitted, we refer to local procedureor data
structures
... B X.A A X.f(B) ...
node Y data A, B...
node X data A, B...
10Distributed Hashing
Chord Paper Ion Stoica, Robert Morris, David
Liben-Nowell, David R. Karger, M. Frans
Kaashoek, Frank Dabek, Hari Balakrishnan,Chord
A Scalable Peer-to-peer Lookup Protocol for
Internet Applications.IEEE/ACM Transactions on
Networking http//pdos.csail.mit.edu/chord/papers/
paper-ton.pdf
11Hashing Values and Nodes
- H(v) is m-bit number (v is value)
- H(X) is m-bit number (X is node id)
- Hash function is good
H(k)
2m - 1
0
H(X)
H(Y)
stored at Y
k v
12Chord Circle
m6
N1
N56
N8
K54
N51
K10
N14
N48
N21
N42
K38
N38
N32
K24, K30
13Rule
- Consider nodes X, Y such that Y follows X
clockwise - Node Y is responsible for keys k such that H(k)
in ( H(X), H(Y)
use hashedvalues...e.g., N54 is nodewhose id
hashesto 54.
N54
N3
stores K55, K56, ... K3
14Succ, pred links
m6
N1.pred
N1
N56
N8
N1.succ
N51
N14
N48
N21
N42
N38
N32
15Search using succ links
m6
N1
N56
N8
N51.find_succ(K52)
N14.find_succ(K52)
N51
N56
N14
N48
N21
N42
N38
N32
16Code
- X.find_succ(k) if k in (X, succ return
succ else return succ.find_succ(k)
17Notation Use of Hashed Values
- X.find_succ(k) if k in (X, succ return
succ else return succ.find_succ(k)
should be
- X.find_succ(k) if H(k) in (H(X),
H(succ) return succ else return
succ.find_succ(k)
18Code
- X.find_succ(k) if k in (X, succ return
succ else return succ.find_succ(k)
- Note What happens if k stored by X?
- Case 1 klt X
- Case 2 kX
19Looking Up Data
- X.DHTlookup(k) Y find_succ(k) S
Y.lookup(k) return S - Y.lookup(k) returns local values associated with
k
20Another Version
- X.DHTlookup(k) X.find_succ(k, X) waitans(k,
S) return S - X.find_succ(k, Y) if k in (X, succ Y.ans(k,
succ.lookup(k)) else succ.find_succ(k, Y)
combines find and lookup avoids chain of returns
21Inserting Data
- X.DHTinsert(k, v) Y find_succ(k) Y.insert(k,
v) - Y.insert(k, v) inserts k,v in local storage
22Another Version
- X.DHTinsert(k, v) if k in (X, succ succ.insert
(k,v) else succ.DHTinsert(k, v)
23Finger Table
m6
finger table for N8
N1
N81 N14
N56
N8
N82 N14
N51
N84 N14
N14
N48
N88 N21
N816 N32
N832 N42
N21
N42
N38
N32
24Finger Table
m6
finger table for N8
N1
N81 N14
N56
N8
N82 N14
N51
N84 N14
N14
N48
N88 N21
N816 N32
N832 N42
N21
N42
N38
Node responsible forkey 832 40
N32
25Example
N1
find_succ(K54)
N56
N8
N51
N14
N48
N21
N42
N38
N32
26Example
N1
find_succ(K54)
N56
N8
N51
N14
N48
N21
N42
N38
N32
27Example
N1
find_succ(K54)
N56
N8
K54
N51
N14
N48
N21
N42
N38
N32
28Code
- X.find_succ(id) if id in (X, succ return
succ else Y closest_preceed(id) return
Y.find_succ(id) - X.closest_preceed(id) for i m downto 1 if
fingeri in (X,id) return fingeri return X
29Another Example
N1
find_succ(K7)
N56
N8
N51
N14
N48
N21
N42
N38
N32
30Example
N1
find_succ(K7)
N56
N8
N51
N14
N48
N21
N42
N38
N32
31Example
return N8
N1
K7
find_succ(K7)
N56
N8
N51
N14
N48
N21
N42
N38
N32
32Yet Another Example
N1
find_succ(K8)
N56
N8
N51
N14
N48
N21
N42
N38
N32
33Yet Another Example
there seems to be a bug in code of Slide 26
(which is the same as Figure 5 of Chord paper).
N1
find_succ(K8)
N56
N8
return N8
N51
N14
N48
N21
N42
N38
N32
34Adding Nodes to Circle
- need to
- update links
- move data
N1
N56
N8
K54
N51
K10
N14
N48
joining node
N26
N21
N42
for now, assume nodes never die
K38
N38
K24, K30
N32
35New Node X Joins
- Node Y is known to be in ring
- X.join(Y) pred nil succ Y.find_succ(X)
36Periodic Stabilization
- X.stabilize() Y succ.pred if Y in (X, succ)
succ Y succ.notify(X) - X.notify(Z) if pred nil OR Z in (pred,
X) pred Z
37Periodic Finger Check
- X.fix_fingers() next next 1 if next gt
m next 1 - fingernext find_succ(X2next-1)
-
38Join Example
Nx
Np
Ns
39Join Example
after join
nil
Nx
Np
Ns
40Join Example
after Nx.stabilize Y Np
nil
Nx
Np
Ns
41Join Example
after Np.stabilize Y Nx
Nx
Np
Ns
42Moving Data When?
keys in (Np, Nx
keys in (Np, Ns
Nx
Np
Ns
43Move Data After Ns.notify(Nx)
nil
Nx
Np
Ns
send all keys in range (Np, Nx when Ns.prev is
updated...
44Periodic Stabilization
- X.notify(Z) if pred nil OR Z in (pred,
X) Z.give(data in range (pred ,Z ) pred
Z X.remove(data in range (pred ,Z )
Question What data do we give when prednil?
Note We are glossing over concurrency
issues,e.g., what happens to lookups while we
are moving data?
45Lookup May be at Wrong Node!
resp. for (Np, Nx
nil
Nx
resp. for (Nx, Ns
Np
Ns
lookup for k in (Np, Nxdirected to Ns!
46Looking Up Data (Revised)
- X.DHTlookup(k) ok false while not ok do Y
find_succ(k) ok, S Y.lookup(k) return
S - Y.lookup(k) if k in (pred, Y return true,
local values for k else return false,
47Inserting Data (Revised)
- X.DHTinsert(k, v) ok false while not ok
do Y find_succ(k) ok Y.insert(k, v) - Y.insert(k, v) if k in (pred, Y insert k,v
in local storage return true else return false
48Why Does This Work?
- pred, succ links eventually correct
- data ends up at correct node key k is at node X
where k in (X.prev, X - finger pointers speed up searchesbut do not
cause problems
49Results for N Node System
- With high probability, the number of nodes that
must be contacted to find a successor is O(log N) - Although finger table contains room for m
entries, only O(log N) need to be stored - Experimental results show average lookup time is
(log N)/2
50Node Failures
k7 v7
data at Nx
k8 v8
k9 v9
Nx
Np
Ns
Nx dies!! - links screwed up - data lost
51To Fix Links
- X.check_pred() if (pred has failed) pred
nil - Also, keep links to r successors in ring
52Failure Example
Initially...
Nx
Np
Ns
backup succ link (r2)
Nx dies...
53Failure Example
After Ns.check_pred
Np
Ns
nil
54Failure Example
After Np discovers Nx down...
Np
Ns
nil
55Failure Example
After stabilization...
Np
Ns
56Protecting Against Data Loss
- One idea robust node (see notes on replicated
data)
robust node X
takeover protocols
node X.1
node X.2
node X.3
k7 v7
k7 v7
k7 v7
k8 v8
k8 v8
k8 v8
k9 v9
k9 v9
k9 v9
backup protocols
57Replicated Hash Table
58Looking Up Data
- X.DHTlookup(k) id H(k) if X
X.node(id) return local values for k else Y
X.node(id) return Y.DHTlookup(k)
59Concurrency Control
- X.DHTlookup(k) id H(k) lock(table) if X
X.node(id) temp local values for k
unlock(table) return temp - else unlock(table) Y X.node(id) retur
n Y.DHTlookup(k)
control for bucket id could move to another node
assume single statements are atomic
60Shorthand
- X.DHTlookup(k) id H(k) if X
X.node(id) return local values for k else
Y X.node(id) return Y.DHTlookup(k)
61Node Joins
node N0
node N2
data for keys that hash to 0,1
data for keys that hash to 2,3
N0 overloaded, asks N1 for help...
62Node Joins
First, set up N1...
node N0
node N1
node N2
data for keys that hash to 0,1
data for keys that hash to 2,3
63Node Joins
Copy data to N1...
node N0
node N1
node N2
data copy
data for keys that hash to 0,1
data for keys that hash to 1
data for keys that hash to 2,3
64Node Joins
Change control...
node N0
node N1
node N2
N1
N1
data for keys that hash to 0,1
data for keys that hash to 1
data for keys that hash to 2,3
Which do we update first, N0 or N1??
65Node Joins
Drop data at N0...
node N0
node N1
node N2
N1
N1
data for keys that hash to 0
data for keys that hash to 1
data for keys that hash to 2,3
66Update Other Nodes
node N0
node N1
node N2
N1
N1
N1
data for keys that hash to 0,1
data for keys that hash to 1
data for keys that hash to 2,3
- How do other nodes get updated?
- Eagerly by N0?
- Lazily by future lookups?
67What about inserts?
node N0
node N1
node N2
data copy
data for keys that hash to 0,1
data for keys that hash to 1
data for keys that hash to 2,3
- insert arrives at this point...
- apply at N0 and then copy?
- re-direct to N1?
68Control of Hash Table
when move in progress, this field records node
data is moving to...
node that holds data for bucket a... if N1 is
local node, then local node controls bucket a
69Different Scenarios
Node X controls a, lookups local, inserts local
X
X
Forward lookups and inserts to Y
Node X controls a, lookups local, inserts to Y
( X?)
X
Inserts local, forward lookups to X
Y
70New Node Y Joins
- X.join(Y) select HT entry j, X willing to
give up (note, nextj should be nil) nextj
Y copy HT making Y.nextj Y copy
data for key j to Y Y.nodej Y Y.nextj
nil nodej Y nextj nil remove
data for key j
71Looking Up Data (Revised)
- X.DHTlookup(k) id H(k) if X X.node(id)
then return local values for k else Y
X.node(id) tt Y.node(id) if (tt neq Y)
and (X.node(id)Y) X.node(id) tt
return( Y.DHTlookup(k) )
72Storing Data
- X.DHTinsert(k, v) id H(k) tt node(id)
nn next(id) if (X tt and nnnil) or
(nnX) insert (k,v) locally else if nn
nil Y tt else Y nn Y.DHTinsert(k,v)
73Example
node N0
node N2
node N1
data for keys that hash to 0,1
data for keys that hash to 2,3
N0.join(N1) is executed...
74Example Step 1
node N0
node N2
node N1
data for keys that hash to 0,1
data for keys that hash to 2,3
Bucket 0 is reserved for N1... Note What happens
to lookups inserts at this point in
time? (Inserts can bounce back and forth from N0
to N1... Problem?)
75Example Step 2
node N0
node N2
node N1
data for keys that hash to 0,1
data for keys that hash to 0
data for keys that hash to 2,3
N1 starts accepting inserts and data from
N0... Note What happens to lookups inserts at
this point in time? (Lookups will miss recently
inserted data...)
76Example Step 3
node N0
node N2
node N1
data for keys that hash to 0,1
data for keys that hash to 0
data for keys that hash to 2,3
Copy complete Node N1 activated (part 1) Note
What happens to lookups inserts at this point
in time?
77Example Step 4
node N0
node N2
node N1
data for keys that hash to 0,1
data for keys that hash to 0
data for keys that hash to 2,3
Copy complete Node N1 activated (part 2) Note
What happens to lookups inserts at this point
in time?
78Example Step 5
node N0
node N2
node N1
data for keys that hash to 0,1
data for keys that hash to 0
data for keys that hash to 2,3
Node N0 records new master (part 1) Note What
happens to lookups inserts at this point in
time?
79Example Step 6
node N0
node N2
node N1
data for keys that hash to 0,1
data for keys that hash to 0
data for keys that hash to 2,3
Node N0 records new master (part 2) Note What
happens to lookups inserts at this point in
time?
80Example Step 7
node N0
node N2
node N1
data for keys that hash to 1
data for keys that hash to 0
data for keys that hash to 2,3
Node N0 removes data for bucket 0... Note What
happens to lookups inserts at this point in
time?
81Lookup Example
N2.DHTlookup(id0)
node N0
node N2
node N1
out of date
data for keys that hash to 0,1
data for keys that hash to 2,3
data for keys that hash to 0
Updates to hash table in progress...
82Lookup Example
N2.DHTlookup(id0)
N1.DHTlookup(id0)
node N0
node N2
node N1
out of date
updated to N1
data for keys that hash to 0,1
data for keys that hash to 2,3
data for keys that hash to 0
Updates to hash table in progress...
83Lookup Example
node N0
node N2
node N1
data for keys that hash to 1
data for keys that hash to 2,3
data for keys that hash to 0
Final state...
84Can we prove scheme works?
- Assume only joins, no failures
- Remember OK to miss recent inserts
- if not, need 2PC to handoff control...
85Exercise
- Make HT extensible
- Start with m entries in HT, thendynamically
double size - Make sure we dont get confusedwhen HTs of
different sizes exist
86Chord vs Replicated HT
- Which code is simpler?
- note that Chord code does not show data
migration! - Lookups O(log N) vs O(1)
- Impact of caching
- Routing table Log N vs N
- Anonymity?
- Bootstrapping
87Neighborhood Search (Gnutella)
- Each node stores its own datasearches nearby
nodes
key val
12 a
7 b
13 c
25 a
key val
41 g
99 c
14 d
key val
47 f
12 d
51 x
9 y
88Storing Data
- X.DTinsert(k, v) insert (k,v) locally at X
89Lookup
- X.DTlookup(k) TTL desired value return(
X.find(k, TTL, X) ) - X.find(k, TTL, Z) TTL TTL 1 S local
data pairs with key k if TTL gt 0 then for all
Y in X.neighbors (Y neq Z) do SS union
Y.find(k, TTL, X) return( S)
90Example
N1.DTlookup(13), TTL 4
key val
41 g
13 c
14 d
key val
13 f
12 c
key val
41 g
13 x
14 d
key val
15 f
12 c
key val
13 d
12 c
key val
47 f
12 d
25 x
9 y
key val
12 a
7 b
13 c
25 a
key val
41 g
13 c
14 d
key val
44 s
05 a
node N1
key val
41 g
13 f
14 d
key val
13 t
01 d
91Example
N1.DTlookup(13), TTL 4
answer so far c, f, x
key val
41 g
13 c
14 d
key val
13 f
12 c
key val
41 g
13 x
14 d
key val
15 f
12 c
ttl1
key val
13 d
12 c
key val
47 f
12 d
25 x
9 y
key val
12 a
7 b
13 c
25 a
key val
41 g
13 c
14 d
ttl2
key val
44 s
05 a
node N1
key val
41 g
13 f
14 d
key val
13 t
01 d
ttl3
92Example
N1.DTlookup(13), TTL 4
answer c, f, x, d (no t!)
key val
41 g
13 c
14 d
key val
13 f
12 c
key val
41 g
13 x
14 d
key val
15 f
12 c
key val
13 d
12 c
key val
47 f
12 d
25 x
9 y
key val
12 a
7 b
13 c
25 a
key val
41 g
13 c
14 d
key val
44 s
05 a
node N1
key val
41 g
13 f
14 d
key val
13 t
01 d
93Optimization
- Queries have unique identifier
- Nodes keep cache of recent queries(query id plus
TTL)
94Example
N1.DTlookup(13), TTL 4, Qid77
key val
41 g
13 c
14 d
key val
13 f
12 c
key val
41 g
13 x
14 d
key val
15 f
12 c
77,1
ttl1
avoid Qid77!
key val
13 d
12 c
key val
47 f
12 d
25 x
9 y
key val
12 a
7 b
13 c
25 a
key val
41 g
13 c
14 d
77,2
ttl2
key val
44 s
05 a
node N1
key val
41 g
13 f
14 d
77,4
key val
13 t
01 d
ttl3
77,3
95Example
N1.DTlookup(13), TTL 4, Qid77
ttl1
key val
41 g
13 c
14 d
key val
13 f
12 c
key val
41 g
13 x
14 d
key val
15 f
12 c
77,1
ttl2
key val
13 d
12 c
key val
47 f
12 d
25 x
9 y
key val
12 a
7 b
13 c
25 a
ttl3
key val
41 g
13 c
14 d
77,2
ttl2
key val
44 s
05 a
node N1
key val
41 g
13 f
14 d
77,4
key val
13 t
01 d
77,3
avoid Qid77!
96Joins
- X.join neighbors cand nodes a friend
recommends Z bootstrap server we hear
about cand cand union Z.getNodes for Y in
cand do ok Y.wantMe(X) if ok
then neighbors neighbors ? Y if
neighbors gt limit, return
97Joins (continued)
- Y.wantMe(X)if I want X as neighbor
then neighbors neighbors ? X return
(true)return (false)
98Bootstrap Server
if no response, remove from S
get neighbors
X5
server
X7
X1
add to S
X2
known nodes S X1, X2, x3,...
sometimes called pong server
99Problems with Neighborhood Search
- Unnecessary messages
- High load and traffic
- Example nodes have M new neighbors,number of
messages is MTTL - Low capacity nodes are a bottleneck
- Do not find all answers
network
area searched
TTL
100Why is Neighborhood Search Good?
- Can pose complex queries
- Simple robust algorithm
- Works well if data is highly replicated
network
area searched
TTL
sites that have latest Justin Bieber song
101Super-Nodes
- Regular nodes index their content atsuper-nodes
- Super-nodes run neighborhood search
SN
SN
SN
SN
SN
SN
SN
SN
SN
SN
102Motivation for Super-Nodes
- Take advantage of powerful nodes
- Searching larger index better than searching many
smaller ones
SN
SN
SN
SN
SN
SN
SN
SN
SN
SN
103Napster (Original One)
SN
104Napster (Original One)
query
get file
SN
105Napster (Original One)
- Actually, had several disconnected SNs
query
get file
napster
SN
SN
SN
106Performance Evaluation
- Yang, Beverly Garcia-Molina, Hector. Designing a
Super-peer Network, IEEE International Conference
on Data Engineering, 2003.
number of nodes 10,000
cluster size
- Metrics
- Bandwidth
- Compute Load
topology
107Aggregate Bandwidth
108Individual Bandwidth Load
single SP
109Individual Compute Load
110SP Redundancy
data to index
data to index, plus queries
Virtual SP
SP 1
full index
full index
SP 2
111Content Addressable Network (CAN)
1
Nodes
Data
2
112What is a P2P System?
- Multiple sites (at edge)
- Distributed resources
- Sites are autonomous (different owners)
- Sites are both clients and servers
- Sites have equal functionality
P2P Purity
113P2P Benefits
- Pooling available (inexpensive) resources
- High availability and fault-tolerance
- Self-organization
114Comparison
????
??
????
?
???
115Comparison
????
?
??
????
????
??
???
?
???
??
116Open Problems
- Functionality
- Applications
- Anonymity
Performance
- Efficiency
- Load-balancing
- Authentic Services
- Prevention of DoS
Participation
Correctness
117Open Problems Bad Guys
- Availability (e.g., coping with DOS attacks)
- Authenticity
- Anonimity
- Access Control (e.g., IP protection,
payments,...)
118Authenticity
title origin of species
author charles darwin
?
date 1859
body In an island far, far away ...
...
119Authenticity
title origin of species
author charles darwin
?
date 1859
00
body In an island far, far away ...
...
120More than Fetching One File
Torigin Y? Adarwin B?
Torigin Y1859 Adarwin
Torigin Y1800 Adarwin
Torigin Y1859 Adarwin
121More than Fetching One File
Torigin Y? Adarwin B?
Torigin Y1859 Adarwin
Torigin Y1800 Adarwin
Torigin Y1859 Adarwin
122Solutions
- Authenticity Function A(doc) T or F
- at expert sites, at all sites?
- can use signature expert sig(doc) user
- Voting Based
- authentic is what majority says
- Time Based
- e.g., oldest version (available) is authentic
123Added Challenge Efficiency
- Example Current music sharing
- everyone has authenticity function
- but downloading files is expensive
- SolutionTrack peer behavior
good peer
good peer
bad peer
124How to Track Peer Behavior?
- Trust Vector v1, v2, v3, v4
a b c d
- Single value between 0 and 1?
125How to Track Peer Behavior?
- Trust Vector v1, v2, v3, v4
a b c d
- Single value between 0 and 1?
- Pair of values total downloads, good
downloads ?
126Trust Operations
update?
a
1, .9, .5, 0, 0
.5
.9
b
c
1, 1, 0, .3, 1
1, 0, 1, 1, .2
.3
1
.3
.2
e
d
127Issues
- Trust computations in dynamic system
- Overloading good nodes
- Bad nodes can provide good content sometimes
- Bad nodes can build up reputation
- Bad nodes can form collectives
- ...
128Sample Results
Fraction of inauthentic downloads
Fraction of malicious peers
129Participation Incentives
- Autonomous nodes need incentives to work together
- Forward messages
- Perform computations
- Share/store files
- Provide services
- Etc.
130Incentive types
- Three main kinds of incentives (thus far)
- Tit for tat
- Reputation
- Money/Currency
131Tit-for-Tat
- I do to you what you do for me
- Example
They need each other to reach more nodes.
Þ Can retaliate
132Reputation and Currency
- If you do something for me, I will give you
reputation/money
133Pros and Cons
- Tit-for-Tat
- Pros Requires minimal infrastructure and
overhead, least prone to cheating - Cons Requires symmetric relationships
- Currency
- Pros Everyone wants money! In some applications
it is required - Cons Requires the heaviest infrastructure
- Reputation
- Applies to most situations, but has some
overhead, as well as its own incentive issues
134Reputation and Currency
- For these techniques, there are 2 questions
- If we have money/reputation scores, how do we use
it to give peers incentives? - How do we implement money/reputation score in a
P2P fashion?
135P2P Summary
- Search
- Chord DHT
- Replicated DHT
- Gnutella
- Super-Peers
- Dealing with Bad Guys
- Dealing with Lazy Guys