Distributed Hash-based Lookup for Peer-to-Peer Systems - PowerPoint PPT Presentation

About This Presentation

Title:

Distributed Hash-based Lookup for Peer-to-Peer Systems

Description:

Distributed Hash-based Lookup for Peer-to-Peer Systems Mohammed Junaid Azad 09305050 Gopal Krishnan 09305915 Mtech1 ,CSE Pseudo code Example * Node Joins with ... – PowerPoint PPT presentation

Number of Views:187

Avg rating:3.0/5.0

Slides: 59

Provided by: juna7

Category:

more less

Transcript and Presenter's Notes

Title: Distributed Hash-based Lookup for Peer-to-Peer Systems

1
Distributed Hash-based Lookupfor Peer-to-Peer
Systems

Mohammed Junaid Azad 09305050
Gopal Krishnan 09305915
Mtech1 ,CSE

2
Agenda

Peer-to-Peer System
Initial Approaches to Peer-to-Peer Systems
Their Limitations
Distributed Hash Tables
CAN-Content Addressable Network
CHORD

3
Peer-to-Peer Systems

Distributed and Decentralized Architecture
No centralized Server(Unlike Client Server
Architecture)
Any Peer can behave as Server

4
Napster

P2P file sharing system
Central Server stores the index of all the files
available on the network
To retrieve a file, central server contacted to
obtain location of desired file
Not completely decentralized system
Central directory not scalable
Single point of failure

5
Gnutella

P2P file sharing system
No Central Server store to index the files
available on the network
File location process decentralized as well
Requests for files are flooded on the network
No Single point of failure
Flooding on every request not scalable

6
File Systems for P2P systems

The file system would store files and their
metadata across nodes in the P2P network
The nodes containing blocks of files could be
located using hash based lookup
The blocks would then be fetched from those nodes

7
Scalable File indexing Mechanism

In any P2P system, File transfer process is
inherently scalable
However, the indexing scheme which maps file
names to location crucial for scalability
Solution- Distributed Hash Table

8
Distributed Hash Tables

Traditional name and location services provide a
direct mapping between keys and values
What are examples of values? A value can be an
address, a document, or an arbitrary data item
Distributed hash tables such as CAN/Chord
implement a distributed service for storing and
retrieving key/value pairs

9
DNS vs. Chord/CAN

DNS
provides a host name to IP address mapping
relies on a set of special root servers
names reflect administrative boundaries
is specialized to finding named hosts or services

Chord
can provide same service Name key, value IP
requires no special servers
imposes no naming structure
can also be used to find data objects that are
not tied to certain machines

10
Example Application using ChordCooperative
Mirroring

Highest layer provides a file-like interface to
user including user-friendly naming and
authentication
This file systems maps operations to lower-level
block operations
Block storage uses Chord to identify responsible
node for storing a block and then talk to the
block storage server on that node

11
CAN
12
What is CAN ?

CAN is a distributed infrastructure that provides
hash table like functionality
CAN is composed of many individual nodes
Each CAN node stores a chunk (zone) of the entire
hash table
Request for a particular key is routed by
intermediate CAN nodes whose zone contains that
key
The design can be implemented in application
level (no changes to kernel required)

13
Co-ordinate space in CAN
14
Design Of CAN

Involves a virtual d-dimensional Cartesian
Co-ordinate space
The co-ordinate space is completely logical
Lookup keys hashed into this space
The co-ordinate space is partitioned into zones
among all nodes in the system
Every node in the system owns a distinct zone
The distribution of zones into nodes forms an
overlay network

15
Design of CAN (..continued)

To store (Key,value) pairs, keys are mapped
deterministically onto a point P in co-ordinate
space using a hash function
The (Key,value) pair is then stored at the node
which owns the zone containing P
To retrieve an entry corresponding to Key K, the
same hash function is applied to map K to the
point P
The retrieval request is routed from requestor
node to node owning zone containing P

16
Routing in CAN

Every CAN node holds IP address and virtual
co-ordinates of each of its neighbours
Every message to be routed holds the destination
co-ordinates
Using its neighbours co-ordinate set, a node
routes a message towards the neighbour with
co-ordinates closest to the destination
co-ordinates
Progress how much closer the message gets to the
destination after being routed to one of the
neighbours

17
Routing in CAN(continued)

For a d-dimensional space partitioned into n
equal zones, routing path length O(d.n1/d )
hops
With increase in no. of nodes, routing path
length grows as O(n1/d )
Every node has 2d neighbours
With increase in no. of nodes, per node state
does not change

18
Before a node joins CAN
19
After a Node Joins
20
Allocation of a new node to a zone

First the new node must find a node already in
CAN(Using Bootstrap Nodes)
The new node randomly chooses a point P in the
co-ordinate space
It sends a JOIN request to point P via any
existing CAN node
The request is forwarded using CAN routing
mechanism to the node D owning the zone
containing P
D then splits its node into half and assigns one
half to new node
The new neighbour information is determined for
both the nodes

21
Failure of node

Even if one of the neighbours fails, messages can
be routed through other neighbours in that
direction
If a node leaves CAN, the zone it occupies is
taken over by the remaining nodes
If a node leaves voluntarily, it can handover
its database to some other node
When a node simply becomes unreachable, the
database of the failed node is lost
CAN depends on sources to resubmit data, to
recover lost data

22
CHORD
23
Features

CHORD is a distributed hash table implementation
Addresses a fundamental problem in P2P
Efficient location of the node that stores
desired data item
One operation Given a key, maps it onto a node
Data location by associating a key with each data
item
Adapts Efficiently
Dynamic with frequent node arrivals and
departures
Automatically adjusts internal tables to ensure
availability
Uses Consistent Hashing
Load balancing in assigning keys to nodes
Little movement of keys when nodes join and leave

24
Features (continued)

Efficient Routing
Distributed routing table
Maintains information about only O(logN) nodes
Resolves lookups via O(logN) messages
Scalable
Communication cost and state maintained at each
node scales logarithmically with number of nodes
Flexible Naming
Flat key-space gives applications flexibility to
map their own names to Chord keys
Decentralized

25
Some Terminology

Key
Hash key or its image under hash function, as per
context
m-bit identifier, using SHA-1 as a base hash
function
Node
Actual node or its identifier under the hash
function
Length m such that low probability of a hash
conflict
Chord Ring
The identifier circle for ordering of 2m node
identifiers
Successor Node
First node whose identifier is equal to or
follows key k in the identifier space
Virtual Node
Introduced to limit the bound on keys per node to
K/N
Each real node runs O(logN) virtual nodes with
its own identifier

26
Chord Ring
27
Consistent Hashing

A consistent hash function is one which changes
minimally with changes in the range of keys and a
total remapping is not required
Desirable properties
High probability that the hash function balances
load
Minimum disruption, only O(1/N) of the keys moved
when a nodes joins or leaves
Every node need not know about every other node,
but a small amount of routing information
m-bit identifier for each node and key
Key k assigned to Successor Node

28
Simple Key Location
29
Example
30
Scalable Key Location

A very small amount of routing information
suffices to implement consistent hashing in a
distributed environment
Each node need only be aware of its successor
node on the circle
Queries for a given identifier can be passed
around the circle via these successor pointers
Resolution scheme correct, BUT inefficient it
may require traversing all N nodes!

31
Acceleration of Lookups

Lookups are accelerated by maintaining additional
routing information
Each node maintains a routing table with (at
most) m entries (where N2m) called the finger
table
ith entry in the table at node n contains the
identity of the first node, s, that succeeds n by
at least 2i-1 on the identifier circle
(clarification on next slide)
s successor(n 2i-1) (all arithmetic mod 2)
s is called the ith finger of node n, denoted by
n.finger(i).node

32
Finger Tables (1)
1 2 4
1,2) 2,4) 4,0)
1 3 0
33
Finger Tables (2) - characteristics

Each node stores information about only a small
number of other nodes, and knows more about nodes
closely following it than about nodes farther
away
A nodes finger table generally does not contain
enough information to determine the successor of
an arbitrary key k
Repetitive queries to nodes that immediately
precede the given key will lead to the keys
successor eventually

34
Pseudo code
35
Example
36
Node Joins with Finger Tables
finger table
keys
start
int.
succ.
6
1 2 4
1,2) 2,4) 4,0)
1 3 0

6
6
finger table
keys
start
int.
succ.
2
4 5 7
4,5) 5,7) 7,3)
0 0 0
6
6
37
Node Departures with Finger Tables
finger table
keys
start
int.
succ.
1 2 4
1,2) 2,4) 4,0)
1 3 0

3
6
finger table
keys
start
int.
succ.
1
2 3 5
2,3) 3,5) 5,1)
3 3 0
6
finger table
keys
start
int.
succ.
6
7 0 2
7,0) 0,2) 2,6)
0 0 3
finger table
keys
start
int.
succ.
2
4 5 7
4,5) 5,7) 7,3)
6 6 0
0
38
Source of InconsistenciesConcurrent Operations
and Failures

Basic stabilization protocol is used to keep
nodes successor pointers up to date, which is
sufficient to guarantee correctness of lookups
Those successor pointers can then be used to
verify the finger table entries
Every node runs stabilize periodically to find
newly joined nodes

39
Pseudo code
40
Pseudo Code(Continue..)
41
Stabilization after Join

n joins
predecessor nil
n acquires ns as successor via some n
n notifies ns being the new predecessor
ns acquires n as its predecessor
np runs stabilize
np asks ns for its predecessor (now n)
np acquires n as its successor
np notifies n
n will acquire np as its predecessor
all predecessor and successor pointers are now
correct
fingers still need to be fixed, but old fingers
will still work

ns
pred(ns) n
n
succ(np) ns
pred(ns) np
succ(np) n
np
42
Failure Recovery

Key step in failure recovery is maintaining
correct successor pointers
To help achieve this, each node maintains a
successor-list of its r nearest successors on the
ring
If node n notices that its successor has failed,
it replaces it with the first live entry in the
list
stabilize will correct finger table entries and
successor-list entries pointing to failed node
Performance is sensitive to the frequency of node
joins and leaves versus the frequency at which
the stabilization protocol is invoked

43
Impact of Node Joins on Lookups Correctness

For a lookup before stabilization has finished,
Case 1 All finger table entries involved in the
lookup are reasonably current then lookup finds
correct successor in O(logN) steps
Case 2 Successor pointers are correct, but
finger pointers are inaccurate. This scenario
yields correct lookups but may be slower
Case 3 Incorrect successor pointers or keys not
migrated yet to newly joined nodes. Lookup may
fail. Option of retrying after a quick pause,
during which stabilization fixes successor
pointers

44
Impact of Node Joins on Lookups Performance

After stabilization, no effect other than
increasing
the value of N in O(logN)
Before stabilization is complete
Possibly incorrect finger table entries
Does not significantly affect lookup speed, since
distance halving property depends only on
ID-space distance
If new nodes IDs are between the target
predecessor and the target, then lookup speed is
influenced
Still takes O(logN) time for N new nodes

45
Handling Failures

Problem what if node does not know who its new
successor is, after failure of old successor
May be in a gap in the finger table
Chord would be stuck!
Maintain successor list of size r, containing
the nodes first r successors
If immediate successor does not respond,
substitute the next entry in the successor list
Modified version of stabilize protocol to
maintain the successor list
Modified closest_preceding_node to search not
only finger table but also successor list for
most immediate predecessor
If find_successsor fails, retry after some time
Voluntary Node Departures
Transfer keys to successor before departure
Notify predecessor p and successor s before
leaving

46
Theorems

Theorem IV.3 Inconsistencies in successor are
transient
If any sequence of join operations is executed
interleaved with stabilizations, then at sometime
after the last join the successor pointers will
form a cycle on all the nodes in the network.
Theorem IV.4 Lookup take log(N) time with high
probability even if N nodes join a stable N node
network, once successor pointers are correct,
even if finger pointers are not updated
Theorem IV.6 If network is initially stable,
even if every node fails with probability ½,
expected time to execute find_succcessor is O(log
N)

47
Simulation

Implements Iterative Style (other one is
recursive style)
Node resolving a lookup initiates all
communication unlike Recursive Style, where
intermediate nodes forward request Optimizations
During stabilization, a node updates its
immediate
successor and 1 other entry in successor list or
finger table
Each entry out of k unique entries gets refreshed
once in
k stabilization rounds
Size of successor list is 1
Immediate notification of predecessor change to
old
predecessor, without waiting for next
stabilization round

48
Parameters

Mean of delay of each packet is 50 ms
Round trip time is 500 ms
Number of nodes is 104
Number of Keys vary from 104 to 106

49
Load Balance

Test ability of consistent hashing, to allocate
keys
to nodes evenly
Number of keys per node exhibits large
variations,
that increase linearly with the number of keys
Association of keys with Virtual Nodes Makes the
number of keys per node more uniform and
Significantly improves load balance
Asymptotic value of query path length not
affected much
Total identifier space covered remains same on
average
Worst-case number of queries does not change
Not much increase in routing state maintained
Asymptotic number of control messages not affected

50
In the absence of Virtual Node
51
In Presence of Virtual Nodes
52
Path Length

Number of nodes that must be visited to resolve
a query, measured as the query path length
As per theorem, IV.2
The number of nodes that must be contacted to
find a successor in an N-node Network is O(log N)
Observed Results
Mean query path length increases logarithmically
with number of nodes
Average Same as expected average query path length

53
Path Length Simulator Parameters

A network with N 2K nodes
No of Keys 100 x 2K
K varied between 3 to 14 and Path length is
measured

54
Graph
55
Future Work

Resilience against Network Partitions
Detect and heal partitions
For every node have a set of initial nodes
Maintain a long term memory of a random set of
nodes
Likely to include nodes from other partition
Handle Threats to Availability of data
Malicious participants could present incorrect
view of data
Periodical Global Consistency Checks for each
node
Better Efficiency
O(logN) messages per lookup too many for some
apps
Increase the number of fingers

56
References

Chord A Scalable Peer-to-Peer Lookup Service for
Internet Applications, I. Stoica, R. Morris, D.
Karger, M. Frans Kaashoek, H. Balakrishnan, In
Proc. ACM SIGCOMM 2001. Expanded version appears
in IEEE/ACM Trans. Networking, 11(1), February
2003.
A Scalable Content-Addressable Network,S.
Ratnasamy, P. Francis, M. Handley, R. Karp, S.
Shenker, In Proc. ACM SIGCOMM 2001)
Querying the Internet with PIER Ryan Huebsch,
Joseph M. Hellerstein, Nick Lanham, Boon Thau
Loo, Scott Shenker, and Ion Stoica, VLDB 03

57
Thank You !
58