Distributed Hash-based Lookup for Peer-to-Peer Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Distributed Hash-based Lookup for Peer-to-Peer Systems

Description:

Distributed Hash-based Lookup for Peer-to-Peer Systems Mohammed Junaid Azad 09305050 Gopal Krishnan 09305915 Mtech1 ,CSE Pseudo code Example * Node Joins with ... – PowerPoint PPT presentation

Number of Views:185
Avg rating:3.0/5.0
Slides: 59
Provided by: juna7
Category:

less

Transcript and Presenter's Notes

Title: Distributed Hash-based Lookup for Peer-to-Peer Systems


1
Distributed Hash-based Lookupfor Peer-to-Peer
Systems
  • Mohammed Junaid Azad 09305050
  • Gopal Krishnan 09305915
  • Mtech1 ,CSE

2
Agenda
  • Peer-to-Peer System
  • Initial Approaches to Peer-to-Peer Systems
  • Their Limitations
  • Distributed Hash Tables
  • CAN-Content Addressable Network
  • CHORD

3
Peer-to-Peer Systems
  • Distributed and Decentralized Architecture
  • No centralized Server(Unlike Client Server
    Architecture)
  • Any Peer can behave as Server

4
Napster
  • P2P file sharing system
  • Central Server stores the index of all the files
    available on the network
  • To retrieve a file, central server contacted to
    obtain location of desired file
  • Not completely decentralized system
  • Central directory not scalable
  • Single point of failure

5
Gnutella
  • P2P file sharing system
  • No Central Server store to index the files
    available on the network
  • File location process decentralized as well
  • Requests for files are flooded on the network
  • No Single point of failure
  • Flooding on every request not scalable

6
File Systems for P2P systems
  • The file system would store files and their
    metadata across nodes in the P2P network
  • The nodes containing blocks of files could be
    located using hash based lookup
  • The blocks would then be fetched from those nodes

7
Scalable File indexing Mechanism
  • In any P2P system, File transfer process is
    inherently scalable
  • However, the indexing scheme which maps file
    names to location crucial for scalability
  • Solution- Distributed Hash Table

8
Distributed Hash Tables
  • Traditional name and location services provide a
    direct mapping between keys and values
  • What are examples of values? A value can be an
    address, a document, or an arbitrary data item
  • Distributed hash tables such as CAN/Chord
    implement a distributed service for storing and
    retrieving key/value pairs

9
DNS vs. Chord/CAN
  • DNS
  • provides a host name to IP address mapping
  • relies on a set of special root servers
  • names reflect administrative boundaries
  • is specialized to finding named hosts or services
  • Chord
  • can provide same service Name key, value IP
  • requires no special servers
  • imposes no naming structure
  • can also be used to find data objects that are
    not tied to certain machines

10
Example Application using ChordCooperative
Mirroring
  • Highest layer provides a file-like interface to
    user including user-friendly naming and
    authentication
  • This file systems maps operations to lower-level
    block operations
  • Block storage uses Chord to identify responsible
    node for storing a block and then talk to the
    block storage server on that node

11
CAN
12
What is CAN ?
  • CAN is a distributed infrastructure that provides
    hash table like functionality
  • CAN is composed of many individual nodes
  • Each CAN node stores a chunk (zone) of the entire
    hash table
  • Request for a particular key is routed by
    intermediate CAN nodes whose zone contains that
    key
  • The design can be implemented in application
    level (no changes to kernel required)

13
Co-ordinate space in CAN
14
Design Of CAN
  • Involves a virtual d-dimensional Cartesian
    Co-ordinate space
  • The co-ordinate space is completely logical
  • Lookup keys hashed into this space
  • The co-ordinate space is partitioned into zones
    among all nodes in the system
  • Every node in the system owns a distinct zone
  • The distribution of zones into nodes forms an
    overlay network

15
Design of CAN (..continued)
  • To store (Key,value) pairs, keys are mapped
    deterministically onto a point P in co-ordinate
    space using a hash function
  • The (Key,value) pair is then stored at the node
    which owns the zone containing P
  • To retrieve an entry corresponding to Key K, the
    same hash function is applied to map K to the
    point P
  • The retrieval request is routed from requestor
    node to node owning zone containing P

16
Routing in CAN
  • Every CAN node holds IP address and virtual
    co-ordinates of each of its neighbours
  • Every message to be routed holds the destination
    co-ordinates
  • Using its neighbours co-ordinate set, a node
    routes a message towards the neighbour with
    co-ordinates closest to the destination
    co-ordinates
  • Progress how much closer the message gets to the
    destination after being routed to one of the
    neighbours

17
Routing in CAN(continued)
  • For a d-dimensional space partitioned into n
    equal zones, routing path length O(d.n1/d )
    hops
  • With increase in no. of nodes, routing path
    length grows as O(n1/d )
  • Every node has 2d neighbours
  • With increase in no. of nodes, per node state
    does not change

18
Before a node joins CAN
19
After a Node Joins
20
Allocation of a new node to a zone
  1. First the new node must find a node already in
    CAN(Using Bootstrap Nodes)
  2. The new node randomly chooses a point P in the
    co-ordinate space
  3. It sends a JOIN request to point P via any
    existing CAN node
  4. The request is forwarded using CAN routing
    mechanism to the node D owning the zone
    containing P
  5. D then splits its node into half and assigns one
    half to new node
  6. The new neighbour information is determined for
    both the nodes

21
Failure of node
  • Even if one of the neighbours fails, messages can
    be routed through other neighbours in that
    direction
  • If a node leaves CAN, the zone it occupies is
    taken over by the remaining nodes
  • If a node leaves voluntarily, it can handover
    its database to some other node
  • When a node simply becomes unreachable, the
    database of the failed node is lost
  • CAN depends on sources to resubmit data, to
    recover lost data

22
CHORD
23
Features
  • CHORD is a distributed hash table implementation
  • Addresses a fundamental problem in P2P
  • Efficient location of the node that stores
    desired data item
  • One operation Given a key, maps it onto a node
  • Data location by associating a key with each data
    item
  • Adapts Efficiently
  • Dynamic with frequent node arrivals and
    departures
  • Automatically adjusts internal tables to ensure
    availability
  • Uses Consistent Hashing
  • Load balancing in assigning keys to nodes
  • Little movement of keys when nodes join and leave

24
Features (continued)
  • Efficient Routing
  • Distributed routing table
  • Maintains information about only O(logN) nodes
  • Resolves lookups via O(logN) messages
  • Scalable
  • Communication cost and state maintained at each
    node scales logarithmically with number of nodes
  • Flexible Naming
  • Flat key-space gives applications flexibility to
    map their own names to Chord keys
  • Decentralized

25
Some Terminology
  • Key
  • Hash key or its image under hash function, as per
    context
  • m-bit identifier, using SHA-1 as a base hash
    function
  • Node
  • Actual node or its identifier under the hash
    function
  • Length m such that low probability of a hash
    conflict
  • Chord Ring
  • The identifier circle for ordering of 2m node
    identifiers
  • Successor Node
  • First node whose identifier is equal to or
    follows key k in the identifier space
  • Virtual Node
  • Introduced to limit the bound on keys per node to
    K/N
  • Each real node runs O(logN) virtual nodes with
    its own identifier

26
Chord Ring
27
Consistent Hashing
  • A consistent hash function is one which changes
    minimally with changes in the range of keys and a
    total remapping is not required
  • Desirable properties
  • High probability that the hash function balances
    load
  • Minimum disruption, only O(1/N) of the keys moved
    when a nodes joins or leaves
  • Every node need not know about every other node,
    but a small amount of routing information
  • m-bit identifier for each node and key
  • Key k assigned to Successor Node

28
Simple Key Location
29
Example
30
Scalable Key Location
  • A very small amount of routing information
    suffices to implement consistent hashing in a
    distributed environment
  • Each node need only be aware of its successor
    node on the circle
  • Queries for a given identifier can be passed
    around the circle via these successor pointers
  • Resolution scheme correct, BUT inefficient it
    may require traversing all N nodes!

31
Acceleration of Lookups
  • Lookups are accelerated by maintaining additional
    routing information
  • Each node maintains a routing table with (at
    most) m entries (where N2m) called the finger
    table
  • ith entry in the table at node n contains the
    identity of the first node, s, that succeeds n by
    at least 2i-1 on the identifier circle
    (clarification on next slide)
  • s successor(n 2i-1) (all arithmetic mod 2)
  • s is called the ith finger of node n, denoted by
    n.finger(i).node

32
Finger Tables (1)
1 2 4
1,2) 2,4) 4,0)
1 3 0
33
Finger Tables (2) - characteristics
  • Each node stores information about only a small
    number of other nodes, and knows more about nodes
    closely following it than about nodes farther
    away
  • A nodes finger table generally does not contain
    enough information to determine the successor of
    an arbitrary key k
  • Repetitive queries to nodes that immediately
    precede the given key will lead to the keys
    successor eventually

34
Pseudo code
35
Example
36
Node Joins with Finger Tables
finger table
keys
start
int.
succ.
6
1 2 4
1,2) 2,4) 4,0)
1 3 0

6
6
finger table
keys
start
int.
succ.
2
4 5 7
4,5) 5,7) 7,3)
0 0 0
6
6
37
Node Departures with Finger Tables
finger table
keys
start
int.
succ.
1 2 4
1,2) 2,4) 4,0)
1 3 0

3
6
finger table
keys
start
int.
succ.
1
2 3 5
2,3) 3,5) 5,1)
3 3 0
6
finger table
keys
start
int.
succ.
6
7 0 2
7,0) 0,2) 2,6)
0 0 3
finger table
keys
start
int.
succ.
2
4 5 7
4,5) 5,7) 7,3)
6 6 0
0
38
Source of InconsistenciesConcurrent Operations
and Failures
  • Basic stabilization protocol is used to keep
    nodes successor pointers up to date, which is
    sufficient to guarantee correctness of lookups
  • Those successor pointers can then be used to
    verify the finger table entries
  • Every node runs stabilize periodically to find
    newly joined nodes

39
Pseudo code
40
Pseudo Code(Continue..)
41
Stabilization after Join
  • n joins
  • predecessor nil
  • n acquires ns as successor via some n
  • n notifies ns being the new predecessor
  • ns acquires n as its predecessor
  • np runs stabilize
  • np asks ns for its predecessor (now n)
  • np acquires n as its successor
  • np notifies n
  • n will acquire np as its predecessor
  • all predecessor and successor pointers are now
    correct
  • fingers still need to be fixed, but old fingers
    will still work

ns
pred(ns) n
n
succ(np) ns
pred(ns) np
succ(np) n
np
42
Failure Recovery
  • Key step in failure recovery is maintaining
    correct successor pointers
  • To help achieve this, each node maintains a
    successor-list of its r nearest successors on the
    ring
  • If node n notices that its successor has failed,
    it replaces it with the first live entry in the
    list
  • stabilize will correct finger table entries and
    successor-list entries pointing to failed node
  • Performance is sensitive to the frequency of node
    joins and leaves versus the frequency at which
    the stabilization protocol is invoked

43
Impact of Node Joins on Lookups Correctness
  • For a lookup before stabilization has finished,
  • Case 1 All finger table entries involved in the
    lookup are reasonably current then lookup finds
    correct successor in O(logN) steps
  • Case 2 Successor pointers are correct, but
    finger pointers are inaccurate. This scenario
    yields correct lookups but may be slower
  • Case 3 Incorrect successor pointers or keys not
    migrated yet to newly joined nodes. Lookup may
    fail. Option of retrying after a quick pause,
    during which stabilization fixes successor
    pointers

44
Impact of Node Joins on Lookups Performance
  • After stabilization, no effect other than
    increasing
  • the value of N in O(logN)
  • Before stabilization is complete
  • Possibly incorrect finger table entries
  • Does not significantly affect lookup speed, since
    distance halving property depends only on
    ID-space distance
  • If new nodes IDs are between the target
    predecessor and the target, then lookup speed is
    influenced
  • Still takes O(logN) time for N new nodes

45
Handling Failures
  • Problem what if node does not know who its new
    successor is, after failure of old successor
  • May be in a gap in the finger table
  • Chord would be stuck!
  • Maintain successor list of size r, containing
    the nodes first r successors
  • If immediate successor does not respond,
    substitute the next entry in the successor list
  • Modified version of stabilize protocol to
    maintain the successor list
  • Modified closest_preceding_node to search not
    only finger table but also successor list for
    most immediate predecessor
  • If find_successsor fails, retry after some time
  • Voluntary Node Departures
  • Transfer keys to successor before departure
  • Notify predecessor p and successor s before
    leaving

46
Theorems
  • Theorem IV.3 Inconsistencies in successor are
    transient
  • If any sequence of join operations is executed
    interleaved with stabilizations, then at sometime
    after the last join the successor pointers will
    form a cycle on all the nodes in the network.
  • Theorem IV.4 Lookup take log(N) time with high
    probability even if N nodes join a stable N node
    network, once successor pointers are correct,
    even if finger pointers are not updated
  • Theorem IV.6 If network is initially stable,
    even if every node fails with probability ½,
    expected time to execute find_succcessor is O(log
    N)

47
Simulation
  • Implements Iterative Style (other one is
    recursive style)
  • Node resolving a lookup initiates all
    communication unlike Recursive Style, where
    intermediate nodes forward request Optimizations
  • During stabilization, a node updates its
    immediate
  • successor and 1 other entry in successor list or
    finger table
  • Each entry out of k unique entries gets refreshed
    once in
  • k stabilization rounds
  • Size of successor list is 1
  • Immediate notification of predecessor change to
    old
  • predecessor, without waiting for next
    stabilization round

48
Parameters
  • Mean of delay of each packet is 50 ms
  • Round trip time is 500 ms
  • Number of nodes is 104
  • Number of Keys vary from 104 to 106

49
Load Balance
  • Test ability of consistent hashing, to allocate
    keys
  • to nodes evenly
  • Number of keys per node exhibits large
    variations,
  • that increase linearly with the number of keys
  • Association of keys with Virtual Nodes Makes the
    number of keys per node more uniform and
    Significantly improves load balance
  • Asymptotic value of query path length not
    affected much
  • Total identifier space covered remains same on
    average
  • Worst-case number of queries does not change
  • Not much increase in routing state maintained
  • Asymptotic number of control messages not affected

50
In the absence of Virtual Node
51
In Presence of Virtual Nodes
52
Path Length
  • Number of nodes that must be visited to resolve
  • a query, measured as the query path length
  • As per theorem, IV.2
  • The number of nodes that must be contacted to
    find a successor in an N-node Network is O(log N)
  • Observed Results
  • Mean query path length increases logarithmically
    with number of nodes
  • Average Same as expected average query path length

53
Path Length Simulator Parameters
  • A network with N 2K nodes
  • No of Keys 100 x 2K
  • K varied between 3 to 14 and Path length is
    measured

54
Graph
55
Future Work
  • Resilience against Network Partitions
  • Detect and heal partitions
  • For every node have a set of initial nodes
  • Maintain a long term memory of a random set of
    nodes
  • Likely to include nodes from other partition
  • Handle Threats to Availability of data
  • Malicious participants could present incorrect
    view of data
  • Periodical Global Consistency Checks for each
    node
  • Better Efficiency
  • O(logN) messages per lookup too many for some
    apps
  • Increase the number of fingers

56
References
  • Chord A Scalable Peer-to-Peer Lookup Service for
    Internet Applications, I. Stoica, R. Morris, D.
    Karger, M. Frans Kaashoek, H. Balakrishnan, In
    Proc. ACM SIGCOMM 2001. Expanded version appears
    in IEEE/ACM Trans. Networking, 11(1), February
    2003. 
  • A Scalable Content-Addressable Network,S.
    Ratnasamy, P. Francis, M. Handley, R. Karp, S.
    Shenker, In Proc. ACM SIGCOMM 2001) 
  • Querying the Internet with PIER Ryan Huebsch,
    Joseph M. Hellerstein, Nick Lanham, Boon Thau
    Loo, Scott Shenker, and Ion Stoica, VLDB 03 

57
Thank You !
58
  • Any Question ??
Write a Comment
User Comments (0)
About PowerShow.com