Data in P2P Systems - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Data in P2P Systems

Description:

The IP routers are peep to peer. Client Server vs Peer-to ... QoS invariable to changes of other factors (topology, number of nodes, etc) The Lookup Problem ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 42
Provided by: Dee138
Category:
Tags: p2p | data | dns | free | gnutella | ip | lookup | napster | number | systems

less

Transcript and Presenter's Notes

Title: Data in P2P Systems


1
Data in P2P Systems
  • Deepak Verma
  • Sanjay prasad

2
What are P2P systems
  • Every node is designed to provide some service
    that helps other nodes in the network to get
    service.
  • The main goal of Peer to peer networking is to
    allow users to share files without putting them
    on central servers. They are used mainly for file
    sharing.
  • P2P is not a new concept. The IP routers are peep
    to peer.

3
Client Server vs Peer-to-Peer
4
P2P System Goals
  • Cost sharing/reduction.
  • Improved scalability/reliability.
  • Resource aggregation and interoperability
  • Increased autonomy
  • Anonymity/privacy
  • Dynamism
  • Enabling ad-hoc communication and collaboration.

5
Computer system Taxonomy
Computer Systems
Centralized Systems (mainframes, SMPs,
workstations)
Distributed Systems
Client - server
Peer to Peer
Flat
Hierarchical
pure
Hybrid
6
P2P Taxonomy
P2P systems
Distributed Computing
File sharing
Collaboration
platforms
7
  • Parallelizable Parallelizable P2P applications
    split a large task into smaller sub-pieces that
    can execute in parallel over a number of
    independent peer nodes.
  • Content and file management. Content and file
    management P2P applications focus on storing
    information on and retrieving information from
    various peers in the network.

8
  • Collaborative Collaborative P2P applications
    allow users to collaborate, in real time, without
    relying on a central server to collect and relay
    information.

9
P2P System Architecture
tools
applications
services
Application specific Layer
Class specific Layer
scheduling
Meta-data
messaging
management
Resource aggregation
Robustness Layer
security
reliability
Locating and routing
Group Management Layer
discovery
Communication Layer
communication
10
  • Communication The fundamental challenge of
    communication in a P2P community is overcoming
    the problems associated with the dynamic nature
    of peers.
  • Group Management This includes discovery of
    other peers in the community and location and
    routing between those peers.
  • Robustness There are three main components that
    are essential to maintaining robust P2P systems
    security, resource aggregation and reliability.

11
  • Class specific application-specific components
    abstract functionality from each class of P2P
    application.
  • Application specific Tools, applications, and
    services implement application-specific
    functionality, such as content file management.

12
P2P system characteristics
  • Systems discover topology and maintain it.
  • Systems are neither client nor server.
  • Systems continually talk to each other.
  • They are inherently fault tolerant.
  • Systems are autonomous.

13
P2P system characteristics
  • Systems have no distinguished role.
  • No single point of bottleneck or failure.
  • Routing will depend on service and data.
  • Some applications (like Napster) are a mix of P2P
    and centralized systems.

14
P2P file sharing vs. web resource sharing
  • P2P file systems are easily to set up rather than
    maintaining a web server for an ordinary user.
    Applications like Napster provide easy to use
    interface.
  • P2P users can share specific type of contents
    like music, video etc.,
  • Massive storage is possible because of
    distributed sharing.

15
P2P Architecture Classification
  • Centralized service Location (SCL)
  • Napster
  • Distributed service location with flooding(DSLF)
  • Gnutella
  • Distributed service location with hashing(DSLH).
  • CAN,pastry,Tapestry,Chord

16
Centralized Service Architecture
17
Distributed Search/Flooding
18
Distributed Search/Flooding
19
Distributed search with hash table
20
Distributed search with hash table
21
Comparison
22
Comparison
  • CSL has services bottleneck
  • DSLF has occasional failure to find a file
  • DSLH more scalable

23
Data-Sharing P2P Systems Open Problems
24
Implementation Choices
  • Topology
  • How elements are put together
  • From free to rigid topologies
  • Data and meta-data placement
  • Gnutella, super-peer networks, Chrod
  • Message routing
  • Query language
  • Keyword-based (less expressive), SQL-like (more
    expressive)

25
Resulting properties
  • Autonomy
  • How, to whom and what to share, how to answer,
    etc
  • Robustness
  • Maintain the quality of searches with the
    presence of failures
  • Efficiency
  • Absolute resources consumed (processing power,
    disk storage, bandwidth, etc)
  • More efficiency higher throughput
  • Accuracy of answers
  • Depends mostly on the query language
  • Comprehensiveness
  • Sometimes is not reachable (partial search)

26
Research Challenges
  • Autonomy/Efficiency correlation
  • Autonomy/Robustness correlation
  • Finding techniques to decouple or best tradeoffs
  • More autonomy but more complex sophisticated
    search alg.
  • Data and meta-data replication technigues

27
Research Challenges, contd
  • Quality of Service (QoS)
  • Different metrics to measure (number of results,
    response time, comprehensiveness, etc)
  • Problem of achieving required QoS as efficient as
    possible
  • E.g., number of results is important for QoS in
    Gnutella the QoS/efficiency tradeoff
  • Problem of making QoS invariable to changes of
    other factors (topology, number of nodes, etc)

28
The Lookup Problem
  • The lookup problem is simple to state Given a
    data item X stored at some dynamic set of nodes
    in the system, find it.
  • One approach is maintain a central database that
    maps a file name to the locations of servers that
    store file.
  • The traditional approach to achieving scalability
    is to use hierarchy.
  • Symmetric lookup algorithms. Unlike the
    hierarchy, no node is more important than any
    other node as far as the lookup process is
    concerned.

29
Distributed Hash Table (DHT)
  • A hash-table interface is an attractive
    foundation for a distributed lookup algorithm
    because it places few constraints on the
    structure of keys or the values they name.
  • The main requirements are that data be identified
    using unique numeric keys, and that nodes be
    willing to store keys for each other.

30
Distributed Hash Table (cont.)
  • A DHT implements just one operation lookup(key)
    yields the network location of the node currently
    responsible for the given key.

31
Distributed Hash Table (cont.)
  • Mapping keys to nodes in a load-balanced way.
  • Forwarding a lookup for a key to an appropriate
    node.
  • Distance function.
  • Building routing tables adaptively.

32
P2P Algorithms
  • Routing in One Dimensions
  • Chord skiplist-like routing
  • Pastry tree-like data structure.
  • Routing in Multiple Dimensions
  • Content-Addressable Network (CAN)

33
Example of Chord
34
Lookup in Chord
  • Each node in Chord has a finger table containing
    the IP address of a node halfway around the ID
    space from it, a quarter-of-the-way, and so forth
    in power of two.
  • A node forwards a query for k to the node in its
    finger table with the highest ID not exceeding k
    the ID of this node is called successor of k .

35
Lookup in Chord (cont.)
  • The power-of-two structure of the finger table
    ensures that the node can always forward the
    query at least half of the remaining ID-space
    distance to k.
  • As a result Chord lookups use O(logN) messages to
    resolve a query.

36
Lookup in Chord (cont.)
  • Chord ensures correct lookups in the face of node
    failures and arrivals using a successor list
    each node keeps track of IP address of the next r
    nodes immediately after it in ID space.
  • A query to make incremental progress in ID space
    even if many finger-table entries turn out point
    to failed or nonexistent nodes.

37
Lookup in Chord (cont.)
6
Successor(6) 0
Successor(2) 3
Finger tables and key locations for a net with
nodes 0, 1, and 3, and keys 1, 2, 6.
38
Node Joins Chord
  • When a node n joins the network
  • Initialize the predecessor and fingers of node n.
  • Update the fingers and predecessors of existing
    nodes to reflect the addition of n.
  • Copy all keys for which node n has became their
    successor to n.

39
Example for Addition
Finger tables and key locations after node 6
joins.
40
Ad-hoc networks and Peer to Peer
  • Wireless adhoc networks have many similarities to
    peer to peer systems.
  • No a priori knowledge.
  • No given infrastructure.

41
References
  • www.oreilly.com
  • P2P Journal
  • P2P tutorial Don Towsley
Write a Comment
User Comments (0)
About PowerShow.com