Early Measurements of a Cluster-based Architecture for P2P Systems - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Early Measurements of a Cluster-based Architecture for P2P Systems

Description:

Early Measurements of a Cluster-based Architecture for P2P Systems Yinglian Xie Carnegie Mellon University Balachander Krishnamurthy, Jia Wang ATT Labs---Research – PowerPoint PPT presentation

Number of Views:133
Avg rating:3.0/5.0
Slides: 16
Provided by: Yin139
Category:

less

Transcript and Presenter's Notes

Title: Early Measurements of a Cluster-based Architecture for P2P Systems


1
Early Measurements of a Cluster-based
Architecture for P2P Systems
  • Yinglian Xie
  • Carnegie Mellon University
  • Balachander Krishnamurthy, Jia Wang
  • ATT Labs---Research

2
Motivation
  • Peer-to-peer(P2P) applications provide us with a
    new content service model
  • End-hosts self organized into an overlay network
    and share content with each other
  • For a wide deployment of P2P applications
  • We need a scalable content location and routing
    scheme in the application layer
  • We need to study and understand P2P traffic
    patterns

3
Recent Work
  • Existing approaches for content location
  • Napster uses a centralized server
  • Gnutella relies on flooding of queries
  • Recent designs
  • Distributed indexing schemes based on hash
    functions
  • CAN, Chord, Pastry, Tapestry

4
Our Work
  • A Cluster-based architecture (CAP) for P2P
    systems
  • Example application distributed search (support
    keyword searching)
  • Design using network-aware clustering
  • Early measurements of CAP
  • trace analysis simulations

5
CAP System Design
  • Network-aware clustering
  • B. Krishnamurthy and J.Wang. On Network-Aware
    Clustering of Web Clients. In proceedings of ACM
    Sigcomm, August 2000
  • An effective technique to group clients that are
    topologically close and under common
    administrative domain
  • Apply network-aware clustering to P2P
    applications
  • An additional level in the hierarchy
  • Less dynamism
  • More scalability

6
CAP Architecture
  • Three entities
  • Clustering server
  • Delegate
  • Client
  • Two operations
  • Node join and node leave
  • Query lookup

7
Inter-cluster Routing
  • Each query has a maximum search depth
  • Each delegate keeps a neighbor list
  • Assigned randomly when the delegate joins the
    network
  • Updated gradually based on application
    requirements
  • Depth-first search among neighbors

8
CAP Evaluation
  • Collect Gnutella traces, apply network-aware
    clustering in trace data analysis
  • To examine the potential advantage of using
    network-aware clustering
  • Trace-driven simulations
  • Measure CAP system performance based on real
    deployment (ongoing work)

9
Collecting Gnutella Trace
  • A modified open source Gnutella client (gnut)
    to passively monitor and log all Gnutella messages

Location Trace length Number of IP addresses
CMU 10 hours 799,386
ATT 14 hours 302,262
ACIRI 6 hours 185,905
Location Trace length Number of IP addresses
CMU 89 hours 301,025
ATT 139 hours 261,094
UKY 96 hours 409,084
UKY 75 hours 292,759
WPI 10 hours 69,285
Table 1 Traces with unlimited connections
Table 2 Traces with limited connections
10
Cluster Distribution
  • CMU trace
  • 5/24/2001 5/25/2001, 799,386 IP addresses,
    45,129 clusters
  • Clustering helps reduce query latency by caching
    repeated queries

11
Client and Cluster Distribution along Time
  • Network-aware clustering helps reduce dynamism in
    the P2P network

12
Simulation
  • Trace-driven simulation
  • Use Gnutella trace to generate join, leave,
    search
  • Assume the query distribution follows the file
    distribution
  • Performance metrics
  • Hit rate
  • Overhead
  • Search Latency

13
Hit Rate
  • Use CMU trace
  • 1,000 node stationary network
  • 311 clusters
  • 4,615search messages
  • 3,793 unique files

14
Overhead and Search Latency
  • Overhead
  • Messages per search, forward operations per
    delegate
  • In Gnutella, overhead grows exponentially
  • In CAP, overhead grows linearly
  • Search Latency
  • Application level hop length
  • In CAP, search path length is short

15
Summary
  • CAP is promising to increase stability and
    scalability of distributed applications
  • Ongoing work We are implementing CAP, deploying
    it in machines around the world, and measuring
    the performance
Write a Comment
User Comments (0)
About PowerShow.com