Title: Protocol and Connectivity based Overlay Level Capacity Calculation of P2P Networks Kasim ztoprak and
1Protocol and Connectivity based Overlay Level
Capacity Calculation of P2P Networks Kasim
Öztoprak and Hürevren KiliçComputer Engineering
DepartmentAtilim University, Ankara,
TURKEYkasim, hurevren_at_atilim.edu.tr
2CONTENT
- Introduction
- Combinatorial Capacity Metric
- Experimental Results
- Conclusions
3INTRODUCTION
- Paradigm shift in programming From stand-alone,
single bodies to simple, primitive, interacting
many entities - Designed protocols and run-time connectivity
- Run-time dynamics, efficiency
- Fault-tolerance
- Self-organization ability
- Metrics (P2P) based on run-time measurements
only Hit ratio, hit time, traffic overhead
4INTRODUCTION
- A hybrid metric both design-time (e.g. protocol)
and run-time (e.g. connectivity) consideration - P2P system as a discrete noiseless channel
-
- - Connection topology
- - Labelled transitions
- with durations
-
-
A Shannon Language
5INTRODUCTION
- Shannons L-channel capacity calculation idea
- Combinatorial Capacity maximum amount of
information (in bits per second) that can be
transmitted over P2P network - Example applications of the metric
- Pure Gnutella 0.6 (known traffic explosion
problem) - Timebased clustered Gnutella
- Potential correlations between the metric and the
of query hits and query-hit response time.
6COMBINATORIAL CAPACITY METRIC
- Definitions and a theorem compiled from (Shannon,
1948) by (Khandekar et.al., 2000) -
- Definition 1 A discrete noiseless channel is
a channel which allows the noiseless transmission
of a sequence of symbols chosen from a finite
alphabet A (called q-letter alphabet) each
symbol, say , having a certain duration
t(a) in time, possibly different for different
symbols.
7COMBINATORIAL CAPACITY METRIC
8COMBINATORIAL CAPACITY METRIC
- Definition 2 A word of length k over A is a
finite string of k letters from A. If a a1a2
ak is such a word, its duration is defined to be
t(a) t(a1) t(a2) t(ak) -
- Definition 3 A language L over A is a
collection of words over A. The discrete
noiseless channel associated with L (the
L-channel for short) is the channel which is only
allowed to transmit sequences from L without
error. -
- Question What is the capacity of such
L-channel to transmit information ? - Capacity The maximum rate (in bits per second)
that information can be transmitted over the P2P
network.
9COMBINATORIAL CAPACITY METRIC
- Definition 4 Shannon language is defined by
a directed graph whose edges are labeled with
letters from the alphabet A. It is the set of
words that result by reading off the edge labels
on paths of the graph. - P2P setup with its connection topology protocol
vs. Shannon language. -
10COMBINATORIAL CAPACITY METRIC
Fig. 1 An example Gnutella based P2P network
setup describing a Shannon Language.
11COMBINATORIAL CAPACITY METRIC
- Points and assumptions
- 1. t(imn) t(omn) t(umn) t(hmn) where m and
n are any two different peers (for computational
efficiency) - 2. For any message type x and different peers m
and n - If xmn is a label then xnm is also a label
and t(xmn) t(xnm) - 3. No self ping, pong,query, query-hit i.e. m ?
n. - 4. Number of peers may change during system
evolution. - The graph does not describe any messaging
scenario but considers the potential of sending a
message from a peer to another !!!
12COMBINATORIAL CAPACITY METRIC
- Definition 5 Let L be a Shannon language,
the combinatorial capacity of the L-channel is
defined as - Ccomb lim sup (1/t)log(N(t))
- t?8
- where N(t) is the total number of words in L
of duration t. -
- An easy algebraic method to calculate Ccomb
developed by (Shannon) using the notion of
partition function. -
13COMBINATORIAL CAPACITY METRIC
- Definition 6 Let s be nonnegative real
number and for a given pair of vertices
describing an edge, branch duration partition
function is defined as - where b is a member of the set Bv,w showing
the edges whose starting node is v and ending
node is w. - The functions constitute the entries of
M by M matrix P(s) where M is the number of peers
at the time of capacity calculation.
14COMBINATORIAL CAPACITY METRIC
- For Fig. 1, we obtain the following matrix
-
- The partition function for the language defined
by given graph G (i.e. LG,?) is called spectral
radius of the matrix P(s) and it is represented
by ?(s).
15COMBINATORIAL CAPACITY METRIC
- Theorem The combinatorial capacity of the LG,?
language is given by - Ccomb ln(s0)
- where s0 is the unique solution to the equation
?(s)1. - (for original proof see (Shannon, 1948)
- for simplified version see (Khandekar et.al.,
2000)) - An alternative way of computing s0
- Find s0 as the greatest positive solution
to the equation - det(I-P(s))0
16EXPERIMENTAL RESULTS
- Aim of the experiments
- To figure out the effect of clustering on the P2P
system by using the proposed capacity metric - To observe (if exists) potential correlations
between proposed metric results and the known
metrics (like of query hits) - Tools
- Matlab 7.2.0 for capacity calculation.
- GnutellaSim (overlay level)
- NS-2 (network simulator)
- GT-ITM (backbone generation) (Nguyen Zakhor,
2002)
17EXPERIMENTAL RESULTS
- Experimental Set-up
- Gnutella Network
- Configurations (Identical initial topology setup
for both) - Pure-Gnutella overlay network
- Time-based clustered version
- Initially 27 peers
- Uniformly distributed random peer arrivals
(Sripanidkulchai et. al., 2003) - 3-node transit-stub network topology using
GT-ITM - 1 ultra-peer and 8 leaf peers for each stub node
18EXPERIMENTAL RESULTS
- Modeled real world activities
- Join, leave (adding or deleting a new graph node)
- Request, serve, forward or block a query (edges
umn and hmn) - Ping, pong (edges imn and omn)
- Delays are assumed to be equal for all message
types - High probability of not flushed but dependent
queries (typical user behavior) - Experiments
- Snapshots Starting at 100th second ending at
1000th second, once in every 5 minute intervals. - 3 experiments for each configuration, total
number of capacity calculations 2x3x424 - How to pick-up and merge connectivities ?
19EXPERIMENTAL RESULTS
Table 1 Capacity calculation results for
pure-Gnutella experimentation
Table 2 Capacity calculation results for
time-based clustered Gnutella experimentation
20EXPERIMENTAL RESULTS
- Observations
- Similar capacity values for both configurations
- Capacity decrease trend in pure-Gnutella
- Time-based clustered version Initiated 150
queries and get 140 hits. Pure-Gnutella
Initiated 86 queries and get 76 hits. Similar hit
ratio! - However, because of high channel capacity,
doing more jobs (i.e. query initiation) is
possible.
21CONCLUSIONS
- A metric for P2P networks based on Shannons
L-channel capacity calculation idea - Able to observe and figure out the efficiency
obtained through clustering - Cannot be used directly for run-time
self-organization of peers connectivity
(peer-side partial observability). - Solution Capacity calculation for partial graphs
? Use of the metric for efficient clustering ? - High and explosive computational time for
capacity calculation. - Solution Problem specific shortcuts using graph
properties?
22