Performance Comparison of Scheduling Algorithms for PeertoPeer Collaborative File Distribution - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Performance Comparison of Scheduling Algorithms for PeertoPeer Collaborative File Distribution

Description:

... highly popular in the Internet, e.g. BitTorrent, Gnutella, Kazaa, Napster, etc. ... Borrowed from the Rarest Element First algorithm employed in BitTorrent ... – PowerPoint PPT presentation

Number of Views:101
Avg rating:3.0/5.0
Slides: 25
Provided by: jonath64
Category:

less

Transcript and Presenter's Notes

Title: Performance Comparison of Scheduling Algorithms for PeertoPeer Collaborative File Distribution


1
Performance Comparison of Scheduling Algorithms
for Peer-to-Peer Collaborative File Distribution
  • Presented by Chan Siu Kei, Jonathan
  • Supervisors Prof. VOK Li, Dr. KS Lui

2
Overview
  • Introduction
  • Communication Model
  • Analysis
  • Scheduling Algorithms
  • - Rarest Piece First
  • - Most Demanding Node First
  • - Maximum-Flow Algorithms
  • Simulation Results
  • Future Work
  • Conclusion

3
Introduction
  • P2P file sharing applications are highly popular
    in the Internet, e.g. BitTorrent, Gnutella,
    Kazaa, Napster, etc.
  • More scalable (faster) compared with traditional
    client/server approach (e.g. FTP)
  • Former research focuses on topics like overlay
    topology formation, peer discovery, content
    search, fairness and incentive issues, etc. But
    seldom look into the data distribution scheduling
    problem
  • We present the first effort and propose a novel
    Maximum-Flow algorithm to better solve the problem

4
Communication Model
  • Synchronous Scheduling
  • - same transmission time for every pair of nodes
  • Asymmetric Bandwidth
  • - send p pieces out, receive q pieces in for
    each cycle

5
Notations and Definitions
  • N no. of peers, M no. of file pieces
  • F F1, F2, , FM
  • P NxM possession matrix,
  • Pij 1 iff node i possesses file piece Fj,
    otherwise Pij 0
  • Pt possession matrix at time t
  • p p1,p2,,pN (upload limit vector),
  • q q1,q2,,qN (download limit vector)

p 1,1,2,2,2, q 2,3,2,3,3
6
Schedule (1)
  • Specifies which file pieces each peer has to send
    out and to whom
  • A possible schedule for P0 with p1,1,2,2,2,
    q2,3,2,3,3
  • - Node 1 send piece 3 to node 2
  • - Node 2 send piece 4 to node 1
  • - Node 3 send piece 5 to node 1
  • send piece 5 to node 2
  • - Node 4 send piece 6 to node 2
  • send piece 6 to node 3
  • - Node 5 send piece 2 to node 4
  • send piece 7 to node 4
  • Formally, we use NxM matrix Sk to represent the
    schedule at cycle k. From Sk, we can derive
    transmission matrix Tk (NxM)


e.g. Node 1 receives piece 4 from Node 2, piece 5
from Node 3 gt and
7
Schedule (2)
  • Given Pk-1 and the schedule Sk-1, Tk-1, the
    possession matrix at next cycle k is Pk Pk-1
    Tk-1 (k gt 0)
  • The distribution terminates after certain, say k0
    cycles, until
  • Our goal is to minimize k0, which is the time
    needed for complete distribution

8
Analysis on Lower Bound (1)
  • Let p p1,p2,,pN, q q1,q2,,qN be the
    upload and download limit vectors.
    , ,
  • Let ri be the total no. of 0s across row i, i.e.
    , the min. value of k0 is given by
  • Let cj be the total no. of 1s along column j,
    i.e. , we can find the minimum no. of
    1s along all columns, , the
    min. value of k0 is given by
  • Let z be the total no. of 0s in P, i.e.
    , the min. value of k0 is given by

(1)
(2)
(3)
9
Analysis on Lower Bound (2)
  • Combining (1),(2),(3), the lower bound k0 is
    given by

(4)
From (1),
From (2),
From (3),
10
Rarest Piece First (RPF)
  • Borrowed from the Rarest Element First algorithm
    employed in BitTorrent
  • Rarity cj of piece j is the no. of peers who have
    piece j, i.e.

RPF Node-Oriented (p1,1,2,2,2,
q2,3,2,3,3)

RPF Piece-Oriented (p1,1,2,2,2,
q2,3,2,3,3)

11
Most Demanding Node First (MDNF)
  • Demand di of node i is the no. of un-received
    pieces for node i, i.e.
  • When choosing recipients, prefer sending to the
    node with largest di

MDNF Node-Oriented (p1,1,2,2,2,
q2,3,2,3,3)
6
6

4
4
5
MDNF Piece-Oriented (p1,1,2,2,2,
q2,3,2,3,3)
6
6

4
4
5
12
Problem with RPF and MDNF
  • The max. no. of transmissions for each cycle
    cannot be achieved

Using MDNF Piece-Oriented (p2,2,2,1,
q2,1,2,2) only 6 transmissions can be
scheduled (but the max. is 7)
MDNF (only 6 transmissions)
Maximum is 7 transmissions
13
Maximum-Flow (MaxFlow)
Let G (V,E) to be the flow network graph
L L1, L2, , LN
R R1, R2, , RN
14
Maximum-Flow (MaxFlow)
  • Edmonds-Karp Algorithm
  • Find augmenting paths using BFS
  • Guarantee to find maximum of transmissions in
    each cycle
  • Complexity

15
MaxFlow Counter Example
  • Pure MaxFlow performance is unsatisfactory, as it
    does not consider whether we can match more in
    subsequent cycles

Using MaxFlow, total 3 cycles are needed
(p2,2,2,2,2, q3,3,3,3,3)

Using RPF Node-Oriented, only 2 cycles are
needed (p2,2,2,2,2, q3,3,3,3,3)
16
MaxFlow - Weighted
  • Put weights on both sides to give priorities to
    some nodes during searching
  • Weights on Li (sum of the no. of 0s in
    other peers for those pieces that peer i has)
  • Weights on Bij dij (sum of the no. of 0s across
    row i and column j)
  • E.g.
  • d42 7

17
MaxFlow WeightedCounter Example
For p2,2,2,2,2, q3,3,3,3,3
Using MaxFlow Weighted, total 3 cycles are
needed
P3 1
Using MDNF Piece-Oriented, only 2 cycles are
needed
P2 1
18
MaxFlow Dynamically-Weighted
  • Allows the weights to be dynamically varied
    within each scheduling cycle

? 15,14,25,13,15,10,16,16 and d43 9 which
is the greatest value among all dij
19
Simulation Results (1)
Fig. 1 Performance comparison of various
scheduling algorithms (All) with varying peer
sizes (file size 100, pi 2, qi 3, equal
probability for 1s and 0s)
20
Simulation Results (2)
Fig. 2 Performance comparison of various
scheduling algorithms (Representative) with
varying peer sizes (file size 100, pi 2, qi
3, equal probability for 1s and 0s)
21
Simulation Results (3)
Fig. 3 Performance comparison of various
scheduling algorithms (Representative) with
varying file sizes (peer size 10, pi 2, qi
3, equal probability for 1s and 0s)
22
Future Work
  • Study the case of asynchronous scheduling, where
    the transmission time is different for different
    pairs of nodes
  • Study the case when the network is dynamic in
    nature, where peers can come and go at any
    instant and they may shift to communicate with
    different sets of peers during the distribution
    process

23
Conclusion
  • The data distribution problem in P2P networks is
    not well studied in previous research
  • We formally define the collaborative file
    distribution problem with the possession and
    transmission matrix formulations
  • We also deduce a theoretical bound for the
    minimum distribution time required
  • We develop several types of algorithms (RPF,
    MDNF, MaxFlow) for solving the problem
  • Our novel dynamically-weighted max-flow algorithm
    outperforms all other algorithms by simulations

24
Thank You!
  • QA
Write a Comment
User Comments (0)
About PowerShow.com