Stochastic Analysis of File Swarming Systems - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Stochastic Analysis of File Swarming Systems

Description:

The .torrent file. Static 'metainfo' file to contain necessary ... Chunk size (256KB), has individual hash code in the torrent file. Types of peers: Leechers ... – PowerPoint PPT presentation

Number of Views:82
Avg rating:3.0/5.0
Slides: 37
Provided by: mhl6
Category:

less

Transcript and Presenter's Notes

Title: Stochastic Analysis of File Swarming Systems


1
Stochastic Analysis of File Swarming Systems
John C.S. Lui
  • The Chinese University of Hong Kong

Collaborators D.M. Chiu, M.H. Lin, B. Fan
2
Background
  • Traditional Client/Server Sharing
  • Performance deteriorates rapidly as the number of
    clients increases
  • IP Multicast
  • Application Multicast (e.g., CDN, ESM)
  • reliability, unused resources at leaf nodes
  • P2P (e.g., Naspter, Gnutella)
  • Free riders only download without contributing
    to the network.
  • BitTorrent P2P systems
  • Good scalability
  • Built-in incentive mechanism to contribute

3
BT Components
  • On a public domain site, obtain torrent file, for
    example
  • http//bt.btchina.net
  • http//bt.ydy.com/

Web Server
The Lord of Ring.torrent
Harry Potter.torrent
Transformer.torrent
4
BT Components
  • The .torrent file
  • Static metainfo file to contain necessary
    information
  • File name
  • of chunks, size
  • checksum
  • IP address of the Tracker,etc
  • A BitTorrent tracker
  • Non-content-sharing node
  • Track peers
  • File
  • Chunk size (256KB), has individual hash code in
    the torrent file
  • Types of peers
  • Leechers
  • Seeders

5
BT publishing a file
Web Server
Moe
Tracker
Downloader Larry
Seeder John
Downloader Curly
6
Simple example
1,2,3,4,5,6,7,8,9,10
Seeder John

1,2,3
1,2,3,5

1,2,3
1,2,3,4
1,2,3,4,5
Downloader Larry
Downloader Moe
7
BT internal Chunk Selection mechanisms
  • Strict Priority
  • First Priority
  • Rarest First
  • General rules
  • Random First Piece
  • Special case, at the beginning
  • Endgame Mode
  • Special case

8
BT internal mechanism
  • Built-in incentive mechanism (where all the magic
    happens)
  • Choking Algorithm
  • Optimistic Unchoking

9
BT internal mechanism
  • Choking is a temporal refusal to upload
  • Each peer unchokes a fixed number of peers
  • Reasons for choking
  • Avoid free riders
  • Network congestion
  • Contribute to useful peers

Andy Yao
Choked
Choked
John C.S Lui
10
BT internal mechanism (optimistic unchoking)
  • A BitTorrent peer has a single optimistic
    unchoke which uploads regardless of the current
    download rate from it. This peer rotates every
    30s
  • Reasons
  • To discover currently unused connections are
    better than the ones being used
  • To provide minimal service to new peers

11
Example optimistic unchoking
Andy Yao
100kb/s
40kb/s
70kb/s
70kb/s.
10kb/s
110kb/s
Downloader Moe
70kb/s
10kb/s
20kb/s
30kb/s
5kb/s
15kb/s
Downloader Melinda
Downloader Larry
Downloader Curly
Downloader John Lui
12
P2P content distribution
  • BitTorrent
  • Sending a file to a large number of peers, with
    the help of peers
  • Producing the most Internet traffic today (over
    50 of traffic, creates contention but ....)
  • What IP multicast tried to support
  • Modeling these systems insights

13
Why Study BitTorrent-like System?
  • BitTorrent is very efficient.
  • Which features make it perform so well?
  • Motivating questions
  • What is the effect of bandwidth constraints?
  • Is the Rarest First policy really necessary?
  • Must nodes perform seeding after file
    downloading?
  • How serious is the Last Piece Problem?
  • Is source coding useful?
  • Does the incentive mechanism affect the
    performance much?

Our aim is to develop mathematical models of
file swarming systems, allowing us to
investigate these issues via analytical means.
14
Model for the File Swarming System
  • A file has K non-overlapping chunks.
  • Peers arrive according to a Poisson process. Each
    peer is initialized with one random chunk.
  • Peers leave the system immediately when finish
    downloading.
  • The system is slotted downlink bandwidth is one
    chunk per time slot for all peers. (download
    constraint)
  • In each time slot, each peer contacts m neighbors
    uniformly from the system to see whether they are
    useful. If some neighbors are useful, it randomly
    chooses one and requests a random useful chunk.
  • If a peer receives several requests, it will
    satisfy all / random one request(s).
    (without/with upload constraint)

15
Model for the File Swarming System
Without upload constraint
Example m2
With upload constraint
peer C
HELLO
peer A
Bitmap
HELLO
Request C5
C5
Bitmap
peer D
HELLO
Request C1
C1
peer B
Bitmap
HELLO
Bitmap
peer E
The case m 1 no upload constraint was
studied by L.Massoulie et.al in Coupon
replication systems.
16
Model 1 Download Constraint Only
  • Classify peers into K-1 types. Peers holding i
    chunks are named type i peers. Denote the
    number of type i peers,
  • We are interested in the average sojourn time Ti
    for type i peers.
  • The average downloading time
  • For a type i peer, the probability that a type j
    peer is useful
  • For a type i peer, the probability that a
    randomly picked peer is useful

17
Model 1 Download Constraint Only
  • Given the system state
    , is a Multi-dimensional
    infinite state-space Markov Process
  • It is hard to solve this Markov Chain directly
  • Transform the Markov Chain to a Density
    Dependent jump Markov Process
  • Focusing on its steady state and asymptotic
    behavior
  • We derive tight bounds.

18
Model 1 Download Constraint Only
The average downloading time .
The case m1 has been studied in 1, in which
the authors gave a looser bound
1 L.Massoulie, M.VojnoviC, Coupon replication
systems, SIGMETRICS, 2005.
19
Lower bound v.s. Upper bound (K200)
m1
m2
Last Piece Problem It takes a peer a longer time
to download the last few chunks of the file,
since it gets increasingly more difficult to find
other peers that can help.
20
Bounds v.s. Simulation (K200)
m1
m2
The simulation shows the accuracy of our model.
How to relief the last piece problem?
21
System with Source Coding
Source

K4
Q6
peer C
peer A
peer D
C1
peer B
peer E
22
System with Source Coding
The source encodes the original K chunks into Q
chunks, Any peer could
reconstruct the original file after he receives
any K distinct chunks.
23
Source Coding vs. No Coding(K200)
m1, no coding
m1, source coding ( )
Source coding eliminates the Last Piece Problem
!!!
24
Download constraint only
K500 m1
K200 m1
25
Download Constraint
K500 m2
K200 m2
26
Model 2 Download Upload Constraints
m1
peer C
peer A
HELLO
Request C5
Bitmap
peer D
HELLO
Request C1
C1
peer B
Bitmap
peer E
27
Model 2 Download Upload Constraints
m1
  • Stage One Requesting
  • The same as Model 1.
  • Stage Two Downloading
  • The distribution of the number of requests one
    peer would receive (depending on its type).
  • Only one request will be satisfied.
  • Still a density dependent jump Markov process
  • The transition rates are more complicated.

28
Model 2 Download Upload Constraints
m1
1.58
29
Bounds v.s. Simulation (K200, without source
coding)
m1 satisfying one request
Ti is NOT close 1 any more, i.e. downloading time
is far from being optimal.
30
Model 3 An Incentive Mechanism
Assuming peers are matched randomly at the
beginning of each time slot. Each pair will
perform chunk transfer iff both of them are
useful to each other.
peer C
peer A
Request C5
C5
C2
Request C2
peer D
peer B
peer E
Request C1
31
Model 3 An Incentive Mechanism
32
Bounds v.s. Simulation (K200, without source
coding)
First Piece Problem
It is not easy to download the first few chunks
when a peer enters the system, but one can solve
this in various of ways.
33
Incentive Mechanism
K500 m1
K200 m1
34
Conclusion
  • Many peers, steady state, certain mechanism to
    ensure file
  • availability (e.g. some seeders), then
  • The nature of swarming makes P2P systems very
    efficient.
  • Rarest First policy is not necessary for
    performance. If peers are cooperative, random
    policy is good enough, though it may be helpful
    to enhance file availability.
  • Peers are not necessary to perform seeding after
    file downloading.
  • Simple strategies (everything is random) can make
    the downloading time near optimal.
  • Source coding is useful, to relief the last piece
    problem.
  • With certain incentive mechanism, the downloading
    time can still approach optimal.

Our mathematical models provide a basis for
designing new BT-like protocol.
35
Research Questions
  • What about fairness?
  • How to extend file swarming to multimedia
    streaming? For Joost?
  • What about wide area network exchange?
  • What happen if there is network congestion?
    What is the impact?
  • Network Coding?
  • Security?

36
Q A
  • Thank You
Write a Comment
User Comments (0)
About PowerShow.com