Title: Challenges, Design and Analysis of a Largescale P2PVoD System Yan Huang, Tom Z. J. Fu, DahMing Chiu,
1Challenges Design and Analysis of a Large-scale
P2P-VoD SystemYan Huang Tom Z. J. Fu Dah-Ming
Chiu John C. S. Lui and Cheng Huang
2Outline
- P2P overview
- An architecture of a P2P-VoD system
- Performance metrics
- Measurement results and analysis
- Future works
3P2P Overview
- Advantages of P2P
- Users help each other so that the server load is
significantly reduced. - P2P increases robustness in case of failures by
replicating data over multiple peers. - P2P services
- P2P file downloading BitTorrent and Emule
- P2P live streaming Coolstreaming and PPLive
- P2P video-on-demand (P2P-VoD) PPLive
- Like P2P streaming systems P2P-VoD systems also
deliver the content by streaming but peers can
watch different parts of a video at the same
time. - P2P-VoD systems require each user to contribute a
small amount of storage (usually 1GB) instead of
only the playback buffer in memory as in the P2P
streaming systems
4Ref P2P Protocols and Applications
5P2P-VoD system
- Major components
- Peers
- Servers the source of content
- Trackers help peers connect to other peers to
share the same content - A bootstrap server helps peers to find a
suitable tracker and to perform other
bootstrapping functions - Other servers
- log servers log significant events for data
measurement - transit servers help peers behind NAT boxes
6Segment sizes
- How to divide a video into multiple pieces
- Small segment size gives more flexibility to
schedule which piece should be uploaded from
which neighboring peer. - The larger the segment size the smaller the
overhead. - Header overhead
- Bitmap overhead
- Protocol overhead
- The video player expects a certain minimum size
for a piece of content to be viewable. - Segmentation of a movie in PPLives VoD system
7Replication Strategy
- Goal
- To make the chunks as available to the user
population as possible to meet users viewing
demand while without incurring excessive
additional overheads - Considerations
- Whether to allow multiple movies be cached
- Multiple movie cache (MVC) / single movie cache
(SVC) - Whether to pre-fetch or not
- Which chunk/movie to remove when the disk cache
is full - Least recently used (LRU) / least frequently used
(LFU)
8Content Discovery
- Content advertising and look-up methods
- Trackers
- Used to keep track of which peers replicate a
given movie - As soon as a user starts watching a movie the
peer informs its tracker that it is replicating
that movie. - When a peer wants to start watching movie it
goes to the tracker to find out which other peers
have that movie. - Gossip method
- Discovering where chunks are is by the gossip
method. - This cuts down on the reliance on the tracker
and makes the system more robust. - DHT
- Used to automatically assign movies to trackers
to achieve some level of load balancing.
9Piece Selection
- Which piece to download first
- Sequential select the piece that is closest to
what is needed for the video playback - Rarest first selecting the rarest piece helps
speeding up the spread of pieces hence
indirectly helps streaming quality. - Anchor-based when a user tries to jump to a
particular location in the movie if the piece
for that location is missing then the closest
anchor point is used instead.
10Transmission Strategy
- Goals
- Maximize downloading rate
- Minimize the overheads
- Strategies (by levels of aggressiveness)
- A peer can send a request for the same content to
multiple neighbors simultaneously - A peer can request for different content from
multiple neighbors simultaneously (PPLives
choice) - For playback rate of 500Kbps 8-20 neighbors is
the best. More than this number can still improve
the achieved rate but at the expense of heavy
duplication rate. - A peer can work with one neighbor at a time.
11Other Design Issues
- NAT and firewalls
- Discovering different types of NAT boxes
- Pacing the upload rate and request rate
- Content authentication
- Chunk level authentication
- Some pieces may be polluted and cause poor
viewing experience locally at a peer. - If a peer detects a chunk is bad discard it.
- Piece level authentication
12What to measure
- User behavior
- includes the user arrival patterns and how long
they stayed watching a movie - used to improve the design of the replication
strategy - External performance metrics
- includes user satisfaction and server load
- used to measure the system performance perceived
externally - Health of replication
- measures how well a P2P-VoD system is replicating
a content - used to infer how well an important component of
the system is doing
13User Behavior
- MVR (movie viewing record)
14User Satisfaction
- Simple fluency
- measures the fraction of time a user spends
watching a movie out of the total time he spends
waiting for and watching that movie -
- R(m i) the set of all MVRs for a given movie
m and user i - n(m i) the number of MVRs in R(m i)
- r one of the MVRs in R(m i)
15User Satisfaction (cont)
- User satisfaction index
- considers the quality of the delivery of the
content -
- r(Q) a grade for the average viewing quality
for an MVR r
16Health of Replication
- Three levels
- Movie level
- The number of active peers who have advertised
storing chunks of that movie - The information that the tracker collects about
movies - Weighted movie level
- Considers the fraction of chunks a peer has in
computing the index - Chunk bitmap level
- The number of copies each chunk of a movie is
stored by peers - Various other statistics can be computed the
average number of copies of a chunk in a movie
the minimum number of chunks the variance of the
number of chunks.
17Statistics on video objects
- Overall statistics of the three typical movies
18Statistics on user behavior (1)
- Interarrival time distribution of viewers
19Statistics on user behavior (2)
- View duration distribution
20Statistics on user behavior (3)
- Start position distribution
21Health index of Movies (1)
- Number of peers that own the movie
22Health index of Movies (2)
- Average owning ratios for different chunks
23Health index of Movies (3)
- Chunk availability and chunk demand
24Health index of Movies (4)
- The available to demand ratios
25User Satisfaction Index (1)
- Generating fluency index
- The computation of F(m i) is carried out by the
client software. - The client software reports all MVRs and the
fluency F(m i) to the log server whenever a
stop-watching event occurs. - The STOP button is pressed
- Another movie/programme is selected
- The user turns off the P2P-VoD software
26User Satisfaction Index (2)
- The number of fluency records
- A good indicator of the number of viewers of the
movie
27User Satisfaction Index (3)
- The distribution of fluency index
28Future works
- Further research in P2P-VoD systems
- How to design a highly scalable P2P-VoD system to
support millions of simultaneous users - How to perform dynamic movie replication
replacement and scheduling so as reduce the
workload at the content servers - How to quantify various replication strategies so
as to guarantee a high health index - How to select proper chunk and piece transmission
strategies so as to improve the viewing quality - How to accurately measure and quantify the user
satisfaction level