Title: An Empirical Study of Flash Crowd Dynamics in a P2P-based Live Video Streaming System
1An Empirical Study of Flash Crowd Dynamics in a
P2P-based Live Video Streaming System
- Bo Li, Gabriel Y. Keung, Susu Xie, Fangming Liu,
Ye Sun, and Hao Yin - Email lfxad_at_cse.ust.hk
- Hong Kong University of Science Technology
- Dec 2, 2008 _at_ IEEE GLOBECOM, New Orleans
2Overview Internet Video Streaming
- Enable video distribution from any place to
anywhere in the world in any format
3- Cont.
- Recently, significant deployment in adopting
Peer-to-Peer (P2P) technology for Internet live
video streaming - Protocol design Overcast, CoopNet, SplitStream,
Bullet, and etc. - Real deployment ESM, CoolStreaming, PPLive, and
etc. - Key
- Requires minimum support from the
infrastructure
Easy to deploy
- Greater demands also generate more resources
Each peer not only downloading the video content,
but also uploading it to other participants
Good scalability
4Challenges
- Real-time constraints, requiring timely and
sustained streaming delivery to all participating
peers - Performance-demanding, involving bandwidth
requirements of hundreds of kilobits per second
and even more for higher quality video - Large-scale and extreme peer dynamics,
corresponding to tens of thousands of users
simultaneously participating in the streaming
with highly peer dynamics (join and leave at
will) - especially flash crowd
Real-time constraints
Performance-demanding
Large-scale and extreme peer dynamics
5Motivation
- Challenge Large-scale extreme peer dynamics
- Current P2P live streaming systems still suffer
from potentially - long startup delay unstable streaming
quality - Especially under realistic challenging scenarios
such as flash crowd
- Flash crowd
- A large increase in the number of users joining
the streaming in a short period of time (e.g.,
during the initial few minutes of a live
broadcast program) - Difficult to quickly accommodate new peers within
a stringent time constraint, without
significantly impacting the video streaming
quality of existing and newly arrived peers - Different from file sharing
6Focus
- Cont.
- Little prior study on the detailed dynamics of
P2P live streaming systems during flash crowd and
its impacts - E.g., Hei et al. measurement on PPLive, the
dynamic of user population during the annual
Spring Festival Gala on Chinese New Year
How to capture various effects of flash crowd in
P2P live streaming systems?
What are the impacts from flash crowd on user
experience behaviors, and system scale?
What are the rationales behind them?
7Outline
- System Architecture
- Measurement Methodology
- Important Results
- Short Sessions under Flash Crowd
- User Retry Behavior under Flash Crowd
- System Scalability under Flash Crowd
- Summary
8Some Facts of CoolStreaming System
- CoolStreaming
- Cooperative Overlay Streaming
- First released in 2004
- Roxbeam Inc. received USD 30M investment, current
through YahooBB, the largest video streaming
portal in Japan
Download 2,000,000
Average online user 20,000
Peak-time online user 150,000
Google entries (keyword Coolstreaming) 400,000
9CoolStreaming System Architecture
- Membership manager
- Maintaining partial view of the overlay gossip
- Partnership manager
- Establishing maintaining TCP connections
(partnership) with other nodes - Exchanging the data availability Buffer Map (BM)
- Stream manager
- Providing stream data to local player
- Making decision where and how to retrieve stream
data - Hybrid Push Pull
10Mesh-based (Data-driven) Approaches
- No explicit structures are constructed and
maintained - e.g., Coolstreaming, PPLive
- Data flow is guided by the availability of data
- Video stream is divided into segments of uniform
length, availability of segments in the buffer of
a peer is represented by a buffer map (BM) - Periodically exchange data availability info with
a set of partners (partial view of the overlay)
and retrieves currently unavailable data from
each other - Segment scheduling algorithm determines which
segments are to be fetched from which partners
accordingly - Overhead delay peers need to explore the
content availability with one another, which is
usually achieved with the use of gossip protocol
11Measurement Methodology
- Each user reports its activities internal
status to the log server periodically - Using HTTP, peer log compacted into parameter
parts of the URL string
- 3 types of status report
- QoS report
- of video data missing the playback deadline
- Traffic report
- Partner report
- 4 events of each session
- Join event
- Start subscription event
- Media player ready event
- receives sufficient data to start playing
- Leave event
12Log Data Collection
- Real-world traces obtained from a live event
broadcast in Japan Yahoo using the CoolStreaming
system - A sport channel on Sept. 27, 2006 (24 hours)
- Live baseball game broadcast at 1800
- Stream bit-rate is 768 Kbps
- 24 dedicated servers with 100 Mbps connections
13How to capture flash crowd effects?
- Two key measures
- Short session distribution
- Counts for those that either fail to start
viewing a program or the service is disrupted
during flash crowd - Session duration is the time interval between a
user joining and leaving the system - User retry behavior
- To cope with the possible service disruption
often observed during flash crowd, each peer can
re-connect (retry) to the program
14Short Sessions under Flash Crowd
- Filter out normal sessions (i.e., users who
successfully join the program) - Focus on short sessions with the duration lt 120
sec and 240 sec
- No. short session increases significantly at
around 1800 when flash crowd - occurs with a large number of peers joining the
live broadcast program
15Strong Correlation Between the Number of Short
Sessions and Peer Joining Rate
16What are the rationales behind these observations?
- Relevant factors
- User client connection fault
- Insufficient uploading capacity from at least one
of the parents - Poor sustainable bandwidth at beginning of the
stream subscription - Long waiting time (timeout) for cumulating
sufficient video content at playback buffer
- Newly coming peers do not have adequate content
to share with others, thus - initially they can only consume the uploading
capacity from existing peers - With partial knowledge (gossip), the delay to
gather enough upload - bandwidth resources among peers and the heavy
resource competition - could be the fundamental bottleneck
17Approximate User Impatient Time
- In face of poor playback continuity, users either
reconnect or opt to leave
- Compare the total downloaded
- bytes of a session with the expected
- total playback video bytes
- according to the session duration
- Extract sessions with insufficient
- download bytes
- The avg. user impatient time
- is between 60s to 120s
18User Retry Behavior under Flash Crowd
- Retry rate count the NO. peers that opt to
re-join to the overlay - with same IP address and port per unit time
- User perspective
- playback could be restored
- System perspective
- amplify the join rates
- Users could have tried many times to successfully
start a video session - Again shows that flash crowd has significant
impact on the initial - joining phase
19System Scalability under Flash Crowd
Media player ready
Received sufficient data to start playing
Successfully joined
20Media Player Ready Time under different time
period
- Considerably longer during the period
- when the peer join rate is higher
21Scale-Time Relationship
- System perspective
- Though there could be enough aggregate resources
brought by newly coming peers, cannot be utilized
immediately - It takes time for the system to exploit such
resources - i.e., newly coming peers (with partial view of
overlay) need to find consume existing
resources to obtain adequate content for startup
and contribute to others - User perspective
- Cause long startup delay disrupted streaming
(thus short session, retry, impatience) - Future work
System scale
Amount of initial buffering
???
- Long ? startup delay
- Short ? continuity
22Summary
- Based on real-world measurement, capture flash
crowd effects - The system can scale up to a limit during the
flash crowd - Strong correlation between the number of short
sessions and joining rate - The user behavior during flash crowd can be best
captured by the number of short sessions, retries
and the impatient time - Relevant rationales behind these findings
23Future work
- Modeling to quantify and analyze flash crowd
effects - Correlation among initial system capacity, the
user joining rate/startup delay, and system
scale? - Intuitively, a larger initial system size can
tolerate a higher joining rate - Challenge how to formulate the factors and
performance gaps relevant to partial knowledge
(gossip)?
24- Based on the above study, perhaps more
importantly for practical - systems, how can servers help alleviate the flash
crowd problem, i.e., - shorten users startup delays, boost system
scaling?
- Commercial systems have utilized self-deployed
servers or CDN - Coolstreaming, Japan Yahoo, 24 servers in
different regions that allowed users to join a
program in order of seconds - PPLive is utilizing the CDN services
- On measurement, examine what real-world systems
do and experience - On technical side, derive the relationship between
25References
- "Inside the New Coolstreaming Principles,
Measurements and Performance Implications," - B. Li, S. Xie, Y. Qu, Y. Keung, C. Lin, J. Liu,
and X. Zhang, - in Proc. of IEEE INFOCOM, Apr. 2008.
- "Coolstreaming Design, Theory and Practice,"
- Susu Xie, Bo Li, Gabriel Y. Keung, and Xinyan
Zhang, - in IEEE Transactions on Multimedia, 9(8)
1661-1671, December 2007 - "An Empirical Study of the Coolstreaming
System," - Bo Li, Susu Xie, Gabriel Y. Keung, Jiangchuan
Liu, Ion Stoica, Hui Zhang, and Xinyan Zhang, - in IEEE Journal on Selected Areas in
Communications, 25(9)1-13, December 2007
26QA
27 28Comparison with the first release
- The initial system adopted a simple pull-based
scheme - Content availability information exchange using
buffer map - Per block overhead
- Longer delay in retrieving the video content
- Implemented a hybrid pull and push mechanism
- Pushed by a parent node to a child node except
for the first block - Lower overhead associated with each video block
transmission - Reduces the initial delay and increases the video
playback quality - Multiple sub-stream scheme is implemented
- Enables multi-source and multi-path delivery for
video streams - Gossip protocol was enhanced to handle the push
function - Buffer management and scheduling schemes are
re-designed to deal with the dissemination of
multiple sub-streams
29Gossip-based Dissemination
- Gossip protocol - used in BitTorrent
- Iteration
- Nodes send messages to random sets of nodes
- Each node does similarly in every round
- Messages gradually flood the whole overlay
- Pros
- Simple, robust to random failures, decentralized
- Cons
- Latency trade-off
- Related to Coolstreaming
- Updated membership content
- Multiple sub-streams
30Multiple Sub-streams
- Video stream is divided into blocks
- Each block is assigned a sequence number
- An example of stream decomposition
- Adoption of the gossip concept from P2P
file-sharing application
31Buffering
- Synchronization Buffer
- Received block firstly put into Syn. Buffer for
corresponding sub-stream - Blocks with continuous sequence number will be
combined - Cache Buffer
- Combined blocks are stored in Cache Buffer
32Comparison with the 1st release (II)
33Comparison with the 1st release (III)
34Parent-children and partnership
- Partners are connected with TCP connections
- Parents are supporting video streams to children
by TCP connection
35System Dynamics
36Peer Join and Adaptation
- Stream bit-rate normalized to ONE
- Two Sub-streams
- Weight of node is outgoing bandwidth
- Node E is newly arrival
37Peer Adaptation
38Peer Adaptation in Coolstreaming
- Inequality (1) is used to monitor the buffer
status of received sub-streams for node A - If this inequality does not hold, it implies that
at least one sub-stream is delayed beyond
threshold value Ts - Inequality (2) is used to monitor the buffer
status in the parents of node A - If this inequality does not hold, it implies that
the parent node is considerably lagging behind in
the number of blocks received when comparing to
at least one of the partners, which currently is
not a parent node for the given node A
39User Types Distribution
40Contribution Index
41Conceptual Overlay Topology
- Source node O
- Super-peers
- A, B, C, D
- Moderate-peers
- a
- Casual-peers
- b, c, d
42Event Distributions
43Media Player Ready Time under different time
period
44Session Distribution