Title: A Serverless Architecture for Building Scalable, Reliable, and CostEffective Videoondemand Systems
1A Server-less Architecture for Building Scalable,
Reliable, and Cost-Effective Video-on-demand
Systems
- Jack Lee Yiu-bun, Raymond Leung Wai Tak
- Department of Information Engineering
- The Chinese University of Hong Kong
2Contents
- 1. Introduction
- 2. Challenges
- 3. Server-less Architecture
- 4. Performance
- 5. Conclusion
31. Introduction
- Traditional Client-server Architecture
- Clients connect to server and request for
streaming - Server capacity limits the system capacity
- Cost increases with system scale
- Server-less Architecture
- Motivated by the availability of powerful user
devices - Each user node contributes to the system
- Memory
- Network bandwidth
- Storage
- Costs shared by users
41. Introduction
- Composed of clusters
- Each node serves as a mini server
52. Challenges
- Video Data Storage
- Retrieval and Transmission Scheduling
- Fault Tolerance
- Distributed Directory Service
- Heterogeneous User Nodes
- System Adaptation node joining/leaving
63. Server-less Architecture
- Storage Policy
- Video data is divided into fixed-size blocks and
then distributed among nodes in the cluster (data
striping) - Low storage requirement, load balanced
- Capable of fault tolerance using redundant blocks
(discussed later)
73. Server-less Architecture
- Retrieval and Transmission Scheduling
- Round-based scheduler
- Retrieval scheduling in terms of macro rounds
composed of GSS groups (micro rounds) - Transmission lasts for one macro round
83. Server-less Architecture
- Fault Tolerance
- Recover from not a single node failure, but
multiple simultaneously node failures as well - Redundancy by Forward Error Correction (FEC) Code
- e.g. Reed-Solomon Erasure Code (REC)
94. Performance Evaluation
- Reliability Analysis
- Find out the system mean time to failure (MTTF)
- Assuming independent node failure/repair rate
- Tolerate up to h failures by redundancy
- Analysis by Markov chain model
104. Performance Evaluation
- Redundancy Level
- Defined as the proportion of nodes serving
redundant data - Redundancy level versus number of nodes on
achieving the target system MTTF
114. Performance Evaluation
- System Response Time
- Sum of the scheduling delay and the prefetch
delay - Prefetch Delay
- Time required to receive the first group of
blocks from all nodes - Increases linearly with system scale not
scalable - Ultimately limits the cluster size
- What is the Solution?
- Multiple parity groups
124. Performance Evaluation
- Multiple Parity Groups
- Instead of single parity group, the redundancy is
encoded with multiple parity groups - Playback begins after receiving the data of first
parity group
134. Performance Evaluation
- Multiple Parity Groups
- Performance gain shorten the prefetch delay
- Drawback higher redundancy level to maintain the
same system MTTF - Tradeoff between response time and redundancy
level
144. Performance Evaluation
- System Response Time
- Increases with cluster size
- Shortened by using multiple parity groups
154. Performance Evaluation
- System Dimensioning
- What are the system configurations if the system
- achieves a MTTF of 10,000 hours, and
- keeps under a response time constraint of 5
seconds?
165. Conclusion
- Server-less Architecture
- Scalable
- Acceptable redundancy level to achieve reasonable
response time in a cluster - Further scale up by forming new autonomous
clusters - Reliable
- Fault tolerance by redundancy
- Comparable reliability as high-end server by the
analysis using Markov chain - Cost-Effective
- Costs shared by all users
175. Conclusion
- Future Work
- Distributed Directory Service
- Heterogeneous User Nodes
- Dynamic System Adaptation
- Node joining/leaving
- Data re-distribution
18End of Presentation