Title: Characterizing%20User%20Access%20To%20Videos%20On%20The%20World%20Wide%20Web
1Characterizing User Access To Videos On The World
Wide Web
Soam Acharya Inktomi Corporation Foster City, CA
Peter Parnes Center For Distance Spanning
Technology Luleå University of Technology Sweden
Brian Smith Department of Computer
Science Cornell University Ithaca, NY
MMCN 2000
2Overview
- Analysis of traces from an ongoing VoW trial (VoD
over the Web) - 2 year period
- 13100 requests
- 246 titles
3Why?
- Audio/Video content
- coming online rapidly
- constitute a large percentage (17) of bytes
transferred online - Useful to
- Cache Designers
- Codec Engineers
- Network Engineers
- Other Multimedia Researchers
- MM Storage Systems
4Questions We Asked
- Do accesses to videos exhibit temporal locality?
- How frequently are videos accessed?
- Do users exhibit specific browsing patterns when
viewing videos? - What are the file size trends?
5Roadmap
- VoW Setup
- Analysis Methodology
- Results
- Conclusion
- Future Work
6VoW Setup
- Lulea University, Sweden
- Center for Distance Spanning Technology
- High speed network (34 Mbps)
- mMOD software system
7VoW Setup II
- Two years (end of Aug 97 - mid Oct 99)
- 246 video titles
- encoded using H.261 (CIF - 320x240)
- 500 campus machines involved in access, 1400
outside - title categories
- general
- movies
- educational
- courses
- tutorials, seminars
8Analysis
- Video file characteristics
- size
- duration
- bitrate distribution
- Trace access analysis
- Trace refinement
- Actual analysis on refined data
9Median Movie Size 96 MBytes
10Median Duration 70 minutes
11- Quality of video streams deliberately kept low
(for external users) - Compression scheme designed to produce lower
bitrates
12Trace Access Analysis - Log Filtering
- Initially eliminate from the trace
- HTML documents
- Java applet requests
- images
- Joining a session already in progress
020133 salt.cdt.luth.se GET Movie1 020323
spock.cdt.luth.se GET TVSerial_970206 030412
aniara.cdt.luth.se GET Movie2 031011
aniara.cdt.luth.se STOP Movie2
13Log Filtering II
- Eliminate from trace
- requests from demo machines
- resolve IP addresses for machine names
- reduce user errors
- hitting STOP button too many times
- hitting GET requests too many times
- Removed 1160 requests, 11965 remaining
14Trace Analysis Methodology
- General
- How do video requests vary by day?
- Mathematical distributions?
- Do some machines request more than others?
- Pattern Detection
- Inter-access times
- Do users access videos all the way?
- Type of file
- Temporal locality
1511965 accesses over twenty five months
16Movie Popularity
Movie popularity did not follow Zipfs law -- P
1/(p1-t ) P freq. of access to a document, p
its rank in popularity
17Distribution of Requests By Machine
- About 73 of all requests from campus and
surrounding community - For requests from within campus
- 2 of all machines (11) gt 21 of requests
- 10 of machines (53) gt 50 of requests
- Lab machines
18(No Transcript)
19Partial Access
- 61 of accesses went to completion
- 39 stopped early
- Suggests browsing pattern
20File Category Variations
- Access patterns vary by file category
- Lectures have temporal locality of access
- Many accesses shortly after going online
- Entertainment videos do not
21Temporal Locality
Previous Stack Position Counter
Stack
Trace
Position
Counter
GET Movie1 GET Movie2 GET Movie2 GET Movie2 GET
Movie3 GET Movie1
1 2 3
0 0 0
(increment previous location of currently
referenced document)
22Temporal Locality
Previous Stack Position Counter
Stack
Trace
Position
Counter
Movie1
GET Movie1 GET Movie2 GET Movie2 GET Movie2 GET
Movie3 GET Movie1
1 2 3
0 0 0
(increment previous location of currently
referenced document)
23Temporal Locality
Previous Stack Position Counter
Stack
Trace
Position
Counter
Movie2
GET Movie1 GET Movie2 GET Movie2 GET Movie2 GET
Movie3 GET Movie1
1 2 3
0 0 0
Movie1
(increment previous location of currently
referenced document)
24Temporal Locality
Previous Stack Position Counter
Stack
Trace
Position
Counter
Movie2
GET Movie1 GET Movie2 GET Movie2 GET Movie2 GET
Movie3 GET Movie1
1 2 3
1 0 0
Movie1
(increment previous location of currently
referenced document)
25Temporal Locality
Previous Stack Position Counter
Stack
Trace
Position
Counter
Movie2
GET Movie1 GET Movie2 GET Movie2 GET Movie2 GET
Movie3 GET Movie1
1 2 3
2 0 0
Movie1
(increment previous location of currently
referenced document)
26Temporal Locality
Previous Stack Position Counter
Stack
Trace
Position
Counter
Movie3
GET Movie1 GET Movie2 GET Movie2 GET Movie2 GET
Movie3 GET Movie1
1 2 3
2 0 0
Movie2
Movie1
(increment previous location of currently
referenced document)
27Temporal Locality
Previous Stack Position Counter
Stack
Trace
Position
Counter
Movie1
GET Movie1 GET Movie2 GET Movie2 GET Movie2 GET
Movie3 GET Movie1
1 2 3
2 0 1
Movie3
Movie2
Plot this after running through the entire trace
28Temporal Locality Result
29Conclusion
- Videos are relatively large (to capture entire
lectures, movies) - Users browse portions of video
- A small number of machines accounted for a large
number of accesses - High temporal locality of trace accesses
30Future Work
- Further analysis on inter-access patterns
- Repeat analysis on traces from other VoW type
experiments, cache traces ...