I Tube, You Tube, Everybody Tubes Pablo Rodriguez Telefonica Research Barcelona - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

I Tube, You Tube, Everybody Tubes Pablo Rodriguez Telefonica Research Barcelona

Description:

10% popular videos account for 80% total views. Richer-get ... tend to upload interesting videos. Information filtering or post ... on videos older than ... – PowerPoint PPT presentation

Number of Views:472
Avg rating:3.0/5.0
Slides: 30
Provided by: mia85
Category:

less

Transcript and Presenter's Notes

Title: I Tube, You Tube, Everybody Tubes Pablo Rodriguez Telefonica Research Barcelona


1
I Tube, You Tube, Everybody TubesPablo
RodriguezTelefonica ResearchBarcelona
2
YouTube Video Example
3
Content is NOT king
4
Content Explosion
How to search content?
5
Aggregation and Recommendation
Infinite Choice Overwhelming Confusion
Filters required to connect users with content
that appeal to their interests
6
Video and Social Networks
  • Trends in video services
  • Users generate new videos
  • Users help each other finding videos
  • Need to understand users and contents
  • Video characteristics in YouTube
  • User-behavior and potential for recommendations

7
Particularities of
  • bite-size bits for high-speed munching
  • Wired mag. Mar 2007
  • Plethora of YouTube clones
  • UGC is very different
  • How different?

8
UGC vs. Non-UGC
  • Massive production scale
  • 15 days in YouTube to produce 120-yr worth of
    movies in IMDb!
  • Extreme publishers
  • 1000 uploads over few years vs. 100 movies over
    50 years
  • Short video length
  • 30 sec5 min vs. 100 min movies in LoveFilm
  • the rest consumption patterns

9
User Participation/Finding Videos
  • Despite Web 2.0 features, user participation
    remains low
  • Only 0.16-0.22 viewers rate videos/comment.
  • 47 videos have pointers from external sites
  • But requests from such sites account for less
    than 3 of the total views

10
Goals and Data
  • Potential for recommendation systems?
  • Popularity evolution
  • Content Duplication
  • Crawled YouTube and other UGC systems
  • metadata video ID, length, views
  • 1.6M Entertainment, 250KScience videos

Goals
Data
11
Part1 Popularity Distribution
  • Static popularity characteristics
  • Underlying mechanism

12
Pareto Principle
  • 10 popular videos account for 80 total views

13
Dominant Power-Law Behavior
  • Richer-get-richer principle
  • If video has K views, then users will watch the
    video with rate K
  • word frequency- citations of papers - scale of
    earthquakes
  • web hits

14
UGC Video Distribution
  • Straight-line waists and truncated both ends

15
Focusing on Popular Videos
  • Why popular videos deviate from power-law?
  • Fetch-at-most-once SOSP2003
  • Behavior of fetching immutable objects oncecf.
    visiting popular web sites many times

16
Why the Unpopular Tail Falls Off
  • Natural shape is curved
  • Sampling bias or pre-filters
  • Publishers tend to upload interesting videos
  • Information filtering or post-filters
  • Search results or suggestions favor popular items

17
Impact of Post-Filters
  • Videos exposed longer to filtering effect
    appear more truncated

video rank
18
Is it Naturally Curved?
  • Matlab curve fitting for Science

19
Is it Naturally Curved?
  • Matlab curve fitting for Science

Zipf is scale-free, while exponential is scaled
underlying mechanism is Zipf and truncation
is due to bottlenecks
20
Implication of Our Findings
  • Latent demand for products that is suppressed by
    bottlenecks in the system
  • Chris Anderson, The Long Tail


40 additional views! How?
Personalized
recommendation Enriched metadataAbundant videos
21
Part2 Popularity Evolution
  • Relationship between popularity and age

22
Popularity Evolution
  • So far, we focused on static popularity
  • Now focus on popularity dynamics
  • How requests on any given day are distributed
    across the video age?
  • 6-day daily trace of Science videos
  • Step1- Group videos requested at least once by
    age
  • Step2- Count request volume per age group

23
Request Volume Across Age
User preference relatively insensitive to age
-- 80 requests on videos older than a month
The probability of a video being watched is 43,
18, 17 and 14 for the first 24 hours, 6 days,
3 weeks, and 1 month accordingly
24
Part4 Content Duplication
  • Level of duplication
  • Birth of duplicates

25
Content Duplication
  • Alias- identical or similar copies of the same
    content
  • Aliases dilute popularity of a single event
  • Views distributed across multiple copies
  • Difficulty in recommendation ranking systems
  • Test with 51 volunteers
  • Find alias using keyword search
  • Identified 1,224 aliases for 184 original videos

26
The Level of Popularity Dilution
  • Popularity diluted up to few-orders magnitude
  • Often aliases got more requests than original
  • (e.g. alias got 1000 times more requests)

27
How Late Aliases Appear?
  • Significant aliases appear within one week
  • Within the first day of posting the original
    video, sometimes you get more than 80 aliases

28
Conclusions
  • UGC is a new form of video social interaction
  • User interaction remains low
  • Lots of potential for social recommendations

29
Questions?
  • Dataset available at http//an.kaist.ac.kr/trace
    s/IMC2007.html
Write a Comment
User Comments (0)
About PowerShow.com