Click anywhere to start the presentation - PowerPoint PPT Presentation

About This Presentation
Title:

Click anywhere to start the presentation

Description:

17,770 movies. 6 years of data: 2000-2005. Test data. Last few ratings of ... movies. Properties of k-NN. Intuitive. No substantial preprocessing is required ... – PowerPoint PPT presentation

Number of Views:539
Avg rating:3.0/5.0
Slides: 53
Provided by: Yehuda3
Category:

less

Transcript and Presenter's Notes

Title: Click anywhere to start the presentation


1
Click anywhere to start the presentation
2
Lessons from the Netflix Prize
Yehuda Koren The BellKor team (with Robert Bell
Chris Volinsky)
movie 15868
movie 7614
movie 3250
3
Recommender systems
We Know What You OughtTo Be Watching This Summer
4
Collaborative filtering
  • Recommend items based on past transactions of
    users
  • Analyze relations between users and/or items
  • Specific data characteristics are irrelevant
  • Domain-free user/item attributes are not
    necessary
  • Can identify elusive aspects

5
(No Transcript)
6
Movie rating data
Training data
Test data
score movie user
1 21 1
5 213 1
4 345 2
4 123 2
3 768 2
5 76 3
4 45 4
1 568 5
2 342 5
2 234 5
5 76 6
4 56 6
score movie user
? 62 1
? 96 1
? 7 2
? 3 2
? 47 3
? 15 3
? 41 4
? 28 4
? 93 5
? 74 5
? 69 6
? 83 6
7
Netflix Prize
  • Training data
  • 100 million ratings
  • 480,000 users
  • 17,770 movies
  • 6 years of data 2000-2005
  • Test data
  • Last few ratings of each user (2.8 million)
  • Evaluation criterion root mean squared error
    (RMSE)
  • Netflix Cinematch RMSE 0.9514
  • Competition
  • 2700 teams
  • 1 million grand prize for 10 improvement on
    Cinematch result
  • 50,000 2007 progress prize for 8.43 improvement

8
Overall rating distribution
  • Third of ratings are 4s
  • Average rating is 3.68

From TimelyDevelopment.com
9
ratings per movie
  • Avg ratings/movie 5627

10
ratings per user
  • Avg ratings/user 208

11
Average movie rating by movie count
  • More ratings to better movies

From TimelyDevelopment.com
12
Most loved movies
Count Avg rating Title
137812 4.593 The Shawshank Redemption
133597 4.545 Lord of the Rings The Return of the King
180883 4.306 The Green Mile
150676 4.460 Lord of the Rings The Two Towers
139050 4.415 Finding Nemo
117456 4.504 Raiders of the Lost Ark
180736 4.299 Forrest Gump
147932 4.433 Lord of the Rings The Fellowship of the ring
149199 4.325 The Sixth Sense
144027 4.333 Indiana Jones and the Last Crusade
13
Important RMSEs
Global average 1.1296
User average 1.0651
Movie average 1.0533
Personalization
Cinematch 0.9514 baseline
BellKor 0.8693 8.63 improvement
Grand Prize 0.8563 10 improvement
Inherent noise ????
14
Challenges
  • Size of data
  • Scalability
  • Keeping data in memory
  • Missing data
  • 99 percent missing
  • Very imbalanced
  • Avoiding overfitting
  • Test and training data differ significantly

movie 16322
15
The BellKor recommender system
  • Use an ensemble of complementing predictors
  • Two, half tuned models worth more than a single,
    fully tuned model

16
(No Transcript)
17
The BellKor recommender system
  • Use an ensemble of complementing predictors
  • Two, half tuned models worth more than a single,
    fully tuned model
  • ButMany seemingly different models expose
    similar characteristics of the data, and wont
    mix well
  • Concentrate efforts along three axes...

18
The three dimensions of the BellKor system
Global effects
  • The first axis
  • Multi-scale modeling of the data
  • Combine top level, regional modeling of the data,
    with a refined, local view
  • k-NN Extracting local patterns
  • Factorization Addressing regional effects

Factorization
k-NN
19
Multi-scale modeling 1st tier
Global effects
  • Mean rating 3.7 stars
  • The Sixth Sense is 0.5 stars above avg
  • Joe rates 0.2 stars below avg
  • ?Baseline estimationJoe will rate The Sixth
    Sense 4 stars

20
Multi-scale modeling 2nd tier
Factors model
  • Both The Sixth Sense and Joe are placed high on
    the Supernatural Thrillers scale
  • ?Adjusted estimateJoe will rate The Sixth Sense
    4.5 stars

21
Multi-scale modeling 3rd tier
Neighborhood model
  • Joe didnt like related movie Signs
  • ?Final estimateJoe will rate The Sixth Sense
    4.2 stars

22
The three dimensions of the BellKor system
  • The second axis
  • Quality of modeling
  • Make the best out of a model
  • Strive for
  • Fundamental derivation
  • Simplicity
  • Avoid overfitting
  • Robustness against iterations, parameter
    setting, etc.
  • Optimizing is good, but dont overdo it!

global
local
quality
23
The three dimensions of the BellKor system
  • The third dimension will be discussed later...
  • NextMoving the multi-scale view along the
    quality axis

global
local
???
quality
24
Local modeling through k-NN
  • Earliest and most popular collaborative filtering
    method
  • Derive unknown ratings from those of similar
    items (movie-movie variant)
  • A parallel user-user flavor rely on ratings of
    like-minded users (not in this talk)

25
k-NN
users
12 11 10 9 8 7 6 5 4 3 2 1
4 5 5 3 1 1
3 1 2 4 4 5 2
5 3 4 3 2 1 4 2 3
2 4 5 4 2 4
5 2 2 4 3 4 5
4 2 3 3 1 6
movies
26
k-NN
users
12 11 10 9 8 7 6 5 4 3 2 1
4 5 5 ? 3 1 1
3 1 2 4 4 5 2
5 3 4 3 2 1 4 2 3
2 4 5 4 2 4
5 2 2 4 3 4 5
4 2 3 3 1 6
movies
- estimate rating of movie 1 by user 5
27
k-NN
users
12 11 10 9 8 7 6 5 4 3 2 1
4 5 5 ? 3 1 1
3 1 2 4 4 5 2
5 3 4 3 2 1 4 2 3
2 4 5 4 2 4
5 2 2 4 3 4 5
4 2 3 3 1 6
movies
Neighbor selectionIdentify movies similar to 1,
rated by user 5
28
k-NN
users
12 11 10 9 8 7 6 5 4 3 2 1
4 5 5 ? 3 1 1
3 1 2 4 4 5 2
5 3 4 3 2 1 4 2 3
2 4 5 4 2 4
5 2 2 4 3 4 5
4 2 3 3 1 6
movies
Compute similarity weightss130.2, s160.3
29
k-NN
users
12 11 10 9 8 7 6 5 4 3 2 1
4 5 5 2.6 3 1 1
3 1 2 4 4 5 2
5 3 4 3 2 1 4 2 3
2 4 5 4 2 4
5 2 2 4 3 4 5
4 2 3 3 1 6
movies
Predict by taking weighted average(0.220.33)/
(0.20.3)2.6
30
Properties of k-NN
  • Intuitive
  • No substantial preprocessing is required
  • Easy to explain reasoning behind a recommendation
  • Accurate?

31
k-NN on the RMSE scale
Global average 1.1296
User average 1.0651
Movie average 1.0533
0.96
Cinematch 0.9514
k-NN
0.91
BellKor 0.8693
Grand Prize 0.8563
Inherent noise ????
32
k-NN - Common practice
  1. Define a similarity measure between items sij
  2. Select neighbors -- N(iu) items most similar
    to i, that were rated by u
  3. Estimate unknown rating, rui, as the weighted
    average

baseline estimate for rui
33
k-NN - Common practice
  1. Define a similarity measure between items sij
  2. Select neighbors -- N(iu) items similar to i,
    rated by u
  3. Estimate unknown rating, rui, as the weighted
    average

Problems
  1. Similarity measures are arbitrary no fundamental
    justification
  2. Pairwise similarities isolate each neighbor
    neglect interdependencies among neighbors
  3. Taking a weighted average is restricting e.g.,
    when neighborhood information is limited

34
Interpolation weights
  • Use a weighted sum rather than a weighted average

(We allow )
  • Model relationships between item i and its
    neighbors
  • Can be learnt through a least squares problem
    from all other users that rated i

35
Interpolation weights
  • Interpolation weights derived based on their
    role no use of an arbitrary similarity measure
  • Explicitly account for interrelationships among
    the neighbors
  • Challenges
  • Deal with missing values
  • Avoid overfitting
  • Efficient implementation

36
Latent factor models
serious
Braveheart
The Color Purple
Amadeus
Lethal Weapon
Sense and Sensibility
Oceans 11
Geared towards males
Geared towards females
Dave
The Lion King
Dumb and Dumber
The Princess Diaries
Independence Day
Gus
escapist
37
Latent factor models
users
4 5 5 3 1
3 1 2 4 4 5
5 3 4 3 2 1 4 2
2 4 5 4 2
5 2 2 4 3 4
4 2 3 3 1

items
users
-.9 2.4 1.4 .3 -.4 .8 -.5 -2 .5 .3 -.2 1.1
1.3 -.1 1.2 -.7 2.9 1.4 -1 .3 1.4 .5 .7 -.8
.1 -.6 .7 .8 .4 -.3 .9 2.4 1.7 .6 -.4 2.1
.2 -.4 .1
.5 .6 -.5
.5 .3 -.2
.3 2.1 1.1
-2 2.1 -.7
.3 .7 -1

items
A rank-3 SVD approximation
38
Estimate unknown ratings as inner-products of
factors
users
4 5 5 3 1
3 1 2 4 4 5
5 3 4 3 2 1 4 2
2 4 5 4 2
5 2 2 4 3 4
4 2 3 3 1
?

items
users
-.9 2.4 1.4 .3 -.4 .8 -.5 -2 .5 .3 -.2 1.1
1.3 -.1 1.2 -.7 2.9 1.4 -1 .3 1.4 .5 .7 -.8
.1 -.6 .7 .8 .4 -.3 .9 2.4 1.7 .6 -.4 2.1
.2 -.4 .1
.5 .6 -.5
.5 .3 -.2
.3 2.1 1.1
-2 2.1 -.7
.3 .7 -1

items
A rank-3 SVD approximation
39
Estimate unknown ratings as inner-products of
factors
users
4 5 5 3 1
3 1 2 4 4 5
5 3 4 3 2 1 4 2
2 4 5 4 2
5 2 2 4 3 4
4 2 3 3 1
?

items
users
-.9 2.4 1.4 .3 -.4 .8 -.5 -2 .5 .3 -.2 1.1
1.3 -.1 1.2 -.7 2.9 1.4 -1 .3 1.4 .5 .7 -.8
.1 -.6 .7 .8 .4 -.3 .9 2.4 1.7 .6 -.4 2.1
.2 -.4 .1
.5 .6 -.5
.5 .3 -.2
.3 2.1 1.1
-2 2.1 -.7
.3 .7 -1

items
A rank-3 SVD approximation
40
Estimate unknown ratings as inner-products of
factors
users
4 5 5 3 1
3 1 2 4 4 5
5 3 4 3 2 1 4 2
2 4 5 4 2
5 2 2 4 3 4
4 2 3 3 1
2.4

items
users
-.9 2.4 1.4 .3 -.4 .8 -.5 -2 .5 .3 -.2 1.1
1.3 -.1 1.2 -.7 2.9 1.4 -1 .3 1.4 .5 .7 -.8
.1 -.6 .7 .8 .4 -.3 .9 2.4 1.7 .6 -.4 2.1
.2 -.4 .1
.5 .6 -.5
.5 .3 -.2
.3 2.1 1.1
-2 2.1 -.7
.3 .7 -1

items
A rank-3 SVD approximation
41
Latent factor models
.2 -.4 .1
.5 .6 -.5
.5 .3 -.2
.3 2.1 1.1
-2 2.1 -.7
.3 .7 -1
4 5 5 3 1
3 1 2 4 4 5
5 3 4 3 2 1 4 2
2 4 5 4 2
5 2 2 4 3 4
4 2 3 3 1
-.9 2.4 1.4 .3 -.4 .8 -.5 -2 .5 .3 -.2 1.1
1.3 -.1 1.2 -.7 2.9 1.4 -1 .3 1.4 .5 .7 -.8
.1 -.6 .7 .8 .4 -.3 .9 2.4 1.7 .6 -.4 2.1
  • Properties
  • SVD isnt defined when entries are unknown ? use
    specialized methods
  • Very powerful model ? can easily overfit,
    sensitive to regularization
  • Probably most popular model among contestants
  • 12/11/2006 Simon Funk describes an SVD based
    method
  • 12/29/2006 Free implementation at
    timelydevelopment.com

42
Factorization on the RMSE scale
Global average 1.1296
User average 1.0651
Movie average 1.0533
Cinematch 0.9514
0.93
factorization
BellKor 0.8693
0.89
Grand Prize 0.8563
Inherent noise ????
43
Our approach
  • User factors Model a user u as a vector pu
    Nk(?, ?)
  • Movie factorsModel a movie i as a vector qi
    Nk(?, ?)
  • RatingsMeasure agreement between u and i rui
    N(puTqi, e2)
  • Maximize models likelihood
  • Alternate between recomputing user-factors,
    movie-factors and model parameters
  • Special cases
  • Alternating Ridge regression
  • Nonnegative matrix factorization

44
Combining multi-scale views
Residual fitting
Weighted average
factorization
global effects
regional effects
k-NN
local effects
A unified model
factorization
k-NN
45
Localized factorization model
  • Standard factorization User u is a linear
    function parameterized by pu rui puTqi
  • Allow user factors pu to depend on the item
    being predicted rui pu(i)Tqi
  • Vector pu(i) models behavior of u on items like i

46
Results on Netflix Probe set
More accurate
47
(No Transcript)
48
Seek alternative perspectives of the data
  • Can exploit movie titles and release year
  • But movies side is pretty much covered anyway...
  • Its about the users!
  • Turning to the third dimension...

global
local
???
quality
49
The third dimension of the BellKor system
  • A powerful source of informationCharacterize
    users by which movies they rated, rather than how
    they rated
  • ? A binary representation of the data

users
users
4 5 5 3 1
3 1 2 4 4 5
5 3 4 3 2 1 4 2
2 4 5 4 2
5 2 2 4 3 4
4 2 3 3 1
0 1 0 1 0 0 1 0 0 1 0 1
1 1 1 0 0 1 0 0 1 1 0 0
0 1 1 1 0 1 0 1 1 0 1 1
0 1 0 0 1 0 0 1 0 1 1 0
1 1 0 0 0 0 1 1 1 1 0 0
0 1 0 0 1 0 0 1 0 1 0 1
movies
movies
50
The third dimension of the BellKor system
  • Great news to recommender systems
  • Works even better on real life datasets (?)
  • Improve accuracy by exploiting implicit feedback
  • Implicit behavior is abundant and easy to
    collect
  • Rental history
  • Search patterns
  • Browsing history
  • ...
  • Implicit feedback allows predicting personalized
    ratings for users that never rated!

movie 17270
51
The three dimensions of the BellKor system
  • Where do you want to be?
  • All over the global-local axis
  • Relatively high on the quality axis
  • Can we go in between the explicit-implicit axis?
  • Yes! Relevant methods
  • Conditional RBMs ML_at_UToronto
  • NSVD Arek Paterek
  • Asymmetric factor models

global
local
implicit
explicit
binary
ratings
quality
52
Lessons
  • What it takes to win
  • Think deeper design better algorithms
  • Think broader use an ensemble of multiple
    predictors
  • Think different model the data from different
    perspectives
  • At the personal level
  • Have fun with the data
  • Work hard, long breath
  • Good teammates
  • Rapid progress of science
  • Availability of large, real life data
  • Challenge, competition
  • Effective collaboration

movie 13043
53
4 5 5 3 1
3 1 2 4 4 5
5 3 4 3 2 1 4 2
2 4 5 4 2
5 2 2 4 3 4
4 2 3 3 1
3 4 3 2 1
4 5 4
2 4 3 4
3 2
5
2 4
Yehuda Koren ATT Labs Research yehuda_at_att.com B
ellKor homepagewww.research.att.com/volinsky/ne
tflix/
Write a Comment
User Comments (0)
About PowerShow.com