User Profiling: Collaborative Filtering - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

User Profiling: Collaborative Filtering

Description:

... U VT, ui,k = P(ui | zk), k,k = P(zk), vj,k = P(ij | zk) M = U VT M' = U 'VT ... representing items the user already rated; the user's rating = given class label ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 13
Provided by: mih95
Category:

less

Transcript and Presenter's Notes

Title: User Profiling: Collaborative Filtering


1
User Profiling Collaborative Filtering
  • Miha Grcar,
  • JSI
  • October 2004

2
User Profiling
  • User profiling constructing a model of a user
  • Goal without any effort from the user
  • The profile is machine readable
  • The profile reflects the users interests
  • Two profiles can be compared to determine the
    degree of similarity in interests of two users
  • Long-term, short-term interests

3
Collaborative Filtering
  • Goal recommend items of interest to the active
    user / try to predict the users rating of an
    item
  • Idea consult other users (explore their
    profiles)
  • A simple example we want to predict how much we
    will like the movie Beautiful Mind
  • Naïve approach look at its average rating at
    IMDb.com (already a collaborative approach)
  • Better approach ask a friend who is considered
    to have a similar taste (true collaborative
    filtering)

4
The k Nearest Neighbors Approach
  • Define the similarity measure in order to be able
    to compare two user profiles
  • To the active user, find k most similar users
    that have rated the item in question
  • Predict the users rating of the item by
    calculating the weighted1 average of the ratings
    given to the item by other users
  • ????
  • 1Weights are usually the similarity values.
  • 2 fundamental problems
  • Sparsity of the data
  • Scalability problem

5
Similarity Measures
  • Cosine similarity
  • Used in information retrieval
  • Pearson correlation coefficient
  • Used in statistics to evaluate the degree of
    linear relationship between two variables

6
Weights
  • Weight is proportional to similarity value
  • Confidence in computed weights
  • many overlapping values?? high
  • few overlapping values?? low
  • Weights amplification reward weights closer to 1
    and punish those closer to 0
  • Inverse user frequency universally liked items
    are less relevant for predictions
  • (log n / nj)

7
The Sparsity Problem
  • Many missing values in feature vectors
  • Solving the sparsity problem
  • Default rating
  • Using content
  • Dimensionality reduction techniques also
    counter the scalability issue!
  • Simple approaches
  • Remove users with low support
  • Random sampling
  • More advanced
  • Latent Semantic Analysis (LSA)
  • Probabilistic LSA (pLSA)

8
Latent Semantic Analysis (LSA)
  • On the basis of Singular Value Decomposition
    (SVD)
  • M U?VT ? M U?VT
  • M the user-item matrix
  • ? the diagonal matrix of the singular values of
    M
  • U, V two unitary matrices

9
Probabilistic Latent Semantic Analysis (pLSA)
  • On the basis of LSA (SVD)
  • Observation pair (u, i)
  • P(u, i) P(z)P(u z)P(i z) z latent
    variable
  • P(u, i) ? P(i u)
  • Incorporating a rating value
  • P(u, i, r) ? P(i, r u)
  • Relation to LSA (SVD)
  • M U?VT, ui,k P(ui zk),??k,k P(zk), vj,k
    P(ij zk)
  • M U?VT ? M U?VT

10
CF as a Classification Task
  • Classes different rating scores
  • For each user we train a separate classifier
  • Train set feature vectors representing items the
    user already rated the users rating given
    class label
  • Test set feature vectors representing items the
    user did not rate yet classification
    prediction of the users rating of the item
  • A problem with the sparsity of the data
  • Possible solution representing one user with
    many instances

11
Item-based CF
  • First compute item-to-item similarities
  • To predict user us rating of item i, compute a
    weighted average of k most similar items to item
    i that u already rated
  • User-user similarity measures can be applied to
    compute item-item similarities

12
  • Thank you.
Write a Comment
User Comments (0)
About PowerShow.com