CHAPTER%201:%20INTRODUCTION - PowerPoint PPT Presentation

About This Presentation
Title:

CHAPTER%201:%20INTRODUCTION

Description:

... most similar n users, u, based on similarity weights, wa,u ... WHAT DO YOU MEAN YOU DON'T CARE FOR BRITNEY SPEARS YOU DUNDERHEAD? Content-Based Recommending ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 32
Provided by: sunylearni
Category:

less

Transcript and Presenter's Notes

Title: CHAPTER%201:%20INTRODUCTION


1
Filtering and Recommender SystemsContent-based
and Collaborative
Some of the slides based On Mooneys Slides
2
Feature selection LSI
  • Both MI and LSI are dimensionality reduction
    techniques
  • MI is looking to reduce dimensions by looking at
    a subset of the original dimensions
  • LSI looks instead at a linear combination of the
    subset of the original dimensions (Good Can
    automatically capture sets of dimensions that are
    more predictive. Bad the new features may not
    have any significance to the user)
  • MI does feature selection w.r.t. a classification
    task (MI is being computed between a feature and
    a class)
  • LSI does dimensionality reduction independent of
    the classes (just looks at data variance)
  • ..where as MI needs to increase variance across
    classes and reduce variance within class
  • Doing this is called LDA (linear discriminant
    analysis)
  • LSI is a special case of LDA where each point
    defines its own class

Digression
3
Personalization
  • Recommenders are instances of personalization
    software.
  • Personalization concerns adapting to the
    individual needs, interests, and preferences of
    each user.
  • Includes
  • Recommending
  • Filtering
  • Predicting (e.g. form or calendar appt.
    completion)
  • From a business perspective, it is viewed as part
    of Customer Relationship Management (CRM).

4
Feedback Prediction/Recommendation
  • Traditional IR has a single userprobably working
    in single-shot modes
  • Relevance feedback
  • WEB search engines have
  • Working continually
  • User profiling
  • Profile is a model of the user
  • (and also Relevance feedback)
  • Many users
  • Collaborative filtering
  • Propagate user preferences to other users

You know this one
5
Recommender Systems in Use
  • Systems for recommending items (e.g. books,
    movies, CDs, web pages, newsgroup messages) to
    users based on examples of their preferences.
  • Many on-line stores provide recommendations (e.g.
    Amazon, CDNow).
  • Recommenders have been shown to substantially
    increase sales at on-line stores.

6
Feedback Detection
Non-Intrusive
Intrusive
  • Click certain pages in certain order while ignore
    most pages.
  • Read some clicked pages longer than some other
    clicked pages.
  • Save/print certain clicked pages.
  • Follow some links in clicked pages to reach more
    pages.
  • Buy items/Put them in wish-lists/Shopping Carts
  • Explicitly ask users to rate items/pages

7
Content-based vs. Collaborative Recommendation
8
Collaborative Filtering
Correlation analysis Here is similar to
the Association clusters Analysis!
9
Item-User Matrix
  • The input to the collaborative filtering
    algorithm is an mxn matrix where rows are items
    and columns are users
  • Sort of like term-document matrix (items are
    terms and documents are users)
  • Can think of items as vectors in the space of
    users (or users as vectors in the space of items)
  • Can do scalar clusters etc..

10
Collaborative Filtering Method
  • Weight all users with respect to similarity with
    the active user.
  • Select a subset of the users (neighbors) to use
    as predictors.
  • Normalize ratings and compute a prediction from a
    weighted combination of the selected neighbors
    ratings.
  • Present items with highest predicted ratings as
    recommendations.

11
Similarity Weighting
  • Typically use Pearson correlation coefficient
    between ratings for active user, a, and another
    user, u.

ra and ru are the ratings vectors for the m
items rated by both a and u ri,j is
user is rating for item j
12
Neighbor Selection
  • For a given active user, a, select correlated
    users to serve as source of predictions.
  • Standard approach is to use the most similar n
    users, u, based on similarity weights, wa,u
  • Alternate approach is to include all users whose
    similarity weight is above a given threshold.

13
Rating Prediction
  • Predict a rating, pa,i, for each item i, for
    active user, a, by using the n selected neighbor
    users,
  • u ? 1,2,n.
  • To account for users different ratings levels,
    base predictions on differences from a users
    average rating.
  • Weight users ratings contribution by their
    similarity to the active user.

ri,j is user is rating for item j
14
Covariance and Standard Deviation
  • Covariance
  • Standard Deviation

15
Significance Weighting
  • Important not to trust correlations based on very
    few co-rated items.
  • Include significance weights, sa,u, based on
    number of co-rated items, m.

16
Problems with Collaborative Filtering
  • Cold Start There needs to be enough other users
    already in the system to find a match.
  • Sparsity If there are many items to be
    recommended, even if there are many users, the
    user/ratings matrix is sparse, and it is hard to
    find users that have rated the same items.
  • First Rater Cannot recommend an item that has
    not been previously rated.
  • New items
  • Esoteric items
  • Popularity Bias Cannot recommend items to
    someone with unique tastes.
  • Tends to recommend popular items.
  • WHAT DO YOU MEAN YOU DONT CARE FOR BRITNEY
    SPEARS YOU DUNDERHEAD?

17
Content-Based Recommending
  • Recommendations are based on information on the
    content of items rather than on other users
    opinions.
  • Uses machine learning algorithms to induce a
    profile of the users preferences from examples
    based on a featural description of content.
  • Lots of systems

18
Advantages of Content-Based Approach
  • No need for data on other users.
  • No cold-start or sparsity problems.
  • Able to recommend to users with unique tastes.
  • Able to recommend new and unpopular items
  • No first-rater problem.
  • Can provide explanations of recommended items by
    listing content-features that caused an item to
    be recommended.
  • Well-known technology The entire field of
    Classification Learning is at (y)our disposal!

19
Disadvantages of Content-Based Method
  • Requires content that can be encoded as
    meaningful features.
  • Users tastes must be represented as a learnable
    function of these content features.
  • Unable to exploit quality judgments of other
    users.
  • Unless these are somehow included in the content
    features.

20
Movie Domain
  • EachMovie Dataset Compaq Research Labs
  • Contains user ratings for movies on a 05 scale.
  • 72,916 users (avg. 39 ratings each).
  • 1,628 movies.
  • Sparse user-ratings matrix (2.6 full).
  • Crawled Internet Movie Database (IMDb)
  • Extracted content for titles in EachMovie.
  • Basic movie information
  • Title, Director, Cast, Genre, etc.
  • Popular opinions
  • User comments, Newspaper and Newsgroup reviews,
    etc.

21
Content-Boosted Collaborative Filtering
EachMovie
IMDb
22
Content-Boosted CF - I
23
Content-Boosted CF - II
User Ratings Matrix
Pseudo User Ratings Matrix
Content-Based Predictor
  • Compute pseudo user ratings matrix
  • Full matrix approximates actual full user
    ratings matrix
  • Perform CF
  • Using Pearson corr. between pseudo user-rating
    vectors

24
Why cant the pseudo ratings be used to help
content-based filtering?
  • How about using the pseudo ratings to improve a
    content-based filter itself?
  • Learn a NBC classifier C0 using the few items for
    which we have user ratings
  • Use C0 to predict the ratings for the rest of the
    items
  • Loop
  • Learn a new classifier C1 using all the ratings
    (real and predicted)
  • Use C1 to (re)-predict the ratings for all the
    unknown items
  • Until no change in ratings
  • With a small change, this actually works in
    finding a better classifier!
  • Change Keep the class posterior prediction
    (rather than just the max class)
  • This is called expectation maximization
  • Very useful on web where you have tons of data,
    but very little of it is labelled
  • Reminds you of K-means, doesnt it?

25
(No Transcript)
26
(boosted) content filtering
27
Co-Training Motivation
  • Learning methods need labeled data
  • Lots of ltx, f(x)gt pairs
  • Hard to get (who wants to label data?)
  • But unlabeled data is usually plentiful
  • Could we use this instead??????

28
Co-training
You train meI train you
Small labeled data needed
  • Suppose each instance has two parts
  • x x1, x2
  • x1, x2 conditionally independent given f(x)
  • Suppose each half can be used to classify
    instance
  • ?f1, f2 such that f1(x1) f2(x2) f(x)
  • Suppose f1, f2 are learnable
  • f1 ? H1, f2 ? H2, ? learning algorithms A1,
    A2


A2
A1
x1, x2
ltx1, x2, f1(x1)gt
f2
Unlabeled Instances
Labeled Instances
Hypothesis
29
Observations
  • Can apply A1 to generate as much training data as
    one wants
  • If x1 is conditionally independent of x2 / f(x),
  • then the error in the labels produced by A1
  • will look like random noise to A2 !!!
  • Thus no limit to quality of the hypothesis A2 can
    make

30
It really works!
  • Learning to classify web pages as course pages
  • x1 bag of words on a page
  • x2 bag of words from all anchors pointing to a
    page
  • Naïve Bayes classifiers
  • 12 labeled pages
  • 1039 unlabeled

31
(No Transcript)
32
Focussed Crawling
  • Cho paper
  • Looks at heuristics for managing URL queue
  • Aim1 completeness
  • Aim2 just topic pages
  • Prioritize if word in anchor / URL
  • Heuristics
  • Pagerank
  • backlinks

33
Modified Algorithm
  • Page is hot if
  • Contains keyword in title, or
  • Contains 10 instances of keyword in body, or
  • Distance(page, hot-page) lt 3

34
Results
35
More Results
36
Conclusions
  • Recommending and personalization are important
    approaches to combating information over-load.
  • Machine Learning is an important part of systems
    for these tasks.
  • Collaborative filtering has problems.
  • Content-based methods address these problems (but
    have problems of their own).
  • Integrating both is best.
Write a Comment
User Comments (0)
About PowerShow.com