Survey of Recommendation Systems and Algorithms - PowerPoint PPT Presentation

1 / 32

About This Presentation

Title:

Survey of Recommendation Systems and Algorithms

Description:

free annotations and explicit 'like it'or 'hate it' annotations ... Weights are calculated based on: Mean Squared Differences. Correlation (Pearson Correlation) ... – PowerPoint PPT presentation

Number of Views:166

Avg rating:3.0/5.0

Slides: 33

Provided by: hil766

Category:

more less

Transcript and Presenter's Notes

Title: Survey of Recommendation Systems and Algorithms

1
Survey of Recommendation Systems and Algorithms
Yuan Qu Xiaoyun Yang Tianping Huang
2
WHY?

The amount of information available is growing
steadily, which needs automated methods to locate
and retrieve information with respect to users
individual preferences.
The number of users accessing the Internet is
also growing, which opens up new possibilities to
organize and recommend information.

3
OBJECTIVES

Searching all the recommendation systems
available from websites
Introducing the algorithms of famous
recommendation systems

4
INTRODUCTION

The techniques used in todays recommendation
systems fall into two categories
content-based filtering
uses actually content features of products.
collaborative filtering
predicts active users preference using other
users rating, assuming that like-minded people
tend to have similar choices.

5
Recommendation systems
6
Firefly

Based on similarities between the interest
profile of that user and those of other users.
At beginning, used for music and movies, now
extend to other media
Characteristics
system maintains a user profile
word-of mouth recommendations
vector matching based on simple rating scale

7
Tapestry

use annotations for recommendation
Characteristics
first collaborative filtering
free annotations and explicit like itor hate
it annotations
depends on a lot of peoples reading and voting
hard for new areas exploration

8
GroupLens

People who agreed in their subjective evaluation
of past article are likely to agree again in the
future. According to the similarity, provide
recommendation
Characteristics
openness
easy to use,vote(explicit 1-5 scale vote)
scalability

9
Lotus Notes

In the group, the users have similar goals and
information interests. The system provides a
feature to let people annotate document and send
them to others
Characteristics
closed system of similar users
annotate the documents
use agent to represent an individual to protect
privacy

10
Phoaks(People Helping One Another Know Stuff)

automatically recognizing web resource
references in a new group message and then
attempts to classify it, then introduces it to
other users
Characteristics
scan the occurrences of URLs posted in group
messages and get the most important to users
use implicit feedback
role specialization
reuse, it reuses recommendations from existing
online conversations.

11
Pointer

Mediator to distribute information. The pointer
consists of URL link, contextual information, and
optimal comments by the sender.
Characteristics
package contextual info with hypertext links
easy to use( easy to add annotation)
not anonymous

12
Mosaic

Idea let users publish and distribute notes as
comments added to web pages
Characteristics
users could publish or distribute bookmarks
comments added to web pages

13
WebWatcher

the server acts as tour guide for web. It
provides interactive communication between server
and users and provides recommendation
Characteristics
use previous tour, calculating the similarity of
users
reinforcement learning
not the same thing as keyword-based search engine

14
GAB

Idea collects and merges bookmark/hotlists files
of participating users and then serves these
files to users
Characteristics
ability to get users bookmarks
multi-tree data structure
sibling relation to avoid losing in hyperspace
cousin relation to avoid sparse connectivity in a
merged subject tree data base
monitor the change of content

15
Yahoo!

Idea manual way, one people uses tools to update
the index as quickly as possible. Every site to
yahoo! is examined by an expert.
Characteristics
expert classifiers
user contributions, the end user also guesses and
classifies the article

16
Fab

Idea combination of two filters, to overcome
problems such as, cold-start, and changing of
users interests
Characteristics
update users profile, based on the content-based
filtering
use 7-point scale to rank
use a series of agents to collect web pages

17
Bayesian-Mixed

incorporates the components of both content-based
filtering and collaborative filtering
Characteristics
use of all of the available data
solve the cold-start problem

18
Mark-combination of two filters

Idea combination of the weighted average of two
filters( content-based and collaborative)
Characteristics
fully realize the advantages of two filters
weights are determined by a per-user basis
weights are determined by a per-item basis

19
Trend

cold-start problem
easy for users to participate or vote
algorithm
privacy

20
Algorithms on Collaborative Filtering
Example of users rating database
21
Collaborative filtering Algorithms

Breese et al. classified CF algorithms into two
general classes.
Memory-based methods
Predications are calculated over the entire
database of users rating.
Model-based methods
An underlying model of users preference is first
constructed, from which predictions are inferred.

22
Memory-based Methods

Prediction of active users rating is a weighted
sum of the ratings of other users.
Weights are calculated based on

Mean Squared Differences

Correlation (Pearson Correlation)

Vector Similarity

23
Improvements to Memory-based Algorithm

Default voting
Inverse User Frequency
Case Amplification
Voting by Category

24
Voting by Category
25
Model-based Methods

Probabilistic Models
Cluster Models
standard cluster method Using EM algorithm to
find clusters.
Repeated Clustering
Gibbs Sampling
Bayesian Network Models
Neural Network Models

26
Repeated Clustering
27
Other Algorithms

A hybrid Memory- and Model-based approach
(Personal Diagnosis)
Assuming that all users report their ratings with
Gaussian noise.
Calculate the expectation of the active user
rating.
Have the same time-complexity as MBM.

28
Problems Improvements

Problems
Cold start
High dimension, Sparse data
Missing data
Missing Real similarity
Improvements
Predictions based on users rating on latent
features.
Combination of CBF and CF.

29
Missing Similarity
30
Ratings to Latent Features

Assuming that users are rating their products
based on the latent features of products. All
products in the database share a set of common
features.
Singular Value Decomposition (SVD)
U is representative of the response of each user
to certain features.
V is representative of the amount of each feature
present in each product.
S is a matrix related to the feature importance
in overall determination of the rating.

31
A Example of Dimension reduction using SVD
32
Recommendation Systems
Data Cleaning
Beginning
Capturing Latent Features (SVD)
Content-based Filtering
Memory-based Models
Model-based Models

Write a Comment

User Comments (0)