Capturing User Interests by Both Exploitation and Exploration - PowerPoint PPT Presentation

About This Presentation
Title:

Capturing User Interests by Both Exploitation and Exploration

Description:

Capturing User Interests by Both Exploitation and Exploration. Richard Sia (Joint work with NEC) ... better reward than greedy exploration is important. How are ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 16
Provided by: kacheu
Learn more at: http://oak.cs.ucla.edu
Category:

less

Transcript and Presenter's Notes

Title: Capturing User Interests by Both Exploitation and Exploration


1
Capturing User Interests by Both Exploitation and
Exploration
  • Richard Sia (Joint work with NEC)Feb 9 2007

2
Background
  • The paid experiment carried out in Fall 2006
  • 11 participants
  • Personal information manager
  • N-armed bandit problem

3
Overview of the PIM
  • Recorder
  • Learner
  • Crawler
  • Recommendation Provider

4
N-armed bandit problem
  • Well-studied problem in reinforcement learning /
    statistics
  • Problem statement
  • Background You are given n different options
  • Decision For each choice you receive a numerical
    reward chosen from an unknown stationary
    probability distribution
  • Goal maximize the total reward over some time
    period
  • Solutions
  • Action-value methods (greedy e-greedy)
  • Softmax Action Selection (decaying)
  • Pursuit methods
  • Associative search

5
Reward vs plays
  • e-greedy give better reward than greedy ?
    exploration is important

6
How are they related?
  • You are the n-armed bandit
  • PIM is the player
  • PIM recommends you topics, you give it reward (by
    clicking)

7
Model
  • Assumptions
  • K topics (K-armed bandit)
  • Each page is classified into one topic
  • Ti P(click read, topic i)
  • it follows a Bernoulli distribution with
    parameter p, we assume p follows a beta
    distribution with parameters a, ß
  • g(j) P(read j)
  • From various search result click studies
  • Utility function U(R T)
  • R ranking of topics

8
Model
  • Updating posteriori distribution after
    observation
  • Not clicked ßnewßold g(ri)
  • Clicked anewaold 1
  • Ranking function of topics
  • Exploitation ?exploration? Mean
    ?variance

9
Experiments
  • Simulation
  • 45 topics (assign degree of interest randomly)
  • Size of display list 7
  • Read probability 1 0.95 0.9 0.85 0.8 0.75 0.7
  • 3 strategies
  • EE
  • Greedy
  • Random
  • User study

10
Utility
  • ? 10 ? 20

11
Utility under interest drift (non-stationary
probability distribution)
  • ? 10 ?
    20

12
Estimation accuracy of ?
  • Random gt EE gt greedy

13
Experiments
  • Simulation
  • User study
  • 45 categories from dmoz.org
  • Arts/Architecture
  • Computers/E-books
  • Science/Biology
  • etc.
  • Survey of user interest before experiment
  • Present 7 items each time
  • Interleave 3 strategies randomly

14
Click utility improvement
  • First 25 iterations Drift at 25th
    iteration

15
Conclusion
  • Modeling of a learning framework to infer users
    interest from users clicks
  • Some possible future works
  • One page (or item) belongs to multiple classes
  • Dependency between classes
Write a Comment
User Comments (0)
About PowerShow.com