Fast Learning of Document Ranking Functions with the Committee Perceptron - PowerPoint PPT Presentation

About This Presentation
Title:

Fast Learning of Document Ranking Functions with the Committee Perceptron

Description:

Joint work with Vitor Carvalho and Jaime Carbonell ... Re-writing AP as a pairwise loss function: Preliminary Results. Using MAP-Loss Using Pairs-Loss ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 28
Provided by: jonatha209
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Fast Learning of Document Ranking Functions with the Committee Perceptron


1
Fast Learning of Document Ranking Functions with
the Committee Perceptron
  • Jonathan Elsas
  • LTI Student Research Symposium
  • Sept. 14, 2007

2
Briefly
  • Joint work with Vitor Carvalho and Jaime
    Carbonell
  • Submitted to Web Search and Data Mining
    conference (WSDM 2008)
  • http//wsdm2008.org

3
Evolution of Features in IR
  • In the beginning, there was TF
  • It became clear that other features were needed
    for effective document ranking
  • IDF, document length
  • Along came HTML
  • doc. structure link network features
  • Now, we have collective annotation
  • social book-marking features

4
Challenges
  • Which features are important? How to best choose
    the weights for each feature?
  • With just a few features, manual tuning or
    parameter sweeps sufficed.
  • This approach becomes impractical with more than
    5-6 features.

5
Learning Approach to Setting Feature Weights
  • Goal Utilize existing relevance judgments to
    learn optimal weight setting
  • Recently has become a hot research area in IR.
    Learning to Rank
  • (See SIGIR 2007 Learning To Rank
    workshophttp//research.microsoft.com/users/LR4IR
    -2007/)

6
Pair-wise Preference Learning
  • Learning a document scoring function
  • Treated as a classification problem on pairs of
    documents
  • Resulting scoring function is used as the learned
    document ranker.

Correct
Incorrect
7
Perceptron Algorithm
  • Proposed in 1958 by Rosenblatt
  • Online algorithm (instance-at-a-time)
  • Whenever a ranking mistake is made, update the
    hypothesis
  • Provable mistake bounds convergence

8
Perceptron Algorithm Variants
  • Pocket Perceptron (Gallant, 1990)
  • Keep the one-best hypothesis
  • Voted Perceptron (Freund Schapire, 1999)
  • Keep all the intermediate hypotheses and combine
    them at the end
  • Often in practice, average hypotheses

9
Committee Perceptron Algorithm
  • Ensemble method
  • Selectively chooses N best hypotheses encountered
    during training
  • Significant advantages over previous perceptron
    variants
  • Many ways to combine output of hypotheses
  • Voting, score averaging, hybrid approaches
  • Weight by a retrieval performance metric

10
Committee Perceptron Training
Training Data
Committee
Current Hypothesis
11
Committee Perceptron Training
Training Data
Committee
Current Hypothesis
12
Committee Perceptron Training
Training Data
Committee
Current Hypothesis
13
Committee Perceptron Training
Training Data
Committee
Current Hypothesis
14
Evaluation
  • Compared Committee Perceptron to
  • RankSVM (Joachims et. al., 2002)
  • RankBoost (Freund et. al., 2003)
  • Learning To Rank (LETOR) dataset
  • http//research.microsoft.com/users/tyliu/LETOR/de
    fault.aspx
  • Provides three test collections, standardized
    feature sets, train/validation/test splits

15
Committee Perceptron Learning Curves
16
Committee Perceptron Performance
17
Committee Perceptron Performance (OHSUMED)
18
Committee Perceptron Performance (TD2004)
19
Committee Perceptron Training Time
  • Much faster than other rank learning algorithms.
  • Training time on OHSUMED dataset
  • CP 450 seconds for 50 iterations
  • RankSVM gt 21k seconds
  • 45-fold reduction in training time with
    comparable performance.

20
Committee Perceptron Summary
  • CP is a fast perceptron-based learning algorithm,
    applied to document ranking.
  • Significantly outperforms the pocket and average
    perceptron variants on learning document ranking
    functions.
  • Performs comparably to two strong baseline rank
    learning algorithms, but trains in much less time.

21
Future Directions
  • Performance of the Committee Perceptron is good,
    but it could be better
  • What are we really optimizing?
  • (not MAP or NDCG)

22
Loss Functions for Pairwise Preference Learners
  • Minimizing the number of mis-ranked document
    pairs
  • This only loosely corresponds to ranked-based
    evaluation measures
  • Problem All rank positions treated the same

23
Problems with Optimizing the Wrong Metric
24
Ranked Retrieval Pairwise- Preference Loss
Functions
  • Average Precision places more emphasis on
    higher-ranked documents.

25
Ranked Retrieval Pairwise- Preference Loss
Functions
  • Average Precision places more emphasis on
    higher-ranked documents.
  • Re-writing AP as a pairwise loss function

26
Preliminary Results
Using MAP-Loss Using Pairs-Loss
27
  • Questions?
Write a Comment
User Comments (0)
About PowerShow.com