Fast Learning of Document Ranking Functions with the Committee Perceptron

About This Presentation

Title:

Fast Learning of Document Ranking Functions with the Committee Perceptron

Description:

Joint work with Vitor Carvalho and Jaime Carbonell ... Re-writing AP as a pairwise loss function: Preliminary Results. Using MAP-Loss Using Pairs-Loss ... – PowerPoint PPT presentation

Number of Views:47

Avg rating:3.0/5.0

Slides: 28

Provided by: jonatha209

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Fast Learning of Document Ranking Functions with the Committee Perceptron

1
Fast Learning of Document Ranking Functions with
the Committee Perceptron

Jonathan Elsas
LTI Student Research Symposium
Sept. 14, 2007

2
Briefly

Joint work with Vitor Carvalho and Jaime
Carbonell
Submitted to Web Search and Data Mining
conference (WSDM 2008)
http//wsdm2008.org

3
Evolution of Features in IR

In the beginning, there was TF
It became clear that other features were needed
for effective document ranking
IDF, document length
Along came HTML
doc. structure link network features
Now, we have collective annotation
social book-marking features

4
Challenges

Which features are important? How to best choose
the weights for each feature?
With just a few features, manual tuning or
parameter sweeps sufficed.
This approach becomes impractical with more than
5-6 features.

5
Learning Approach to Setting Feature Weights

Goal Utilize existing relevance judgments to
learn optimal weight setting
Recently has become a hot research area in IR.
Learning to Rank
(See SIGIR 2007 Learning To Rank
workshophttp//research.microsoft.com/users/LR4IR
-2007/)

6
Pair-wise Preference Learning

Learning a document scoring function
Treated as a classification problem on pairs of
documents
Resulting scoring function is used as the learned
document ranker.

Correct
Incorrect
7
Perceptron Algorithm

Proposed in 1958 by Rosenblatt
Online algorithm (instance-at-a-time)
Whenever a ranking mistake is made, update the
hypothesis
Provable mistake bounds convergence

8
Perceptron Algorithm Variants

Pocket Perceptron (Gallant, 1990)
Keep the one-best hypothesis
Voted Perceptron (Freund Schapire, 1999)
Keep all the intermediate hypotheses and combine
them at the end
Often in practice, average hypotheses

9
Committee Perceptron Algorithm

Ensemble method
Selectively chooses N best hypotheses encountered
during training
Significant advantages over previous perceptron
variants
Many ways to combine output of hypotheses
Voting, score averaging, hybrid approaches
Weight by a retrieval performance metric

10
Committee Perceptron Training
Training Data
Committee
Current Hypothesis
11
Committee Perceptron Training
Training Data
Committee
Current Hypothesis
12
Committee Perceptron Training
Training Data
Committee
Current Hypothesis
13
Committee Perceptron Training
Training Data
Committee
Current Hypothesis
14
Evaluation

Compared Committee Perceptron to
RankSVM (Joachims et. al., 2002)
RankBoost (Freund et. al., 2003)
Learning To Rank (LETOR) dataset
http//research.microsoft.com/users/tyliu/LETOR/de
fault.aspx
Provides three test collections, standardized
feature sets, train/validation/test splits

15
Committee Perceptron Learning Curves
16
Committee Perceptron Performance
17
Committee Perceptron Performance (OHSUMED)
18
Committee Perceptron Performance (TD2004)
19
Committee Perceptron Training Time

Much faster than other rank learning algorithms.
Training time on OHSUMED dataset
CP 450 seconds for 50 iterations
RankSVM gt 21k seconds
45-fold reduction in training time with
comparable performance.

20
Committee Perceptron Summary

CP is a fast perceptron-based learning algorithm,
applied to document ranking.
Significantly outperforms the pocket and average
perceptron variants on learning document ranking
functions.
Performs comparably to two strong baseline rank
learning algorithms, but trains in much less time.

21
Future Directions