Less is More Probabilistic Models for Retrieving Fewer Relevant Documents - PowerPoint PPT Presentation

About This Presentation
Title:

Less is More Probabilistic Models for Retrieving Fewer Relevant Documents

Description:

Less is More. Probabilistic Models for Retrieving Fewer ... Infinite number of possibilities, no formalism. Model, heuristics intertwined. Our approach ... – PowerPoint PPT presentation

Number of Views:82
Avg rating:3.0/5.0
Slides: 33
Provided by: harr60
Category:

less

Transcript and Presenter's Notes

Title: Less is More Probabilistic Models for Retrieving Fewer Relevant Documents


1
Less is MoreProbabilistic Models for Retrieving
Fewer Relevant Documents
  • Harr Chen, David R. Karger
  • MIT CSAIL
  • ACM SIGIR 2006
  • August 9, 2006

2
Outline
  • Motivations
  • Expected Metric Principle
  • Metrics
  • Bayesian Retrieval
  • Objectives
  • Heuristics
  • Experimental Results
  • Related Work
  • Future Work and Conclusions

3
Motivation
  • In IR, we have formal models, and formal metrics
  • Models provide framework for retrieval
  • E.g. Probabilistic
  • Metrics provide rigorous evaluation mechanism
  • E.g. Precision and recall
  • Probability ranking principle (PRP) provably
    optimal for precision/recall
  • Ranking by probability of relevance
  • But other metrics capture other notions of result
    set quality ? and PRP isnt necessarily optimal

4
Example Diversity
  • User may be satisfied with one relevant result
  • Navigational queries, question/answering
  • In this case, we want to hedge our bets by
    retrieving for diversity in result set
  • Better to satisfy different users with different
    interpretations, than one user many times over
  • Reciprocal rank/search length metrics capture
    this notion
  • PRP is suboptimal

5
IR System Design
  • Metrics define preference ordering on result sets
  • MetricResult set 1 gt MetricResult set 2
  • ?Result set 1 preferred to Result set 2
  • Traditional approach Try out heuristics that we
    believe will improve relevance performance
  • Heuristics not directly motivated by metric
  • E.g. synonym expansion, psuedorelevance feedback
  • Observation Given a model, we can try to
    directly optimize for some metric

6
Expected Metric Principle (EMP)
  • Knowing which metric to use tells us what to
    maximize for the expected value of the metric
    for each result set, given a model

Corpus
Result Sets
Calculate EMetric using model
Return set with max score
1, 2
Document 1
1, 3
2, 1
Document 2
2, 3
3, 1
Document 3
3, 2
7
Our Contributions
  • Primary EMP metric as retrieval goal
  • Metric designed to measure retrieval quality
  • Metrics we consider precision/recall _at_ n, search
    length, reciprocal rank, instance recall, k-call
  • Build probabilistic model
  • Retrieve to maximize an objective the expected
    value of metric
  • Expectations calculated according to our
    probabilistic model
  • Use computational heuristics to make optimization
    problem tractable
  • Secondary retrieving for diversity (special
    case)
  • A natural side effect of optimizing for certain
    metrics

8
Detour What is a Heuristic?
  • Ad hoc approach
  • Use heuristics that are believed to be correlated
    with good performance
  • Heuristics used to improve relevance
  • Heuristics (probably) make system slower
  • Infinite number of possibilities, no formalism
  • Model, heuristics intertwined
  • Our approach
  • Build model that directly optimizes for good
    performance
  • Heuristics used to improve efficiency
  • Heuristics (probably) make optimization worse
  • Well-known space of optimization techniques
  • Clean separation between model and heuristics

9
Our Contributions
  • Primary EMP metric as retrieval goal
  • Metric designed to measure retrieval quality
  • Metrics we consider precision/recall _at_ n, search
    length, reciprocal rank, instance recall, k-call
  • Build probabilistic model
  • Retrieve to maximize an objective the expected
    value of metric
  • Expectations calculated according to our
    probabilistic model
  • Use computational heuristics to make optimization
    problem tractable
  • Secondary retrieving for diversity (special
    case)
  • A natural side effect of optimizing for certain
    metrics

10
Search Length/Reciprocal Rank
  • (Mean) search length (MSL) number of irrelevant
    results until first relevant
  • (Mean) reciprocal rank (MRR) one over rank of
    first relevant


Search length 2 Reciprocal rank 1/3
11
Instance Recall
  • Each topic has multiple instances (subtopics,
    aspects)
  • Instance recall is how many instances covered (in
    union) over first n results


Instance recall _at_ 5 0.75
12
k-call _at_ n
  • Binary metric 1 if top n results has k relevant,
    0 otherwise
  • 1-call is (1 no)
  • See TREC robust track


1-call _at_ 5 1 2-call _at_ 5 1 3-call _at_ 5 0
13
Motivation for k-call
  • 1-call Want one relevant document
  • Many queries satisfied with one relevant result
  • Only need one relevant document, more room to
    explore ? promotes result set diversity
  • n-call Want all relevant documents
  • Perfect precision
  • Hone in on one interpretation and stick to it!
  • Intermediate k
  • Risk/reward tradeoff
  • Plus, easily modeled in our framework
  • Binary variable

14
Our Contributions
  • Primary EMP metric as retrieval goal
  • Metric designed to measure retrieval quality
  • Metrics we consider precision/recall _at_ n, search
    length, reciprocal rank, instance recall, k-call
  • Build probabilistic model
  • Retrieve to maximize an objective the expected
    value of metric
  • Expectations calculated according to our
    probabilistic model
  • Use computational heuristics to make optimization
    problem tractable
  • Secondary retrieving for diversity (special
    case)
  • A natural side effect of optimizing for certain
    metrics

15
Bayesian Retrieval Model
  • There exists distributions that generate relevant
    documents, irrelevant documents
  • PRP rank by
  • Remaining modeling questions form of rel/irrel
    distributions and parameters for those
    distributions
  • In this paper, we assume multinomial models, and
    choose parameters by maximum a posteriori
  • Prior is background corpus word distribution

16
Our Contributions
  • Primary EMP metric as retrieval goal
  • Metric designed to measure retrieval quality
  • Metrics we consider precision/recall _at_ n, search
    length, reciprocal rank, instance recall, k-call
  • Build probabilistic model
  • Retrieve to maximize an objective the expected
    value of metric
  • Expectations calculated according to our
    probabilistic model
  • Use computational heuristics to make optimization
    problem tractable
  • Secondary retrieving for diversity (special
    case)
  • A natural side effect of optimizing for certain
    metrics

17
Objective
  • Probability Ranking Principle (PRP) maximize
    at each step in ranking
  • Expected Metric Principle (EMP) maximize
    for complete result
    set
  • In particular for k-call, maximize

18
Our Contributions
  • Primary EMP metric as retrieval goal
  • Metric designed to measure retrieval quality
  • Metrics we consider precision/recall _at_ n, search
    length, reciprocal rank, instance recall, k-call
  • Build probabilistic model
  • Retrieve to maximize an objective the expected
    value of metric
  • Expectations calculated according to our
    probabilistic model
  • Use computational heuristics to make optimization
    problem tractable
  • Secondary retrieving for diversity (special
    case)
  • A natural side effect of optimizing for certain
    metrics

19
Optimization of Objective
  • Exact optimization of objective is usually
    NP-hard
  • E.g. Exact optimization for k-call reducible to
    NP-hard maximum graph clique problem
  • Approximation heuristic Greedy algorithm
  • Select documents successively in rank order
  • Hold previous documents fixed, optimize objective
    at each rank

Maximize Emetric d
d1
20
Optimization of Objective
  • Exact optimization of objective is usually
    NP-hard
  • E.g. Exact optimization for k-call reducible to
    NP-hard maximum graph clique problem
  • Approximation heuristic Greedy algorithm
  • Select documents successively in rank order
  • Hold previous documents fixed, optimize objective
    at each rank

Fixed
d1
Maximize Emetric d, d1
d2
21
Optimization of Objective
  • Exact optimization of objective is usually
    NP-hard
  • E.g. Exact optimization for k-call reducible to
    NP-hard maximum graph clique problem
  • Approximation heuristic Greedy algorithm
  • Select documents successively in rank order
  • Hold previous documents fixed, optimize objective
    at each rank

Fixed
d1
Fixed
d2
Maximize Emetric d, d1, d2
d3
22
Greedy on 1-call and n-call
  • 1-greedy
  • Greedy algorithm reduces to ranking each
    successive document assuming all previous
    documents are irrelevant
  • Algorithm has discovered incremental negative
    pseudorelevance feedback
  • n-greedy Assume all previous documents relevant

23
Greedy on Other Metrics
  • Greedy with precision/recall ? reduces to PRP!
  • Greedy on k-call for general k (k-greedy)
  • More complicated
  • Greedy with MSL, MRR, instance recall works out
    to 1-greedy algorithm
  • Intuition to make first relevant document appear
    earlier, we want to hedge our bets as to query
    interpretation (i.e., diversify)

24
Experiments Overview
  • Experiments verify that optimizing for metric
    improves performance on metric
  • They do not tell us which metrics to use
  • Looked at ad hoc diversity examples
  • TREC topics/queries
  • Tuned weights on separate development set
  • Tested on
  • Standard ad hoc (robust track) topics
  • Topics with multiple annotators
  • Topics with multiple instances

25
Diversity on Google Results
  • Task reranking top 1,000 Google results
  • In optimizing 1-call, our algorithm finds more
    diverse results than PRP, Google results

26
Experiments Robust Track
  • TREC 2003, 2004 robust tracks
  • 249 topics
  • 528,000 documents
  • 1-call, 10-call results statistically significant

27
Experiments Instance Retrieval
  • TREC-6,7,8 interactive tracks
  • 20 topics
  • 210,000 documents
  • 7 to 56 instances per topic
  • PRP baseline instance recall _at_ 10 0.234
  • Greedy 1-call instance recall _at_ 10 0.315

28
Experiments Multi-annotator
  • TREC-4,6 ad hoc retrieval
  • Independent annotators assessed same topics
  • TREC-4 49 topics, 568,000 documents, 3
    annotators
  • TREC-6 50 topics, 556,000 documents, 2
    annotators
  • ? More annotators more satisfied using 1-greedy

29
Related Work
  • Fits in risk minimization framework (objective as
    negative loss function)
  • Other approaches look at optimizing for metrics
    directly, with training data
  • Pseudorelevance feedback
  • Subtopic retrieval
  • Maximal marginal relevance
  • Clustering
  • See paper for references

30
Future Work
  • General k-call (k 2, etc.)
  • Determination if this is what users want
  • Better underlying probabilistic model
  • Our contribution is in the ranking objective, not
    the model ? model can be arbitrarily
    sophisticated
  • Better optimization techniques
  • E.g., Local search would differentiate algorithms
    for MRR and 1-call
  • Other metrics
  • Preliminary work on mean average precision,
    precision _at_ recall
  • (Perhaps) surprisingly, these metrics are not
    optimized by PRP!

31
Conclusions
  • EMP Metric can motivate model choosing and
    believing in a metric already gives us a
    reasonable objective, Emetric
  • Can potentially apply EMP on top of a variety of
    different underlying probabilistic models
  • Diversity is one practical example of a natural
    side effect of using EMP with the right metric

32
Acknowledgments
  • Harr Chen supported by the Office of Naval
    Research through a National Defense Science and
    Engineering Graduate Fellowship
  • Jaime Teevan, Susan Dumais, and anonymous
    reviewers provided constructive feedback
  • ChengXiang Zhai, William Cohen, and Ellen
    Voorhees provided code and data
Write a Comment
User Comments (0)
About PowerShow.com