Document Language Models, Query Models, and Risk Minimization for Information Retrieval - PowerPoint PPT Presentation

About This Presentation
Title:

Document Language Models, Query Models, and Risk Minimization for Information Retrieval

Description:

... Science. Carnegie Mellon University. LM Applied to IR ... modeling of both queries ... Retrieval as a sequence of presenting decisions. Application of ... – PowerPoint PPT presentation

Number of Views:113
Avg rating:3.0/5.0
Slides: 20
Provided by: Ale2
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Document Language Models, Query Models, and Risk Minimization for Information Retrieval


1
Document Language Models, Query Models, and Risk
Minimization for Information Retrieval
John Lafferty, Chengxiang Zhai School of
Computer Science Carnegie Mellon University
2
LM Applied to IR
  • First proposed in (Ponte Croft, 98)
  • Also explored in (Miller et al., 99 Hiemstra et
    al. 99 Berger Lafferty, 99 Song Croft, 99
    Hiemstra, 00 etc.)
  • A very promising approach
  • Good empirical performance
  • Statistical foundation
  • Re-use existing LM techniques
  • But,
  • Lack of understanding of the approach (lack of
    relevance)
  • Conceptual inconsistency in feedback (text
    terms)
  • Real empirical advantage still needs to be
    demonstrated

3
Research Questions
  • How can we extend the language modeling approach
    to
  • Allow modeling of both queries and documents
  • Exploit language modeling to perform natural and
    effective feedback
  • Estimate translation models without training data
  • (Berger Lafferty 99)
  • What is the relationship between the language
    modeling approach and the traditional
    probabilistic retrieval models?
  • Can we go beyond the Probability Ranking
    Principle?

4
Outline
  • A Risk Minimization Retrieval Framework
  • Special Cases
  • Markov Chain Translation Model
  • Query Language Model Estimation
  • Evaluation

5
Risk Minimization Framework Basic Idea
  • Utility/Risk as retrieval criterion
  • Retrieval as a sequence of presenting decisions
  • Application of Bayesian decision theory

6
Risk Minimization Framework Features
  • Modeling utility-based retrieval (beyond binary
    relevance)
  • Modeling interactive retrieval (dynamic user
    model)
  • Covering many existing retrieval models
  • Fully probabilistic (language model estimation)
  • Document language model query language model
  • Feedback as query model estimation

7
Generative Model
8
Actions, Loss functions, and Bayes risk
Given Cd1,d2, dk from source S and query q
from user U, which list of documents to present?
Action a list of documents Loss L(a,?),
??(?Q, ?D1, ,?Dk ,R1, , Rk) Bayes risk
  • Bayes optimal decision (risk minimization)
  • a argmin R(aU,q,S,C)

a
9
Risk Minimization Ranking Function
posterior distribution
Loss function
Query model
Doc model
Relevance model
10
Special Cases
Loss function L(?Q, ?D ,R)?
11
A Markov Chain Method for Estimating Query Model
12
Markov Chain Translation Probabilities
w0
...
13
Sample Query Probabilities
14
Evaluation
  • KL-divergence Unigram Retrieval Model
  • Fixed linear interpolation smoothing for
  • Comparing two methods for estimating
  • Maximum likelihood ( query likelihood, simple
    LM)
  • Markov chain on top 50 docs
  • Three testing collections
  • AP89 (250MB 50 queries)
  • TREC8 ad hoc (2GB 50 queries)
  • TREC8 web track (2GB 50 queries)

15
Results Simple LM vs. Markov Chain
AP89
TREC8
Web
16
Results Rocchio vs. Markov Chain
AP89
TREC8
Web
17
Rocchio vs. Markov Chain TREC8
18
Rocchio vs. Markov Chain Web
19
Conclusions and Future Work
  • Risk minimization as a new general retrieval
    framework
  • Goes beyond the Probability Ranking Principle
    (PRR)
  • Recovers existing models
  • Extends existing work on language modeling
  • Markov chain model expansion
  • Efficient translation model
  • Applicable to both query model and document model
  • Empirically effective
  • Future Work
  • Explore utility-based ranking criterion (e.g.,
    MMR)
  • Explore new models and new estimation methods
Write a Comment
User Comments (0)
About PowerShow.com