Modern Information Retrieval - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Modern Information Retrieval

Description:

Modern Information Retrieval Chapter 2 Modeling Can keywords be used to represent a document or a query? keywords as query and matching as query processing cannot ... – PowerPoint PPT presentation

Number of Views:251
Avg rating:3.0/5.0
Slides: 15
Provided by: WELS152
Category:

less

Transcript and Presenter's Notes

Title: Modern Information Retrieval


1
Modern Information Retrieval
  • Chapter 2 Modeling

2
  • Can keywords be used to represent a document or a
    query?
  • keywords as query and matching as query
    processing cannot generate good results, in
    general
  • ranking algorithm, document relevance and IR model

3
  • Taxonomy of IR models

4
  • Ad hoc and filtering retrieval
  • ad hoc retrieval static document collection,
    queries submitted
  • filtering retrieval static queries, document
    streaming
  • user profile describes users preference
  • keywords, relevance feedback and dynamic keywords
    adjustment

5
  • Formal characterization of IR models

6
  • Classic IR
  • Index terms
  • deciding on the importance of a term is difficult
  • consider a terms semantics as well as its
    distribution in all documents
  • weights are used to quantify the importance of
    the index terms for describing the document
    contents

7
  • mutual independence assumption simplifies the
    task of fast ranking computation

8
  • Boolean model
  • index term weights are binary
  • query as a Boolean expression
  • not, and, or as connectives
  • Users might find it difficult to specify their
    information needs

9
  • advantages and disadvantages
  • each document is either relevant or non-relevant
  • given (0,1,0), is document dj an answer?

10
  • Vector model
  • Allows partial matching and ranking by a
    similarity measure

11
(No Transcript)
12
  • Computing index term weights
  • term frequency, tf factor how well the term
    describes the document contents
  • inverse document frequency, idf factor how well
    the term represents the document

13
(No Transcript)
14
  • the vector model is a popular retrieval model due
    to its simplicity and performance
Write a Comment
User Comments (0)
About PowerShow.com