An Implicit Feedback approach for Interactive Information Retrieval - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

An Implicit Feedback approach for Interactive Information Retrieval

Description:

Introduction. Relevance Feedback (RF) ... Introduction. Approach: Searchers can interact with different representations of each document. ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 37
Provided by: DIT96
Category:

less

Transcript and Presenter's Notes

Title: An Implicit Feedback approach for Interactive Information Retrieval


1
An Implicit Feedback approach for Interactive
Information Retrieval
  • Ryen W. White, Joemon M. Jose, Ian Ruthven
  • University of Glasgow

Hamza Hydri Syed Course Presentation - Web
Information Retrieval
2
  • Roadmap
  • Introduction
  • Searcher Interaction
  • Binary Voting Model
  • Evaluation
  • Results Analysis
  • Conclusions

3
  • Roadmap
  • Introduction
  • Searcher Interaction
  • Binary Voting Model
  • Evaluation
  • Results Analysis
  • Conclusions

4
  • Introduction
  • Relevance Feedback (RF)
  • Automatically improving a systems representation
    of a searchers information need through an
    iterative process of feedback1.
  • Depends on a series of relevance assessments made
    explicitly by the user.
  • Assumes that underlying need is the same across
    all the iterations.
  • Implicit RF
  • IR system unobtrusively monitors search behaviour
  • Removes the need for the searcher to explicitly
    indicate which documents are relevant2.
  • Variety of surrogate measures have been
    employed.
  • Hyperlinks clicked, mouseovers, scrollbar
    activity3,4
  • Can be unreliable indicators, use interaction
    with the full-text documents as implicit
    feedback.

5
  • Introduction
  • Approach
  • Searchers can interact with different
    representations of each document.
  • Representations are of varying length, focussed
    on the query logically connected at the
    interface to form an interactive search path.
  • Develops a means of better representing searcher
    needs while minimizing the burden of explicitly
    reformulating queries.

6
  • Roadmap
  • Introduction
  • Searcher Interaction
  • Binary Voting Model
  • Evaluation
  • Results Analysis
  • Conclusions

7
  • Searcher Interaction
  • Document Representations
  • Focus on query-relevant parts of documents
  • Reduce likelihood for selection of erroneous
    terms.
  • Interface uses multiple document representations
  • Top-ranking sentences (TRS) from each of the top
    30 documents retrieved
  • Title
  • Query-biased summary of the documents
  • Summary Sentence
  • Sentence in Context
  • Document itself

8
(No Transcript)
9
  • Searcher Interaction
  • Relevance Path
  • The further along the path a searcher travels,
    the more relevant is the information in the path.
  • Paths can vary in length and searchers can access
    the full-text of the document from any step in
    the path.

10
  • Roadmap
  • Introduction
  • Searcher Interaction
  • Binary Voting Model
  • Evaluation
  • Results Analysis
  • Conclusions

11
  • Binary Voting Model
  • Features
  • Heursitics-based model which implicitly selects
    terms for query modification.
  • Utilizes searcher interaction with document
    representations relevance paths.
  • Term present in a viewed representation recieves
    a vote, when not present recieves no vote.
  • Winning terms are those with the most votes and
    hence best describe the information viewed by the
    searcher.
  • Contribution of a vote is weight-ed based on the
    indicative worth of the representation.
  • 0.1 - Title
  • 0.2 - TRS
  • 0.2 Summary Sentence
  • 0.2 Sentence in Context
  • 0.3 Summary

12
  • Binary Voting Model
  • Features
  • Each document is represented by a vector of
    length n
  • n total number of unique non-stop-word terms
  • The list holding these terms vocabulary
  • A document x term matrix is built of size (d1) x
    n
  • d no. of documents the searcher has seen

13
  • Binary Voting Model
  • Example Simple Updating
  • Original Query Q0 contains t5 and t9
  • Vector is normalised to give each term a value
    between 0,1
  • Term occuring is assigned a weight wt
  • p no. of steps taken
  • D document
  • t term
  • r representation
  • Wt,r weight of t for the representation r
  • Weight for each term is added to the appropriate
    term/document entry in the matrix

14
  • Binary Voting Model
  • Example Simple Updating
  • Initial state of document x term matrix
  • Searcher expresses interest in the Title of the
    document D1 with a step weight of 0.1 and
    contains terms t1,t2 and t7
  • Matrix changes to
  • Weights of terms t1,t2 and t7 are directly
    updated
  • t2 is now seen as being important to D1
  • t1 and t7 are seen as more important than before
    to D1
  • Scoring is cummulative

15
  • Binary Voting Model
  • Query Creation
  • For every 5 paths a new query is computed, which
    gathers sufficient implicit evidence from
    searcher interaction
  • To compute new query we calculate the average
    score for each term across all documents
  • Terms are ranked by this score
  • High average score implies the term has appeared
    in many viewed representations and/or in those
    with high indicative weights
  • Top 6 terms chosen are
  • t9,t5,t1,t7,t3 and t2
  • Although t2,t3 and t8 have the same score, t8 is
    not included since t3 occurs more recently and t2
    occurs in more than one document

16
  • Binary Voting Model
  • Tracking Information Need
  • Change in the information need can be measured by
    computing the change in the term ordering from
    the term list at different steps i.e., qm and
    qm1
  • Since the vocabulary is static, only the order of
    the terms in the list will change
  • Where is searcher information need and o is
    the Spearman rank-order correlation coefficient
    that computes the difference between two lists of
    unique terms
  • The correlation returns values between -1 and 1
  • Result closer to -1 means the term lists are
    dissimilar w.r.t rank ordering
  • Result closer to 1 means similarity between
    ranking terms increases

17
  • Binary Voting Model
  • Strategies Implemented
  • Re-searching coeffecient value lt 0.2 indicates
    large change in term lists, that they are
    substantially different w.r.t rank ordering, this
    reflects a large change in . A new re-search
    is done to retrieve a new set of documents.
  • Reordering Documents result in the range 0.2,
    0.5) indicates weak correlation consequently a
    less substantial change in . A new query is
    used to reorder the top 30 retrieved documents,
    which is done using best-match tf-idf scoring.
  • Reordering TRS coefficient in range 0.5,0.8)
    indicates strong correlation in the two term
    lists a small change in the predicted . New
    query is used to re-rank TRS list.

18
  • Roadmap
  • Introduction
  • Searcher Interaction
  • Binary Voting Model
  • Evaluation
  • Results Analysis
  • Conclusions

19
  • Evaluation
  • Manual Baseline System
  • Similar to implicit feedback system except that
    searcher is solely responsible for adding new
    query terms selecting what action is
    undertaken.
  • Baseline interface has additional component
    term/strategy control panel, which allows
    searchers to decide how best to use the query.
  • This nature of Baseline allows us to evaluate how
    well the implicit feedback system detected
    information needs from the perspective of the
    subject.

20
(No Transcript)
21
  • Evaluation
  • Experimental Subjects
  • Were mainly undergraduate and postgraduate
    students of University of Glasgow, divided into 2
    groups of
  • Experienced
  • Inexperienced
  • Experimental Tasks
  • Each subject was asked to complete one search
    task from each of 4 categories
  • Categories were
  • fact search (finding a persons mail address)
  • background search (finding information on dust
    allergies)
  • decision search (choosing the best financial
    instrument)
  • search for number of items (finding contact
    details of a no. of employees)
  • Search scenarios reflect real-life search
    situations allow the subject to make personal
    assessments on what constitutes relevant material5

22
  • Roadmap
  • Introduction
  • Searcher Interaction
  • Binary Voting Model
  • Evaluation
  • Results Analysis
  • Conclusions

23
  • Results Analysis
  • Hypotheses tested
  • The terms selected for implicit feedback
    represent the information needs of the subject
    (i.e., term selection support)
  • The implicit feedback approach estimates changes
    in the subjects information need
  • The implicit feedback approach makes search
    decisions that correspond closely with those of
    the subject

24
  • Results Analysis
  • Hypothesis 1 Information need detection
  • We measure degree of term overlap using baseline
    system.
  • BVM runs in background, invisible to subject
    not involved directly in any query modification
    decisions.
  • High values of term overlap suggest that the
    terms chosen by the BVM are of good value and
    match the subjects own impression of information
    need

25
  • Results Analysis
  • Hypothesis 1 Information need detection
  • Shows average percentage of occassions where the
    top 6 terms chosen by BVM also included as
    atleast one of the subjects terms
  • Difference between inexperienced and experienced
    subjects was not significant.
  • Term overlap for experienced subjects was
    generally higher than that for inexperienced
    subjects.

26
  • Results Analysis
  • Hypothesis 1 Information need detection
  • Shows the average number of query iterations
    average query length
  • iteration is the use of a query for any action
    reordering the TRS, the documents or re-searching
    the Web
  • Average query length is the number of terms in
    the new query that were not in the original query

27
  • Results Analysis
  • Hypothesis 1 Information need detection
  • Shows the average frequency of query manipulation
    for each subject performing different types of
    search
  • Query manipulation adding terms and/or removing
    terms
  • Subjects added terms to queries more often for
    decision and background searchs than for fact
    search and search for number of items.
  • Implicit feedback is better in decision search as
    compared to fact search

28
  • Results Analysis
  • Hypothesis 2 Information need tracking
  • Shows the average number of actions carried out
    on each system across all search tasks
  • Differences in the no. of times the TRS were
    reordered
  • Experienced subjects make more use of unfamiliar
    actions
  • Both groups reorder the list of TRS more than
    implicit feedback system and reorder the
    documents less frequently
  • Reordering of sentences/documents allows the
    system to reshape the information space

29
  • Results Analysis
  • Hypothesis 2 Information need tracking
  • Shows the proportion of each type of action that
    was undone
  • Reversal indicates a dissatisfaction with outcome
    of the action or terms suggested
  • Subjects responded well to search strategy
    employed on their behalf
  • Inexperienced subjects disliked the effects of
    the TRS reordering
  • Experienced subjects liked TRS re-ranking, but
    reversed the re-searching operation more often

30
  • Results Analysis
  • Hypothesis 3 Relevance paths
  • Subjects were asked to rate the worth of
    following a relevance path from one document
    representation to another.
  • The relevance paths were significantly more
    helpful, beneficial, appropriate and useful to
    experienced subjects than inexperienced
  • The distance travelled along the relevance path
    was a good indicator of relevance of the
    information in that path

31
  • Results Analysis
  • Hypothesis 3 Relevance paths
  • Shows the most common path taken, the average
    number of steps followed, the average number of
    complete and partial paths etc.
  • Subjects used relevance paths consistently,
    although experienced subjects followed the paths
    longer
  • Experienced subjects interacted more with the
    retrieved documents and more frequently used the
    document representations for viewing the
    full-text of a document

32
  • Roadmap
  • Introduction
  • Searcher Interaction
  • Binary Voting Model
  • Evaluation
  • Results Analysis
  • Conclusions

33
  • Conclusions
  • Interface uses query-relevant document
    representations to facilitate access to
    potentially useful information and allow
    searchers to closely examine results.
  • This form of implicit feedback is at the extreme
    end of a spectrum of searcher support. They may
    be best used to make decisions in conjuction
    with, not in place of, the searcher.
  • This approach has the potential to alleviate some
    of the problems inherent in explicit relevance
    feedback, while preserving many of its beliefs.
  • The success of the approach bodes well for
    construction of effective implicit RF systems
    that will work in concert with the searcher.

34
References
  • Salton, G., Buckley, C. (1990). Improving
    retrieval performance by relevance feedback.
    Journal of the American Society for Information
    Science, 41(4), 288297.
  • Morita, M., Shinoda, Y. (1994). Information
    filtering based on user behavior analysis and
    best match text retrieval. In Proceedings of the
    17th annual ACM SIGIR conference on research and
    development in information retrieval (pp.
    272281).
  • Lieberman, H. (1995). Letizia an agent that
    assists web browsing. In Proceedings of the 14th
    international joint conference on artificial
    intelligence (pp. 475480).
  • Joachims, T., Freitag, D., Mitchell, T. (1997).
    Webwatcher a tour guide for the world wide web.
    In Proceedings of the 16th joint international
    conference on artificial intelligence (pp.
    770775).
  • Ingwersen, P. (1992). Information retrieval
    interaction. London Taylor Graham.

35
Questions
36
..... Thank You !
Write a Comment
User Comments (0)
About PowerShow.com