An Implicit Feedback approach for Interactive Information Retrieval

About This Presentation

Title:

An Implicit Feedback approach for Interactive Information Retrieval

Description:

Introduction. Relevance Feedback (RF) ... Introduction. Approach: Searchers can interact with different representations of each document. ... – PowerPoint PPT presentation

Number of Views:65

Avg rating:3.0/5.0

Slides: 37

Provided by: DIT96

Category:

more less

Transcript and Presenter's Notes

Title: An Implicit Feedback approach for Interactive Information Retrieval

1
An Implicit Feedback approach for Interactive
Information Retrieval

Ryen W. White, Joemon M. Jose, Ian Ruthven
University of Glasgow

Hamza Hydri Syed Course Presentation - Web
Information Retrieval
2

Roadmap
Introduction
Searcher Interaction
Binary Voting Model
Evaluation
Results Analysis
Conclusions

Roadmap
Introduction
Searcher Interaction
Binary Voting Model
Evaluation
Results Analysis
Conclusions

Introduction
Relevance Feedback (RF)
Automatically improving a systems representation
of a searchers information need through an
iterative process of feedback1.
Depends on a series of relevance assessments made
explicitly by the user.
Assumes that underlying need is the same across
all the iterations.
Implicit RF
IR system unobtrusively monitors search behaviour
Removes the need for the searcher to explicitly
indicate which documents are relevant2.
Variety of surrogate measures have been
employed.
Hyperlinks clicked, mouseovers, scrollbar
activity3,4
Can be unreliable indicators, use interaction
with the full-text documents as implicit
feedback.

Introduction
Approach
Searchers can interact with different
representations of each document.
Representations are of varying length, focussed
on the query logically connected at the
interface to form an interactive search path.
Develops a means of better representing searcher
needs while minimizing the burden of explicitly
reformulating queries.

Roadmap
Introduction
Searcher Interaction
Binary Voting Model
Evaluation
Results Analysis
Conclusions

Searcher Interaction
Document Representations
Focus on query-relevant parts of documents
Reduce likelihood for selection of erroneous
terms.
Interface uses multiple document representations
Top-ranking sentences (TRS) from each of the top
30 documents retrieved
Title
Query-biased summary of the documents
Summary Sentence
Sentence in Context
Document itself

8
(No Transcript)
9

Searcher Interaction
Relevance Path
The further along the path a searcher travels,
the more relevant is the information in the path.
Paths can vary in length and searchers can access
the full-text of the document from any step in
the path.

Roadmap
Introduction
Searcher Interaction
Binary Voting Model
Evaluation
Results Analysis
Conclusions

Binary Voting Model
Features
Heursitics-based model which implicitly selects
terms for query modification.
Utilizes searcher interaction with document
representations relevance paths.
Term present in a viewed representation recieves
a vote, when not present recieves no vote.
Winning terms are those with the most votes and
hence best describe the information viewed by the
searcher.
Contribution of a vote is weight-ed based on the
indicative worth of the representation.
0.1 - Title
0.2 - TRS
0.2 Summary Sentence
0.2 Sentence in Context
0.3 Summary

Binary Voting Model
Features
Each document is represented by a vector of
length n
n total number of unique non-stop-word terms
The list holding these terms vocabulary
A document x term matrix is built of size (d1) x
n
d no. of documents the searcher has seen

Binary Voting Model
Example Simple Updating
Original Query Q0 contains t5 and t9
Vector is normalised to give each term a value
between 0,1
Term occuring is assigned a weight wt
p no. of steps taken
D document
t term
r representation
Wt,r weight of t for the representation r
Weight for each term is added to the appropriate
term/document entry in the matrix

Binary Voting Model
Example Simple Updating
Initial state of document x term matrix
Searcher expresses interest in the Title of the
document D1 with a step weight of 0.1 and
contains terms t1,t2 and t7
Matrix changes to
Weights of terms t1,t2 and t7 are directly
updated
t2 is now seen as being important to D1
t1 and t7 are seen as more important than before
to D1
Scoring is cummulative

Binary Voting Model
Query Creation
For every 5 paths a new query is computed, which
gathers sufficient implicit evidence from
searcher interaction
To compute new query we calculate the average
score for each term across all documents
Terms are ranked by this score
High average score implies the term has appeared
in many viewed representations and/or in those
with high indicative weights
Top 6 terms chosen are
t9,t5,t1,t7,t3 and t2
Although t2,t3 and t8 have the same score, t8 is
not included since t3 occurs more recently and t2
occurs in more than one document

Binary Voting Model
Tracking Information Need
Change in the information need can be measured by
computing the change in the term ordering from
the term list at different steps i.e., qm and
qm1
Since the vocabulary is static, only the order of
the terms in the list will change
Where is searcher information need and o is
the Spearman rank-order correlation coefficient
that computes the difference between two lists of
unique terms
The correlation returns values between -1 and 1
Result closer to -1 means the term lists are
dissimilar w.r.t rank ordering
Result closer to 1 means similarity between
ranking terms increases

Binary Voting Model
Strategies Implemented
Re-searching coeffecient value lt 0.2 indicates
large change in term lists, that they are
substantially different w.r.t rank ordering, this
reflects a large change in . A new re-search
is done to retrieve a new set of documents.
Reordering Documents result in the range 0.2,
0.5) indicates weak correlation consequently a
less substantial change in . A new query is
used to reorder the top 30 retrieved documents,
which is done using best-match tf-idf scoring.
Reordering TRS coefficient in range 0.5,0.8)
indicates strong correlation in the two term
lists a small change in the predicted . New
query is used to re-rank TRS list.

Roadmap
Introduction
Searcher Interaction
Binary Voting Model
Evaluation
Results Analysis
Conclusions

Evaluation
Manual Baseline System
Similar to implicit feedback system except that
searcher is solely responsible for adding new
query terms selecting what action is
undertaken.
Baseline interface has additional component
term/strategy control panel, which allows
searchers to decide how best to use the query.
This nature of Baseline allows us to evaluate how
well the implicit feedback system detected
information needs from the perspective of the
subject.

20
(No Transcript)
21

Evaluation
Experimental Subjects
Were mainly undergraduate and postgraduate
students of University of Glasgow, divided into 2
groups of
Experienced
Inexperienced
Experimental Tasks
Each subject was asked to complete one search
task from each of 4 categories
Categories were
fact search (finding a persons mail address)
background search (finding information on dust
allergies)
decision search (choosing the best financial
instrument)
search for number of items (finding contact
details of a no. of employees)
Search scenarios reflect real-life search
situations allow the subject to make personal
assessments on what constitutes relevant material5

Roadmap
Introduction
Searcher Interaction
Binary Voting Model
Evaluation
Results Analysis
Conclusions

Results Analysis
Hypotheses tested
The terms selected for implicit feedback
represent the information needs of the subject
(i.e., term selection support)
The implicit feedback approach estimates changes
in the subjects information need
The implicit feedback approach makes search
decisions that correspond closely with those of
the subject

Results Analysis
Hypothesis 1 Information need detection
We measure degree of term overlap using baseline
system.
BVM runs in background, invisible to subject
not involved directly in any query modification
decisions.
High values of term overlap suggest that the
terms chosen by the BVM are of good value and
match the subjects own impression of information
need

Results Analysis
Hypothesis 1 Information need detection
Shows average percentage of occassions where the
top 6 terms chosen by BVM also included as
atleast one of the subjects terms
Difference between inexperienced and experienced
subjects was not significant.
Term overlap for experienced subjects was
generally higher than that for inexperienced
subjects.

Results Analysis
Hypothesis 1 Information need detection
Shows the average number of query iterations
average query length
iteration is the use of a query for any action
reordering the TRS, the documents or re-searching
the Web
Average query length is the number of terms in
the new query that were not in the original query

Results Analysis
Hypothesis 1 Information need detection
Shows the average frequency of query manipulation
for each subject performing different types of
search
Query manipulation adding terms and/or removing
terms
Subjects added terms to queries more often for
decision and background searchs than for fact
search and search for number of items.
Implicit feedback is better in decision search as
compared to fact search

Results Analysis
Hypothesis 2 Information need tracking
Shows the average number of actions carried out
on each system across all search tasks
Differences in the no. of times the TRS were
reordered
Experienced subjects make more use of unfamiliar
actions
Both groups reorder the list of TRS more than
implicit feedback system and reorder the
documents less frequently
Reordering of sentences/documents allows the
system to reshape the information space

Results Analysis
Hypothesis 2 Information need tracking
Shows the proportion of each type of action that
was undone
Reversal indicates a dissatisfaction with outcome
of the action or terms suggested
Subjects responded well to search strategy
employed on their behalf
Inexperienced subjects disliked the effects of
the TRS reordering
Experienced subjects liked TRS re-ranking, but
reversed the re-searching operation more often

Results Analysis
Hypothesis 3 Relevance paths
Subjects were asked to rate the worth of
following a relevance path from one document
representation to another.
The relevance paths were significantly more
helpful, beneficial, appropriate and useful to
experienced subjects than inexperienced
The distance travelled along the relevance path
was a good indicator of relevance of the
information in that path

Results Analysis
Hypothesis 3 Relevance paths
Shows the most common path taken, the average
number of steps followed, the average number of
complete and partial paths etc.
Subjects used relevance paths consistently,
although experienced subjects followed the paths
longer
Experienced subjects interacted more with the
retrieved documents and more frequently used the
document representations for viewing the
full-text of a document

Roadmap
Introduction
Searcher Interaction
Binary Voting Model
Evaluation
Results Analysis
Conclusions

Conclusions
Interface uses query-relevant document
representations to facilitate access to
potentially useful information and allow
searchers to closely examine results.
This form of implicit feedback is at the extreme
end of a spectrum of searcher support. They may
be best used to make decisions in conjuction
with, not in place of, the searcher.
This approach has the potential to alleviate some
of the problems inherent in explicit relevance
feedback, while preserving many of its beliefs.
The success of the approach bodes well for
construction of effective implicit RF systems
that will work in concert with the searcher.

34
References

Salton, G., Buckley, C. (1990). Improving
retrieval performance by relevance feedback.
Journal of the American Society for Information
Science, 41(4), 288297.
Morita, M., Shinoda, Y. (1994). Information
filtering based on user behavior analysis and
best match text retrieval. In Proceedings of the
17th annual ACM SIGIR conference on research and
development in information retrieval (pp.
272281).
Lieberman, H. (1995). Letizia an agent that
assists web browsing. In Proceedings of the 14th
international joint conference on artificial
intelligence (pp. 475480).
Joachims, T., Freitag, D., Mitchell, T. (1997).
Webwatcher a tour guide for the world wide web.
In Proceedings of the 16th joint international
conference on artificial intelligence (pp.
770775).
Ingwersen, P. (1992). Information retrieval
interaction. London Taylor Graham.

35
Questions
36
..... Thank You !

Write a Comment

User Comments (0)

About PowerShow.com

An Implicit Feedback approach for Interactive Information Retrieval - PowerPoint PPT Presentation

An Implicit Feedback approach for Interactive Information Retrieval

Introduction. Relevance Feedback (RF) ... Introduction. Approach: Searchers can interact with different representations of each document. ... – PowerPoint PPT presentation