Unobtrusively Tracking Information Needs Implicit Solutions to Explicit Problems - PowerPoint PPT Presentation

1 / 38

About This Presentation

Title:

Unobtrusively Tracking Information Needs Implicit Solutions to Explicit Problems

Description:

help struggling searchers find what they seek ... is enhanced to become more attuned to the searcher's information need through ... – PowerPoint PPT presentation

Number of Views:86

Avg rating:3.0/5.0

Slides: 39

Provided by: ryenw4

Category:

more less

Transcript and Presenter's Notes

Title: Unobtrusively Tracking Information Needs Implicit Solutions to Explicit Problems

1
Unobtrusively Tracking Information
NeedsImplicit Solutions to Explicit
Problems

Ryen W. White
Information Retrieval Group
Department of Computing Science
University of Glasgow
ryen_at_dcs.gla.ac.uk

2
Me, me, me

Final year Ph.D. student in the Department of
Computing Science, University of Glasgow
Thank you for inviting me!
Main aim of visit to UW was to develop a plan for
the final user evaluation of my Ph.D.
Think I have done this!
I will describe a bit of my work so far

3
Aims

Searching can be problematic help struggling
searchers find what they seek
Develop a means of better representing searcher
needs whilst minimising the burden of explicitly
reformulating queries or directly providing
relevance information
Use implicit (unobtrusive, hidden) monitoring of
interaction to generate an expanded query that
estimates the information need of the searcher

4
Information Seeking Metaphor

Information seeking consists of three major
components
the user with their request for information
the document collection on which to apply their
request
the response of the IR system to this request
Interactive IR systems allow the user to conduct
searching tasks dynamically and correspondingly
reacts to system responses over session time

IR System
query
corpus
user
results
5
The Information Need

The transformation of a users information need
into a query is known as query formulation
One of the most challenging activities in
information seeking
Amplified if the information need is vague or
collection of knowledge is poor
IR systems assume that the query is a close
representation of real information need, this is
often not true
Systems need a way of understanding relevance

6
Ostension

Searchers know what is relevant, but can have
problems choosing terms to express relevance
How would you describe red to a child?
Perhaps by using red things as examples
We can describe relevance to an IR system in a
similar way, by identifying which documents have
relevant attributes

7
Relevance Feedback
Relevance Feedback is an automatic iterative
process designed to produce improved query
formulations following an initial retrieval
operation RF typically expands a
searchers query
query moved closer to relevant documents
query moved closer to relevant documents
original query
revised query
0
not relevant
1
relevant
corpus
8
RF Procedure

1 User poses initial request
2 The retrieval system returns a list of
documents judged relevant
i Based on query and internal document
representations and retrieval algorithms
ii Usually returns a list of titles and abstracts
3 User selects relevant documents from this set
4 Query is modified automatically and the search
process is repeated

9
(No Transcript)
10
Relevance Feedback

The initial query is enhanced to become more
attuned to the searchers information need
through an iterative process of feedback
However
relies on explicit relevance assessments
visiting documents to gauge relevance is a
demanding and time-consuming process
use a binary notion of relevance, what about
partially relevant?
searchers may be unwilling/unable to provide
feedback

11
Implicit Feedback

Search system unobtrusively monitors search
behaviour
Removes the need for the searcher to explicitly
indicate which documents are relevant
Searchers no longer required to assess the
relevance of a number of documents
System makes inferences based on interaction and
selects terms that approximate searcher needs

12
Implicit Feedback Approach

Implicit Feedback traditionally uses surrogate
measures as evidence of searcher interests
Document reading time, scrolling, mouse clicks,
etc.
or forms of document retention
Printing, saving, bookmarking, etc.
Useful, but can be highly context-dependent and
vary greatly between users

13
Implicit Feedback Approach

Our approach assumes only that searchers will
view information that pertains to their
information needs
Whole documents can contain good and bad
expansion terms
use smaller representations of documents and
extract terms from these
reduce likelihood that erroneous terms will be
chosen for query expansion

14
Document Representations

For each document there are five different
representations
title, as created by the author
query-biased summary of the document
list of top-ranking sentences (TRS) from the top
30 documents, scored in relation to the query
each sentence is considered as a representation
for that document
sentence in the query-biased summary
sentence in the context it occurs in the document

15
(No Transcript)
16
Top-Ranking Sentences

for each document in the top thirty retrieved
pool all summaries from all docs, rank with score

1
5
4
3
document order
query score order
2
7
Web document
6
Extract sentences from documents
Score sentences in relation to query
Choose top 4 sentences for summary
17
Relevance Path

Searcher can view titles and access full-texts as
in standard Web search interfaces
Through their interaction searchers have control
over which representations they view
Distance travelled along a path can provide
information on the relevance of terms used in
path representations

Summary Sentence
Sentence in Context
TRS
Title
Summary
Doc
18
Binary Voting Model (BVM)

We choose terms to better represent information
needs from representations viewed by searcher
Each representation votes for the terms it
contains
All terms are candidates in the voting process
and these votes accumulate across all viewed
representations
Useful terms will be those contained in many of
the representations user chooses to view

19
Indicative worth

Document representations can vary in length and
can hence be regarded as being more or less
indicative of document content
i.e. a top-ranking sentence is less indicative
than a query-biased summary (typically 4
sentences)
contains less information about the content of
the document
We weight the contribution of the
representations vote based on the indicative
worth (typical length) of the representations

20
Implementing the BVM

Documents are represented by a vector containing
all unique non-stemmed, non-stopword terms in the
top 30 web documents
The list is the vocabulary

Four terms in vocabulary
Four terms in vocabulary
t
w( )
D1
relevance path
w( )
D2
w(.)

representation weight (based on indicativity)
Each document D, has a separate row in the matrix
w( )
w( )
Dn
21
Creating the new query
relevance path
w( ) .2 .1 .3 .6 w( ) .2 .3
.5 w( ) .2 .3 .5 w( ) .1 .3 .4
TRS
Summary
2
4
Title
Take average w(.) across 10 paths and use
top-scoring terms to expand query
TRS indicativity 0.2 Title
indicativity 0.1 Sum indicativity 0.3
22
Using the expanded query

Traditional relevance feedback systems require
searcher to control relevance feedback
Instruct system to perform query modification and
produce a new set of documents
May not always be appropriate
Information needs are dynamic and can develop in
a dramatic or gradual manner
Gradual changes ? generation of a new result set
is perhaps too severe
Revisions that reflect the degree of development
perhaps more suitable

23
Different changes, different actions

Use the evidence gathered to track potential
changes in information need and tailor the
results presentation to suit degree of change
Large changes ? new searches
Small changes ? less radical operations
Reordering the list of documents or reordering
the top-ranking sentences

24
Changing Needs

We detect changes in terms suggested by the
system for query expansion and based on the
degree of change we decide how to use the new
query
The weight of all terms in the vocabulary change
Vocabulary is static, terms in the list will not
change, weights and order will

25
Spearman rank-order correlation

Tests for degree of similarity between two lists
of rankings
Non-parametric, ranks not scores used

Information viewed by the searcher
Order of the term lists
original order
order after 10 relevance paths
26
Choosing the action

We have a coefficient in the range -1 to 1, where
a result closer to -1 means the term lists are
dissimilar with respect to their rank ordering
As coefficient gets closer to one, the lists
become more similar, and the change in
information need is assumed to be smaller

use Spearman rank order correlation coefficient
to predict extent of change
re-search
-1
0
.2 .5 .8 1
re-order documents
no action
order of terms changes...
re-order TRS
27
Take stock

We have an approach for
Detecting information needs
Through monitoring the information viewed by the
searcher
Tracking information needs
Through the differences in information viewed
over time
We evaluate the success of the approach from the
perspective of the searcher

28
Pilot Study

Assess how well our approach detects information
needs and tracks changes in these needs
Compared it against a baseline system that placed
responsibility for query reformulation and action
on the searcher
Did not compare it with a traditional relevance
feedback system
Test how well it detects/tracks information needs
before claiming it can better relevance feedback

29
Hypotheses

Hypothesis 1
Terms selected by the system relate to the
information need of the searcher
Hypothesis 2
System successfully perceives developments in the
information need and acts appropriately

30
Subjects

24 subjects
Inexperienced and experienced searchers
12 in each group
Differences in internet/computer usage and search
experience
Inexperienced users 3.1 hours online per week
Experienced users 34.9 hours online per week
Average age 26 yrs (max 54, min 16)

31
Tasks

Searchers asked to complete one task from each of
four categories
Fact search
finding a named persons email address
Decision search
choosing the best financial instrument
Background search
finding information on dust allergies
Search for a number of items
finding contact details of some potential
employers

32
Example Task (Background search)

Simulated work task situation Imagine you work
in an old building and one of your colleagues has
developed a severe dust allergy that you believe
is caused by his working environment. He is
writing a letter to complain about the lack of
cleanliness in his work environment and has asked
you to help him find information about dust
allergies.

33
Baseline System

Searcher responsible for selecting expansion
terms and the action
Increased control, but also increased
responsibility
Term/strategy control panel added to interface
Allowed us to evaluate the worth of the implicit
feedback system, not strict baseline!

term selection
action selection
34
Methodology

Presentation of tasks to subjects was held
constant each subject performed the tasks in the
same order (factorial design)
10 minutes for each task
Background logging was used to record user
interaction

35
Methodology

1. Short tutorial and training task
2. Collected background data on aspects such as
subjects experience and training in online
searching
3. Introduced to tasks/systems
4. Attempted tasks, completed questionnaires
5. Final questionnaire
6. Informal interview

36
Brief Results

Searchers used the interface components
Relevance paths, top-ranking sentences, etc.
Implicit query expansion produced good terms that
searchers found useful
Need tracking was helpful and selected
appropriate actions
Downside interface at times erratic, took away
searcher control

37
More results

White, R.W., Jose, J.M. and Ruthven, I. Adapting
to Evolving Needs Evaluating a Behaviour-Based
Search Interface. Proceedings of the 17th Annual
HCI Conference, 2003.
White, R.W., Jose, J.M. and Ruthven, I.
Implicitly Tracking Information Needs.
Information Processing and Management, in
preparation.
White, R.W., Jose, J.M. and Ruthven, I. An
Approach for Implicitly Detecting Information
Needs. Proceedings of the 12th Annual CIKM
Conference, 2003.

38
Plans for the future