Unobtrusively Tracking Information Needs Implicit Solutions to Explicit Problems - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Unobtrusively Tracking Information Needs Implicit Solutions to Explicit Problems

Description:

help struggling searchers find what they seek ... is enhanced to become more attuned to the searcher's information need through ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 39
Provided by: ryenw4
Category:

less

Transcript and Presenter's Notes

Title: Unobtrusively Tracking Information Needs Implicit Solutions to Explicit Problems


1
Unobtrusively Tracking Information
NeedsImplicit Solutions to Explicit
Problems
  • Ryen W. White
  • Information Retrieval Group
  • Department of Computing Science
  • University of Glasgow
  • ryen_at_dcs.gla.ac.uk

2
Me, me, me
  • Final year Ph.D. student in the Department of
    Computing Science, University of Glasgow
  • Thank you for inviting me!
  • Main aim of visit to UW was to develop a plan for
    the final user evaluation of my Ph.D.
  • Think I have done this!
  • I will describe a bit of my work so far

3
Aims
  • Searching can be problematic help struggling
    searchers find what they seek
  • Develop a means of better representing searcher
    needs whilst minimising the burden of explicitly
    reformulating queries or directly providing
    relevance information
  • Use implicit (unobtrusive, hidden) monitoring of
    interaction to generate an expanded query that
    estimates the information need of the searcher

4
Information Seeking Metaphor
  • Information seeking consists of three major
    components
  • the user with their request for information
  • the document collection on which to apply their
    request
  • the response of the IR system to this request
  • Interactive IR systems allow the user to conduct
    searching tasks dynamically and correspondingly
    reacts to system responses over session time

IR System
query
corpus
user
results
5
The Information Need
  • The transformation of a users information need
    into a query is known as query formulation
  • One of the most challenging activities in
    information seeking
  • Amplified if the information need is vague or
    collection of knowledge is poor
  • IR systems assume that the query is a close
    representation of real information need, this is
    often not true
  • Systems need a way of understanding relevance

6
Ostension
  • Searchers know what is relevant, but can have
    problems choosing terms to express relevance
  • How would you describe red to a child?
  • Perhaps by using red things as examples
  • We can describe relevance to an IR system in a
    similar way, by identifying which documents have
    relevant attributes

7
Relevance Feedback
Relevance Feedback is an automatic iterative
process designed to produce improved query
formulations following an initial retrieval
operation RF typically expands a
searchers query
query moved closer to relevant documents
query moved closer to relevant documents
original query
revised query
0
not relevant
1
relevant
corpus
8
RF Procedure
  • 1 User poses initial request
  • 2 The retrieval system returns a list of
    documents judged relevant
  • i Based on query and internal document
    representations and retrieval algorithms
  • ii Usually returns a list of titles and abstracts
  • 3 User selects relevant documents from this set
  • 4 Query is modified automatically and the search
    process is repeated

9
(No Transcript)
10
Relevance Feedback
  • The initial query is enhanced to become more
    attuned to the searchers information need
    through an iterative process of feedback
  • However
  • relies on explicit relevance assessments
  • visiting documents to gauge relevance is a
    demanding and time-consuming process
  • use a binary notion of relevance, what about
    partially relevant?
  • searchers may be unwilling/unable to provide
    feedback

11
Implicit Feedback
  • Search system unobtrusively monitors search
    behaviour
  • Removes the need for the searcher to explicitly
    indicate which documents are relevant
  • Searchers no longer required to assess the
    relevance of a number of documents
  • System makes inferences based on interaction and
    selects terms that approximate searcher needs

12
Implicit Feedback Approach
  • Implicit Feedback traditionally uses surrogate
    measures as evidence of searcher interests
  • Document reading time, scrolling, mouse clicks,
    etc.
  • or forms of document retention
  • Printing, saving, bookmarking, etc.
  • Useful, but can be highly context-dependent and
    vary greatly between users

13
Implicit Feedback Approach
  • Our approach assumes only that searchers will
    view information that pertains to their
    information needs
  • Whole documents can contain good and bad
    expansion terms
  • use smaller representations of documents and
    extract terms from these
  • reduce likelihood that erroneous terms will be
    chosen for query expansion

14
Document Representations
  • For each document there are five different
    representations
  • title, as created by the author
  • query-biased summary of the document
  • list of top-ranking sentences (TRS) from the top
    30 documents, scored in relation to the query
  • each sentence is considered as a representation
    for that document
  • sentence in the query-biased summary
  • sentence in the context it occurs in the document

15
(No Transcript)
16
Top-Ranking Sentences
  • for each document in the top thirty retrieved
  • pool all summaries from all docs, rank with score

1
5
4
3
document order
query score order
2
7
Web document
6
Extract sentences from documents
Score sentences in relation to query
Choose top 4 sentences for summary
17
Relevance Path
  • Searcher can view titles and access full-texts as
    in standard Web search interfaces
  • Through their interaction searchers have control
    over which representations they view
  • Distance travelled along a path can provide
    information on the relevance of terms used in
    path representations

Summary Sentence
Sentence in Context
TRS
Title
Summary
Doc
18
Binary Voting Model (BVM)
  • We choose terms to better represent information
    needs from representations viewed by searcher
  • Each representation votes for the terms it
    contains
  • All terms are candidates in the voting process
    and these votes accumulate across all viewed
    representations
  • Useful terms will be those contained in many of
    the representations user chooses to view

19
Indicative worth
  • Document representations can vary in length and
    can hence be regarded as being more or less
    indicative of document content
  • i.e. a top-ranking sentence is less indicative
    than a query-biased summary (typically 4
    sentences)
  • contains less information about the content of
    the document
  • We weight the contribution of the
    representations vote based on the indicative
    worth (typical length) of the representations

20
Implementing the BVM
  • Documents are represented by a vector containing
    all unique non-stemmed, non-stopword terms in the
    top 30 web documents
  • The list is the vocabulary

Four terms in vocabulary
Four terms in vocabulary
t
w( )
D1
relevance path
w( )
D2
w(.)




representation weight (based on indicativity)
Each document D, has a separate row in the matrix
w( )
w( )
Dn
21
Creating the new query
relevance path
w( ) .2 .1 .3 .6 w( ) .2 .3
.5 w( ) .2 .3 .5 w( ) .1 .3 .4
TRS
Summary
2
4
Title
Take average w(.) across 10 paths and use
top-scoring terms to expand query
TRS indicativity 0.2 Title
indicativity 0.1 Sum indicativity 0.3
22
Using the expanded query
  • Traditional relevance feedback systems require
    searcher to control relevance feedback
  • Instruct system to perform query modification and
    produce a new set of documents
  • May not always be appropriate
  • Information needs are dynamic and can develop in
    a dramatic or gradual manner
  • Gradual changes ? generation of a new result set
    is perhaps too severe
  • Revisions that reflect the degree of development
    perhaps more suitable

23
Different changes, different actions
  • Use the evidence gathered to track potential
    changes in information need and tailor the
    results presentation to suit degree of change
  • Large changes ? new searches
  • Small changes ? less radical operations
  • Reordering the list of documents or reordering
    the top-ranking sentences

24
Changing Needs
  • We detect changes in terms suggested by the
    system for query expansion and based on the
    degree of change we decide how to use the new
    query
  • The weight of all terms in the vocabulary change
  • Vocabulary is static, terms in the list will not
    change, weights and order will

25
Spearman rank-order correlation
  • Tests for degree of similarity between two lists
    of rankings
  • Non-parametric, ranks not scores used

Information viewed by the searcher
Order of the term lists
original order
order after 10 relevance paths
26
Choosing the action
  • We have a coefficient in the range -1 to 1, where
    a result closer to -1 means the term lists are
    dissimilar with respect to their rank ordering
  • As coefficient gets closer to one, the lists
    become more similar, and the change in
    information need is assumed to be smaller

use Spearman rank order correlation coefficient
to predict extent of change
re-search
-1
0
.2 .5 .8 1
re-order documents
no action
order of terms changes...
re-order TRS
27
Take stock
  • We have an approach for
  • Detecting information needs
  • Through monitoring the information viewed by the
    searcher
  • Tracking information needs
  • Through the differences in information viewed
    over time
  • We evaluate the success of the approach from the
    perspective of the searcher

28
Pilot Study
  • Assess how well our approach detects information
    needs and tracks changes in these needs
  • Compared it against a baseline system that placed
    responsibility for query reformulation and action
    on the searcher
  • Did not compare it with a traditional relevance
    feedback system
  • Test how well it detects/tracks information needs
    before claiming it can better relevance feedback

29
Hypotheses
  • Hypothesis 1
  • Terms selected by the system relate to the
    information need of the searcher
  • Hypothesis 2
  • System successfully perceives developments in the
    information need and acts appropriately

30
Subjects
  • 24 subjects
  • Inexperienced and experienced searchers
  • 12 in each group
  • Differences in internet/computer usage and search
    experience
  • Inexperienced users 3.1 hours online per week
  • Experienced users 34.9 hours online per week
  • Average age 26 yrs (max 54, min 16)

31
Tasks
  • Searchers asked to complete one task from each of
    four categories
  • Fact search
  • finding a named persons email address
  • Decision search
  • choosing the best financial instrument
  • Background search
  • finding information on dust allergies
  • Search for a number of items
  • finding contact details of some potential
    employers

32
Example Task (Background search)
  • Simulated work task situation Imagine you work
    in an old building and one of your colleagues has
    developed a severe dust allergy that you believe
    is caused by his working environment. He is
    writing a letter to complain about the lack of
    cleanliness in his work environment and has asked
    you to help him find information about dust
    allergies.

33
Baseline System
  • Searcher responsible for selecting expansion
    terms and the action
  • Increased control, but also increased
    responsibility
  • Term/strategy control panel added to interface
  • Allowed us to evaluate the worth of the implicit
    feedback system, not strict baseline!

term selection
action selection
34
Methodology
  • Presentation of tasks to subjects was held
    constant each subject performed the tasks in the
    same order (factorial design)
  • 10 minutes for each task
  • Background logging was used to record user
    interaction


35
Methodology
  • 1. Short tutorial and training task
  • 2. Collected background data on aspects such as
    subjects experience and training in online
    searching
  • 3. Introduced to tasks/systems
  • 4. Attempted tasks, completed questionnaires
  • 5. Final questionnaire
  • 6. Informal interview

36
Brief Results
  • Searchers used the interface components
  • Relevance paths, top-ranking sentences, etc.
  • Implicit query expansion produced good terms that
    searchers found useful
  • Need tracking was helpful and selected
    appropriate actions
  • Downside interface at times erratic, took away
    searcher control

37
More results
  • White, R.W., Jose, J.M. and Ruthven, I. Adapting
    to Evolving Needs Evaluating a Behaviour-Based
    Search Interface. Proceedings of the 17th Annual
    HCI Conference, 2003.
  • White, R.W., Jose, J.M. and Ruthven, I.
    Implicitly Tracking Information Needs.
    Information Processing and Management, in
    preparation.
  • White, R.W., Jose, J.M. and Ruthven, I. An
    Approach for Implicitly Detecting Information
    Needs. Proceedings of the 12th Annual CIKM
    Conference, 2003.

38
Plans for the future
  • Need to evaluate a similar interface
  • 36 subjects
  • Comparative evaluation between three systems
  • Implicit feedback system that recommends terms
    (IQE) and actions
  • Implicit feedback system that chooses terms and
    action
  • Explicit feedback system that give searcher
    control over terms and action
Write a Comment
User Comments (0)
About PowerShow.com