RELEVANCE? in information science - PowerPoint PPT Presentation

Loading...

PPT – RELEVANCE? in information science PowerPoint presentation | free to download - id: 68e5d1-ODYwZ



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

RELEVANCE? in information science

Description:

... by being more specific but recall decreases Some users want high precision others high recall Cleverdon s law ... law of unintended consequences ... – PowerPoint PPT presentation

Number of Views:13
Avg rating:3.0/5.0
Slides: 40
Provided by: TefkoSa3
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: RELEVANCE? in information science


1
RELEVANCE?in information science
  • Tefko Saracevic, Ph.D.
  • tefkos_at_rutgers.edu

2
Two worlds in information science
  • IR systems offer as answers their version of
    what may be relevant
  • by ever improving algorithms
  • People go their way assess relevance
  • by their problem-at hand, context criteria
  • The two worlds interact

Covered here human world of relevance NOT
covered how IR deals with relevance
3
Relevance interaction
URLs, references, and inspirations are in Notes
4
Our work is to understand a person's real-time
goal and match it with relevant information.
... relevant information.
... relevant ...
5
Definitions
Merriam-Webster Dictionary Online
1a relation to the matter at hand b
practical and especially social applicability
pertinence ltgiving relevance to college coursesgt
2 the ability (as of an information retrieval
system) to retrieve material that satisfies the
needs of the user.
6
Relevance by any other name...
  • Many names e.g.
  • pertinent useful applicable significant
    germane material bearing proper related
    important fitting suited apropos ...
    nowadays even truthful ...
  • Connotations may differ but the concept is still
    relevance
  • "A rose by any other name would smell as sweet
    Shakespeare, Romeo and Juliet

7
What is matter at hand?
  • Context in relation to which
  • a problem is addressed
  • an information need is expressed
  • a question is asked
  • an interaction is conducted
  • There is no such thing as considering relevance
    without a context

Axiom One cannot not have a context
in information interaction.
8
context information seeking intent
from Latin contextus  "a joining together
contexere  "to weave together
  • Context circumstance, setting
  • The set of facts or circumstances that surround a
    situation or event the historic context
    Wordnet
  • However, in information science computer
    science as well

There is no term more often used, less often
defined and, when defined, defined so variously,
as context. Context has the potential to be
virtually anything that is not defined as the
phenomenon of interest. Dervin, 1997
9
context information seeking intent
  • Process in which humans purposefully engage in
    order to change their state of knowledge
    (Marchionini, 1995)
  • A conscious effort to acquire information in
    response to a need or gap in your knowledge
    (Case, 2007)
  • ...fitting information in with what one already
    knows and extending this knowledge to create new
    perspectives (Kuhlthau, 2004)

10
Information seeking concentrations
  • Purposeful process all cognitive to
  • change state of knowledge
  • respond to an information need or gap
  • fit information in with what one already knows
  • To seek information people seek to change the
    state of their knowledge
  • Critique Broader social, cultural, environmental
    factors not included

11
context information seeking intent
  • Many information seeking studies involved TASK as
    context accomplishment of task as intent
  • Distinguished as to simple, difficult, complex
    ...
  • But there is more to a task then task itself
  • time-line stages of task changes over time

12
Two large questions
  • Why did relevance become a central notion of
    information science?
  • What did we learn about relevance through
    research in information science?

13
Why relevance?
  • A bit of history

14
It all started with
  • Vannevar Bush Article As we may think 1945
  • Defined the problem as ... the massive task of
    making more accessible of a bewildering store of
    knowledge.
  • problem still with us growing
  • Suggested a solution, a machine Memex ...
    association of ideas ... duplicate mental
    processes artificially.
  • Technological fix to problem

1890-1974
15
Information Retrieval (IR) definition
  • Term information retrieval coined defined by
    Calvin Mooers, 1951
  • IR ... intellectual aspects of description of
    information, ... and its specification for
    search ... and systems, technique, or
    machines...to provide information useful to
    user

1919-1994
16
Technological determinant
  • In IR emphasis was not only on organization but
    even more on searching
  • technology was suitable for searching
  • in the beginning information organization was
    done by people searching by machines
  • nowadays information organization mostly by
    machines (sometimes by humans as well)
    searching almost exclusively by machines

17
Some of the pioneers
Mortimer Taube1910-1965
Hans Peter Luhn 1896-1964
  • at Documentation Inc. pioneered coordinate
    indexing
  • first to describe searching as Boolean algebra
  • at IBM pioneered many IR computer applications
  • first to describe searching using Venn diagrams

18
Searching relevance
  • Searching became a key component of information
    retrieval
  • extensive theoretical practical concern with
    searching
  • technology uniquely suitable for searching
  • And searching is about retrieval of relevant
    answers

Thus RELEVANCE emerged as a key notion
19
Why relevance?
  • Aboutness
  • Relevance
  • A fundamental notion related to organization of
    information
  • Relates to subject in a broader sense to
    epistemology
  • A fundamental notion related to searching for
    information
  • Relates to problem-at-hand and context in a
    broader sense to pragmatism

Relevance emerged as a central notion in
information science because of practical
theoretical concerns with searching
20
What have we learned about relevance?
  • Relevance research

21
Claims counterclaims in IR
  • Historically from the outset My system is
    better than your system!
  • Well, which one is it? Lets test it. But
  • what criterion to use?
  • what measures based on the criterion?
  • Things got settled by the end of 1950s and
    remain mostly the same to this day

22
Relevance IR testing
  • In 1955 Allen Kent James W. Perry were first to
    propose two measures for test of IR systems
  • relevance later renamed precision recall
  • A scientific engineering approach to testing

Allen Kent 1921 -
James W. Perry 1907-1971
23
Relevance as criterion for measures
  • Precision
  • Recall
  • Probability that what is relevant in a file is
    retrieved
  • conversely how much relevant stuff is missed?
  • Probability that what is retrieved is relevant
  • conversely how much junk is retrieved?
  • Probability of agreement between what the system
    retrieved/not retrieved as relevant (systems
    relevance) what the user assessed as relevant
    (user relevance)where user relevance is the gold
    standard for comparison

24
First test law of unintended consequences
  • Mid 1950s test of two competing systems
  • subject headings by Armed Services Tech Inf
    Agency
  • uniterms (keywords) by Documentation Inc.
  • 15,000 documents indexed by each group, 98
    questions searched
  • but relevance judged by each group separately

Results
  • First group 2,200 relevant
  • Second 1,998 relevant
  • but low agreement
  • Then peace talks
  • but even after agreement came to 30.9
  • Test collapsed on relevance disagreements

Learned Never, ever use more than a single judge
per query. Since then to this day IR tests dont
25
Cranfield tests 1957-1967
Cyril Cleverdon 1914-1997
  • Funded by NSF
  • Controlled testing different indexing languages,
    same documents, same relevance judgment
  • Used traditional IR model non-interactive
  • Many results, some surprising
  • e.g. simple keywords high ranks on many counts
  • Developed Cranfield methodology for testing
  • Still in use today incl. in

TREC started in 1992, still strong in 2014
26
Tradeoff in recall vs. precision
Cleverdons law
  • Generally, there is a tradeoff
  • recall can be increased by retrieving more but
    precision decreases
  • precision can be increased by being more specific
    but recall decreases
  • Some users want high precision others high recall
  • Example from TREC

27
Assumptions in Cranfield methodology
  • IR and thus relevance is static (traditional IR
    model)
  • Relevance is
  • topical
  • binary
  • independent
  • stable
  • consistent
  • if pooling complete
  • Inspired relevance experimentation on every one
    of these assumptions
  • Main findingnone of them holds

but simplified assumptions enabled rich IR tests
and many developments
28
IR relevance static vs. dynamic
  • Q Do relevance inferences criteria change over
    time for the same user task? A They do
  • For a given task, users inferences are dependent
    on the stage of the taskDifferent stages
    differing selections but different stages
    similar criteria different weightsIncreased
    focus increased discrimination more stringent
    relevance inferences

IR relevance inferences are highly dynamic
processes
29
Experimental results
Topical Topicality very important but not exclusive role. Cognitive, situational, affective variables play a role e.g. user background (cognitive) task complexity (situational) intent, motivation (affective)
Binary Continuum Users judge on a continuum comparatively, not only binary (relevant not relevant). Bi-modality Seems that assessments have high peaks at end points of the range (not relevant, relevant) with smaller peaks in the middle range
Independent Order in which documents are presented to users seems to have an effect. Near beginning Seems that documents presented early have a higher probability of being inferred as relevant.
30
Experimental results (cont.)
Stable Time relevance judgments not completely stable change over time as tasks progress learning advances Criteria for judging relevance are fairly stable
Consistent Expertise higher higher agreement, less differences lower lower agreement, more leniency. Individual differences the most prominent feature factor in relevance inferences. Experts agree up to 80 others around 30Number of judges More judges less agreement
If pooling Complete (if only a sample of collection or a pool from several searches is evaluated) Additions with more pools or increased sampling more relevant objects are found
31
Clues on what basis criteria users make
relevance judgments?
Content topic, quality, depth, scope, currency, treatment, clarity
Object characteristics of information objects, e.g., type, organization, representation, format, availability, accessibility, costs
Validity accuracy of information provided, authority, trustworthiness of sources, verifiability
32
Clues (cont.) Matching users
Use or situational match appropriateness to situation, or tasks, usability, urgency value in use
Cognitive match understanding, novelty, mental effort
Affective match emotional responses to information, fun, frustration, uncertainty
Belief match personal credence given to information, confidence
33
Summary of relevance experiments
  • First experiment reported in 1961
  • compared effects of various representations
    (titles, abstracts, full text)
  • Over the years about 300 or so experiments
  • Little funding
  • only two funded by a US agency (1967)

Most important general finding Relevance is
measurable
34
In conclusion
  • Information technology systems will change
    dramatically
  • even in the short run
  • and in unforeseeable directions
  • But relevance is here to stay!

and relevance has many faces some unusual
35
Innovation ... as well ... not all are digital
36
and here is its use
37
Unusual services Library therapy dogs
U Michigan, Ann Arbor, Shapiro Library
38
Presentation in Wordle
39
Thank you
Gracias
Merci
Hvala
Obrigado
Thank you for inviting me!
Grazie
About PowerShow.com