RELEVANCE? in information science - PowerPoint PPT Presentation


PPT – RELEVANCE? in information science PowerPoint presentation | free to download - id: 68e5d1-ODYwZ


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

RELEVANCE? in information science


... by being more specific but recall decreases Some users want high precision others high recall Cleverdon s law ... law of unintended consequences ... – PowerPoint PPT presentation

Number of Views:13
Avg rating:3.0/5.0
Slides: 40
Provided by: TefkoSa3


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: RELEVANCE? in information science

RELEVANCE?in information science
  • Tefko Saracevic, Ph.D.

Two worlds in information science
  • IR systems offer as answers their version of
    what may be relevant
  • by ever improving algorithms
  • People go their way assess relevance
  • by their problem-at hand, context criteria
  • The two worlds interact

Covered here human world of relevance NOT
covered how IR deals with relevance
Relevance interaction
URLs, references, and inspirations are in Notes
Our work is to understand a person's real-time
goal and match it with relevant information.
... relevant information.
... relevant ...
Merriam-Webster Dictionary Online
1a relation to the matter at hand b
practical and especially social applicability
pertinence ltgiving relevance to college coursesgt
2 the ability (as of an information retrieval
system) to retrieve material that satisfies the
needs of the user.
Relevance by any other name...
  • Many names e.g.
  • pertinent useful applicable significant
    germane material bearing proper related
    important fitting suited apropos ...
    nowadays even truthful ...
  • Connotations may differ but the concept is still
  • "A rose by any other name would smell as sweet
    Shakespeare, Romeo and Juliet

What is matter at hand?
  • Context in relation to which
  • a problem is addressed
  • an information need is expressed
  • a question is asked
  • an interaction is conducted
  • There is no such thing as considering relevance
    without a context

Axiom One cannot not have a context
in information interaction.
context information seeking intent
from Latin contextus  "a joining together
contexere  "to weave together
  • Context circumstance, setting
  • The set of facts or circumstances that surround a
    situation or event the historic context
  • However, in information science computer
    science as well

There is no term more often used, less often
defined and, when defined, defined so variously,
as context. Context has the potential to be
virtually anything that is not defined as the
phenomenon of interest. Dervin, 1997
context information seeking intent
  • Process in which humans purposefully engage in
    order to change their state of knowledge
    (Marchionini, 1995)
  • A conscious effort to acquire information in
    response to a need or gap in your knowledge
    (Case, 2007)
  • ...fitting information in with what one already
    knows and extending this knowledge to create new
    perspectives (Kuhlthau, 2004)

Information seeking concentrations
  • Purposeful process all cognitive to
  • change state of knowledge
  • respond to an information need or gap
  • fit information in with what one already knows
  • To seek information people seek to change the
    state of their knowledge
  • Critique Broader social, cultural, environmental
    factors not included

context information seeking intent
  • Many information seeking studies involved TASK as
    context accomplishment of task as intent
  • Distinguished as to simple, difficult, complex
  • But there is more to a task then task itself
  • time-line stages of task changes over time

Two large questions
  • Why did relevance become a central notion of
    information science?
  • What did we learn about relevance through
    research in information science?

Why relevance?
  • A bit of history

It all started with
  • Vannevar Bush Article As we may think 1945
  • Defined the problem as ... the massive task of
    making more accessible of a bewildering store of
  • problem still with us growing
  • Suggested a solution, a machine Memex ...
    association of ideas ... duplicate mental
    processes artificially.
  • Technological fix to problem

Information Retrieval (IR) definition
  • Term information retrieval coined defined by
    Calvin Mooers, 1951
  • IR ... intellectual aspects of description of
    information, ... and its specification for
    search ... and systems, technique, or provide information useful to

Technological determinant
  • In IR emphasis was not only on organization but
    even more on searching
  • technology was suitable for searching
  • in the beginning information organization was
    done by people searching by machines
  • nowadays information organization mostly by
    machines (sometimes by humans as well)
    searching almost exclusively by machines

Some of the pioneers
Mortimer Taube1910-1965
Hans Peter Luhn 1896-1964
  • at Documentation Inc. pioneered coordinate
  • first to describe searching as Boolean algebra
  • at IBM pioneered many IR computer applications
  • first to describe searching using Venn diagrams

Searching relevance
  • Searching became a key component of information
  • extensive theoretical practical concern with
  • technology uniquely suitable for searching
  • And searching is about retrieval of relevant

Thus RELEVANCE emerged as a key notion
Why relevance?
  • Aboutness
  • Relevance
  • A fundamental notion related to organization of
  • Relates to subject in a broader sense to
  • A fundamental notion related to searching for
  • Relates to problem-at-hand and context in a
    broader sense to pragmatism

Relevance emerged as a central notion in
information science because of practical
theoretical concerns with searching
What have we learned about relevance?
  • Relevance research

Claims counterclaims in IR
  • Historically from the outset My system is
    better than your system!
  • Well, which one is it? Lets test it. But
  • what criterion to use?
  • what measures based on the criterion?
  • Things got settled by the end of 1950s and
    remain mostly the same to this day

Relevance IR testing
  • In 1955 Allen Kent James W. Perry were first to
    propose two measures for test of IR systems
  • relevance later renamed precision recall
  • A scientific engineering approach to testing

Allen Kent 1921 -
James W. Perry 1907-1971
Relevance as criterion for measures
  • Precision
  • Recall
  • Probability that what is relevant in a file is
  • conversely how much relevant stuff is missed?
  • Probability that what is retrieved is relevant
  • conversely how much junk is retrieved?
  • Probability of agreement between what the system
    retrieved/not retrieved as relevant (systems
    relevance) what the user assessed as relevant
    (user relevance)where user relevance is the gold
    standard for comparison

First test law of unintended consequences
  • Mid 1950s test of two competing systems
  • subject headings by Armed Services Tech Inf
  • uniterms (keywords) by Documentation Inc.
  • 15,000 documents indexed by each group, 98
    questions searched
  • but relevance judged by each group separately

  • First group 2,200 relevant
  • Second 1,998 relevant
  • but low agreement
  • Then peace talks
  • but even after agreement came to 30.9
  • Test collapsed on relevance disagreements

Learned Never, ever use more than a single judge
per query. Since then to this day IR tests dont
Cranfield tests 1957-1967
Cyril Cleverdon 1914-1997
  • Funded by NSF
  • Controlled testing different indexing languages,
    same documents, same relevance judgment
  • Used traditional IR model non-interactive
  • Many results, some surprising
  • e.g. simple keywords high ranks on many counts
  • Developed Cranfield methodology for testing
  • Still in use today incl. in

TREC started in 1992, still strong in 2014
Tradeoff in recall vs. precision
Cleverdons law
  • Generally, there is a tradeoff
  • recall can be increased by retrieving more but
    precision decreases
  • precision can be increased by being more specific
    but recall decreases
  • Some users want high precision others high recall
  • Example from TREC

Assumptions in Cranfield methodology
  • IR and thus relevance is static (traditional IR
  • Relevance is
  • topical
  • binary
  • independent
  • stable
  • consistent
  • if pooling complete
  • Inspired relevance experimentation on every one
    of these assumptions
  • Main findingnone of them holds

but simplified assumptions enabled rich IR tests
and many developments
IR relevance static vs. dynamic
  • Q Do relevance inferences criteria change over
    time for the same user task? A They do
  • For a given task, users inferences are dependent
    on the stage of the taskDifferent stages
    differing selections but different stages
    similar criteria different weightsIncreased
    focus increased discrimination more stringent
    relevance inferences

IR relevance inferences are highly dynamic
Experimental results
Topical Topicality very important but not exclusive role. Cognitive, situational, affective variables play a role e.g. user background (cognitive) task complexity (situational) intent, motivation (affective)
Binary Continuum Users judge on a continuum comparatively, not only binary (relevant not relevant). Bi-modality Seems that assessments have high peaks at end points of the range (not relevant, relevant) with smaller peaks in the middle range
Independent Order in which documents are presented to users seems to have an effect. Near beginning Seems that documents presented early have a higher probability of being inferred as relevant.
Experimental results (cont.)
Stable Time relevance judgments not completely stable change over time as tasks progress learning advances Criteria for judging relevance are fairly stable
Consistent Expertise higher higher agreement, less differences lower lower agreement, more leniency. Individual differences the most prominent feature factor in relevance inferences. Experts agree up to 80 others around 30Number of judges More judges less agreement
If pooling Complete (if only a sample of collection or a pool from several searches is evaluated) Additions with more pools or increased sampling more relevant objects are found
Clues on what basis criteria users make
relevance judgments?
Content topic, quality, depth, scope, currency, treatment, clarity
Object characteristics of information objects, e.g., type, organization, representation, format, availability, accessibility, costs
Validity accuracy of information provided, authority, trustworthiness of sources, verifiability
Clues (cont.) Matching users
Use or situational match appropriateness to situation, or tasks, usability, urgency value in use
Cognitive match understanding, novelty, mental effort
Affective match emotional responses to information, fun, frustration, uncertainty
Belief match personal credence given to information, confidence
Summary of relevance experiments
  • First experiment reported in 1961
  • compared effects of various representations
    (titles, abstracts, full text)
  • Over the years about 300 or so experiments
  • Little funding
  • only two funded by a US agency (1967)

Most important general finding Relevance is
In conclusion
  • Information technology systems will change
  • even in the short run
  • and in unforeseeable directions
  • But relevance is here to stay!

and relevance has many faces some unusual
Innovation ... as well ... not all are digital
and here is its use
Unusual services Library therapy dogs
U Michigan, Ann Arbor, Shapiro Library
Presentation in Wordle
Thank you
Thank you for inviting me!