Relevance II: Document Representations and Clues to Document Relevance - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Relevance II: Document Representations and Clues to Document Relevance

Description:

Barry was concerned with how users used document representations to judge the ... Janes discovered, through a motion index filled out by users, that abstracts ... – PowerPoint PPT presentation

Number of Views:89
Avg rating:3.0/5.0
Slides: 13
Provided by: ILS77
Category:

less

Transcript and Presenter's Notes

Title: Relevance II: Document Representations and Clues to Document Relevance


1
Relevance II Document Representations and Clues
to Document Relevance
  • Article by
  • Carol L. Barry
  • Class discussion lead by
  • Thomas Stipanowich
  • October 21, 2004

2
Document Representation
  • Barry was concerned with how users used document
    representations to judge the documents
    relevance. She wished to discover what
    representations and traits best helped to predict
    the usefulness of documents.
  • The representations focused on were abstracts,
    title, document/source traits, and indexing terms

3
Previous Research
  • Kent (1967) interviewed 70 users of IRPs,
    intermediate response products, who were in
    ongoing research studies. Conducting literature
    searches, the users saw one of five IRPs
  • 1. the full bibliographic citation
  • 2. the abstract, including citation
  • 3. the first paragraph of the document,
    including citation
  • 4. the last paragraph of the document, including
    citation
  • 5. both the first and last paragraphs, including
    citation
  • Based on yes or no responses as to whether the
    information was useful, Kent discovered that
    using the first and last paragraphs could
    establish relevancy over 81 of the time.
    (Compared to 68.7 for citations and 73.2 for
    the abstract)

4
Previous Research
  • Saracevic (1969,1971) examined 99 questions
    given to 22 users concerning the relevancy of
    titles and abstracts compared to the full text.
    The title included the bibliographic citation,
    and the abstract included the title and the
    bibliographic citation.
  • Saracevic found that just using titles provided
    85 accuracy in selecting documents, while
    including abstracts narrowed the accuracy to 90.
    He concluded that the abstract served better to
    aid users in selecting articles than titles.

5
Previous Research
  • Marcus, Kugel, and Benenfeld (1978) Examined
    four document representations titles,
    abstracts, all indexing terms assigned to a
    document, and indexing terms assigned to a
    document that matched terms used in the original
    search. Users were randomly shown the four
    representations and asked to decide on the
    articles relevancy.
  • The authors found that abstracts were most
    useful, at 73 of the time, and came up with a
    length hypothesis in which the usefulness of a
    document representation correlates to its length.
    This held true in all recorded instances except
    that abstracts still proved more useful than
    indexing terms even with less recorded words.

6
Previous Research
  • Janes (1991) Examined the incremental
    presentation of document representation effects
    on users by splitting her study group into four
    groups that each received different orders of
    information. These groups were
  • 1. title, abstract, bibliographic citation
  • 2. title, abstract, indexing terms
  • 3. title, bibliographic citation, abstract
  • 4. title, indexing terms, abstract
  • Janes discovered, through a motion index filled
    out by users, that abstracts
  • were the most useful, followed by titles, than a
    considerable drop to
  • bibliographic citations, and than indexing.

7
Research
  • Barry decided to interview 18 students from
    Louisiana State University in a two hour session
    which had them rate fifteen documents, chosen
    after a prior phone interview with each
    participant, by the document representations
    originally found with the article. The responses
    were broken down into three groups
  • 1. Answers including an evaluative judgment about
    the documents were broken down into 21 relevance
    criteria groups. These represented 36 of
    responses.
  • 2. If the response included mention of a
    bibliography, references, or footnotes they were
    listed as reference trait only. These represented
    2 of responses.
  • 3.The remaining 62 were coded as information
    content only, meaning that the users identified
    some aspect of information content but no
    relevance criteria.

8
Table 1 Relevance criterion categories.
  • Criterion category Brief definition of category
  • Depth/scope Extent to which information is
    in-depth and focused
  • Tangibility Extent to which information relates
    to real, tangible issues or events, or the
    extent to which definite, proven information
    is presented
  • Effectiveness Extent to which a technique or
    procedure is judged to be effective or
    successful
  • Accuracy/validity Extent to which information
    is correct, valid, accurate extent to which
    information supports respondents position,
    or extent to which respondent agrees with
    information
  • Clarity Extent to which information is
    presented in a clear or
  • readable manner
  • Recency Extent to which information is recent,
    current, up-to-date
  • Background/experience The degree of knowledge
    with which the respondent
  • approaches the document

9
Table 1 Relevance criterion categories.(cont.)
  • Criterion category Brief definition of category
  • Ability to understand The respondents judgment
    that he/she would be able to understand or
    follow the information presented
  • Consensus within the field Extent to which there
    is consensus within the intellectual field
    relating to the information provided
  • External verification Extent to which
    information provided is supported by
  • other sources of information
  • Content novelty Extent to which information is
    novel to the respondent
  • Source novelty Extent to which a source of
    information is novel to the respondent
  • Stimulus document Extent to which the document
    itself is novel to the
  • Novelty respondent
  • Relationship with author Respondent indicates a
    personal or professional
  • relationship with an author
  • Affectiveness An affective or emotional
    reaction to the information

10
Table 1 Relevance criterion categories.(cont.)
  • Criterion category Brief definition of category
  • Source quality General standards of quality
    that are predicted based on a source of the
    document
  • Source reputation/visibility Extent to which a
    source of the document is well known or
    reputable
  • Availability Extent to which information
    provided is available through other sources
  • Personal availability Extent to which respondent
    already possesses information like that
    provided
  • Access Extent to which it would be easy to
    obtain a copy of the document extent to which
    a cost would be involved in obtaining a copy
    of the document
  • Time constraints Extent to which time
    constraints are a factor in the
  • respondents decision to pursue, or not
    pursue, a
  • document

11
Table 6. Co-occurrence of selected document
characteristics and selected relevance criterion
categories
  • Relevance criterion Full text Abstract
    Abstract Title Document/ Indexing
  • category source traits terms
  • Depth/scope X X X X
    X
  • Accuracy/validity X X X
    X X
  • Content novelty X X X
    X X
  • Tangibility X X
    X X X
  • Ability to understand X X X
    X
  • Recency X X
    X X
  • External verification X X X
  • Effectiveness X X X
  • Access X X
    X
  • Clarity X X

12
Questions
  • What relevance criteria do you feel were left out
    from each studies lists? What criteria do you
    feel are most important, particularly considering
    the increasing use of Web documents?
  • Barry said she wished she could look into the
    users level of confidence in their predictions
    of documents based on the representations. What
    further information would this clarify about the
    representations?
  • What traits/document representations do you find
    most useful in locating documents?
  • Information quality is cited by Bateman as a
    construct contributing to the meaning of
    relevance. But Amento (2000) maintains that
    information quality is a concept separate from
    relevance. Which do you think is true? Should
    we define information quality apart from
    relevance?
Write a Comment
User Comments (0)
About PowerShow.com