Deep-Indexing the OPAC: Integrating Contents Information into Search Results - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Deep-Indexing the OPAC: Integrating Contents Information into Search Results

Description:

Mary M. Strouse, CUA DuFour Law Library. 7th MAIUG Annual ... Identification/Collocation: What does the library have about... collocation issue ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 31
Provided by: maryst1
Learn more at: http://staff.cua.edu
Category:

less

Transcript and Presenter's Notes

Title: Deep-Indexing the OPAC: Integrating Contents Information into Search Results


1
Deep-Indexing the OPAC Integrating Contents
Information into Search Results
  • Mary M. Strouse, CUA DuFour Law Library
  • 7th MAIUG Annual Conference, October 2005.

2
Functions of Contents Information (TOC Data)
  • Evaluation
  • Does this resource suit my purpose?
  • Navigation
  • Which volume(s), pages do I need?
  • Identification/Collocation
  • What does the library have about?
  • Written by?
  • Containing known title?

3
Local Priorities for TOC Inclusion
  • Edited collections on broad themes
  • Diverse geographic treatments
  • Conferences, symposia, anthologies
  • Local interest (our faculty, etc.)
  • Unifying themes multiple authorship and/or
    non-predictable content

4
Where, When and How
  • Choice of vendors
  • Blackwell
  • Syndetic Solutions
  • Marcive (using Syndetics data)
  • Scanning/local input
  • Loading mechanisms
  • III loading services (Blackwell)
  • Matching on demand
  • Choice of formats

5
505 Enhanced Contents Note
  • Standards-compliant
  • Existing tools (macros, spell-check)
  • Includes volume and chapter, but not page
  • Keyword access to titles and authors
  • Titles indexed (all or nothing)
  • Cant index authors (not in inverted form).
  • Can be difficult to read

6
Vendor TOC Format (97x)
  • Displays as a table of contents
  • Includes page numbers
  • Indexing flexibility authors and titles as well
    as keyword
  • Can exclude generic titles (Introduction
    Preface, etc.) from indexing
  • Space for both transcribed and authorized forms

7
97x TOC Format Detail
  • 970 field (one per chapter or section title)
  • Indicator 1 title indexing
  • 0 Non-distinctive title (dont index)
  • 1 General chapter title or heading
  • 2 Citable title No longer used
  • Indicator 2 hierarchy, degree of indentation
  • l Section or chapter label
  • t Section or chapter title
  • c Personal author
  • f Personal author in inverted form
  • d Non-personal author
  • e Editor
  • p Starting page number
  • by default, authors of non-distinctive titles
    are not indexed. May be indexed on request.

8
Disadvantages of TOC Format
  • Table-like format takes up screen, adds
    significantly to printing
  • Limitations on use of vendor data (TOC blocked in
    exported results)
  • Implicit burden on library staff
  • Commingling of library and vendor data

9
Display bug Corporate author doesnt display
10
Effect on Keyword Search
  • Adds significantly to retrieval in keyword
    searches authors and titles
  • Display element in keyword results is always the
    book title also true of sorted/limited
    results.
  • Search terms are highlighted in full record
    display

11
Effect on Title Search
  • Library determines which titles to exclude (970
    first indicator)
  • Chapter titles will appear in unsorted results
    browse
  • Chapter titles not identified as such
  • English initial articles automatically excluded
  • Search terms highlighted in full record display

12
Effect on Author Search
  • Individual authors are linked in record (as
    transcribed) and appear in browse list (indexed
    form)
  • Authority work often needed to match with
    existing names
  • Corporate authors from 970 d and editors from
    970 e not indexed
  • Display of titles in extended browse follows same
    rules as title search

13
Keyword-only Indexing Option
  • Includes authors and titles
  • Must specify inclusion in author and title
    segments
  • Avoids collocation issue/authority work
  • Avoids noise retrieval, confusion between
    chapters and books
  • Limits access to documents and reports (distinct
    works)
  • Limits effectiveness of known-author and
    known-title searches

14
Formatting Controls
  • BIB_TOC_HEADER WWWoption
  • Places a caption at head of TOC display
  • Default no caption
  • Accepts HTML for formatting or link to a help
    file
  • TABLEPARAM_BIB_TOC
  • Stylesheet Class bibTOC
  • No link in brief citation

15
Search result display options
  • DISPLAY_245 does not apply to chapter titles
  • EXTENDED_TU will not force a book title to
    display
  • Beware confusion from forcing extended display
    (INDEX_EXTta)

16
BROWSE WWWoption
  • Controls first line of index browse
  • BROWSE_T controls first line of record browse
    (in absence of briefcit.htm )
  • If no 970 subfields are specified, all subfields
    will display
  • If specify default subfields for non-245 titles,
    must include subfield t

17
Example 1 BROWSE_T245/abnp/c or
BROWSE_T245/abnp/c 970/t/cd/a/c (ALL TOC
subfields display)
18
Example 2 BROWSE_T245/abnp/c 970/t/cd
and BROWSE_T245/abnp/c 970/t/cd /at/c
19
Briefcit Format
  • ltspan class"briefcitTitle"gt
  • lt!--linkfieldspecVbT--gt
  • lt/spangt
  • All record browse screens show book titles
    (includes limited and keyword results)
  • All index browse screens show chapter titles
    (includes sorted results)
  • Use BROWSE_T (define 970 t to avoid no title
    display)

20
Briefcit Format
  • ltspan class"briefcitTitle"gt
  • lt!--linkfieldspecVbt245abnp--gt
  • lt/spangt
  • All record browse screens show book titles
  • (includes sorted, limited and keyword)
  • Only system-sorted index browse shows chapter
    titles

21
Loading and Workflow Issues
  • False adds monographic series w/ ISSNs,
  • False drops CIP and other title discrepancies
  • Coding consistency
  • Authority control
  • Volume of work
  • Lack of tools
  • No mechanism to identify/protect library-added
    data

22
Coding issues vendor- supplied data
  • Titles and names transcribed from TOC, not from
    fullest form available
  • No space for formal titles of included works
    -- we add 7xx
  • Inconsistent coding of index-worthiness
  • Is Appendix a title or a number?

23
Non-personal Authors d
  • Used for corporate author
  • 970 11 l9 tRedefining Discrimination
    'Disparate Impact' and the Institutionalization
    of Affirmative Action d United States Department
    of Justice Office of Legal Policy p121
  • Also used for personal authors in direct order
    (but sometimes not)
  • 970 12 tExcerpts from Antigone d Sophocles p11
  • 970 11 tReith Lecture 2000 d The Prince of
    Wales p11

24
Non-personal Authors d
  • Also used for other transcribed phrases and et
    al.
  • 970 21 tWorkshop Discussion Civil Litigation
    Against Terrorism d Workshop Participants p185
  • 970 21 tPublic Support for Access to Government
    Records A National Survey cPaul D. Driscoll
    fDriscoll, Paul D. cSigman L. Splichal
    fSplichal, Sigman L. cMichael B.
    SalwenfSalwen, Michael B. d et al. p23
  • Library can add index link in f (not
    vendor-provided)

25
Recap User Issues
  • Cost in screen space, added printing
  • Multiple forms of author entry (split files)
  • Cant distinguish between chapter and book-length
    treatment (increased noise)
  • License limitations on data use

26
Wish List
  • Fix corporate author display bug
  • Identify chapter titles in search results
  • Option to force display of both chapter title and
    book title in extended browse
  • Link to full TOC display from brief citation
    format (briefcit.html)
  • Allow limited data export for legitimate
    scholarly use

27
Recap Workflow Issues
  • Vendor-dependent format
  • Staff burden need coding regularization
  • Co-mingling of vendor and library data
  • False positives (multiple ISBN)

28
Wish List
  • Additional Subfields/Codes
  • Indexed/authorized form for corporate author
  • Data source and ownership
  • Authority history
  • Subfield code(s) to identify library-added TOC
    data, overlay-protect library-added authority
    work
  • Better coding conventions, transparency

29
References
  • CSDirect TOC Data FAQ (password required)
  • http//csdirect.iii.com/faq/tocfaq.shtml
  • Blackwell TOC Enrichment brochure
  • http//www.blackwell.com/pdf/TOCEnrichment.pdf
  • Vendors
  • http//www.blackwell.com/level2/TOC.asp
  • http//www.syndetics.com/index.htm
  • http//www.marcive.com/HOMEPAGE/MARCres.htm
  • (Marcive uses Syndetics data)

30
Contact
  • Mary M. Strouse
  • Head of Technical Services
  • Judge Kathryn J. DuFour Law Library
  • Catholic University of America
  • strouse at law.cua.edu

Thank You!
Write a Comment
User Comments (0)
About PowerShow.com