An Inquiry and Analysis of Metadata Utilization A Case Study of MARC - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

An Inquiry and Analysis of Metadata Utilization A Case Study of MARC

Description:

An Inquiry and Analysis of Metadata Utilization A Case Study of MARC William E. Moen School of Library and Information Sciences – PowerPoint PPT presentation

Number of Views:194
Avg rating:3.0/5.0
Slides: 15
Provided by: Willi819
Category:

less

Transcript and Presenter's Notes

Title: An Inquiry and Analysis of Metadata Utilization A Case Study of MARC


1
An Inquiry and Analysis of Metadata Utilization
A Case Study of MARC
2005 ASIST Annual Meeting, November 1, 2005,
Charlotte, North Carolina
  • William E. Moenltwemoen_at_unt.edugtSchool of
    Library and Information SciencesTexas Center for
    Digital KnowledgeUniversity of North
    TexasDenton, TX 72603

2
Two quality criteria
  • Fullness/completeness
  • Usefulness

3
Context for the initial analysis
  • Z39.50 Interoperability Testbed project
  • A Institute of Museum and Library Services
    National Leadership Grant
  • Goal Improve Z39.50 semantic interoperability
    among libraries for information access and
    resource sharing
  • Interoperability across library online catalogs
  • Indexing of MARC records to support searching
  • Richness of MARC content designation available
  • Inform indexing guidelines and policies

4
Indexing MARC
  • Indexing Guidelines to Support Z39.50 Profile
    Searches (available on Z-Interop website)
  • Identified all MARC 21 fields/subfields that can
    contain author, title, or subject data
  • Author-related fields/subfields 119
  • AuthorTitle-related fields/subfields 21
  • Title-related fields/subfields 253
  • Subject-related fields/subfields 144

5
Z-Interop test dataset
  • Approximately 1 sample of MARC records from
    OCLCs WorldCat database
  • Weighted sampling based on number of libraries
    holding the object represented by the record
  • 419,657 total MARC records
  • 89 of records full level cataloging
  • Formats represented in test dataset
  • Books 91
  • Cartographic Materials lt 1
  • Electronic resources lt 1
  • Archival/Mixed Materials lt1
  • Sound recordings 4
  • Visual Materials 1
  • Serials 3

6
MARC 21 content designation
MARC 21 Field Groups Currently Defined Obsolete Total MARC 1972 (Books Format Only)
00x 6 1 7 3
0xx 238 7 245 28
1xx 66 1 67 40
2xx 137 32 169 15
3xx 109 32 141 4
4xx 69 0 69 37
5xx 323 38 361 8
6xx 184 5 189 66
7xx 452 47 499 41
8xx 141 20 161 36
TOTAL 1725 183 1908 278
7
Content designation in dataset
MARC 21 Field Groups Currently Defined Obsolete Unlikely Used Total
00x 6 0 0 6
0xx 96 1 33 130
1xx 49 0 2 51
2xx 81 0 19 100
3xx 23 6 0 29
4xx 10 0 30 40
5xx 128 1 3 132
6xx 104 1 7 112
7xx 205 0 5 210
8xx 105 3 8 116
TOTAL 807 12 107 926
8
Summary frequency results
Total number of fields/subfields occurring in
dataset 13,849,499
Frequency of Fields/Subfields of All Occurrences
gt 600,000 1 4.4
500,000 gt 599,999 0 0
400,000 gt 499,999 13 39.9
300,000 gt 399,999 6 14.3
200,000 gt 299,999 6 10.6
100,000 gt 199,999 10 10.3
TOTAL 36 79.5
Only 4 of all fields/subfields account for 80
of all occurrences or 96 of all fields/subfields
account for 20 of all occurrences
9
Characteristics of top 36
  • Most frequently occurring 650 a Subject data
  • 2nd most frequently occurring 040 d Cataloging
    source
  • 3rd 4th most frequently occurring 260 a b
    Publication information
  • 5th most frequently occurring 245 a Title
  • Contain data useful to end users 28
  • Contain control numbers, etc. 5
  • Contain data useful to catalogers 3
  • Top 36 fields/subfields

10
Implications for indexing
  • 537 fields/subfields contain author, title,
    subject data
  • 381 of these actually occur in Z-Interop dataset
  • Total occurrences of the 381 4,397,712
  • 19 of the 381 (5) account for 80 of all
    occurrences
  • 9 of 19 are subject-related
  • 5 of 19 are author-related
  • 5 of 19 are title-related
  • Preliminary testing using only 19 indexed fields
  • 95 - 100 of correct records retrieved!

11
The MCDU Project
  • The MARC Content Designation Utilization Project
  • What is the extent of catalogers use of content
    designation available in MARC 21?
  • Develop and implement systematic methods,
    procedures, and software tools to produce
    reliable and valid analysis of MARC 21 content
    designation use
  • MARC record as artifact of cataloging enterprise

FOR MORE INFORMATION, VISIT THE PROJECT WEBSITE
http//www.mcdu.unt.edu/
12
The MCDU dataset analysis
  • 56 million MARC records all WorldCat bib
    records
  • Parsed and stored in MySQL
  • 20 databases
  • LC and Non-LC created records
  • 10 databases each based on type of record/format
  • Frequency counts of all fields/subfields
  • Non-LC Book Format field occurrence results

13
Making sense of the numbers
  • The numbers dont stand on their own
    contextualizing, qualifying, exploring,
    understanding
  • Metadata quality Fullness/completeness
  • Identify core elements of bibliographic records
    based on the analysis of format-specific samples
    and compare with existing recommendations for
    core records
  • Metadata quality Usefulness
  • Comparing the FRBR conceptual frameworks user
    tasks, MARC content designation supporting those
    tasks, and utilization of that content
    designation in the records

14
References
  • MARC Content Designation Utilization Project
  • http//www.mcdu.unt.edu
  • Z39.50 Interoperability Testbed
  • http//www.unt.edu/zinterop/
Write a Comment
User Comments (0)
About PowerShow.com