Title: Pattern Recognition in Action for Cataloging and Metadata
1Pattern Recognition in Action for Cataloging and
Metadata
- 2006 OLC Technical Services Retreat
- Chris Grabenstatter
- April 25, 2006
2Agenda
- OCLC Cataloging/Metadata strategic directions
- Architecture to support strategy
- Examples of projects
3Cataloging Environment
- Fewer catalogers, reduced budgets
- Little growth in print materials acquisitions
- E-resources increasing cataloged?
4Deliver more automatically
- Build on PromptCat, Cataloging Partners program
success - Partner with major materials providers
- Cataloging tied to selection possible new
service
5More Scripts/Language Support
- Growing WorldCat
- Supporting libraries diverse collections
- Easier to get materials cataloged
- Growing membership
- One stop shopping
- Both US and global libraries
6Metadata support for e-content
- Support automated metadata generation for
e-resources - Facilitate storage and discovery of digital
content - Support new metadata schemes - crosswalks
- Enrich WorldCat with e-serials records and
holdings
7Continue to deliver value
- Ongoing Connexion maintenance
- Standards
- Simplify pricing
8Lego Era
- The Internet is entering its Lego era. Indeed,
blocks of interchangeable software components are
proliferating and developers are joining them
together to create a potentially infinite array
of useful new programs. - --John Markoff, The New York Times, April 5, 2006
9Library 2.0
- Library 2.0 is about small pieces of software
loosely joined, requires business models where
multiple vendors bring value to consumers
together to reduce duplication of effort and
reduce barriers to innovation - --Paul Miller, Library 2.0 the challenge of
disruptive innovation. - http//www.talis.com/resources/documents/447_Libr
ary_2_prf1.pdf
10OCLC Metadata Management Service
Connexion Digital Archive Content Coop
ILS PICA NetLibrary Material Vendors
Publishers
OAI Repositories Local DBs
Web Services/Portal/API Layer
Local Holdings (MFHD)
OAI Harvest
Validate
DA Ingest
DA Extract
DA Access
Format Crosswalks
Metadata Creation
Acquisitions/ Selection
Z39.50 (authorities Non-roman
Terminologies
Pan/Zoom
Language Service
Reports Stats
SRW/Zing Update
Shelf Ready
Profiling
Metadata Capture
Profiling Data
Usage Stats
Digital Archive
11Projects Metadata support for e-content
- Extraction/Creation Web Service
- Crosswalk Web service
- OCLC Terminologies Service
- Content Cooperative Pilot
- OCLC eSerials Holdings Service
12Extraction/Creation Web Service
- Extract metadata from Web sites, PDF files, and
Word files - Re-implementing and enhancing functionality
currently available in Connexion browser - Connexion browser May 2006
- Connexion client 1.60 June 2006
13Connexion extract metadata
- Enter URL or path to extract metadata
- Supported file types .htm, .doc, .pdf
- Create multiple records from Web sites linked to
the parent URL - Specify to display or save created workforms,
apply default constant data, and define My Status
value - Future add tools to create metadata
14(No Transcript)
15(No Transcript)
16(No Transcript)
17(No Transcript)
18(No Transcript)
19Crosswalk Web Service
- Batchload/PromptCat
- ONIX to MARC
- OAI harvesting Dublin Core to MARC
- Future
- Import and export Dublin Core data from Connexion
client - Support for other metadata schemes both browser
and client interfaces
20OCLC Terminologies service
- Introduction, June 2006
- Add more access points using other controlled
vocabularies, e.g., MeSH, GSAFD - Available to all OCLC Cataloging subscribers
- Subscriptions available for non-Cataloging users
- Use with a variety of metadata editors, e.g.,
Connexion browser and client
21List of Terminologies in initial release
- aat - Art Architecture Thesaurus (J. Paul Getty
Trust) - dct - Dublin Core Metadata Initiative Type
Vocabulary (Dublin Core Metadata Initiative) - gmgpc - Thesaurus of Graphic Materials, TGM I
(Library of Congress) - gsafd - Guidelines On Subject Access To
Individual Works Of Fiction, Drama, Etc.
(American Library Association) - lctgm - Thesaurus of Graphic Materials, TGM II
(Library of Congress) - mesh - Medical Subject Headings (MeSH) (National
Library of Medicine) - ngl - Newspaper Genre List (University of
Washington) - tgn - Thesaurus of Geographic Names (J. Paul
Getty Trust) - ulan - Union List of Artists' Names (J. Paul
Getty Trust
22Terminology Pane
A separate application
1
2
23Content Cooperative Pilot
- Upload content objects to the OCLC Digital
Archive from Connexion browser and client
interfaces - Digital image, thesis dissertation, oral
history, e-book, video, etc. - Replace WorldCat records to automatically add a
URL pointing to the content object - Access digital content from FirstSearch, Group
Catalogs, and OpenWorldCat
24(No Transcript)
25(No Transcript)
26(No Transcript)
27(No Transcript)
28(No Transcript)
29(No Transcript)
30Planned Open WorldCat Page for Digital Image
31Planned Zoom Pan
32OCLC eSerials Holdings service
- Automatically updates eSerials holdings in
WorldCat - Access to eSerials via WorldCat Resource Sharing
- Access to eSerials through OCLC discovery
platform - Compare electronic and print serials collections
- No additional work for the library!
33(No Transcript)
34Pilot Partners
- 35 Pilot libraries
- EBSCO
- Ex Libris
- Serials Solutions
- TDNet
- More to come
35Benefits to the library
- Increased operational efficiencies in ILL
- Filling where possible
- A revenue opportunity for some
- You control requests via automatic deflection
- Increased visibility at the point of need
- Leverages investment in services
36Progress
- Initial production system available late June
2006 - Web-based registration form
- No charge to participate in the eSerials holdings
service - Future enhancements projected to include options
for local holdings data, MARC record update
service, and additional deflection choices
37ProjectsDeliver more automatically
- Improve shelf-ready cataloging
- PromptCat/Cataloging Partners 100 goal
- Partner with major vendors
- Selection
- Possible future service
- OCLC partnering with materials vendors to help
with notification slip selection process - Cataloging a by product of selection
- Watch for more information in the future!
38Projects More Scripts/Language support
- New scripts support
- Cyrillic, Greek and Hebrew July 2005
- Thai and Tamil scripts for use with Connexion
client 1.50 (investigating Devanagari, Sinhala,
and Bengali next) - Connexion interface translations
- Client
- Chinese (Traditional and Simplified) and Japanese
July 2005 - German and Korean Nov. 2005
- CatExpress French (Nov. 2006)
- Unicode export Nov. 2005
- Automatic transliteration Web service June 2006
39(No Transcript)
40(No Transcript)
41Questions, Answers Sharing
?
?
?
42Cataloging future directionsContact us
- Eric Childress eric_childress_at_oclc.org
- Chris Grabenstatter c_grabenstatter_at_oclc.org