Interoperability%20in%20Digital%20Libraries%20Open%20Archives%20Initiative%20and%20the%20NSDL - PowerPoint PPT Presentation

About This Presentation
Title:

Interoperability%20in%20Digital%20Libraries%20Open%20Archives%20Initiative%20and%20the%20NSDL

Description:

Interoperability in Digital Libraries Open Archives Initiative and the NSDL CS 502 20020326 Carl Lagoze Cornell University Acknowledgements: – PowerPoint PPT presentation

Number of Views:186
Avg rating:3.0/5.0
Slides: 52
Provided by: CarlL168
Category:

less

Transcript and Presenter's Notes

Title: Interoperability%20in%20Digital%20Libraries%20Open%20Archives%20Initiative%20and%20the%20NSDL


1
Interoperability in Digital LibrariesOpen
Archives Initiative and the NSDL
  • CS 502 20020326
  • Carl Lagoze Cornell University

Acknowledgements Bill Arms Herbert Van de
Sompel
2
Beyond the walls
The Library should selectively adopt the portal
model for targeted program areas. By creating
links from the Librarys Web site, this approach
would make available the ever-increasing body of
research materials distributed across the
Internet. The Library would be responsible for
carefully selecting and arranging for access to
licensed commercial resources for its users, but
it would not house local copies of materials or
assume responsibility for long-term
preservation. LC21 Digital Strategy for the
Library of Congresspage 5
3
A portal should mean more than access..
  • Traditional portal (e.g., Yahoo!)
  • linkage with limited responsibility
  • Hybrid Portal
  • Asserting some semblance of curatorial role over
    linked resources
  • Providing a rich fabric of services across those
    resources

4
Interoperability standards enable service
creation.
  • Search and discovery
  • Z39.50
  • Metadata vocabularies and syntax
  • MARC
  • Dublin Core
  • XML/RDF
  • Object models
  • METS
  • FEDORA

5
Interoperability Trade-offs
6
Yes, its about resource discovery over
distributed collections
metadata
Author Title Abstract Identifer
7
Facilitating/Monitoring Longevity of Distributed
Content
PreservationService
8
Personalization of Content
9
Cross-Repository Reference Linking
Linkage Service
10
Origins of the OAI
  • Increasing interest in alternative scholarly
    publishing solutions e.g., LANL arXiv
  • Increasing impact through federation
  • UPS Mtg., Sante Fe, October 1999
  • Representatives of various ePrint, library,
    publishing, communities
  • Goal definition of an interoperability framework
    among ePrint providers
  • Result Santa Fe Convention, interoperability
    through metadata harvesting

11
Open Archives
  • Political Agenda?
  • Author self-archiving of E-Prints
  • Mission to reformulate scholarly publishing
    framework
  • Technical?
  • Infrastructure to facilitate interoperability
    across multiple domains

12
Technical Umbrella for Practical Interoperability
Metadata Harvesting
Reference Libraries
Museums
Publishers
E-PrintArchives
that can be exploited by different communities
13
OAI Technical Infrastructure Key technical
features
  • Deploy now technology 80/20 rule
  • Two-party model providers (data providers) and
    consumers (service providers)
  • Simple HTTP encoding
  • XML schema for some degree of protocol
    conformance
  • Extensibility
  • Multiple item-level metadata
  • Collection level metadata

14
The World According to OAI
Service Providers
Discovery
Current Awareness
Preservation
Data Providers
15
Content and Metadata
Item (metadata)
repository
resource
record
010010
16
(No Transcript)
17
OAI-PMH History
  • Version 1.0 January 21, 2001
  • Version 1.1 July 2, 2001
  • W3C XML schema changes
  • Version 2.0a March 1, 2002
  • Production release June 3, 2002
  • No major functionality changes
  • Numerous functional tweaks
  • Harvesting granularity, flow control, error
    handling

18
Key Features of the OAI Metadata Harvesting
Protocol
  • definitions concepts
  • repository
  • record
  • identifier
  • datestamp
  • set
  • protocol features
  • HTTP encoding
  • metadata prefix schema
  • flow control
  • protocol requests
  • supporting requests
  • harvesting requests

19
repository
20
record
ltrecordgt ltheadergt ltidentifiergtoaieg001lt/ident
ifiergt ltdatestampgt1999-01-01lt/datestampgt lt/head
ergt ltmetadatagt ltdc xmlnshttp//purl.org/dcgt
lttitlegtMy Examplelt/titlegt lt/dcgt lt/metadatagt
ltaboutgt ltea xmlnshttp//www.arXiv.org/ea
ltusagegtNo restrictionslt/usagegt lt/eagt lt/aboutgtlt
/recordgt
21
identifiers
locally unique key for extracting a record from a
repository
oai-identifier oaiarchive-identifierrecord-ide
ntifier
example oaincstrlncstrl.cornellcs/TR94-1418
22
selective harvesting - datestamps
23
selective harvesting - sets
S2
24
set specifics
  • repositories define hierarchical organization
  • each item in a repository may be organized in one
    set, several sets, or no sets at all
  • meaning of sets or of set hierarchy is not
    defined in protocol
  • individual communities may formulate common set
    configurations

25
HTTP encoding - requests
BASE-URL -----------gt an.oa.org/OAI-scriptkeyword
arguments --gt verbListIdentiferssetS1
GET http//an.oa.org/OAI-script?verbListIdenti
ferssetS1
POST POST http//an.oa.org/OAI-script
HTTP/1.0 Content-Length 78 Content-Type
application/x-www-form-urlencoded
verbListIdentiferssetS1
26
HTTP encoding - responses
ltxml version1.0 encodingUTF-9
?gtltGetRecord xmlnshttp//oai.namespace.uri
xmlnsxsihttp//w3.namespace.uri xsischemaL
ocationhttp//oai.namespace.uri http//oai.sc
hemaURLgt ltresponseDategt2000-19-01T193030-0400
lt/responseDategt ltrequestURLgthttp//an.oa.org/OAI-
script?verbGetRecord ampidentifieroai3Aar
Xiv3A0001 ampmetadataPrefixoai_dclt/request
URLgt ltrecordgt record contents lt/record addit
ional recordslt/GetRecordgt
27
metadata prefix and schema
  • support for harvesting multiple metadata formats
  • metadata schema each format must have a
    validating XML schema at a publicly accessible
    URL (communities may define shared formats and
    schema.
  • metadata prefix each repository maps a prefix to
    the schema it supports, which is used in protocol
    requests.
  • support for unqualified Dublin Core mandatory
  • DC OAI record syntax that builds on base DCMI
    schema
  • reserved prefix oai_dc.

28
flow control
29
flow control specifics
  • applies to all protocol requests that return
    lists ListRecords, ListIdentifiers, ListSets
  • resumptionToken is opaque
  • semantics of partitioning of responses within
    resumption requests is undefined

30
Extensibility Feature Summary
  • Multiple metadata formats
  • Collection level metadata
  • Identify about container
  • Record data
  • Terms and conditions
  • Provenance
  • Set structure
  • Pre-configured queries

31
OAI Protocol
service provider
data provider
  • Supporting protocol requests
  • Identify
  • ListMetadataFormats
  • ListSets
  • Harvesting protocol requests
  • ListRecords
  • ListIdentifiers
  • GetRecord

32
Supporting Protocol Requests
service provider
data provider
Identify
  • Repository name
  • Base-URL
  • Admin e-mail
  • OAI protocol version
  • Description Container

33
Supporting Protocol Requests
service provider
data provider
ListMetadataFormats
  • REPEAT
  • Format prefix
  • Format XML schema
  • /REPEAT

34
Supporting Protocol Requests
service provider
data provider
ListSets
  • REPEAT
  • Set Specification
  • Set Name
  • /REPEAT

35
Harvesting Protocol Requests
service provider
data provider
froma
untilb
setklm ListRecords metadataPrefixoai_dc
  • REPEAT
  • Identifier
  • Datestamp
  • Metadata
  • About Container
  • /REPEAT

36
Harvesting Protocol Requests
service provider
data provider
froma

untilb ListIdentifiers setklm
  • REPEAT
  • Identifier
  • Datestamp
  • /REPEAT

37
Harvesting Protocol Requests
service provider
data provider
identifieroaimlib123a
GetRecord metadataPrefixoai_dc
  • Identifier
  • Datestamp
  • Metadata
  • About

38
(No Transcript)
39
(No Transcript)
40
Measures of Success
  • gt100 implementers of the protocol
  • 64 registered
  • Basis for much research and implementation
  • JCDL 2002
  • A subject category for paper submission!
  • Numerous papers building on OAI
  • Research Projects and Funding

41
Externally funded initiatives
  • European Community
  • Open Archives Forum
  • Cyclades Project
  • Andrew W. Mellon Foundation
  • Funding for 7 service providers
  • Digital Library Federation
  • Gateways for access to member's digital
    collecitons
  • National Science Foundation
  • National Science Foundation Core Infrastructure

42
DP9 Architecture
  • Giving search engines access to the deep web

43
(No Transcript)
44
(No Transcript)
45
NSDL (National Digital Library for Science,
Mathematics, and Engineering )
  • Large-scale digital library technology
  • 1,000,000 users
  • 10,000,000 items
  • 100,000 collections
  • Diverse participants
  • Libraries
  • Academic/research institutions
  • Individuals

46
NSDL References
  • http//comm.nsdlib.org/
  • Zia, L., Growing a National Learning Environments
    and Resources Network for Science, Mathematics,
    Engineering, and Technology Education, D-Lib,
    March 2001
  • Arms, W. et. al., A Spectrum of Interoperability
    The Site for Science Prototype for the NSDL,
    D-Lib, January 2002
  • Lagoze, C. et. Al., Core Services in the
    Architecture of the NSDL, JCDL 2002, July 2002.

47
The Challenge
Provide coherent services for users across
diverse collections, while retaining the
individuality and richness of the collections.
48
The strategy
  • A Spectrum of Interoperability
  • Open framework for collections services
  • Embrace collections with rich metadata support
    for standards, ... accommodate collections with
    limited metadata limited support for
    interoperability.
  • Technical basis
  • Follow library tradition of metadata sharing
  • Use automated methods to generate, normalize,
    translate metadata
  • Distribute metadata to service providers

49
The Metadata Repository
Services
Users
Metadata repository
The metadata repository is a resource for service
providers. It holds information about every
collection and item known to the NSDL.
Collections
50
MR Ingest and Exposure
OAI-PMH
OAI-PMH
Normalization Generation Cross-walking
MR Front Porch
OAI-PMH
gathering
Directentry
51
Challenges and Questions
  • Utility of lowest common denominator metadata
    such as DC
  • Quality of metadata from non-professional
    contributors
  • Machines processing to reduce and compliment
    human effort
  • Functionality of service structure
Write a Comment
User Comments (0)
About PowerShow.com