Title: Flexible and Extensible Digital Object and Repository Architecture (FEDORA)
1Flexible and Extensible Digital Object and
Repository Architecture (FEDORA)
Sandra Payette Cornell University payette_at_cs.corne
ll.edu
Dritter Workshop der Digitalisierungszentren,
October 5, 1999
http//www.cs.cornell.edu/payette/presentations/fe
dora-gdz.ppt
2Cornell Digital Library Research Group
- Computer Science Department
- Bill Arms
- Carl Lagoze
- Sandy Payette
- Naomi Dushay
- David Fielding
- Affiliates
- Anne Kenney (Cornell Library)
- Geri Gay (Human Computer Interaction)
- CNRI
3CDLRG - Projects
- Prism (DLI2)
- Fedora
- Harmony (IDL)
- Dienst and NCSTRL
- Electronic Scholarly Publishing
- D-Lib
- Citation Linking (IDL)
4Digital Library Interoperability
5Principles for Digital Library Architecture
- Open Architecture
- functionality partitioned into set of
well-defined services - services accessible via well-defined protocol
- Modularization
- promotes interoperability
- scalable to different clientele (library,
informal web) - Federation
- enable aggregations into logical collections
- Distribution
- of content and services
- of administration and management
6Component-Ware Digital Libraries
UI Gateway Service
Name Service
Identifiers
Collection Service
Query Mediator Service
Index Service
Repository Service
Digital Objects
7FEDORA
- Digital Object Model
- container for aggregating any digital material
- disseminations of complex types
- global extensibility mechanisms
- access management
- Repository Service
- Service layer for contained DigitalObjects
- Object lifecycle management
- Secure environment
- open interface
8FEDORA Goals
- Distribution - of digital content and services
- Interface Stability - for digital objects
- Interoperability - for digital objects and
repositories - Extensibility - naturally evolving type system
- Flexibility - community-driven type development
- Security - rights management and access control
- Preservation - longevity of digital objects
9FEDORA History
- Kahn/Wilensky
- Warwick Framework
- Distributed Active Relationships
- Cornell FEDORA (Lagoze, Payette)
- CNRI Repository (Arms, Blanchi, Overly)
- CNRI/FEDORA - Interoperability Project
- UVA - Complex disseminators, distribution
- Project Prism (DLI2)
10FEDORA DigitalObjects can be...
- Simple, familiar entities
- Complex, compound, dynamic objects
11FEDORA DigitalObject Model
MIME-typed stream of bytes
Book
Dissemination
Service Request upon external source
Internal DataStream
Reference DataStream
12Disseminator Type
- A set of behaviors that formally describes the
functionality of any global or community-specific
notion of content.
getSection getArticle
13Disseminator
- A generic component that associates
- a set of behaviors with a DigitalObject.
Extensible Type Disseminator
Generic behaviors
Extended behaviors
14FEDORA DigitalObject
application/ MARC
application/ postscript
image/gif
image/gif
image/gif
image/gif
Primitive Disseminator
15Client communicates with generic requests
GetChapter GetTOC GetPage
application/ MARC
DS1
Primitive Disseminator
application/ postscript
DS2
16A Disseminator...
GetDCField GetDCRecord
DC
application/ MARC
DS1
GetMethods(DC)
application/ postscript
DS2
GetDCField(Title), GetDCRecord
17DigitalObject Interface Stability
Disseminator Type
Interface
Mechanism
18DigitalObject Extensibility Adding New Types
Book
The same underlying data...
Mechanism
Structure
Interface
19Extensibility a look under the hood
Servlet URNDC1
DC
application/ MARC
application/ postscript
20Proliferation of Disseminator Types
- We use FEDORA DigitalObjects to store
Disseminator Signatures and Servlets. - Type Registration (via name service)
- a Disseminator Types global identifier is
- the URN of a DigitalObject containing a
Signature - a Servlets global identifier is
- the URN of a DigitalObject containing a Servlet
Types can be globally recognizable and mechanisms
can be shared.
21Interoperable Digital Objects and Repositories
RAP Client
Name Service
Repository
Identifiers
Cornell Library Collections
Audio/Visual Archive
Image Database System
22Persistent Identifiers
- In FEDORA, use them for
- Repositories
- DigitalObjects
- Disseminator Types
- Servlet Mechanisms
- Benefits
- Ensure uniqueness
- Provide stability (location independence)
- Promote global extensibility
- Promote interoperability
23Identifiers - A Brief Primer
- IETF Uniform Resource Name (URN) Spec
- Naming Scheme
- The policies and procedures for creating and
assigning URNs within a particular domain. - Resolution System
- A system that translates URNs into their
location-specific identifiers (e.g., URLs). - Registries
- A set of global directories that provide
information on which resolution systems can
translate any particular URN.
24Identifiers - Existing Solutions
- CNRIs Handle System
- good implementation of URN specification
- 1 Handle gtgt one or more locations
- resolve to different data types (URL, IOR,)
- OCLCs PURL
- persistent URLs, not really URNs
- 1 PURL gtgt only one location (a HTTP redirect)
- Community-specific Initiatives
- Digital Object Identifier (DOI) - publishers
- Handle System Rights Metadata
- PubMedID - Medline
- BibCode - astro-physics journals
25FEDORA Status
- Reference Implementation
- CORBA IDL defines open interfaces for Repository
Access Protocol (RAP) - Java/CORBA repository and clients
- Collaborations
- CNRI
- core design and interoperability
- complex disseminations (dynamic)
- U of Virginia
- web integration
- complex disseminations (e.g., e-texts)
26New Research
- DLI2 - Project Prism
- security (associating enforceable policies and
mechanisms with DigitalObjects) - preservation (enable long-term survival of
DigitalObjects in distributed environment) - IDL - Harmony
- aggregation and interaction of multiple, complex
metadata sets in DigitalObjects - RDF and XML
27PRISM Security Policy Enforcement
- Challenges
- what is enforceable?
- distributed object environment
- interoperability and extensibility
- Monitor all operations, generic and extended
- Enforce a wide array of policies
- basic security violations
- rights management
- access control
GetDCField GetDCRecord
DC
application/ MARC
text/x-acl
28PRISM Preservation
Fedora Repositories
Preservation Service
29PRISM Preservation Policy Enforcement
Preservation Surrogate Object
Monitors DigitalObject state and catches
unacceptable, or risky transitions
Preserve
Book
P
DS1
preservation metadata
Preservation Service
application/ postscript
DS2
30References
- Payette, Blanchi, Lagoze, and Overly
Interoperability for Digital Objects and
Repositories The Cornell/CNRI Experiments,
D-Lib Magazine, May 1999. http//www.dlib.org/dlib
/may99/payette/05payette.html - Payette and Lagoze Flexible and Extensible
Digital Object and Repository Architecture
(FEDORA), ECDL 1998. http//www.cs.cornell.edu/pay
ette/papers/ECDL98/FEDORA.html - Lagoze and Payette An Infrastructure for
Open-Architecture Digital Libraries
http//ncstrl.cs.cornell.edu/Dienst/UI/1.0/Displa
y/ncstrl.cornell/TR98-1690 - Daniel, Lagoze, and Payette, A Metadata
Architecture for Digital Libraries, IEEE ADL
1998. http//www.cs.cornell.edu/lagoze/papers/ADL9
8/dar-adl.html - FEDORA Home Page http//www.cs.cornell.edu/NCSTRL/
CDLRG/FEDORA.html - Payette Persistent Identifiers on the Digital
Terrain, RLG DigiNews,April 1998, Volume 2,
Number 2. http//www.rlg.org/preserv/diginews/digi
news22.html