Title: Archiving Workflow between a Local Repository and the National Library Archive
1Archiving Workflow between a Local Repository
and the National Library Archive
- Experiences from the DiVA Project
- Eva Müller, Peter Hansson, Uwe Klosa, Stefan
Andersson - Electronic Publishing Centre
- Uppsala University Library, Sweden
2focus on
- Using URNNBN as persistence identifier for
referencing and as identifier in an archiving
workflow - Workflow between a local repository and the
national library archive
3Outline
- DiVA project and its objectives
- DiVA publishing system
- DiVA Archive and archiving workflow
- URNNBN and its role in the workflow
- Conclusions and next steps
4DiVA ProjectDigitala Vetenskapliga Arkivet
(Digital Scientific Archive)
- Since September 2000
- Objectives
- Technical solutions well functioning workflow
supporting full-text publication of doctoral
theses, working papers - Explore ways to ensure the future use and
understanding of the digital objects in the
archive
5DiVA Publishing System
- Focus on workflow
- reuse and enhance the data directly from the
source document,originally created by authors,
for metadata and a digital master for an
electronic printed version - store checksum the files
- assign a persistent identifier
- send a copy to the National Library Archive
6Word Processor
Word Processing Format (Template)
Author
DiVADocumentFormat
DiVA Manager
DiVADocumentFormat
Local Repository
Local Long-termStorage
Long-termstorage packages
7Implementation
- Java XML technologies
- Currently an Oracle database used for indexing
and searching - Architecture component-based design
- Modularity and reusability of the components
- Possibility to seamlessly replaced modules with
improved implementations of the component
8DiVA system
- Used by 5 universities in Sweden
- Stockholm, Umeå, Uppsala och Örebro university
Södertörns högskola - Soon 1 university in Denmark
- Århus University (Staatsbibliotek i Århus)
9Issues
- How can we ensure the future access and
understanding of documents we produce locally? - What factors increase potential for success?
- Can these factors be integrated into an automated
and low-cost workflow?
10How can we ensure accessibility in the future?
- A stable point of reference (persistent
identifier) - Use human-readable, non-proprietary storage
format - Storage in several locations
11How can we minimize risks for data loss?
- Multiple copies in different locations
- Mechanism to keep track of copies
12Analysis of interest
- Authors
- Dissemination of their intellectual output
- Universities
- Track research output
- Reduce publishing cost
- Increase impact
- National libraries
- Legal deposit
13Strategies
- Decision to use XML as a primary storage format
- Decision to use URNNBN
- Decision to cooperate with the Royal Library
- Decision to fit all needs into an automated
workflow
14DiVA Document Format
- Internal format
- Version 1.0 (described in XML Schema)
- 99 elements
- Component based
- Extensible
- Administrative elements are combined with
descriptive elements - DocBook DTD is used for the content part of the
document
15Metadata DisseminationServices
Word Processor
Web Services
DiVA Document Format
Word Processing Format (Template)
Author
Local Repository
16Implementation of the Archiving Workflow
- Assignment of the URNNBN to the resources
- Implementation of URNNBN Resolution Service
- URNNBN as a unique identifier within the archive
- URNNBN as a naming convention for files,
directories and archival packages - URNNBN as a part of disseminated metadata
17Assignment of the URNNBN
- Sub domain managed locally
- Structure URNNBNseXdiva
- URNNBNseuudivalocally managed serial number
- URNNBN is used as identifier for each item an
item is a single publication without
consideration of format
18Implementation of URNNBN Resolution Service
- Only basic functionally, more development planned
- Implemented as a java-servlet and contains an
harvester which can harvest URNURL-bindings from
many different repositories
19User
requeste.g. http//urn.kb.se/resolve?urn
responseuser redirected to an URL
Royal Library
URNNBNresolutionservice
Resolution ServiceConfiguration File
URNNBNse to URLmappings
request
request
response
response
Repositories
URNNBNRegister Format
URNNBNRegister Format
Other
DiVA
20URNNBN as a naming convention for files,
directories and archival packages
21Workflow to the National Library Archive
- Archiving packages
- Administrative and descriptive metadata stored in
XML - Today content in pdf and where possible even in
XML - Each manifestation is bundled to AP
- General data
- Format specific data
22Archiving Package
23Central
URNNBNResolution Service
Long-termStorage
Library Catalogue
Long-term storage packages
MARC 21
XML
Local
Long-termStorage
urnnbnse.
urnnbnse.. -gt http//wwwurnnbnse..
-gt http//www... urnnbnse.. -gt http//www...
Long-term storage packages
List of URNNBNto URL mappings
Metadata
Repository
Metadata Content
24Conclusions
- Low-cost system that supports a fully automated
workflow from the point of submission works well - Using harvesting model for updates to the mapping
registry makes the management of URNNBN simple - Automatic creation of MARC21 records makes
cataloguing faster and less expensive - Push model to deliver archival packages makes
this process more reliable and easier to manage
25- Use of XML for all metadata associated with each
archival package increases the likelihood of
future understanding of the digital objects in
the archive and offers the potential to easily
extract document metadata, if necessary - Modularity of technical solution offers advantage
of component reusability and is a solid basis for
further local and cooperative development - Long-term access to institutional research
publications can be assured with cooperation from
national libraries
26Next steps
- New project funded by Royal Librarys Department
for National Co-ordination and Development
(BIBSAM) - Examine and evaluate current solutions
- Develop and implement a generalized archiving
workflow between a local repository and a
national archive focusing on the variety of
publishing platforms and systems
27More information
- http//publications.uu.se/
- http//publications.uu.se/epcentre/
- http//publications.uu.se/conferences/ecdl2003/arc
hiving_ECDL_2003.pdf