Best Practices in Developing a Digital Library Repository Using Fedora at the UVa Library - PowerPoint PPT Presentation

1 / 18

About This Presentation

Title:

Best Practices in Developing a Digital Library Repository Using Fedora at the UVa Library

Description:

... or work, a text volume, an episode of a television news show, an oral history interview) ... the same XML file), or supporting the download of image files. ... – PowerPoint PPT presentation

Number of Views:78

Avg rating:3.0/5.0

Slides: 19

Provided by: lesliej152

Category:

more less

Transcript and Presenter's Notes

Title: Best Practices in Developing a Digital Library Repository Using Fedora at the UVa Library

1
Best Practices in Developing a Digital Library
Repository Using Fedora at the UVa Library

Leslie Johnston
Head of Digital Access Services, University of
Virginia Library
Fedora Users Meeting, Open Repositories 2007

2
Underlying Object Architecture Must be First

The object architecture guides all repository
development steps.
Based on core architectural and service
assumptions, such as these at UVa
We have complex objects with many relationships.
We have simple objects that users just want to
find and use.
Any given resource can be associated with and
presented in any number of contexts.
The Repository will have a public interface to
support discovery and use of the collections by
the UVa community, accompanied by tools for their
use in instruction and research.

3
Underlying UVa Object Architecture

At UVA, an atomic or granular approach
Media Objects (art and architecture images, page
images, video, audio) are the most granular, and
are always children of work objects. The
datastreams are metadata and media files
Work Objects represent a logical object or
concept (a visual resources concept of site or
work, a text volume, an episode of a television
news show, an oral history interview). Work
Objects provide the principal mode of discovery
for Media Objects. Work Objects are not required
to have Media Object children. The datastreams
are metadata and inline content, such as TEI
transcriptions and EAD finding aids
A Work Object can be a child of an Aggregation
Object, representing a real or virtual
collection, a serial publication, a television
news series, etc. The datastreams are metadata
for the aggregation set and rules specific to the
set used to present all or subsets of its members.

4
Underlying UVa Object Architecture

Many relationships are possible
Media Objects can be the children of multiple
Work Objects.
A drawing of a building in a letter can be part
of a visual resources object representing the
structure, as well as the child of a text object
for the letter).
Work Objects may be part of many Aggregation
Objects.
A painting may be part of a collection of work by
that one artist, and part of a virtual collection
representing a particular subject theme.
Aggregation Objects may be the children of other
Aggregation Objects, where appropriate.

5
Disseminators in a Granular Architecture

Media Objects have disseminators that primarily
provide access to the datastreams.
Examples
The uvaDefault disseminator includes a
getDefaultContent behavior which gets whichever
datastream is considered the default view for a
user.
The uvaImage disseminator includes the behaviors
getPreview and getScreen, which simply get
specific datastreams. It also includes a
getImageViewer behavior which gets an image
datastream and wraps it in an applet that allows
manipulation of the image.
The uvaDefault and uvaMeta disseminators include
behaviors that return xml (get) or styled html
versions (view) of descriptive or administrative
metadata, or Dublin Core.

6
Disseminators in a Granular Architecture

Work Objects have disseminators that provide
access to the work and its media children.
Examples
The default disseminator includes the getFullView
behavior which returns a styled presentation of a
text and its child page images, if there are any.
The uvaPageBook disseminator includes the
getPageTurner behavior which gets the child
images and presents them for navigation in a page
turner.
The uvaDefault and uvaMeta disseminators include
behaviors that return xml (get) or styled html
versions (view) of descriptive or administrative
metadata, or Dublin Core.

7
Disseminators in a Granular Architecture

Aggregation Objects primarily have disseminators
that provide access to a set of Work Objects.
Examples
getCollection presents the default view of the
collection.
getMembers returns a list of members all or
only those which meet criteria expressed through
an XPath statement.
getPids returns a list of all PIDS for member
Work Objects.
viewDublin Core presents styled Dublin Core data
for the aggregation.

8
Content Model Development

Not surprisingly, the first step is the
identification of functional requirements
Coordinated in a format-by-format basis involving
research and development staff, repository
operations staff, and public services staff who
are specific format experts.
Identify all functional requirements for end-user
use, including default delivery options,
interaction with the objects through the web
interface, and downloading possibilities.

9
Content Model Development

Inventorying of digital media
Detailed review of the legacy collections as to
media formats, configurations of media files,
available metadata, and metadata format.

10
Content Model Development

Development of production standards
Production standards are developed out of the
inventory of legacy materials and metadata and
the wish list for functional requirements, where
we identify the appropriate formats and
configuration of files to enable the desired
functionality.
The standards must jibe with what we have, what
we can migrate legacy collections to, and
generate in a scaleable way in all future
production.
http//www.lib.virginia.edu/digital/reports/uvalib
_production_standards.htm
http//www.lib.virginia.edu/digital/metadata/

11
Content Model Development

Translation of the output of all those processes
into content models and disseminators
The key to developing content models is limiting
the number of content models you will have for
each format class, such as images or electronic
texts or video. Multiple content models for each
format class are often needed.
Each content model for each format class
corresponds to a necessary variation of the
configuration of media files, which require
different mechanisms for the behaviors and
therefore constitute a different content model.
All content models have a default disseminator,
and metadata disseminator, and an asset
definition disseminator, plus disseminators that
are specific to the media type.

12
Content Model Development

Translation of functional specifications into
content models and disseminators
Each functional requirement for the end-user
interface translates to one or more disseminators
with behaviors that objects should be able to
present, such as delivering subsets of their
content or metadata, delivery of static files or
on-the-fly transformations (such as raw XML
delivery versus delivery of XHTML generated from
the same XML file), or supporting the download of
image files.
Great care was needed to confirm that the desired
functionality for the search and delivery in the
public interface matched behaviors in the content
models, and that the correct number and types of
media file and metadata datastreams were present
to support those behaviors.

13
UVa Content Models

Media Objects
Bitonal
1 Bitonal TIFF datastream
1 static JPEG thumbnail datastream
HighRes
1 MRSID datastream
1 static JPEG screen size datastream
1 static JPEG thumbnail datastream
LowRes
1 JPEG screen size datastream
1 JPEG thumbnail datastream
Work Objects
GDMS (image metadata for works)
EAD (metadata describing archival collection)
GenText (TEI transcription only)
Book (TEI transcription and page images)
PageBook (page images only)
Aggregation Objects Still under review for
final content models

14
Supporting it all Production Workflows

Translation of the production standards and
configuration of content models into scaleable,
sustainable workflows.
Coordinated by research and development staff,
repository operations staff, and production staff.

15
Production Workflows

Subject librarians select all content, with the
additional step of a technical assessment.
Assessment includes review of media formats and
metadata against production standards.
After the assessment, new production and legacy
collection migration are prioritized.
Ease of production, e.g., existing workflows in
place
Need for metadata enrichment or normalization
Time constraints, such as instructional deadlines
Funding availability for staff and infrastructure
Workflows are in place for Images, Electronic
Texts, and EAD Finding Aids. This includes all
steps for handoffs of materials for digitization,
digitization, cataloging and export of cataloging
records from native systems to XML, programmatic
creation of administrative and technical
metadata, QA, handoff to repository operations,
object building and ingest, and indexing for
delivery.

16
(No Transcript)
17
(No Transcript)
18
Ongoing Process

Review and revision of what weve done
architecture, standards, and disseminators.
Dont be afraid to assess and update.
Still adding new format types video, datasets,
and printed music are under way, with audio and
GIS next.

Write a Comment

User Comments (0)