Creating Working Digital Libraries - PowerPoint PPT Presentation

View by Category
About This Presentation
Title:

Creating Working Digital Libraries

Description:

Creating Working Digital Libraries Howard Besser UCLA School of Education & Information http://www.gseis.ucla.edu/~howard – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 48
Provided by: nyu65
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Creating Working Digital Libraries


1
Creating WorkingDigital Libraries
  • Howard Besser
  • UCLA School of Education Information
  • http//www.gseis.ucla.edu/howard

2
Creating WorkingDigital Libraries-
  • Moving from Digital Collections to Digital
    Libraries
  • Interoperability
  • Importance of Standards
  • Longevity
  • Best Practices for Managing Digital Projects
  • Some Wild Musings

3
Moving from Digital Collections to Digital
Libraries
  • Whats the difference?
  • Recent history of Library Automation-

4
Developmental Stages
  • Experiment with methods
  • Build real operational systems
  • Build interoperable operational systems

5
Traditional Digital Library Model
6
Ideal Digital Library Model
7
Developmental Stages
  • Experiment with methods
  • Build real operational systems
  • Build interoperable operational systems
  • For DL Initiatives
  • For OPACs
  • For I A Services
  • For Image Retrieval

8
Key problems were facing
  • Discovery
  • Interoperability-
  • Longevity-

9
For Interoperability Digital Libraries Need
Standards
  • Descriptive Metadata for consistent description
  • Discovery Metadata for finding
  • Administrative Metadata for viewing and
    maintaining
  • Structural Metadata for navigation
  • ... Terms Conditions Metadata for controlling
    access...

10
Metadata is not just indexing terms
  • CBIR attributes used for retrieval on color,
    shape, texture, etc.
  • Structural attributes used for page-turning
  • Administrative attributes used for managing a
    digital work over time
  • IPR attributes to limit unauthorized use
  • Identification attributes to determine what
    application software is needed to view a
    particular digital work
  • Can be located anywhere

11
Why are Standards and Metadata consensus
important?
  • Managing digital files over time
  • Longevity
  • Interoperability
  • Veracity
  • Recording in a consistent manner
  • Will give vendors incentive to create
    applications that support this

12
Why Standards?
  • Why do we need standards?
  • To make information universally available to
    users
  • facilitate sharing and interchange of
    information
  • To preserve information (make it safe from
    changes in hardware and software)
  • Standards only work if communities widely accept
    them, but theyre necessary for communities to
    work together

13
Serious Longevity Problems
  • What we know from prior widespread digital file
    formats
  • Images separating from their metadata
  • Inaccessibility of software needed to view an
    image
  • Inability to even decode the file format of an
    image

14
Journal Archiving
  • License, dont own may not be even able to
    obtain right to make archival copy
  • Increasingly no paper back-up at all
  • Usually we dont have the important redundancy
    factor
  • Stanfords LOCKSS Project (Lots of Copies Keeps
    Stuff Safe) and its problems (http//lockss.stanfo
    rd.edu)

15
The Short Life of Digital Info Digital Longevity
Problems-
  • Disappearing Information
  • The Viewing Problem
  • The Scrambling Problem
  • The Inter-relation Problem
  • The Custodial Problem
  • The Translation Problem

16
The Viewing Problem
  • Digital Info requires a whole infrastructure to
    view it
  • Each piece of that infrastructure is changing at
    an incredibly rapid rate
  • How can we ever hope to deal with all the
    permutations and combinations

17
The Scrambling ProblemDangers from
  • Compression to ease storage delivery
  • Container Architecture to enhance digital commerce

18
The Inter-relation Problem
  • -Info is increasingly inter-related to other info
  • -How do we make our own Info persist when it
    points to and integrates with Info owned by
    others?
  • -What is the boundary of a set of information (or
    even of a digital object)?

19
The Custodial Problem
  • How do we decide what to save?
  • Who should save it?
  • How should they save it?
  • -methods for later access emulation, migration,
    etc.
  • -issues of authenticity and evidence

20
The Translation Problem
  • Content translated into new delivery devices
    changes meaning
  • -A photo vs. a painting
  • -If Info is produced originally in digital form
    in one encoded format, will it be the same when
    translated into another format?
  • Behaviors

21
Pieces of the Solution (1/2)
  • -We need to insist upon clearly readable
    standardized ways for digital objects to
    self-identify their formats
  • -We should discourage scrambling
  • -We need to better understand information
    inter-relates to other Info, and what constitutes
    boundaries of Info objects

22
Pieces of the Solution (2/2)
  • -People and organizations wishing to make
    information persist need guidelines of how to go
    about doing it
  • -We need to better understand how translating
    from one storage or display format to another
    affects the meaning of a work
  • -We need to save the behaviors of a digital
    object, not just its contents

23
Metadata can be the first line of defense
  • Can tell you
  • where the file is (if you cant find the file)
  • where more info about the file is (if you have
    the file but most other metadata has become
    separated)
  • what the file format is
  • what the compression scheme is
  • what application program and version is needed
    for the file

24
Groups Working onthe Big Longevity
Problemhttp//sunsite.Berkeley.EDU/Imaging/Databa
ses/Longevity/
  • CPA Task Force
  • Getty Time Bits Conference follow-up
  • NEDLIB, CURL, Michigan
  • Internet Archive
  • Long Now

25
Migration/Refreshing
  • Impact on evidential value

26
Best Practices for Managing Digital Projects-
  • Who will your users be?
  • Best Practices Guidelines
  • Workflow and Management Issues

27
Why are you Managing this Information?
  • Organizational mission type
  • Users
  • Uses

28
Scanning Best Practices
  • Think about users (and potential users), uses,
    and type of material/collection
  • Scan at the highest quality that does not exceed
    the likely potential users/uses/material
  • Do not let todays delivery limitations influence
    your scanning file sizes understand the
    difference between digital masters and derivative
    files used for delivery
  • Many documents which appear to be bitonal
    actually are better represented with greyscale
    scans
  • Include color bar and ruler in the scan
  • Use objective measurements to determine scanner
    settings (do NOT attempt to make the image good
    on your particular monitor or use image
    processing to color correct)
  • Dont use lossy compression
  • Store in a common (standardized) file format
  • Capture as much metadata as is reasonably
    possible (including metadata about the scanning
    process itself)

29
Why Scale is important
30
Digital Object Behaviors
  • Book example

31
Metadata Standards(from MOA2)
  • Administrative Metadata
  • for enhancing resource management
  • Structural Metadata
  • for reflecting internal hierarchies and
    relationships btwn parts
  • Raw/Seared/Cooked

32
Workflow and Management Issues-
  • Managing multiple image files
  • Persistent Identification
  • Making your works accessible throughout the Net

33
The number of variant forms of a work can be
enormous
  • different views of the same object
  • different scans of the same photo
  • different resolutions
  • different compression schemes
  • different compression ratios
  • different file storage formats
  • different details of the same image
  • ...

34
Image Families
35
Identification/Provenance
  • how to deal with different versions (browse,
    hi-res, medium res) derived from the same scan or
    different encoding schemes (TIFF, PICT, JFIF)
  • Vocabulary Standards to express this
  • VRA Surrogate Categories
  • CIMI's "Image Elements

36
Persistent IDs--the Problem
  • Need to separate work ID from work location
  • URNs probably wont be ready until 2003
  • Becomes a business process issue when one
    organization maintains the resource and another
    organization references it (ie. licensed from
    vendors or managed by separate administrative
    structures)

37
More Persistent IDs--the Approach for today
  • PURLs
  • Handles
  • HTTP redirects
  • And worry about costs now and conversion costs
    when URNs become feasible

38
Data Set ManagementMore issues with referencing
IDs
  • References for mirror sites
  • References for back-up sites when main site is
    down or bottle-necked
  • References for off-site copies and archival copies

39
Making your works accessible throughout the Net
  • The DLF/Mellon meeting
  • An administrative and political issue as much as
    a a technical one

40
Some Wild Musings-
  • Movement towards packages and away from MARC
  • The disappearance of OPACs

41
Containers and Packages of MetadataWarwick, not
MARC
  • modular
  • overlapping
  • extensible
  • community-based
  • designed for a networked world to aid commonality
    btwn communities while still providing full
    functionality within each community

42
DC Qualifiers
  • allows one community to express important nuances
    and qualifications, while still making the basic
    importance available to communities with simple
    needs
  • our community can reflect alternate title,
    transliterated title, and main title, yet they
    will all be found under a simple Web search under
    title

43
Crosswalks
  • mapping btwn differing metadata structures
  • eliminate the need for monolithic, universally
    adopted standards
  • focus on flexibility and interoperatiblity
  • RDF-based metadata registries

44
Crosswalk Example
45
Do we still need OPACs?
  • Why repeat almost identical bibliographic
    descriptions in each local system?
  • Why not store only local information locally, and
    link to bibliographic descriptions stored in the
    major utilities?
  • Could our acquisition systems for monographs
    begin to use the acquisition systems imposed on
    us by our parent organizations (like those for
    supplies)?

46
Creating WorkingDigital Libraries-
  • Moving from Digital Collections to Digital
    Libraries
  • Interoperability
  • Importance of Standards
  • Longevity
  • Best Practices for Managing Digital Projects
  • Some Wild Musings

47
Creating Working Digital Libraries
  • Howard Besser
  • UCLA School of Education Information
  • http//www.getty.edu/gri/standard/intrometadata/
  • http//www.ifla.org/II/metadata.htm
  • http//sunsite.Berkeley.EDU/Imaging/Databases/sta
    ndards
  • http//sunsite.Berkeley.EDU/moa2/
  • http//sunsite.Berkeley.EDU/Longevity/
  • http//purl.oclc.org/metadata/dublin_core/
  • http//www.gseis.ucla.edu/howard/image-meta.html
  • http//www.gseis.ucla.edu/howard/Metadata/UC-May0
    0/
  • http//sunsite.berkeley.edu/Metadata/sp2000.html
  • http//www.gseis.ucla.edu/howard/
About PowerShow.com