CS 502: Computing Methods for Digital Libraries - PowerPoint PPT Presentation

About This Presentation
Title:

CS 502: Computing Methods for Digital Libraries

Description:

A repository is any computer system whose primary function is to store digital ... An archive is a repository that is organized to emphasize the long-term ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 22
Provided by: wya54
Category:

less

Transcript and Presenter's Notes

Title: CS 502: Computing Methods for Digital Libraries


1
CS 502 Computing Methods for Digital Libraries
  • Lecture 22
  • Repositories

2
Administration
Final examination May 19, 2000 100 - 230pm
Phillips 219 5 or 6 questions on whole
course Do you want an open laptop
examination? There will be a make up examination
near the beginning of the examination period.
Please send me email if you might wish to take it.
3
Administration
Discussion class, Wednesday April 19 One
class only, from 730 to 830 p.m. Online survey
http//create.hci.cornell.edu/cssurvey.cfm
4
Repositories
Definitions A repository is any computer system
whose primary function is to store digital
material for use in a library. An archive is a
repository that is organized to emphasize the
long-term preservation of information.
5
Requirements 1
Information hiding Internal organization
should be hidden from client computers.
6
Repository layers and interfaces
Persistent Store
Store API
Object Management Layer
Shell API
Interface
External Interface
Clients
7
Requirements 2
  • Object models
  • Support for a flexible range of object models.
  • Few restrictions on data, metadata, external
    links, and internal
  • relationships.
  • New categories of information do not require
    fundamental
  • changes to other aspects of the digital
    library.

8
Multiple disseminations
  • Client can access a choice of forms of digital
    object
  • Format -- PDF or HTML
  • Performance -- 8 bit/pixel or 24 bit/pixel color
  • Content -- thumbnail, medium-resolution,
    high-resolution
  • Repository might store alternative disseminations
    or derive
  • them when requested.

9
Dynamic content
  • Dissemination is produced by executing code at
    time client
  • makes request
  • Real-time sensor, e.g., traffic camera, satellite
    picture
  • User characteristics, e.g., location, user
    profile
  • Dissemination is intrinsically dynamic, e.g.,
  • simulation
  • virtual reality
  • computer program
  • Java applet

10
Metadata
  • Metadata can be linked to digital object
  • external catalog or index
  • embedded in the digital object
  • generated at run time
  • Granularity of metadata
  • collection of digital objects
  • digital object
  • element of digital object

11
Requirements 3
  • Open protocols and formats
  • Clients use well-defined protocols, data
    types, and formats.
  • Architecture must allow incremental changes
    of protocols.
  • Access management
  • Allow a broad set of policies
  • All levels of granularity
  • Prepared for future developments.
  • Reliability and performance
  • Very large volumes of data
  • Absolutely reliable in retention of data
  • Good performance

12
Repository systems
Core Repository
13
Repository systems
Core Repository
Load Services
14
Repository systems
Core Repository
Presentation Services
Load Services
15
Common repository systems
  • Web server
  • File-based object model plus hyperlinks
  • Good tools for access
  • Weak on long-term preservation
  • Relational database
  • Table-based object model -- schema and data
    dictionary
  • Good tools for data management
  • Used for long-term preservation in data processing

16
Dumb and smart objects
  • Smart repositories objects
  • behaviors provided by the repository
  • e.g., relational database
  • Smart clients
  • behaviors provided by the client
  • e.g., web server
  • Smart objects
  • repository is very simple
  • digital objects provide their own behaviors
  • compare with object-oriented programming (data
    code)

17
Example CNRI repository
  • Dumb repository for access to digital objects
  • All information stored as typed data in digital
    objects.
  • A single digital object has both data and
    metadata.
  • Identification of digital objects is by
    location independent, persistent URNs.
  • Access controls built into methods for
    accessing digital object.

18
Repository Access Protocol (RAP)
  • RAP is a simple protocol with two main
  • groups of commands
  • Deposit digital object
  • Verify digital object
  • Delete digital object
  • Edit digital object
  • Access digital object
  • Access metadata

19
Repository layers and interfaces
Persistent Store
Store API
Object Management Layer
Shell API
RAP Interface
RAP Interface
RAP Command
20
Client and repository architectures
Store
End Client
Digital Object Processing
Object Persistence
Object Management
Object Management
Client
Repository
RAP Interface
RAP Interface
RAP Requests
RAP Replies
ORB
21
Components
  • Hardware
  • Repository Sun Sparc with Solaris or IBM
    RS/6000 with AIX.
  • Software
  • Communications CORBA/IIOP distributed object
    system.
  • Repository shell and object management layer
    CORBA and Python.
  • Persistent store Unix file system, Oracle,
    Shore.
  • Client CGI scripts, Java applets.
Write a Comment
User Comments (0)
About PowerShow.com