Digital Preservation Case Study: National Archives and Records Administration - PowerPoint PPT Presentation

1 / 8
About This Presentation
Title:

Digital Preservation Case Study: National Archives and Records Administration

Description:

Digital Preservation Case Study: National Archives and Records Administration Paul Watry University of Liverpool NARA prototype World's largest digital preservation ... – PowerPoint PPT presentation

Number of Views:328
Avg rating:3.0/5.0
Slides: 9
Provided by: livAcUkm
Category:

less

Transcript and Presenter's Notes

Title: Digital Preservation Case Study: National Archives and Records Administration


1
Digital Preservation Case Study National
Archives and Records Administration
  • Paul Watry
  • University of Liverpool

2
NARA prototype
  • World's largest digital preservation initiative
  • Runs until 2012 with 350m spend.
  • Deals with all aspects of information life-cycle
    management
  • Looks at any data, not just HTML or web-based
    data.

3
Major issues
  • Preservation of records is a two-fold problem
  • First, the preservation of the original record
    and making it available over time
  • Second, creating a digital representation of the
    original record in a software independent fashion
    and using this proxy for the original record.
  • The technologies required to solve these twin
    challenges come from the data grid, persistent
    archive, and digital library communities.

4
Combination of technologies
  • Data Grids provide the means to manage
    information (the actual bits that comprise
    documents).
  • We are talking about petabytes of data.
  • Digital Library technologies provide ways of
    accessing data (finding the documents you
    want).
  • Persistent Archives technologies provide ways of
    presenting (or viewing) and manipulating
    documents.
  • NOT simply the web (email, scientific simulation
    programs, etc.)

5
Joint Work Liverpool and SDSC
  • SDSC is providing the data grid technologies
    through the SRB (pronounced serb).
  • Liverpool is providing the digital library and
    persistent archive technologies (Cheshire and
    Multivalent systems).
  • San Diego provides the capability of storing and
    keeping track of the data (including replication)
    over distributed supercomputers
  • Liverpool provides the capability of finding,
    presenting, and reusing the data held in these
    large repositories.

6
Establishing Provenance or Chain of Custody
  • Automating archival processes
  • Ingest into the data grid. Who created data? What
    format? Who has access to view or edit data?
  • Developing collection management systems
  • How do you keep track of the data in distributed
    data grids?
  • Developing digital ontologies
  • Once you have stored the data, how do you
    retrieve it? You will need to view it where there
    is no software? How do you reuse it?

7
Obsolescence of digital document data formats
  • Big problem how do you view documents for which
    there is no software.
  • Strategies in the past migration, emulation,
    universal format (PDF).
  • Problems
  • Migration degrades data over time (20 years)
  • Emulation you have to keep a room full of
    functioning machines.
  • Universal format get a life

8
Multivalent preservation architecture
  • A browser which will read documents from the
    original bitstream through the use of media
    adapters.
  • Does this by transforming the document into
    browser's internal structure.
  • We can then present and manipulate the document
    independently of the software or infrastructure
    that created it.
  • The manipulation capability means that we can
    reuse legacy documents in any number of ways.
Write a Comment
User Comments (0)
About PowerShow.com