Risk Assessment and Cost Appraisal Delos - November 15-16 Academia Nazionale Dei Lincei - Rome

1 / 41
About This Presentation
Title:

Risk Assessment and Cost Appraisal Delos - November 15-16 Academia Nazionale Dei Lincei - Rome

Description:

and new versions can be incompatible with old files. Environmental obsolescence: ... MPEG3, WAV. 0.6. Multimedia. EMF, Draw. 0.4. Vector. XML, HTML. 0.3. Mark ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 42
Provided by: rorym3

less

Transcript and Presenter's Notes

Title: Risk Assessment and Cost Appraisal Delos - November 15-16 Academia Nazionale Dei Lincei - Rome


1
Risk Assessment and Cost AppraisalDelos -
November 15-16 Academia Nazionale Dei Lincei -
Rome
  • Rory McLeod Digital Preservation Manager
  • The British Library, London

2
Overview
  • Risk assessment
  • Background
  • Components
  • Key findings and activities 2007/08
  • Cost appraisal
  • Lifecycle management
  • LIFE1
  • LIFE2

3
Content analysis from 2006- DPC Mind the Gap
  • Over 200 terabytes of data growing by over 50
    terabytes a year
  • Majority is from the sound archive, however
  • Manuscripts 1.5 terabytes
  • Asia, Pacific and Africa 9.5 terabytes
  • European and American 0.5 terabytes
  • Commercial services 5 terabytes
  • Voluntary deposit of electronic publications 1.5
    terabytes
  • Newspapers potentially 70 terabytes at project
    conclusion
  • Microsoft digitisation 30 terabytes
  • Sound recordings 12 terabytes
  • A wide variety of formats are represented in the
    data most common formats are found, but there
    are also smaller amounts of rare and proprietary
    data
  • Update- Based upon the risk assessment we
    estimate the total size to be closer to 300 TB.

4
Background Risk assessment 2007
  • The objective of Digital Preservation Team is to
    address the risk of deterioration of digital
    material through short-term access to long-term
    preservation.
  • Building on a 2003 study
  • When we examined the 2003 risk assessment we
    concluded
  • Having the object isnt enough (CDR)
  • Knowing the format of the object isnt enough
    (EPS)
  • You need software to use it, a computer to run it
    on (Postscript Parser)
  • The functionality and access of the object can
    intimately depend on the details of the
    environment, most of which we dont have.
    (operating system, hardware requirements)
  • Taking those basic concepts a little further..

5
Background - Risk assessment 2007
  • Physical media deterioration
  • The lifetimes of physical media can be measured
    in years (or even months, e.g. recordable CD/DVD)
  • Unlike books which can be kept for centuries in
    the right conditions
  • Technical obsolescence
  • 3.5 floppy disk drives used to be ubiquitous,
    now only a few have them
  • File format obsolescence
  • Keeping the bitstream isnt enough we need to
    understand it
  • Many file formats are undocumented we cant
    understand the files, and need the software
  • Software gets abandoned (who uses WordStar any
    more?) and new versions can be incompatible with
    old files
  • Environmental obsolescence
  • Keeping the software isnt enough we need to run
    it
  • But old hardware doesnt work, or isnt available

6
Background - Risk assessment 2007
  • So our starting point
  • Identify the digital assets currently held
  • Identify the environmental requirements of those
    assets
  • Assess the risks jeopardizing the use of those
    assets
  • React to those risks (save those assets to
    which access no longer exists or will most soon
    be lost)
  • Proactively respond to those risks (prevent
    assets from becoming inaccessible in the future)

7
Background Risk assessment 2007
  • This produced an accurate and detailed digital
    holdings list
  • Be as near to exhaustive as possible
  • Be detailed
  • Physical formats
  • Disk/file-system formats
  • File formats
  • Operating system requirements
  • Application requirements

Not covered in 2003
8
Background Risk assessment
  • From this holdings list, we have performed a risk
    assessment to
  • Enumerate the risks faced
  • Evaluate the likelihood and impact of each risk
  • Rank holdings according to risk
  • Perform risk-based triage on holdings
  • And longer term, from the assessment and ranking
    we will
  • Prioritise ingest into DOM
  • Write preservation plans and take preservation
    actions to target the highest-risk material
  • Possibly Migrate it to less risky file formats
  • Preserve software environments (emulators etc.)
  • Guide future ingest
  • Determine a set of preferred long-term
    preservation formats

9
Components - Risk assessment 2007
  • AS/NZ 43602004 risk standard
  • Follows on from 2003 study
  • Methodology is split into Context, identify,
    analyse, evaluate and treat
  • A new value scale for BL holdings plus DRAMBORA
    an established risk toolkit was used for impact
    (cataclysmic to none)
  • Representative content was analysed from all
    areas of the collections, the first time this has
    been done.
  • 23 different risks were identified, these were
    gathered into 6 direct and 2 indirect risks
  • Media Degradation, Media Obsolescence, File
    format obsolescence, Hardware obsolescence,
    Operating system file system, Software
    obsolescence
  • Poor policy (Cataloguing, Metadata), Poor policy
    other (Handling, Training)

10
Components - Risk assessment 2007
  • The AS/NZS 43602004 Risk Management standard
    defines a seven-step approach to risk management
  • Communicate and consult
  • Communicate and consult with internal and
    external stakeholders as appropriate at each
    stage of the risk management process and
    concerning the process as a whole.
  • Establish the context
  • Stakeholders are identified, and the objectives
    of the stakeholders and the organization as a
    whole are established.
  • Identify the risks.
  • In this stage, the risksthat is, what can go
    wrongare enumerated and described. We used a
    combination of industry analysis and real life
    scenarios.

11
Components - Risk assessment 2007
  • Analyse the risks
  • This step covers the evaluation of the impact of
    the risks, and the likelihood of those risks. The
    evaluation may be qualitative or quantitative
  • Evaluate the analysis
  • At this stage, negligible risks might be
    discarded (to simplify analysis), and evaluations
    (especially qualitative evaluations) adjusted.
  • Treat the risks
  • The options to address the risks are identified,
    the best option chosen, and implemented. This may
    include taking no action if no risk is
    sufficient.
  • This step was felt beyond the remit of this
    assessment project.

12
Components - Risk assessment 2007
  • Monitor and review
  • It is necessary to monitor the effectiveness of
    all steps of the risk management process. This is
    important for continuous improvement. Risks and
    the effectiveness of treatment measures need to
    be monitored to ensure changing circumstances do
    not alter priorities.
  • The assessment also uses the impact scale
    devised in the DRAMBORAi methodology.
  • i Digital Repository Audit Method Based on Risk
    Assessment http//www.repositoryaudit.eu/
  • Summary
  • The first part of the analysis was to create an
    inventory of the digital assets. Each collection
    area was visited and interviewed, and a partial
    audit of their digital material conducted. This
    provided an indicative sample of the current
    state of play within the Library. It is likely
    that continued annual updating of this list will
    form part of the long-term maintenance of the
    analysis.

13
Components - Risk assessment 2007
  • Broad results returned, can be split into
    technical and policy, the headlines are
  • 12 of the 13 case studies returned results
    consistent with the highest category of risk
    identified (Media Degradation)
  • Secondary risks associated with software and
    hardware are less risky but without addressing
    Media Degradation all data is at the same high
    level of risk.
  • Failure rates for disks within the BL collections
    have reached a high level (up to 3)
  • No central store or service for this digital
    content
  • The proposed timeframes for ingest of this
    material mean that an interim solution must now
    be considered to safeguard this material (and
    prepare it for ingest)
  • There is a lack of awareness of the fragility of
    these collection items across the BL
  • There is a need for training in both handling and
    data stewardship skills across the collection
    areas

14
Components - Risk assessment 2007
Risk ranking Risk Access type jeopardized
8 Media degradation Bit-stream
7 Media obsolescence Bit-stream
6 File format obsolescence File/Semantic
5 Hardware obsolescence File/Semantic
4 Operating system file system obsolescence File/Semantic
3 Software obsolescence File/Semantic
2 Poor policy (improper cataloguing, metadata) Semantic
1 Poor policy (other) Semantic/File/Bit-stream
15
Risk assessment- Final Prioritisation
16
Components - Risk assessment 2007
  • BL preservation value system
  • Our obligation to preserve the material
  • Estimates of the cost/effort to mitigate the
    risks
  • Estimates of the resource available to the
    Digital Preservation Team
  • Estimates of the cultural significance and value
    of the collection
  • The commercial significance and value of the
    collection
  • The need for further analysis of the collection
    to inform future preservation activities
  • Reader and researcher needs
  • Interest and demand

17
(No Transcript)
18
Key findings- Risk Assessment 2007
  • DPT needs to create and implement a policy that
    deals with all digital content consistently
  • This reduces the variations seen in how digital
    material is cared for
  • BL needs to move from at-risk physical media to
    online hard disk-based managed storage.
  • This addresses media deterioration, physical
    damage, environmental damage, and media
    obsolescence, and is believed to be the best
    long-term storage mechanism option available
  • This also enhances manageability of the digital
    collection
  • Where migration to hard disk is not immediately
    possible, move to climate controlled (etc.)
    storage to ensure that the physical media last as
    long as possible (and back-up)
  • This reduces the problems due to media
    deterioration, physical damage, and environmental
    damage
  • Failure rates for disks within the BL collections
    have reached unacceptably high levels (up to 3)
    for hand held media.

19
Activities to mitigate 07/08- Risk Assessment 2007
  • Monitor and review
  • DPT will use a continuous improvement approach
    constantly reducing the level of risk
  • Annual update to the risk assessment to
    continuously improve the condition of the
    collection based digital objects
  • Annual identification of resulting actions to
    mitigate risks
  • Management of the digital preservation
    prioritisation table
  • Key performance indicators to be drawn from the
    risk factors within the prioritisation table, to
    be monitored by the digital preservation steering
    group. (Ideally all risk factors should be in a
    continuous process of reduction)
  • Resource Plan
  • DPT will take responsibility for this effort by
    writing a resource plan to establish next stage
    activity. This will involve people, equipment,
    storage and policy issues.
  • Establishment of the British Library centre for
    digital preservation based upon this risk work.
  • This work is already underway.

20
The cost of digitisation and preservation The
LIFE Project
01101101010101011001110100110110101010101100111010
0110110101010101
21
Cost appraisal overview
  • What is the LIFE Project?
  • LIFE1 and LIFE2
  • LIFE Models
  • Burney Case Study
  • Benefits
  • Further Information

22
Lifecycle Information for E-literature
  • Project phases
  • LIFE1 (12 months)
  • LIFE2 (18 months)

23
LIFE starts to answer the question
  • What is the long term costof preserving digital
    material?

24
Why use lifecycle costing?
  • Enables evaluation of all the financial
    commitments for an item in a collection
  • Important for digital collections, where many
    costs are largely unknown

25
Lifecycle management- aims
  • Better understanding of the digital lifecycle
  • Plan and prepare for digital preservation
    activities
  • Evaluate and improve efforts
  • Compare analogue and digital

26
LIFE1 project
  1. Literature Review
  2. Economic Lifecycle Model
  3. Generic Preservation Model
  4. Case Studies
  5. International Conference

27
LIFE1 Case Studies
e-Journals Web Archiving Voluntary Deposit
28
LIFE2
  • LIFE1

29
Aim of LIFE2
  • To evaluate, refine and
  • further develop the techniques
  • developed in phase one of LIFE

30
LIFE2 deliverables
  • Economic Evaluation of LIFE1
  • Revision of the LIFE Model
  • Version 1.1 (October 2007)
  • Version 2 (Summer 2008)
  • Updated Preservation Model (Summer 2008)
  • Final report
  • End of project conference

31
The LIFE Model v1.1
Access
Content Preservation
Bit-stream Preservation
Metadata Creation
Ingest
Lifecycle Stage
Access Provision
Preservation Watch
Repository Admin
Re-use Existing Metadata
Quality Assurance
Lifecycle Elements
Access Control
Preservation Planning
Storage Provision
Metadata Creation
Deposit
User Support
Preservation Action
Refreshment
Metadata Extraction
Holdings Update

Re-ingest
Backup
Reference Linking
Inspection
32
LIFE Model v1.1 Non-lifecycle Elements
Non-Lifecycle Stage Management and Administration Systems / Infrastructure Economic Adjustments
Non-Lifecycle Elements Management Repository Software Inflation
Non-Lifecycle Elements Administration Discounting
33
Generic LIFE Preservation Model
  • The GPM predicted large cost and much activity -
    the challenge is reducing both.

34
Generic LIFE Preservation Model
Preservation cost of n objects of a particular
format for the period 0 to t.
e.g. 200000 objects of the GIF format for a
period of 10 years.
Frequency of action
Tech Watch
Preservation action
Preservation
  • Monitoring formats and software for obsolescence
  • Preservation planning
  • Updating metadata

Q/A
Update object and event metadata
Perform preservation action
Cost of Preservation tool
  • The number of preservation actions within the
    time period calculated

35
Complexity of file formats
Frequency of action
Tech Watch
Preservation action
Preservation

Category Complexity Examples
Simple 0.1 ASCII, Unicode
Bitmap 0.2 JPEG, GIF
Mark-up 0.3 XML, HTML
Vector 0.4 EMF, Draw
Multimedia 0.6 MPEG3, WAV
Document 0.8 Word, PDF
Complex 1 Oracle database dump
  • Size
  • Complexity
  • Proprietary
  • Open
  • Standardised

Q/A
Update metadata
Perform preservation action
Cost of Preservation tool
Format Complexity

36
LIFE2 Case Studies
01101101010101011001110100110110101010101100111010
01101101010101011001110100110110101010101100111010
0110110101010101100111010110
Institutional Repositories Primary
Data Digitised Newspapers
37
The Burney Collection
  • Purchased by the British Library in 1818 for
    13,500
  • 1,100 volumes of the earliest known newspapers
  • 1,000,000 pages from 17th, 18th and 19th
    Centuries.
  • Re-scanning or re-microfilming is not possible.
  • Microfilmed in the 1970s
  • Digitisation started in 1995-96 and ran until
    2004.

38
Questions that arise from Burney
  • Comparing digital and analogue lifecycles
  • What is the lifecycle cost to an institution of
    producing digitised surrogates?
  • What are the key preservation issues common
    across digitisation projects of differing scales?

39
Benefits of LIFE
  • Assess the financial commitment for acquiring or
    creating new digital materials
  • More effective planning for preservation
    activities
  • Comparison of digital lifecycles across
    collections
  • Evaluation and optimisation of existing digital
    lifecycles
  • Predictive future cost of digital preservation

40
LIFE Website Blog
  • Websitewww.life.ac.uk
  • LIFE Blogwww.life.ac.uk/blog

41
Thanks and Acknowledgements
  • Thanks for your attention.
  • Risk Assessment 2007 (Peter Bright and Paul
    Wheatley)
  • LIFE Team (Paul Ayris, Helen Shenton, Paul
    Wheatley and Richard Davies)
Write a Comment
User Comments (0)