Title: A LOOMING CRISIS: MAINTAINING ACCESS TO ELECTRONIC RESEARCH PRODUCTS
1A LOOMING CRISIS MAINTAINING ACCESS TO
ELECTRONIC RESEARCH PRODUCTS
- Daphne Fautin
- University of Kansas
- Gail Kampmeier
- Illinois Natural History Survey
2Electronic PEET Products
- Project web pages
- Images
- Literature - publications, reports, field
journals - Gene sequences and other molecular data
- Character matrices keys
- Databases - data structure
3What Happens
- When project funding ceases
- When project members disperse
- When PIs retire, change research topics, move, or
Who will champion access to the electronic
resources produced by PEETs, AToLs, BSIs, PBIs, ?
4Fate of Our Electronic Resources
- Who should be responsible?
- Institutions originally receiving project
funding? - Funding agencies?
- Those creating the resources?
- Professional societies?
5Issues
- Who owns the products? (not an issue only for
electronic media) - How can the products continue to be served?
- How should the products best be preserved?
6This is a global issue
- Among efforts to grapple with it is the 2005
National Science Board Report 05-40 -
www.nsf.gov/pubs/2005/nsb0540
(NPR this morning on electronic art and art
museums)
7Issues
- Who owns the products? (not an issue only for
electronic media) - How can the products continue to be served?
- How should the products best be preserved?
8Archiving
- LIBRARIES have historically
- been the repository of scholarly output (
publications) - MUSEUMS have been custodians of specimens
- Some other physical objects
- end up in TRADITIONAL ARCHIVES
9Archiving
- WHICH products should be preserved
- HOW should they be preserved
- WHERE should they be preserved
- Locally, supercomputers, electronic archives,
etc.
Metadata retrieval requires excellent
documentation Software versions a practical
challenge, not a technical one (remember Gene
Stoermer!)
10Electronic PEET Products
- Project web pages
- Images
- Literature - publications, reports, field
journals - Gene sequences and other molecular data
- Character matrices keys
- Databases - data structure
11Internet Archive
12Mr. Peabodys WayBack Machine
13Caveats Pages Not Archived
- Anything requiring interaction with the server
- Forms, database-generated content
- Javascript not resolving in true URLs
- Server-side image maps
- Pages with robot exclusion headers (robots.txt)
- Orphan pages (no links into)
- Unknown sites
14Electronic PEET Products
- Project web pages
- Images
- Literature - publications, reports, field
journals - Gene sequences and other molecular data
- Character matrices keys
- Databases - data structure
15Images
- Scanned
- Resolution
- Format standard TIF?
- Produced digitally
- Format evolution of production software if not
saved as flat TIF
16Electronic PEET Products
- Project web pages
- Images
- Literature - publications, reports, field
journals - Gene sequences and other molecular data
- Character matrices keys
- Databases - data structure
17Literature, Reports, Field Journals...
- Issues similar to images
- Format evolution
- Media migration
- Metadata for retrieval
- OCR for finding individual items
- Solutions are library-like, requiring recurring
infusions of -
- Personnel
- Migrate as formats evolve, versions change
- Time
- Digital lifetime determination
18Literature, Reports, Field Journals...
19Electronic PEET Products
- Project web pages
- Images
- Literature - publications, reports, field
journals - Gene sequences and other molecular data
- Character matrices keys
- Databases - data structure
20Gene sequences and other molecular data
21Electronic PEET Products
- Project web pages
- Images
- Literature - publications, reports, field
journals - Gene sequences
- Character matrices keys
- Databases - data structure
22Character Matrices Keys
- DELTA/INTKEY (example of standard in danger of
format evolution) - Lucid (now in Version 3.4)
- MacClade
- PAUP
- Hennig86
- MorphoBank
- Others
23Relational Databases Content Structure
- Archiving
- Metadata essential for discovery
- Convert to flat files
- Software-independent format (e.g. comma
delimited) - Lose relational structure but relationships can
be coded
24Relational Databases Content Structure
- Continued service
- Version changes
- High maintenance (some require professional DBA)
- One size generally does not fit all makes it
difficult to pass on - Maintain also front end (required for queries)
- scripting language e.g. ColdFusion, PHP
25a SILVER BULLETorSILVER BUCKSHOT?
TO MAINTAIN ACCESS TO ELECTRONIC RESEARCH PRODUCTS
- Concentration of resources vs. discovery of new
methods by diversification
26Demonstrate value / usefulness
Hits / citations Can be problematic for
taxonomy / systematics Become part of large
entity
27www.iobis.org
the main provider of marine data to
28- Maintaining functionality
LIBRARIES have been custodians of scholarly
knowledge
A distributed resource PORTAL CONTRIBUTORS
OBIS GBIF FishBase Consortium
Individuals Institutions
29DIGITAL LIBRARIES
30Develop a clear technical and financial strategy
create policy for key issues consistent with the
technical and financial strategy.
- The Foundation should actively engage with the
community to ensure that community policies and
priorities are established and then updated in a
timely way.
www.nsf.gov/pubs/2005/nsb0540
31Recurring Challenges
-
- Personnel
- Time
- Format evolution / back compatibility
- Metadata complete, appropriate (controlled
vocabulary) - Digital lifetime - determining what, if anything,
should be truly discarded
32ITS UP TO US