Title: The Alliance for Data Archive Technologies: Looking towards a Common Future
1The Alliance for Data Archive Technologies
Looking towards a Common Future
Myron Gutmann, ICPSR Ben Evans, ASSDA Deborah
Mitchell, ASSDA Kevin Schürer, UK Data Archive
2Overview
- Why?
- What?
- Why Now?
- Early Steps
- Understanding Process
- Understanding Needs
- Next Steps
3Why?
- Data curation has been an ad hoc process, with
local practices expertise - Since the 1990s
- Enormous investment in technology
- Significant successes in social science(SDA,
Nesstar, DVN, IPUMS, even ICPSR) - Major new ways to find use content (Google)
architectures to deliver content (web services)
4More Why
- Proprietary systems unsustainable
- Market too small for commercial systems
- Partnerships will help avoid unnecessary
duplication of effort assure efficiency - Need to be truly global
5What?
- New organization to support technologies for
curation, preservation, delivery that are - Open
- Community-developed
- Standards-based
- Built on existing networks of social science data
archives technology centers, and - Open to all who want to contribute
6Why Now? Three Standards
- DDI Metadata Standard
- OAIS Preservation Reference Model
- Repository Architecture Standards - Fedora,
D-Space Duraspace - Organizational models like the DDI Alliance,
CESSDA, Data-PASS (even the new Hathi Trust)
7Why Now? Community Tech
- Community-developed software has become widely
used - Examples Drupal/Plone
- Examples Fedora
- Examples SOLR/Lucene
- But we shouldnt ignore all the challenges that
this software has faced
8Why Now? Workflows
- Improved workflow technologies are operating in
many of our institutions - Some are shared in CESSDA Data-PASS
- And in other communities Virtual Observatory
- Another challenge not the same as sharing
business practices in complex organizations
9Why Now? Progress So Far
- SDA
- Nesstar
- DVN
- All used in more than one archive
- Not all open-source
- Potential shared technologies that we can
leverage in the future
101st Steps October 2008 Meeting
- ICPSR
- ASSDA
- UKDA
- Roper Center - UConn
- Odum Ins. N. Carolina
- Harvard - IQSS
- Minnesota Pop. Center
- Berkeley SDA
- DANS Netherlands
- DDA Denmark
- Gesis ZA
- South Africa
- DDI Alliance
- IASSIST
- Library of Congress
- U.S. NSF
- U.S. NIH
- Canadian SSHRC
Thanks to Library of Congress for hosting
111st Steps After October, 2008
- Solicit needs in the form of wish lists
- Authorize creation of an organization at an
appropriate time - Work on raising money and finding common ground
for future work
12Process Begin with OAIS Model
13Design OAIS for ICPSR
14Focus on Ingest
15ICPSR Standards Compliance
- Ingest tools
- AIP Creation-Validation
- SIP Creation-Validation
- DIP Creation-Validation
- Audit tools
- Tools for full variable-level metadata creation
not dependent on proprietary software (such as
SPSS) - DDI Editor
- DDI Converter
- DDI 2 to 3 translator
16Needs Wish Lists from
- ICPSR
- UKDA
- ASSDA
- Harvard
- Roper Center
- Odum Institute
- DANS (Netherlands)
- DDA (Denmark)
- GESIS (Germany)
- NSD (Norway)
- Minnesota Pop. Center
17Needs A Catalog
Administration
- Identity management
- OAIS workflow audit (SIP/AIP/DIP)
Access
- Data format conversion
- Setup file creation
- International data sharing
- Community data/User comments/Web 2.0
- Search
- Confidentiality
- Persistent identifiers
- Visualization
- Data citation
- Semantic data access
- Security
Data Management
Production
- Open metadata curation
- Data format curation
- Data management analysis
- Qualitative data management
- Data integration
- Metadata registries
- Survey question management
- Data citation
Ingest
- Open metadata curation
- Confidentiality
- Software/algorithm archiving
Archival Storage
- Storage fabric/architecture (FEDORA or ?)
- Replication (LOCKSS)
- Persistent identifiers
- Content model development
18Next Steps Canberra Meeting
- Prime Goal Strategic Planning
- Whats the business model?
- What are the links to
- Standards?
- Security?
- Archiving practice workflows?
- Training Research?
- How do we measure success?
19Three Major Outcomes
- Goal 1 A few critical decisions
- Standards, repository framework, software
approaches - Goal 2 Initial Common Interests. Examples
- Fedora data/content models
- Open source metadata tools (DDI 3?)
- Goal 3 How do we collaborate?
20Thank you!
- gutmann_at_umich.edu
- deborah.mitchell_at_anu.edu.au
- schurer_at_essex.ac.uk
- ben.evans_at_anu.edu.au