Title: Preservation Strategies in the North Carolina Geospatial Data Archiving Project (NCGDAP) NCSU Libraries Steve Morris Head of Digital Library Initiatives
1Preservation Strategies in the North Carolina
Geospatial Data Archiving Project (NCGDAP)NCSU
Libraries Steve Morris Head of Digital
Library Initiatives
Digital Preservation in State Government Best
Practices Exchange 2006
2Overview
- Digital geospatial data preservation issues
- Technical solutions
- Organizational/cultural solutions
3NC Geospatial Data Archiving Project
- Partnership between university library (NCSU) and
state agency (NCCGIA), with Library of Congress
under the National Digital Information
Infrastructure and Preservation Program (NDIIPP) - One of 8 initial NDIIPP partnerships (only state
project) - Focus on state and local geospatial content in
North Carolina (state demonstration) - Tied to NC OneMap initiative, which provides for
seamless access to data, metadata, and
inventories - Objective engage existing state/federal
geospatial data infrastructures in preservation
4Targeted Content
- Resource Types
- GIS vector data
- Digital orthophotography
- Digital maps
- Tabular data
- Content Producers
- Mostly state, local, regional
- Some university, commercial
- Selected local federal projects
5Todays geospatial data as tomorrows cultural
heritage
Future uses of data are difficult to anticipate
(as with Sanborn Maps).
6Risks to Digital Geospatial Data
- Producer focus on current data
- Time-versioned content generally not archives
- Future support of data formats in question
- Vast range of data formats in use--complex
- Shift to streaming data for access
- Archives have been a by-product of providing
access - Preservation metadata requirements
- Descriptive, administrative, technical, DRM
- Geodatabases
- Complex functionality
7Different Ways to Approach Preservation
- Technical solutions How do we preserve acquired
content over the long term? - Cultural/Organizational solutions How do we make
the data more preservableand more prone to be
preservedat point of production?
8Vector Data Format Options
- Option A use an open format and have a really
unfortunate transformation and limited vendor
support for the output object - Option B use closed format but retain the
original content and count on short- and
medium-term vendor support. - Option C do both to buy time and look for an
open, ASCII solution. (watch GML activity) - No sweet spot, just an evolving and changing mix
of - flawed options that are used in combination.
9Preserving Cartographic Representation
Counterpart to the map is not just the dataset
but also models, symbolization, classification,
annotation, etc.
10Preserving Geodatabases
- Spatial databases in general vs. ESRI Geodatabase
format - Not just data layers and attributesalso
topology, annotation, relationships, behaviors - Growing use of geodatabases by municipal, county
agencies - Some looking to Geodatabase as archive platform
(in addition to feature class export) - ESRI Geodatabase archiving approaches
- Feature Class Export, XML Export, Geodatabase
History, File Geodatabase, Geodatabase
Replication
11Harnessing Geospatial Web Services
Image atlases from WMS services? Capturing
cartographic representation? Recording records
from decisions-making processes? Later data
transfer via WFS GML?, Other?
12Project Repository Approach
- Interest in how geospatial content interacts with
widely available digital repository software - Focus on salient, domain-specific issues
- Challenge remain repository agnostic
- Avoid imprinting on repository software
environment - Preservation package should not be the same as
the ingest object of the first environment - Tension between exploiting repository software
features vs. becoming software dependent
13Organizational/Cultural Approaches
Provide feedback to producer organizations/ inform
state geospatial infrastructure
Take the data as is, in the manner in which it
can be obtained
Wrangle and archive data
Note the Project in North Carolina Geospatial
Data Archiving Project the process, the
learning experience, and the engagement with
industry and infrastructure are more important
than the archive
14Points of Engagement with Spatial Data
Infrastructure
- Framework data communities
- Snapshot frequency, naming schemes,
classification, GML application schemas, format
strategies - Metadata standards and outreach
- Persistent identifiers, versioning, feedback on
metadata quality - Content replication/transfer
- For data improvement projects, disaster
preparedness, aggregation by regional service
providers, and archives - Where does archiving and preservation fit in?
15Points of Engagement with the Open Geospatial
Consortium (OGC)
- Geography Markup Language (GML) for archiving
(PDF/A version of GML?) - GeoDRM
- Adding preservation use cases
- Content Packaging
- Will there be an industry solution?
- Web Map Context Documents
- Can we save data state as well as application
state? - Content Replication
- Is this a layer in the overall architecture?
- Persistent Identifiers
16Points of Engagement with Industry
- Software vendors
- Better support for temporal data management
- Tools for retrospective data conversion
- Web mashup and open source communities
- WMS caching schemes
- Standard tiling schemes with temporal component?
- Data vendors
- Cultivate market for older data (scaled pricing?)
- Tech transfer on archiving practices?
17Project Status
Cultivating a market for older data.
18Project Status
Cultivating tools for retrospective conversion.
19Conclusion
- Geospatial data is complex, introducing manifold
challenges to ingest processes and repository
development - Vector data and spatial databases are especially
complex - Geospatial data exists in very large quantities
and is subject to frequent update - Need to engage industry in the solution
- Need to engage point of production
20Questions?
Contact Steve Morris Head, Digital Library
Initiatives NCSU Libraries Steven_Morris_at_ncsu.edu
Web site http//www.lib.ncsu.edu/ncgdap/