Title: GLOBAL BIODIVERSITY
1GLOBALBIODIVERSITY
WWW.GBIF.ORG
INFORMATIONFACILITY
The GBIF Data Repository Tool (New updated
version 3.0)
Hannu Saarenmaa EC CHM GBIF European Regional
Nodes Meeting Copenhagen, 2005-09-15/18
2Outline
- Objectives and background
- Design and installation
- Use
- Demonstration
31. Objectives and background
4Challenges in data sharing
- Eventually, all data sets become orphans
Archiving services are a necessity. - The concept share once, use many requires
available data repositories. - Data from archives must be available using
standard mechanisms to portals such as GBIF. - IPR, confidentiality, and benefit sharing must be
respected at all times.
5Goals of the GBIF Data Repository Tool
- Enable data custodians to manage their data and
control its publishing. - Provide mechanism such that spreadsheets, etc.,
can directly be used for sharing data - Hide the database complexities from users
- Make available a simple data warehouse tool for
those who want to host datasets for the community - I.e., lower the threshold of data sharing as low
as possible.
62. Design
7Functionalities
- Data must be formatted according to the Darwin
Core standard and its extensions in flat
spreadsheet format. - In fact, any flat format will work (rows,
columns) - The system will check and parse the data into
embedded MySQL database that becomes available to
the public as a DiGIR/TAPIR resource. - Owners can control the level of detail released
- Fuzzying of geographic coordinates is available
- Collector names and time periods can be hidden
- Approval of terms and conditions for data use can
be required - Owner can revoke release and update data.
- Metadata can be inherited to data to replace
missing values as defined. - Includes an embedded image server
8Component architecture
9Installation
- For Linux and Windows
- Based on Python, Zope 2.10 and MySQL
- Supports the DiGIR and TAPIR protocols of TDWG
- Turn-key installation
- Fits with directly into the EC CHM software
package
10(No Transcript)
11(No Transcript)
123. Use
13Steps for data owners
- Prepare the data files
- Create a nested folder structure on the
Repository for the collection - Enter default metadata scope (to cover missing
values in data, etc.) - Decide on access policies
- Upload the files
- Publish the data files
14Create a collection and assign it to a data
custodian...
15Enter metadata scope for the collection
(inherited)
16Create the resources (databases) of the
collection and folders
17Upload the files to folder, validate and release
them
18Prepare the data file(s) in tab-separated format
19(No Transcript)
20- Access policy
- options
- Fully open
- Standard GBIF policy of acknow-ledgements
- No direct download and fuzzying for web service
access
21Data is now searchable locally and through the
DiGIR/TAPIR protocols
22(No Transcript)
234. Demonstrationhttp//fmnh.eaudeweb.ro/