GLOBAL BIODIVERSITY - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

GLOBAL BIODIVERSITY

Description:

Sharing and using primary biodiversity data through GBIF. Overview of the GBIF ... Symposium (STAG) in March 2005. Global Biodiversity Information Facility ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 49
Provided by: hannusaare
Category:

less

Transcript and Presenter's Notes

Title: GLOBAL BIODIVERSITY


1
GLOBALBIODIVERSITY
WWW.GBIF.ORG
INFORMATIONFACILITY
GBIF Network as a Model for GISIN
Hannu Saarenmaa AAAS Annual Meeting Washington,
DC, February 20, 2005

2
Outline
  • Sharing and using primary biodiversity data
    through GBIF
  • Overview of the GBIF information system
  • Sharing species information from Species Banks
  • GISIN architecture and GISIN as a Species Bank
  • Conclusion

3
1.Sharing and using primarybiodiversity data
through GBIF
4
GBIFs objective is
  • to establish an distributed information
    infrastructure that serves scientific
    biodiversity data
  • with initial focus on primary data at specimen
    and observation levels, and on names
  • expanding to species-level information,
  • with links to molecular, genetic and ecosystems
    levels
  • to function as a global integrator

5
Pyramid of information
  • Policy and decisions
  • can benefit from
  • Knowledge
  • and
  • Information
  • which depend on
  • Primary data

CHM
Refinement, analysis, synthesis
Other information networks
GISIN
GBIF area of responsibility
6
What primary data exists?
  • 1-3 billion physical specimens in museums
  • Label data to bedigitised
  • 300-400 million digital data records off-line
  • Museums, observation networks, natural resource
    surveys, etc.
  • 46 million records are online today through GBIF
  • Using standard formats

7
What is primary data?
Secondary information
  • Point occurrence data with the basic attributes
  • Identification
  • Location
  • Time

Primary data
Slide by A. Townsend Peterson
8
Predicting geographic distributions with primary
data makes possible ...
  • Projecting species invasions
  • Designing reintroduction programs
  • Understanding the effects of global climate
    change and other types of change
  • Understanding rare and endangered species
    distributions
  • Designing biodiversity conservation plans
  • Many models such as Bioclim, GARP

Slide by A. Townsend Peterson
9
Hydrilla Primary data of native range
Slide by A. Townsend Peterson
10
Hydrilla Native modeled distribution
Slide by A. Townsend Peterson
11
Hydrilla North America
Slide by A. Townsend Peterson
12
Hydrilla North American infestations
Slide by A. Townsend Peterson
13
2. Overview of the GBIF information system
14
User
GBIF component architecture
Metadata and name query
( UDDI )
( UDDI )
Provider query
Index
Index
Portal
Data Portal
Registry
Registry
Request Marshaller
Request Marshaller
Cache
Metadata
Cache
Metadata
Institutions Providers Services
Institutions Providers Services
Available providers
Metadata response
Query Engine
Query Engine
Accounting
Accounting
Publish availability
Metadata and statistics
DiGIR
Full data response
DiGIR
Full data query
Synonyms
SOAP
SOAP
Name provider
Name provider
Data provider
Data provider
HTTP
HTTP
Provider Services
Provider Services
Provider Services
Provider Services
other
other
Resource
Resource
Metadata
Metadata
15
(No Transcript)
16
  • Turn-key packages available implementing
    DiGIR/DarwinCore and BioCASe/ABCD
  • Available for Linux and Windows
  • Supported by helpdesk_at_gbif.org

17
GBIF (prototype) Data Portal
  • Gateway to data of the providers
  • Name service is a part of the data portal
  • Search and browse data by name, country, etc.
  • Drill in and download data, display simple maps
  • Multilingual
  • Maintains a cache of key data in case provider
    goes off-line
  • Opened 6 February 2004
  • Based on Java and MySQL

18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23

protocols
and
  • Simple web services
  • XML messaging between computer applications
  • This is open data sharing -- not data exchange
    with trade partner agreements
  • Enables search retrieval of structured data
  • Enable single point of access (portal/search) to
    distributed information resources
  • Created by the TDWG/CODATA subgroup on biological
    collection data
  • Unified protocol in 2005 Merger of DiGIR and
    BioCASe to TAPIR TDWG Access Protocol for
    Information Retrieval

24
Darwin Core and ABCD data formats
  • Two XML schemata for data exchange available and
    to choose from
  • Darwin Core is a minimal set
  • 48 elements in flat structure
  • Can be extended for instance, curatorial,
    bacteriological, observational...
  • ABCD (Access to Biological Collection Data) is a
    superset
  • 600 elements in hierarchical structure
  • Can describe entire collection

25
Image data standards
  • ABCD can handle links to images now
  • Metadata from Dublin Core
  • Annotations standards of what is in image needed
  • JPEG2000 in future

26
Identification data standards
  • DELTA
  • Standard tied to aging software
  • LUCID
  • Data format less tied to new, evolving software
  • Many electronic key products available
  • SDD Structured Descriptive Data
  • Character description
  • New standard without software yet

27
3. Sharing species information through Species
Banks
  • GBIF is starting to climb the pyramid of
    information

28
Encyclopedia of Life
  • Imagine an electronic page for each species of
    organism on Earth, ...
  • Linking dynamically to data, information, and
    knowledge sources, such as
  • ARKive
  • EcoPort
  • GBIF data and name providers
  • GenBank
  • MORPHOBANK
  • Tree of Life

29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
32
(No Transcript)
33
(No Transcript)
34
Species banks
  • Species home pages mushrooming, but no standard
    exists for species information pages or how they
    can be registered, accessed and virtualised
  • GBIF believes that there is not going to be a
    Species Bank but a distributed cyber-infrastructur
    e for species information and knowledge
  • Integrate information for various uses like
    identification, invasives, pest control,
    taxonomic review, ...
  • GBIF approach (again) is to integrate
  • Standardise sharing of how species home pages and
    their fragements and enable interoperability of
    their providers
  • Symposium (STAG) in March 2005

35
Technical view on Species Bank
  • Species Bank is an idea of federated databases
    serving elementary chucks of knowledge.
  • Example from diagnostic knowledge Descriplets -
    have the form Taxon t has value v for feature
    f.
  • Millions of such statements, not only about
    diagnostics, may form a global Species Bank.
  • They originate from thousands of sources
  • RDF/XML, semantic web, semantic grid, CYC
  • Multimedia, free text must also be supported
  • Link to primary data and modelling

36
Slide courtesy of Kevin Thiele
37
5.GISIN needs and a possible GISIN architecture
  • A specialised Species Bank

38
Some use cases for GISIN
  • Identify IAS at point of entry
  • Decide on control measures
  • Report IAS sighting, trigger alerts
  • Model IAS spread and impact
  • Find expertise and literature on IAS

39
GISIN technology needs
  • Standards for schemata for key data types in
    particular species profile and its fragments
  • Tools for providers TAPIR protocol
  • Federation via a registry - UDDI
  • Integration - portals

40
Database / provider types
  • Six were identified by the Database Content
    Working Group
  • Species profile or fact sheet / diagnostic
  • Specimens (Darwin Core)
  • Observations (Darwin Core extension)
  • Expertise
  • Bibliographic
  • Projects / research

41
Species profile standard
  • Representation of IAS fact sheets and species
    home pages
  • Well-structured
  • Consists of basic species info, plus
    community-specific extensions
  • Distribution, identification, trophic relations,
    naming, expertise, IAS status,
  • Support for distributed authoring
  • Work is underway

42
IAS observation standard
  • Elements important to invasive species science
  • Native and non-native status
  • Pathways of spread
  • Host and parasitic organisms observed
  • Impact, invasiveness, etc.
  • Control techniques used
  • Include other optional fields as guidance for new
    database developers
  • This could be a Darwin Core extension for IAS

43
GISIN registry
  • In a service-oriented distributed architecture,
    dynamic discovery and location independence of
    the services is fundamental
  • Need to have a registry solution as part of the
    architecture
  • Alternatives
  • Dedicated GISIN UDDI with replication to/from
    GBIF
  • Appear in GBIF UDDI as thematic network

44
A possible GISIN component architecture
User
( UDDI )
( UDDI )
Index
Index
Registry
Registry
Knowledge Portal(s)
Cache
Metadata
Cache of data and descriplets
Metadata
Institutions Providers Services
Institutions Providers Services
Accounting
Accounting
Publish availability
Species knowledge provider
Data/name provider
Multimedia
Provider Services
Provider Services
Texts
Resource
Resource
Resource
Resource
Metadata
Metadata
Metadata
Metadata
45
GISIN portal(s)
  • Portals integrate data, information, and
    knowledge
  • Integration of GBIFs primary data has been
    straightforward
  • Finding working models how to integrate
    information and knowledge is the challenge for
    Species Banks and GISIN

46
6.Conclusion
47
What makes GBIF work
  • Standards for data and protocols (and their
    interaction via web services)
  • Control and ownership of data remains with
    providers
  • Registry for advertisement of data
  • Integration at portals
  • GBIF is multi-purpose open-ended
    cyber-infrastructure that enables taxonomists and
    others to serve the society in new ways

48
What can make GISIN work
  • Build on what has made GBIF work, but do
    recognise that...
  • GISIN has more targeted use cases than GBIF
  • How to share and integrate species knowledge is
    still not well known. Research and prototyping
    needed.
  • Think GISIN largely as a Species Bank
Write a Comment
User Comments (0)
About PowerShow.com