An example of data integration in a distributed network environment - PowerPoint PPT Presentation

Loading...

PPT – An example of data integration in a distributed network environment PowerPoint presentation | free to download - id: 24d7be-ZDc1Z



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

An example of data integration in a distributed network environment

Description:

Museum of Vertebrate Zoology. Open-access Distributed ... University of California Museum of Vertebrate Zoology ... University of Michigan Museum of Zoology ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: An example of data integration in a distributed network environment


1
  • An example of data integration in a distributed
    network environment
  • Barbara Stein
  • Museum of Vertebrate Zoology

2
Open-access Distributed Databases
  • ...not such a new idea
  • 1993 FishGopher (gopher server)
  • 1997 Neodat II (data warehouse)
  • 1998 N.A. Bird Data Network (Z39.50)
  • 1998 REMIB (TCP/IP secure sockets)
  • 2000 FishNET (Z39.50 XML)
  • All of these efforts were taxon-based

3
MaNIS Goals
  • Facilitate open access to combined specimen data
    from a web browser
  • Enhance the value of specimen collections
  • Conserve curatorial resources
  • Use a design paradigm that could be easily
    adopted by other disciplines

4
Institutional Considerations
  • Design of the network must benefit participants
    as well as the larger user community
  • Institutions must be able to retain their
    current database management systems
  • Institutions must be able to retain control over
    which of their data are accessible
  • Institutions must be able to document network
    use of their collections data

5
Design Considerations
  • Architecture must be simple, low cost, and
    require minimal maintenance
  • No visible long-term support for the network or
    its participants
  • Known opposition within the community to
    centralization of operations
  • Uncertain availability of in-house technical
    expertise

6
DiGIR
Distributed Generic Information Retrieval DiGIR
is a software application (i.e., a protocol) that
specifies how requests and responses issued
across a network are formulated.    MaNIS was
the first functional implementation of DiGIR and
became a driving force behind its development.
7
DiGIR Goals
  • A network protocol that would serve as a standard
    among natural history databases...
  • Avoid multiple incongruous development efforts
  • Pool resources achieve economies of scale
  • Create a support community of experts
  • Solve scalability problems
  • Ensure easy adoption by any discipline with
    similar needs

8
Design Approach
  • Use open protocols and standards, such as HTTP
    and XML
  • Let user communities define the structure of
    their data without requiring changes to the
    networking protocol or presentation software
  • Make new data provider installations as easy as
    possible
  • Develop open source software with GNU (free
    General Public Licensing)

9
Standards are Paramount
  • The mammal community was ready...
  • MSW (Mammal Species of the World)
  • Documentation standards for data processing
  • DwC2 (Darwin Core Version 2)
  • Georeferencing Guidelines

10
Steps in Development of MaNIS
  • Collaborative georeferencing of locality data
  • Creating the network software
  • Connecting institutional databases to the network 

11
Distribution of Origin of Mammal Specimens
Africa a large institution (FMNH) holds the
majority of specimens, but holdings may be
biased or incomplete Oceania a smaller
collection (BPBM) may be crucial to
biogeographic investigations Mesoamerica to
neglect any one institution might be a serious
omission
12
Distribution of Origin of Mammal Specimens
13
Steps in Development of MaNIS
  • Collaborative georeferencing of locality data
  • Creating the network software
  • Connecting institutional databases to the network 

14
Key Features of the MaNIS Network
  • There is no central repository or server
  • Institutions retain control over public access
    to their data without changing their in-house
    dbms
  • Software is optimized for query performance
  • In-house dbms protected from traffic and
    intrusion
  • Each data provider automatically maintains
    summary data (i.e., counts of specimen records),
    in addition to specimen data from its
    institutional database

15
MaNIS Network Diagram
MaNIS DiGIR Web Portal
MaNIS DiGIR Web Portal
MaNIS DiGIR Web Portal
MVZ-MaNIS Presentation Layer
UMNH-MaNIS Presentation Layer
UWBM-MaNIS Presentation Layer
16
Steps in Development of MaNIS
  • Collaborative georeferencing of locality data
  • Creating the network software
  • Connecting institutional databases to the network 

17
Keys to Success
  • Shared goals
  • Standards
  • Collaboration
  • heightened sense of community
  • recognition of the enormous value of combined
    data, i.e., the whole
  • large and small collections are now recognized
    for their respective contributions to that whole
  • appreciation of what can be gained has replaced
    a sense of competition
  • Trust
  • all business public
  • all participants equal

18
Impact of the Network
  • Conservation Aid resource managers and provide
    new tools for solving the biodiversity crisis
  • Research Encourage development of new
    applications for ecological analysis and
    synthesis
  • Education Make possible educational use of
    specimen data
  • Collections management Increase use of
    collections while conserving curatorial resources

19
MaNIS Developers
John Wieczorek (lead MaNIS programmer) PJ
Schwartz (DiGIR portal) Dave Vieglais (DiGIR
provider) Reed Beaman Stan Blum Renato
Giovanni Collaborators Australia National
Botanical Garden (ANBG) Berkeley Digital Library
Project (DLP) Biological Collection Access
Service for Europe (BioCASE) Committee on Data
for Science and Technology (CODATA) Centro de
Referência em Informação Ambiental (CRIA)Global
Biodiversity Information Facility (GBIF)
Taxonomic Databases Working Group (TDWG)
University of Kansas Biodiversity Research
Center (KUBRC)
20
Funded MaNIS Participants
Bernice P. Bishop Museum  California Academy of
Sciences  Colección Nacional de Mamíferos
(Mexico) Field Museum  Los Angeles County
Museum of Natural History  Louisiana State
University Museum of Natural Science  Michigan
State University Museum Royal Ontario Museum
Texas Tech University Museum  University of
Alaska Museum University of California Museum of
Vertebrate Zoology University of Kansas Natural
History Museum  University of Michigan Museum of
Zoology University of New Mexico Museum of
Southwestern Biology University of Puget Sound
James R. Slater Museum University of Utah Museum
of Natural History University of Washington
Burke Museum Non-funded participants Comisión
Nacional para el Conocimiento y Uso de la
Biodiversidad (CONABIO) Sternberg Museum, Fort
Hays State University University of Kansas
Natural History Museum, Division of Birds
University of Minnesota Bell Museum
21
Project Information
  • MaNIS is an international collaboration among
    mammal specimen collections (http//elib.cs.berkel
    ey.edu/manis)
  • DiGIR is a collaborative open source development
    project on SourceForge (https//sourceforge.net/pr
    ojects/digir)
  • Software and documentation are available on the
    DiGIR web site (http//digir.net)

22
Thank you
23
Distributed Database Networks
  • Discipline-specific
  • FishNet
  • HerpNET
  • ITIS (The Integrated Taxonomic Info. System)
  • MaNIS (The Mammal Networked Info. System)
  • ORNIS (The Ornithological Info. System)

24
Distributed Database Networks
  • International
  • AVH (Australian Virtual Herbarium)
  • BioCASE (Biological Collection Access for
    Europe)
  • CONABIO (Comisión Nacional para el Conocimiento
    y Uso de la Biodiversidad)
  • ENHSIN (European Natural History Science
    Information Network)
  • GBIF (Global Biodiversity Information Facility)
  • REMIB (Red Mundial de Información Sobre
    Biodiversidad)

25
Georeferencing Reference
Wieczorek, J., Q. Guo and R.J. Hijmans. In
press. The point-radius method for georeferencing
locality descriptions and calculating associated
uncertainty. International Journal of
Geographical Information Science.
About PowerShow.com