Applying Computer Science Research in Biodiversity Informatics - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Applying Computer Science Research in Biodiversity Informatics

Description:

Applying Computer Science Research in Biodiversity Informatics ... Caragana arborescens Lam. ( accepted name) Caragana sibirica Medikus (synonym) Checklist 2 ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 19
Provided by: uni61
Category:

less

Transcript and Presenter's Notes

Title: Applying Computer Science Research in Biodiversity Informatics


1
Applying Computer Science Research in
Biodiversity Informatics
  • Some Experiences and Lessons
  • Andrew Jones
  • Cardiff University
  • Andrew.C.Jones_at_cs.cardiff.ac.uk

2
Example predicting effect of climate change
  • Typical question Where might a species be
    expected to occur, under present or predicted
    climatic conditions?
  • Some relevant resource types
  • Data sources
  • Catalogue of life
  • Species Information Sources (SISs)
  • Species geography
  • Descriptive data
  • Specimen distribution
  • Geographical
  • Boundaries of geographical political units
  • Climate surfaces
  • Genetic sequences
  • Analytic tools
  • Biodiversity richness assessment various
    metrics
  • Bioclimatic modelling bioclimatic envelope
    generation
  • Phylogenetic analysis (generation of phylogenetic
    trees)

3
Some potential problems
  • Availability of data
  • Originally constructed for various reasons
    proprietary standards no interoperability
  • Even if on Internet, may be difficult to find or
    use
  • Heterogeneity
  • Representation
  • Granularity
  • Scientific naming issues
  • Flexibility scalability

4
LITCHI background
  • BBSRC/EPSRC- and EU- funded
  • Aim to detect conflicts between species
    checklists and either
  • Assist in producing a consistent checklist, or
  • Generate correspondences between checklists
    (cross-map)
  • Addressed problems presented by species
    classification and naming variations when
    accessing data relating to species

5
LITCHI example
  • Checklist 1
  • Caragana arborescens Lam. (accepted name)
  • Caragana sibirica Medikus (synonym)
  • Checklist 2
  • Caragana sibirica Medikus (accepted name)
  • Caragana arborescens Lam. (synonym)
  • (Lam. Lamark)

A full name which is not a pro-parte name may
not appear as both an accepted name and a synonym
in the same checklist
6
Issues in LITCHI
  • Used constraints (implemented in Prolog) on what
    constituted a consistent checklist
  • Repair standard techniques for integrity repair
    fail to generate some important possible ways of
    repairing violations
  • Our approach to add information that allows us
    to distinguish between categories of constraint
    violation
  • Sicstus Prolog/Visual Basic/Microsoft Access
    solution only worked on some PCs(!)
  • So we reimplemented in Java

7
SPICE for Species 2000
  • Initial project funded by BBSRC/EPSRC follow-on
    by EU
  • The SPICE for Species 2000 project aimed to
  • build a federated catalogue of scientific names
    organised by taxon (species, etc.)
  • accommodate GSD (Global Species Database)
    heterogeneity, autonomy instability
  • ensure scalability
  • Addressed availability of dataheterogeneity of
    representation

8
SPICE internal architecture
9
Implementing SPICE
  • Techniques have included
  • Common Data Model
  • Tightly coupled federation
  • Small set of supported request types (canned
    queries)
  • CORBA and HTTP (CGI)/XML implementations
  • Issues
  • Trade-off between scalability and ease of
    deployment
  • CORBA
  • Scaled up well
  • Useful platform for CAS load balancing
  • Firewall problems heavyweight, unfamiliar
    technique for GSD providers
  • HTTP/XML
  • Less scalable
  • Easily deployed(Some providers simply had to
    modify their existing Webfront end code)

10
GRAB (GRid And Biodiversity)
  • 6 month DTI-funded demonstrator project
  • Project aim
  • Assess Grids potential for collaborative
    research in biodiversity informatics
  • Supporting discovery use of diverse
    biodiversity-related databases
  • Exploring use of Globus SRB middleware

11
(No Transcript)
12
GRAB resource types
Catalogueof life
SIS
Climate
SIS
...
GRAB resource clients
GRAB interface
  • Catalogue of life
  • Scientific common names
  • Species Information System (SIS)
  • Images geography
  • Climate
  • Max/min temperature annual precipitation

13
Issues in GRAB
  • Problems installing Globus research software
  • Essentially wanted to send distributed requests
    receive responses
  • Initial HTTP-based prototype worked well
  • Versions of SRB then available had little to
    offer
  • Globus 2 approach needed canned queries,
    temporary files, etc much more difficult than
    the HTTP prototype

14
The BiodiversityWorld project
  • 3 year e-Science project funded by BBSRC
  • Aim
  • Build a Biodiversity Grid(Problem Solving
    Environment to support Biodiversity research)

15
BiodiversityWorld architecture


User interface


Presentation

Workflow
enactment
Wrapped
Native

engine

resources

Biodiversity
-
Metadata
World
repositor
y

Resources

BGI API


BiodiversityWorld
-
GRID
Interface
(BGI)


The GRID

16
Issues in BDW
  • Globus 3 provides Grid Services, but still
    evolving (WSRF in Globus 4)
  • Trade-off abstraction layer (BGI) including
    invocation mechanism
  • Insulates from change
  • Wraps resources to remove needless heterogeneity
  • Wraps the wrapped resources (!) to insulate from
    infrastructure change
  • Performance penalty
  • Assume computationally intensive applications lie
    in a single BDW resource
  • Hinders interoperation with other
    Grid/Webservices

17
Summary
  • We have applied
  • modern, complex commercial software
  • specialised research software, and
  • Computer Science theory
  • to address biodiversity informatics problems
  • In practice, real-world applications often
    bring limitations in such software techniques
    to light, necessitating (e.g.) compromises,
    trade-offs, work-arounds, extensions to theory,

18
Acknowledgements
  • UK DTI, EPSRC BBSRC EU
  • Collaborators on grants mentioned Universities
    of Southampton and Reading Natural History
    Museum (London)
  • Organisations that have co-operated with these
    research projects, especially
  • Species 2000
  • ILDIS
  • FishBase
  • Hadley Centre for Climate Prediction and Research
Write a Comment
User Comments (0)
About PowerShow.com