Moving Beyond Ontology Libraries: Integrating and Accessing Biomedical Ontologies to Annotate Experi - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Moving Beyond Ontology Libraries: Integrating and Accessing Biomedical Ontologies to Annotate Experi

Description:

Huge growth in online biomedical data sets ... Zebrafish shh & oep genes. Query: Zebrafish oep gene annotations show nearly all defects seen in human h.p. ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 35
Provided by: virtualg
Category:

less

Transcript and Presenter's Notes

Title: Moving Beyond Ontology Libraries: Integrating and Accessing Biomedical Ontologies to Annotate Experi


1
Moving Beyond OntologyLibraries Integrating and
AccessingBiomedical Ontologies to
AnnotateExperimental Data
  • Daniel L. RubinChris J. MungallSuzanna E. Lewis
  • Monte Westerfield
  • Michael Ashburner
  • Mark A. Musen

2
The biomedical data explosion
  • Huge growth in online biomedical data sets
  • Genomics (genetic sequences, SNPs)
  • Gene expression microarrays
  • Proteomics (mass spectrometry, protein arrays)
  • Tissue arrays, ICH
  • Need for people machines to make sense of
    massive data sets

3
Ontologies have an important role in e-science
  • Ontologies formal and explicit declarations of
    the entities and relationships applications
  • Relevance built by humans, processed by
    machines
  • Capabilities
  • Relate disparate data
  • Enable data summarization, data mining

4
Ontology development is fragmented
  • Separate communities of biomedical researchers
    creating and maintaining ontologies
  • Different model organism databases using
    ontologies to annotate experimental data
  • Bioinformaticians creating algorithms to analyze
    these annotations
  • These activities are not unifiedunification
    could allow
  • Integration with other data
  • Cross-species analysis

5
Problems facing ontology content curation
  • Many different groups/consortia create
    ontologiestheir efforts are uncoordinated
  • Many different ontologies, overlapping content
    and variable quality
  • Ontologies are not interoperable
  • Data integration efforts are laborious barriers
    to accessing and effectively using expanding data
    repositories

6
Problems facing experimental data annotation
  • Growing number of biomedical resources annotate
    data with ontologies (GO, MGED, BioPAX)
  • Current resources confined to using single
    ontology for annotations
  • Difficult to relate different annotation
    repositories to each other

?
?
?
7
NIH has funded a National Center of Biomedical
Ontology
  • Mission Advance biomedicine with tools and
    methodologies for the structured organization of
    knowledge
  • Strategy Develop, disseminate, and support
  • Open-source ontology development and data
    annotation tools
  • Resources (OBO, OBD) enabling scientists to
    access, review, and integrate disparate knowledge
    resources

8
http//bioontology.org/
9
(No Transcript)
10
cBiO Software and Resources
  • Open Biomedical Ontologies (OBO)
  • An integrated virtual library of biomedical
    ontologies
  • Open Biomedical Data (OBD)
  • An online repository of OBO annotations on
    experimental data accessible via BioPortal
  • BioPortal A Web-based portal
  • Allow investigators and intelligent computer
    programs to access and use OBO
  • Use OBO to annotate experimental data in OBD
  • Visualize and analyze OBD annotations

11
Methods (1)Integrating diverse ontologies
  • obo.sourceforge.net an initial effort at
    integration hosts bio-ontologies
  • Variety of formats OBO-EDIT, DAG-EDIT, Protégé,
    XML
  • Must be viewed by tool that created them
  • No mappings between them no way to
    relate/compare them

12
obo.sourceforge.net
13
Strategy access ontologies via common
representation
  • Protégé (http//protege.stanford.edu)
  • Platform to access/manage ontologies
  • Provides plug-in architecture
  • Existing plug-ins handle most formats hosted by
    obo.sourceforge.net
  • DAG-EDIT (Gennari, 2005)
  • GO (Yeh, Karp, et al. 2003)
  • OWL (Knublauch, Fergerson et al. 2004)
  • No plug-in for OBO-EDIT thus, we created Protégé
    extension to read OBO-EDIT

Recently, an OBO-EDIT plug-in was created by
Gennari (2005)
14
Creating Protégé plug-in to read OBO-EDIT
  • Approach translate OBO-EDIT representation into
    Protégé frames representation
  • Concepts in OBO-EDIT ?? Protégé classes
  • Relationship types in OBO-EDIT ?? Protégé slots
  • Operations performed using Python script
  • Validation of import manual comparison of
    original and imported ontologies

15
Accessing OBO-EDIT in Protégé
Ontology in OBO-EDIT
We access OBO-EDIT ontologies by importing them
via a python script that maps the OBO-EDIT
ontologies into Protégé ontologies.
PYTHONSCRIPT
Ontology in Protégé
16
Benefits of accessing OBO ontologies in Protégé
  • Unified access to all OBO ontology content via
    single API
  • Access to Protégé tools for alignment and diff
  • Access to ontology visualization tools
  • Avoid necessity to use multiple tools to access
    OBO ontologies

17
DAG-EDIT ? Protégé
The PaTO ontology (originally in DAG-EDIT format
and imported using DAG-EDIT plug-in to Protégé).
18
OBO-EDIT ? Protégé
The Drosophila anatomy ontology (originally in
OBO-EDIT format and imported with OBO-EDIT
extension to Protégé). The contents of both PaTO
and Drosophila ontologies are now accessible in
the same common format.
19
Methods (2) Managing/accessing annotations
  • Numerous model organism databases being developed
    (e.g., FlyBase, ZFIN, SGD)
  • Collect experimental and computed data
  • Annotate data using OBO ontologies, providing
    computable representation of anatomy, biology,
    phenotype, etc.
  • Annotations are evolving
  • Past single terms from one ontology
  • Current/future multiple composed terms taken
    from several ontologies

20
New types of annotations
  • New expressive annotations are composed
  • Ontology entities (nouns), e.g., phenotype
    ontology
  • Ontology attributes (verbs), e.g., PaTO
  • Values to which annotation is applicable
  • For example
  • Datum annotated FBal0145168 allele
  • Entity atresia
  • Attribute shape
  • Value abnormal
  • We created prototype resource allowing users to
    browse these annotationsOBD

i.e., the FBal0145168 allele is associatedwith
atresia in the shape phenotype,which is abnormal
21
Open Biomedical Data (OBD) (taken from FlyBase
and ZFIN data)
All Alleles
OBD collects annotations on experimental data
using OBO ontologies. LEFT Ontology annotations
on alleles. The annotations consist of entities,
attributes, and/or values (EAV). RIGHT Detailed
view showing all annotations on a particular
allele in the EAV format.
22
Advantages of OBD
  • Unification of annotations in disparate model
    organism databases
  • Browse search for genes/alleles having
    particular types of attributes or values
  • More expressive queries
  • e.g., find alleles associated with lethal embryo
    (Eembryo, Aviability, Vlethal) and
    abnormal embryonic head (Eembryonic head,
    Aqualitative, Vabnormal)
  • Potential to link similar phenotypes to similar
    genes

23
Example holoprosencephaly (h.p.)
  • Locus of lesion causing human h.p. was
    incompletely understood SHH mutations can cause
    midline defects (cleft palate or h.p.)
  • Query find genes with similar mutant phenotypes
  • Human ? SHH gene
  • Zebrafish ? shh oep genes
  • Query Zebrafish oep gene ? annotations show
    nearly all defects seen in human h.p. (suggests
    oep ortholog in human may be responsible for
    human h.p.)
  • Knowledge of ZFIN oep gene was available in 1998,
    and provided candidate for cause of human h.p.
    mutation of human oep ortholog (TDGF1) not found
    until 2002!

24
Discussion
ONTOLOGY DEVELOPMENT
  • Bio-ontologies being developed in vertical
    communities with little or no coordination
  • Redundancy, variable quality, confusing array of
    ontologies
  • At present, obo.sourceforge.net is a catch-all
    collection of ontologies, without integration of
    actual content
  • We demonstrated utility of Protégé to integrate
    diverse ontology content
  • Can inter-relate diverse ontologies
  • Protégé provides tools for ontology alignment,
    GUI for viewing ontologies, and API for
    applications

25
Discussion
ONTOLOGY USE IN ANNOTION
  • Increasing number of biological databases using
    ontologies to annotate data content, but they are
    not integrated
  • Difficult to perform cross-species analysis
  • OBD will unify annotations among model organism
    databases
  • OBD will support search/query with richer
    annotations (EAV), making it possible to
    describe richer phenotypes
  • Linking OBD to OBO will permit more
    biologically-relevant queries, because of access
    to all parents of terms used for annotation

26
Future work unifying ontologies and annotations
  • BioPortal a Web portal accessing and linking OBO
    and OBD
  • Benefit access semantics in OBO ontologies to
    refine search/visualization of OBD annotations
  • e.g., search based on parents of annotation terms

27
BioPortal
28
OBD
29
Acknowledgements
  • National Center for Biomedical Ontology
  • Executive Team Mark Musen, Suzanna Lewis, Daniel
    Rubin, Sima Misra
  • cBiO staff Natasha Noy, Ray Fergerson, Lynn
    Murphy, Archana Verbakam, Chris Mungall, Harold
    Solbrig
  • Collaborators Michael Ashburner, Monte
    Westerfield, Ida Sim, Chris Chute, Barry Smith,
    Peggy Storey, Richard Olshen, Werner Ceusters,
    Deborah McGuinness
  • Students postdocs Kaustubh Supekar, Nigam
    Shah, Fabian Neuhaus
  • Funded through NIH Roadmap for Medical Research
    grant U54 HG004028
  • Program officer Peter Good (NIGMS)

30
Thank you.
Contact information Center feedback_at_cbio.us
31
(No Transcript)
32
Planning the Center Structure of the grant
33
cBiO Resources
Software resources Scientific investigation
Community outreach
34
cBiO Scientific Team
  • Computer Science (Musen, Stanford)
  • Ontology management/alignment/diff
  • Ontology integration
  • Terminology access/query (Chute, Mayo)
  • Ontology visualization/browsing/search (Storey,
    UVIC)
  • Bioinformatics (Lewis, Berkeley)
  • Data/image Annotation tools
  • Annotation databases
  • Driving Biological Projects
  • Flybase (Ashburner, Cambridge)
  • ZFIN (Westerfield, Oregon)
  • HIV (Sim, UCSF)
  • Education/Dissemination
  • Educational workshops (Smith, University at
    Buffalo)
  • Ontology development workshops
Write a Comment
User Comments (0)
About PowerShow.com