AlzPharm: an RDF Use Case for Semantic Web in Neuroscience - PowerPoint PPT Presentation

Loading...

PPT – AlzPharm: an RDF Use Case for Semantic Web in Neuroscience PowerPoint presentation | free to view - id: 6fc9a2-MWM4M



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

Description:

Yale Center for Medical Informatics (YCMI) AlzPharm: an RDF Use Case for Semantic Web in Neuroscience – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 22
Provided by: hugol150
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience


1
AlzPharm an RDF Use Case for Semantic Web in
Neuroscience
Yale Center for Medical Informatics (YCMI)
2
SeS2006 Workshop, Beijing, China (Sept. 3, 2006)
  • Authors
  • Hugo Y.K. Lam (CBB Ph.D. Program)
  • Kei Cheung (YCMI)
  • Luis Marenco (YCMI)
  • Perry Miller (YCMI)
  • Nian Liu (YCMI)
  • Chiquito Crasto (YCMI)
  • Tim Clark (Mass. General Hospital, Harvard
    University)
  • Yong Gao (Partners)
  • June Kinoshita (AlzForum)
  • Elizabeth Wu (AlzForum)
  • Gwen Wong (AlzForum)
  • Gordon Shepherd (Yale Neurobiology)
  • Tom Morse (Yale Neurobiology)
  • Susie Stephens (Oracle)

3
Overview
  • Most of the neuroscience databases are neither
    integrated nor interoperating
  • A domain ontology is insufficient for integrating
    neuroscience data spanning multiple domains
  • We present a Semantic Web approach to building an
    e-Neuroscience data integration framework, which
    involves using RDF as a standard data model to
    facilitate representation and integration of data

4
e-Neuroscience
  • Involves developing tools, technologies, and
    infrastructure to support multidisciplinary and
    collaborative science enabled by the Internet
  • Aims to address data integration problem in
    neuroscience
  • Fits the informatics-oriented goal of the Human
    Brain Project initiated by NIH
  • Provides a better understanding of brain function
    by integrating different levels of brain data.

5
Current Issues
  • Registry
  • Keyword-based search approach suffers from the
    problem of specificity and sensitivity
  • Centralized approach to registering resources may
    not be scalable. E.g. NDG

6
Current Issues (contd)
  • No Links between related databases

7
Semantic Web Approach
  • Representing and Integrating Data

8
Semantic Web
  • Exposes the semantics of Web-accessible data in a
    standard machine-readable way so that the data
    can be more easily interpreted and integrated by
    computer programs (or Web agents)
  • Components of the Semantic Web technologies
  • Ontology
  • Ontological Languages
  • Semantic-Web-aware Tools (e.g., databases)

9
RDF Data Modeling
  • Uses the Oracle RDF Data Model (which is
    installed on a Linux server) to build a Semantic
    Web data warehouse for integrating datasets
    extracted from two independently-developed
    neuroscience databases
  • BrainPharm (a subdatabase of SenseLab)
  • SWAN (Semantic Web Applications in Neuromedicine)

10
RDF Data Modeling
  • BrainPharm
  • A database under development to support research
    on drugs for the treatment of different
    neurological disorders (http//senselab.med.yale.e
    du/senselab/BrainPharm/alzData.asp)
  • Contains pharmacological agents that act on
    neuronal receptors and signal transduction
    pathways in the normal brain and in nervous
    disorders such as Alzheimers Disease (AD)
  • Enables searches for drug actions at the level of
    key molecular constituents, cell compartments and
    individual cells

11
RDF Data Modeling
  • SWAN (http//swan.mindinformatics.org/)
  • A project to develop knowledge management tools
    and resources for Alzheimer Disease (AD)
    researchers, based on an ecosystem model of
    scientific discourse
  • Uses an upper ontology, including the following
    components scientists, experiments,
    publications, bibliographic databases, research
    collaborations, scientific web communities, and
    etc
  • Implemented using Semantic Web technology
  • Represents data in RDF format,
  • Currently stores a subset of data obtained from
    the Alzheimer Research Forum (http//www.alzforum.
    org)

12
Data Conversion Loading
  • The drug-related (chemical) information are
    extracted from BrainPharm
  • The SWAN hypotheses and publications are
    extracted from Alzforum
  • SWAN data are already available in RDF format
  • BrainPharm exports data in its own XML format
    called EDSP (Electronic DataSet Protocol)
  • Convert the EDSP/XML format into the
    corresponding RDF/XML format using XSL
    Transformation (XSLT)
  • Load both the SWAN and BrainPharm RDF datasets
    into the Oracle RDF Data Model using its data
    loader program, which takes RDF data in N-triple
    format

13
RDF Based Queries
  • Oracle has extended SQL to provide support for an
    RDF query language, which allows users to perform
    queries against multiple RDF datasets
  • The following two examples illustrate how such
    queries can be made to retrieve and integrate
    data from BrainPharm and SWAN

14
RDF Based Queries
  • Example One
  • Target
  • Query BrainPharm to group and count AD drugs
    based on their molecular targets
  • Result
  • There are 2 groups of drugs for AD.
  • The first one contains 5 drugs that act as
    acetylcholinesterase inhibitors.
  • The second one contains only 1 drug that is a
    N-methyl-D-aspartic acid (NMDA) receptor
    antagonist.

15
RDF Based Queries
  • Example One
  • Query

16
RDF Based Queries
  • Example One
  • Remarks
  • Current implementations of RDF query languages
    (RQL) by specialized RDF stores do not support
    aggregate functions (e.g., COUNT, SUM and
    AVERAGE) via GROUP BY
  • The Oracle RDF query language supports such
    functions, as it is a hybrid between RQL and SQL

17
RDF Based Queries
  • Example Two
  • Target
  • Retrieves the information (stored in BrainPharm)
    about the AD drug Donepezil and publications
    (stored in SWAN) whose titles or abstracts
    contain the term Donepezil (case-insensitive)
  • Demonstrates the use of RDF inferencing based on
    the parent-child (is-a) relationship between the
    Publication class (e.g., original articles
    retrieved from PubMed) and ARFPublication class
    (e.g., PubMed articles that have been commented
    by researchers/curators associated with Alzforum)
    as defined in the SWAN RDF Schema

18
RDF Based Queries
  • Example Two
  • Query (Partial)

19
RDF Based Queries
  • Example Two
  • Result
  • With the is-a inference rule incorporated into
    the query, it finds a total of 19 publications
    that are linked to claims and/or hypotheses that
    have to do with the effect of Donepezil on AD
    treatment
  • Among these publications, one of them belongs to
    the ARFPublication class (i.e., one of the 19
    publications is ARF-commented)
  • Given the ID (PubMed ID) of the commented
    publication, the user can retrieve the detailed
    comments through the Alzforum Web site

20
RDF Based Queries
  • Example Two
  • Remarks
  • In the SWAN dataset, publications (e.g., those
    retrieved from PubMed) are treated as instances
    of the Publication class
  • We define publications that have been commented
    in the Alzforum as instances of the
    ARFPublication class
  • The Oracle RDF Data Model allows us to create
    rules for hierarchical relationships from the
    RDFS for the data so that it enables us to find
    out all the publications and its subclasses (e.g.
    the ARF publications)

21
AlzPharm
AlzForum
BrainPharm
Oracle/RDF
SWAN
About PowerShow.com