Semantic Web for Health Care and Biomedical Informatics - PowerPoint PPT Presentation

Loading...

PPT – Semantic Web for Health Care and Biomedical Informatics PowerPoint presentation | free to view - id: 3ec3d7-OWI3M



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Semantic Web for Health Care and Biomedical Informatics

Description:

Keynote at NSF Biomed Web Workshop, December 4-5, 2007 Amit P. Sheth amit.sheth_at_wright.edu Thanks Pablo Mendes, Satya Sahoo and Kno.e.sis team; Collaborators at ... – PowerPoint PPT presentation

Number of Views:188
Avg rating:3.0/5.0
Slides: 55
Provided by: AmitS54
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Semantic Web for Health Care and Biomedical Informatics


1
Semantic Web for Health Care and Biomedical
Informatics
  • Keynote at
  • NSF Biomed Web Workshop, December 4-5, 2007
  • Amit P. Sheth
  • amit.sheth_at_wright.edu
  • Thanks Pablo Mendes, Satya Sahoo and Kno.e.sis
    team
  • Collaborators at Athens Heart Center (Dr.
    Agrawal), NLM (Olivier Bodenreider), CCRC, UGA
    (Will York), CCHMC (Bruce Aronow)

2
Outline
  • Semantic Web very brief intro
  • Scenarios to demonstrate the applications and
    benefit of semantic web technologies
  • Health care
  • Biomedical Research

3
Biomedical Informatics...
...needs a connection
Hypothesis Validation Experiment
design Predictions Personalized medicine
Semantic Web research aims at providing this
connection!
Etiology Pathogenesis Clinical
findings Diagnosis Prognosis Treatment
Genome Transcriptome Proteome Metabolome Physiome
...ome
More advanced capabilities for search,
integration, analysis, linking to new
insights and discoveries!
Genbank
Uniprot
Medical Informatics
Bioinformatics
4
Evolution of the Web
Web as an oracle / assistant / partner - ask
to the Web - using semantics to leverage
text data services people
2007
1997
5
Semantic Web Enablers and Techniques
  • Ontology Agreement with Common Vocabulary
    Domain Knowledge Schema Knowledge base
  • Semantic Annotation (meatadata Extraction)
    Manual, Semi-automatic (automatic with human
    verification), Automatic
  • Reasoning/computation semantics enabled search,
    integration, complex queries, analysis (paths,
    subgraph), pattern finding, mining, hypothesis
    validation, discovery, visualization

6
Maturing capabilites and ongoing research
  • Text mining Entity recognition, Relationship
    extraction
  • Integrating text, experimetal data, curated and
    multimedia data
  • Clinical and Scientific Workflows with semantic
    web services
  • Hypothesis driven retrieval of scientific
    literature, Undiscovered public knowledge

7
Metadata and Ontology Primary Semantic Web
enablers
8
Characteristics of Semantic Web
Self Describing
Easy to Understand
The Semantic Web XML, RDF Ontology
Machine Human Readable
Issued by a Trusted Authority
Can be Secured
Convertible
Adapted from William Ruh (CISCO)
9
Open Biomedical Ontologies
Many ontologies exist
Open Biomedical Ontologies, http//obo.sourceforge
.net/
10
Drug Ontology Hierarchy (showing is-a
relationships)
interaction_ with_non_ drug_reactant
11
N-Glycosylation metabolic pathway
GNT-Iattaches GlcNAc at position 2
12
Opportunity exploiting clinical and biomedical
data
binary
Health Information Services Elsevier
iConsult
Scientific Literature PubMed 300 Documents
Published Online each day
User-contributed Content (Informal) GeneRifs

NCBI Public Datasets Genome, Protein DBs new
sequences daily
Laboratory Data Lab tests, RTPCR, Mass spec
Clinical Data Personal health history
Search, browsing, complex query, integration,
workflow, analysis, hypothesis validation,
decision support.
13
Scenario 1
  • Status In use today
  • Where Athens Heart Center
  • What Use of semantic Web technologies for
    clinical decision support

14
Operational since January 2006
15
Active Semantic Electronic Medical Records (ASEMR)
  • Goals
  • Increase efficiency with decision support
  • formulary, billing, reimbursement
  • real time chart completion
  • automated linking with billing
  • Reduce Errors, Improve Patient Satisfaction
    Reporting
  • drug interactions, allergy, insurance
  • Improve Profitability
  • Technologies
  • Ontologies, semantic annotations rules
  • Service Oriented Architecture

Thanks -- Dr. Agrawal, Dr. Wingeth, and others.
ISWC2006 paper
16
  • Demonstration

17
ASMER Efficiency
18
Scenario 2
  • Status Demonstration
  • Where W3C Health Care and Life Sciences (HCLS)
    interest group
  • What Using semantic web to aggregate and query
    data about Alzheimers
  • http//www.w3.org/2001/sw/hcls/

19
Scenario 2 Scientific Data Sets for Alzheimers
20
SPARQL Query spanning multiple sources
21
Scenario 3
  • Status Completed research
  • Where NIH
  • What Understanding the genetic basis of nicotine
    dependence. Integrate gene and pathway
    information and show how three complex biological
    queries can be answered by the integrated
    knowledge base.
  • How Semantic Web technologies (especially RDF,
    OWL, and SPARQL) support information integration
    and make it easy to create semantic mashups
    (semantically integrated resources).

22
Motivation
  • NIDA study on nicotine dependency
  • List of candidate genes in humans
  • Analysis objectives include
  • Find interactions between genes
  • Identification of active genes maximum number
    of pathways
  • Identification of genes based on anatomical
    locations
  • Requires integration of genome and biological
    pathway information

23
Genome and pathway information integration
KEGG
Reactome
HumanCyc
  • pathway
  • protein
  • pmid

Entrez Gene
  • pathway
  • protein
  • pmid
  • pathway
  • protein
  • pmid

GeneOntology
HomoloGene
  • GO ID
  • HomoloGene ID

24
JBI
25
Entrez Knowledge Model (EKoM)
BioPAX ontology
26
Deductive Reasoning Protein-Protein Interaction
RULE given that two genes interact with each
other, given certain number of parameters being
met, we can assert that the gene products also
interact with each other
IF (x have_common_pathway y) AND (x rdftype
gene) AND (y rdftype gene) AND (x has_product
m) AND (y has_product n) AND (m rdftype
gene_product) AND (n rdftype gene_product) THEN
(m ? n)
27
Scenario 4
  • Status Completed research
  • Where NIH
  • What queries across integrated data sources
  • Enriching data with ontologies for integration,
    querying, and automation
  • Ontologies beyond vocabularies the power of
    relationships

28
Use data to test hypothesis
Link between glycosyltransferase activity and
congenital muscular dystrophy?
Glycosyltransferase
Congenital muscular dystrophy
Adapted from Olivier Bodenreider, presentation
at HCLS Workshop, WWW07
29
In a Web pages world
Adapted from Olivier Bodenreider, presentation
at HCLS Workshop, WWW07
30
With the semantically enhanced data
SELECT DISTINCT ?t ?g ?d ?t is_a
GO0016757 . ?g has molecular function ?t .
?g has_associated_phenotype ?b2 . ?b2
has_textual_description ?d . FILTER (?d,
muscular distrophy, i) . FILTER (?d,
congenital, i)
From medinfo paper. Adapted from Olivier
Bodenreider, presentation at HCLS Workshop, WWW07
31
Scenario 5
  • Status Research prototype and in progress
  • Workflow withSemantic Annotation of Experimental
    Data already in use
  • Where UGA
  • What
  • Knowledge driven query formulation
  • Semantic Problem Solving Environment (PSE) for
    Trypanosoma cruzi (Chagas Disease)

32
Knowledge driven query formulation
  • Complex queries can also include
  • - on-the-fly Web services execution to retrieve
    additional data
  • inference rules to make implicit knowledge
    explicit

33
T.Cruzi PSE Query Interface
34
N-Glycosylation Process (NGP)
35
Semantic Web Process to incorporate provenance
Semantic Annotation Applications
36
ProPreO Ontology-mediated provenance
parent ion charge
830.9570 194.9604 2 580.2985
0.3592 688.3214 0.2526 779.4759
38.4939 784.3607 21.7736 1543.7476
1.3822 1544.7595 2.9977 1562.8113
37.4790 1660.7776 476.5043
parent ion m/z
parent ionabundance
fragment ion m/z
fragment ionabundance
ms/ms peaklist data
Mass Spectrometry (MS) Data
37
ProPreO Ontology-mediated provenance
ltms-ms_peak_listgt ltparameter instrumentmicromas
s_QTOF_2_quadropole_time_of_flight_mass_spectromet
er modems-ms/gt ltparent_ion
m-z830.9570 abundance194.9604
z2/gt ltfragment_ion m-z580.2985
abundance0.3592/gt ltfragment_ion
m-z688.3214 abundance0.2526/gt ltfragment_i
on m-z779.4759 abundance38.4939/gt ltfragme
nt_ion m-z784.3607 abundance21.7736/gt ltfr
agment_ion m-z1543.7476 abundance1.3822/gt
ltfragment_ion m-z1544.7595 abundance2.9977/
gt ltfragment_ion m-z1562.8113
abundance37.4790/gt ltfragment_ion
m-z1660.7776 abundance476.5043/gt lt/ms-ms_pea
k_listgt
OntologicalConcepts
Semantically Annotated MS Data
38
Scenario 6
  • When Research in progress
  • Where Athens Heart Center and Cincinatti
    Childrens Hospital Medical Center
  • What scientific literature mining
  • Dealing with unstructured information
  • Extracting knowledge from text
  • Complex entity recognition
  • Relationship extraction

39
Heart Failure Clinical Pathway
Ontology A Framework for Schema-Driven
Relationship Discovery from Unstructured Text,
Ramakrishnan, et. al., ISWC 2006, LNCS 4273, pp.
583-596
40
Contextual delivery of information
41
  • Two technical challenges
  • Text mining
  • Workflow adaptation

42
Extracting the Relationship
Diabetes mellitus adversely affects the outcomes
in patients with myocardial infarction (MI), due
in part to the exacerbation of left ventricular
(LV) remodeling. Although angiotensin II type 1
receptor blocker (ARB) has been demonstrated to
be effective in the treatment of heart failure,
information about the potential benefits of ARB
on advanced LV failure associated with diabetes
is lacking. To induce diabetes, male mice were
injected intraperitoneally with streptozotocin
(200 mg/kg). At 2 weeks, anterior MI was created
by ligating the left coronary artery. These
animals received treatment with olmesartan (0.1
mg/kg/day n 50) or vehicle (n 51) for 4
weeks. Diabetes worsened the survival and
exaggerated echocardiographic LV dilatation and
dysfunction in MI. Treatment of diabetic MI mice
with olmesartan significantly improved the
survival rate (42 versus 27, P lt 0.05) without
affecting blood glucose, arterial blood pressure,
or infarct size. It also attenuated LV
dysfunction in diabetic MI. Likewise, olmesartan
attenuated myocyte hypertrophy, interstitial
fibrosis, and the number of apoptotic cells in
the noninfarcted LV from diabetic MI. Post-MI LV
remodeling and failure in diabetes were
ameliorated by ARB, providing further evidence
that angiotensin II plays a pivotal role in the
exacerbated heart failure after diabetic MI.
Angiotensin II type 1 receptor blocker attenuates
exacerbated left ventricular remodeling and
failure in diabetes-associated myocardial
infarction., Matsusaka H, et. al.
43
Problem Extracting relationships between MeSH
terms from PubMed
UMLS Semantic Network
complicates
Biologically active substance
affects
causes
causes
Disease or Syndrome
Lipid
affects
instance_of
instance_of
???????
Fish Oils
Raynauds Disease
MeSH
PubMed
44
Background knowledge used
  • UMLS A high level schema of the biomedical
    domain
  • 136 classes and 49 relationships
  • Synonyms of all relationship using variant
    lookup (tools from NLM)
  • 49 relationship their synonyms 350 mostly
    verbs
  • MeSH
  • 22,000 topics organized as a forest of 16 trees
  • Used to query PubMed
  • PubMed
  • Over 16 million abstract
  • Abstracts annotated with one or more MeSH terms

T147effect T147induce T147etiology
T147cause T147effecting T147induced
45
Method Parse Sentences in PubMed
SS-Tagger (University of Tokyo)
SS-Parser (University of Tokyo)
  • Entities (MeSH terms) in sentences occur in
    modified forms
  • adenomatous modifies hyperplasia
  • An excessive endogenous or exogenous
    stimulation modifies estrogen
  • Entities can also occur as composites of 2 or
    more other entities
  • adenomatous hyperplasia and endometrium
    occur as adenomatous hyperplasia of the
    endometrium

(TOP (S (NP (NP (DT An) (JJ excessive) (ADJP (JJ
endogenous) (CC or) (JJ exogenous) ) (NN
stimulation) ) (PP (IN by) (NP (NN estrogen) ) )
) (VP (VBZ induces) (NP (NP (JJ adenomatous) (NN
hyperplasia) ) (PP (IN of) (NP (DT the) (NN
endometrium) ) ) ) ) ) )
46
Method Identify entities and Relationships in
Parse Tree
Modifiers
TOP
Modified entities
Composite Entities
S
VP
UMLS ID T147
NP
VBZ induces
NP
PP
NP
NP
NN estrogen
IN by
JJ excessive
PP
DT the
ADJP
NN stimulation
MeSHID D004967
IN of
JJ adenomatous
NN hyperplasia
NP
JJ endogenous
JJ exogenous
CC or
MeSHID D006965
NN endometrium
DT the
MeSHID D004717
47
  • What can we do with the extracted knowledge?
  • Semantic browser demo

48
Evaluating hypotheses
Keyword query MigraineMH MagnesiumMH
PubMed
49
Workflow Adaptation Why and How
  • Volatile nature of execution environments
  • May have an impact on multiple activities/ tasks
    in the workflow
  • HF Pathway
  • New information about diseases, drugs becomes
    available
  • Affects treatment plans, drug-drug interactions
  • Need to incorporate the new knowledge into
    execution
  • capture the constraints and relationships between
    different tasks activities

50
Workflow Adaptation Why?
51
Workflow Adaptation How
  • Decision theoretic approaches
  • Markov Decision Processes
  • Given the state S of the workflow when an event E
    occurs
  • What is the optimal path to a goal state G
  • Greedy approaches rely on local optimization
  • Need to choose actions based on optimality across
    the entire horizon, not just the current best
    action
  • Model the horizon and use MDP to find the best
    path to a goal state

52
Conclusion
  • semantic web technologies can help with
  • Fusion of data semi-structured, structured,
    experimental, literature, multimedia
  • Analysis and mining of data, extraction,
    annotation, capture provenance of data through
    annotation, workflows with SWS
  • Querying of data at different levels of
    granularity, complex queries, knowledge-driven
    query interface
  • Perform inference across data sets

53
Take home points
  • Shift of paradigm from browsing to querying
  • Machine understanding
  • extracting knowledge from text
  • Inference, software interoperation
  • Semantic-enabled interfaces towards hypothesis
    validation

54
References
  • A. Sheth, S. Agrawal, J. Lathem, N. Oldham, H.
    Wingate, P. Yadav, and K. Gallagher, Active
    Semantic Electronic Medical Record, Intl Semantic
    Web Conference, 2006.
  • Satya Sahoo, Olivier Bodenreider, Kelly Zeng, and
    Amit Sheth, An Experiment in Integrating Large
    Biomedical Knowledge Resources with RDF
    Application to Associating Genotype and Phenotype
    Information WWW2007 HCLS Workshop, May 2007.
  • Satya S. Sahoo, Kelly Zeng, Olivier Bodenreider,
    and Amit Sheth, From "Glycosyltransferase to
    Congenital Muscular Dystrophy Integrating
    Knowledge from NCBI Entrez Gene and the Gene
    Ontology, Amsterdam IOS, August 2007, PMID
    17911917, pp. 1260-4
  • Satya S. Sahoo, Olivier Bodenreider, Joni L.
    Rutter, Karen J. Skinner , Amit P. Sheth, An
    ontology-driven semantic mash-up of gene and
    biological pathway information Application to
    the domain of nicotine dependence, submitted,
    2007.
  • Cartic Ramakrishnan, Krzysztof J. Kochut, and
    Amit Sheth, "A Framework for Schema-Driven
    Relationship Discovery from Unstructured Text",
    Intl Semantic Web Conference, 2006, pp. 583-596
  • Satya S. Sahoo, Christopher Thomas, Amit Sheth,
    William S. York, and Samir Tartir, "Knowledge
    Modeling and Its Application in Life Sciences A
    Tale of Two Ontologies", 15th International World
    Wide Web Conference (WWW2006), Edinburgh,
    Scotland, May 23-26, 2006.
  • Demos at http//knoesis.wright.edu/library/demos/
About PowerShow.com