Title: HeC Technical Review, Work Package 6 LINKING EXTERNAL KNOWLEDGE TO THE PATIENT CASE
1HeC Technical Review, Work Package 6 LINKING
EXTERNAL KNOWLEDGE TO THE PATIENT CASE
Maat Gknowledge Jaume I University (TKBG
group) PhD Student Ernesto Jiménez-Ruiz
- 14-16 January 2008, Geneva
2Outline
- Introduction
- Motivation
- Use Case
- Objectives
- Used state-of-the-art techniques
- Text mining
- Ontology Fragment Extraction and Segmentation
- 3D Brower-like Tool
- Methodology
- Prototype
- WP6 Workflows
3Motivation
- Work complementary to HeC Toolbar
- Linking, accessing and using external resources
- Medline
- Public online databases (e.g. ORPHANET, GeneMap)
- Knowledge resources (e.g. biomedical ontologies)
- Helping the clinician in integrating with her
everyday environment - Clinicians use a web browser (IE or FF) to find
papers/search public databases for information - Novel research in a particular topic, association
of one or more clinical conditions/ pathologies/
malformations etc.
4Use Case (I)
- Clinicians have created a new Patient Folder
within the Hypertrophic Cardiomyopathy (HCM)
study (in the HeC database). - Clinicians want to gets back the list of HCM HeC
related patients diagnosed of a rare subtype of
the disease, and which is of interest to him for
obtaining interesting treatments/opinions. - Clinicians want to find research articles that
describe related research work.
5Use Case (II)
- Clinicians presents a text based query describing
a HeC patient case. - Clinicians want to get related patient cases and
research articles in order to - obtain additional knowledge
- be able to refine and resubmit the query if
necessary - Clinicians want to use Ontologies to help them
in - the query refinement specification or
generalization of query-concepts - the representation of known links between
query-concepts - the search look for a semantic type (Disease or
Syndromes)
6Motivation (III)
- Objectives
- Creation of a browser-like tool to establish a
connection between - Patient-based (Text-like) queries,
- Medical terminologies like UMLS Metathesaurus
- Textual resources research articles
- Fragments from domain ontologies.
- Needed HeC related resources
- A collection of text-rich resources annotated
with UMLS - A set of domain ontologies mapped to UMLS.
- Patient data annotated with UMLS
7UNDERLYING STATE-OF-THE-PART TECHNIQUES
- Text mining techniques
- Ontology Segmentation techniques
8Text Mining Techniques
- Collaboration with the Rebholzs Text Mining
group at European Bioinformatics Institute,
Cambridge - EBI Infrastructure and techniques
- Biomedical Entity Recognition
- Flexible processing of a text input stream
- Flexibility in the use of different lexicon
dictionaries - Contribution
- Treatment of the UMLS Metathesaurus
- Creation of UMLS-based input lexicon-dictionary
- Relation Discovery
- Currently based on co-occurrences at sentence
level - Used abstracts coming from Medline
9Ontology Fragment Extraction Techniques
- Ontologies provide Surrounding information about
the knowledge of interest - Galen, NCI, UMLS Semantic Network, UMLS
Metathesaurus - Domain ontologies in Medicine are rather huge so
we need to deal with them in a modular-fragmented
way. - OntoPath Language and Tool
- Flexibility in the extraction of fragments
(knowledge on demand) ?? - Small and specific fragments can be obtained ?
- It lacks a underlying formal approach ?
- It does not allow approximate queries ?
- Locality-based modularization (Collaboration with
IMG from Manchester) - Proposes a formal approach ??
- Problems dealing with big knowledge resources ?
- It does not allow approximate queries ?
- Approximate Queries (ArHex)
- Extraction of approximate fragments by means tree
patterns. ?? - The idea is to use the three techniques (combined
or independently) depending on the case.
10BROWSER TOOL
11Methodology for the Browser Tool
- Step 0 Annotation/Alignment of resources with
UMLS - Extraction and annotation of a collection of
text-rich resources, related to a specific HeC
domain (i.e. JIA ,Cardiomyophaties or Brain
Tumors). - Selection of domain ontologies of interest
(Galen, NCI and the UMLS itself) and alignment
with UMLS. - Patient data annotated should be annotated with
UMLS.
12Methodology for the Browser Tool
- Step 1 Definition of a text-based query
- Gene mutations in sarcomere involved in
cardiomyopathies related to patient relatives - ltze ids"C0596611" sem"Genetic Function"gtGene
mutationslt/zegt in ltze ids"C0036225" sem"Cell
Component"gtsarcomerelt/zegt involved in ltze
ids"C0878544" sem"Disease or Syndrome"gtcardiomyo
pathieslt/zegt related to ltze ids"C0080103"
sem"Family Group"gtpatient relativeslt/zegt
13Methodology for the Browser Tool
- Step 2 Extraction of an ontology fragment (I)
- Knowledge represented in NCI w.r.t.
Cardiomyopathy
14Methodology for the Browser Tool
- Step 2 Extraction of an ontology fragment (II)
- Knowledge represented in Galen relating/linking
sarcomere with cardiomyopathies DCM and HCM
15Methodology for the Browser Tool
- Step 3 Extraction of Medline abstracts and
patient cases - In Step 0 a set of abstracts related to the
domain were extracted. - The most relevant abstracts, w.r.t. the query
entities are selected and ranked given a
relevance measure - Query Entities Patient Relatives, DCM, HCM,
Myocardium, Sarcomere Kind of Genes (UMLS
semantic type) - For the retrieval of patient records the idea is
similar, we should look for the occurrence of the
given entities within the patient data, and to
extract the most relevant/interesting cases. - Other techniques could be applied
16Methodology for the Browser Tool
- Step 4 Extraction of co-occurrences
- Entities Patient Relatives, DCM, HCM,
Myocardium, Sarcomere Kind of Genes (UMLS
semantic type) - Gene Mutation is a rather general entity so we
look for the semantic type.
17Methodology for the Browser Tool
- Step 5 Customization of the conceptual map and
query - We want to focus on HCM
18Prototype for the Browser Tool (I)
- Integration with HeC Gateway by means of an
AJAX-like client
19Prototype for the Browser Tool (II)
20WP6 WORKFLOW
21Workflow between WP6 efforts (I)
- Joint use with HeC Toolbar
- HeC Toolbar
- Integration between HeC Client application (where
a patient of interest is identified) and a web
browser - URLs of interesting resources could be saved in
the system in the form of a bookmark which
associates the resource with the case or the
patient or part of the case, like an image. - Extracted URL of Medline abstracts with the 3D
Knowledge Browser could be selected and saved by
means the HeC Toolbar. - HeC Toolbar Firefox http//tamas.web.cern.ch/tama
s/hectoolbar/
22Workflow between WP6 efforts (II)
- Joint use with HeC CaseReasoner
- The 3D Knowledge Browser extracts patient cases
of interest - From patient cases can also be extracted
co-occurrences and interesting relations - However used techniques (eg clustering) when
working with structured data are different from
those used in text mining - The tools for global similarity search (HeC
CaseReasoner) analyse similarity patterns of
entities within the extracted set of patients,
which can have useful information as treatments
or problems in the diagnosis.
23Questions and Feedback
- Thank you!!
- Interesting Links
- 3D Browser draft report
- https//www.health-e-child.org/wps/wp6/documents/
3d-browser - EBI Work
- Report http//krono.act.uji.es/publications/techr
ep/tkbg-ebi-report - Software http//www.ebi.ac.uk/Rebholz/software.ht
ml - Contact
- E-mail ejimenez_at_uji.es
- Web page
- http//krono.act.uji.es/people/Ernesto
- http//www3.uji.es/ejimenez