Ontologies, Clinical and Genomic Information How to say what we mean and mean what we say Opportunities - PowerPoint PPT Presentation

About This Presentation
Title:

Ontologies, Clinical and Genomic Information How to say what we mean and mean what we say Opportunities

Description:

Information Management Group / Bio Health Informatics Group ... Oncology, Cardiology, ... Adult, developmental, aetiological,... Clinical, genetic, genomic, ... – PowerPoint PPT presentation

Number of Views:308
Avg rating:3.0/5.0
Slides: 40
Provided by: DrJerem5
Category:

less

Transcript and Presenter's Notes

Title: Ontologies, Clinical and Genomic Information How to say what we mean and mean what we say Opportunities


1
Ontologies, Clinical and Genomic InformationHow
to say what we mean and mean what we
sayOpportunities Pitfalls
  • Alan Rector, Jeremy Rogers, Chris Wroe
  • Information Management Group / Bio Health
    Informatics GroupDepartment of Computer Science,
    University of Manchesterrector_at_cs.man.ac.uk
    www.clinical-escience.orgwww.co-ode.orgwww.open
    galen.orgprotege.stanford.org

2
What Is An Ontology?
  • Ontology (Socrates Aristotle 400-360 BC)
  • The study of being
  • Word borrowed by computing for the explicit
    description of the conceptualisation of a domain
  • concepts (entities)
  • properties and attributes of concepts
  • constraints on properties and attributes
  • Individuals (often, but not always)
  • An ontology defines
  • a common vocabulary
  • a shared understanding
  • a classification

3
Sharing info ? Sharing meaning
  • Metadata
  • Data describing the content and meaning of
    resources and services.
  • But everyone must speak the same language
  • Terminologies
  • Shared and common vocabularies
  • For search engines, agents, curators, authors and
    users
  • But everyone must mean the same thing
  • Ontologies
  • Shared and common understanding of a domain
  • Essential for search, exchange and discovery

4
Measure the worldquantitative models(not
ontologies)
  • Quantitative
  • Numerical data
  • 2mm, 2.4V, between 4 and 5 feet
  • Unambiguous tokens
  • Main problem is accuracy at initial capture
  • Numerical analysis (e.g. statistics) well
    understood
  • Examples
  • How big is this breast lump?
  • What is the average age of patients with cancer ?
  • How much time elapsed between original referral
    and first appointment at the hospital ?

5
describe the the world ontologies
  • Qualitative
  • Descriptive data
  • Cold, colder, blueish, not pink, drunk
  • Ambiguous tokens
  • Whats wrong with being drunk ?
  • Ask a glass of water.
  • Accuracy poorly defined
  • More examples
  • How pleomorphic are the cells in the biopsy?
  • What is a proteins function?
  • What is the derivation of a tissue?

6
Why Develop an Ontology?Naming, Classifying,
Indexing
  • To share common understanding of the structure of
    descriptive information
  • among people
  • among software agents
  • between people and software
  • To enable reuse of domain knowledge
  • to introduce standards to allow interoperability
  • To index and annotate other resources

Semantic InteroperabilityFoundation of the
Semantic Web/Grid
7
More Reasons
  • To make domain assumptions explicit
  • easier to change domain assumptions (consider a
    genetics knowledge base)
  • easier to understand and update legacy data
  • To separate domain knowledge from the operational
    knowledge
  • re-use domain and operational knowledge
    separately (e.g., configuration based on
    constraints)
  • To manage the combinatorial explosion

8
A semantic continuum
  • Mike Uschold, Boeing Corp

Shared human consensus
Implicit
? Further to the right ?
  • Less ambiguity
  • Better inter-operation
  • More robust less hardwiring
  • More difficult

9
An Ontology should be just the Beginning
Databases
Declare structure
Ontologies
Knowledge bases
The SemanticWeb
Provide domain description
Software agents
Problem-solving methods
10
What an Ontology Isnt(It wont make the
coffee)
  • A database
  • Ontologies are about categories/classes/types/conc
    epts/entities not instances
  • ABOUT diseases, genes, proteins, ...
    NOT ABOUT specific patients, samples, studies,
  • A database/EHR schema
  • An ontology is about meaning rather than storage
  • Although ontology technologies are a means for
    merging schemas
  • A decision support/protocol management system
  • The entities used in the rules, not the rules
  • A metadata schema
  • The entities used in the metadata, not the schema
    itself
  • A lexicon
  • Meaning rather than language
  • But every ontology needs language tools

11
Ontology Technologies
  • Description logics (DLs), OWL
  • Designed to provide logical support for automatic
    classification and consistency checking
  • Designed for sharing and software engineering
  • Leverage off Semantic Web / Grid commnity
  • But not everything in OWL is an ontology
  • RDF(S)
  • Specialised for groups
  • DAGEdit and other OBO tools FMA explorer,
  • UML
  • Carefully developed UML models convey much
    information for an ontology
  • But support only very simple inference and
    checking

12
Why its hard (1)
  • Language is slippery local Rigour logic are
    hard
  • Classification is too easy for people (to do
    badly)
  • But logical/computational properties unintuitive
  • Combinatorial explosions
  • Philosophical religious differences
  • Information capture
  • Data quality
  • Tools environments
  • Different points of view
  • Oncology, Cardiology,
  • Adult, developmental, aetiological,
  • Clinical, genetic, genomic,

13
Why its hard (2)
  • Need a combined model of meaning
  • The EHR/Database holding the ontology PLUSThe
    ontology held
  • Hard to scope easy to do too much
  • Just in time ontology
  • Better in the bio than the medical community
  • Software engineering methods poorly understood

14
Classification is easy for people (to do badly)
  • On those remote pages it is written that animals
    are divided into
  • a. those that belong to the Emperor
  • b. embalmed ones
  • c. those that are trained
  • d. suckling pigs
  • e. mermaids
  • f. fabulous ones
  • g. stray dogs
  • h. those that are included in this classification
  • i. those that tremble as if they were mad
  • j. innumerable ones
  • k. those drawn with a very fine camel's hair
    brush
  • l. others
  • m. those that have just broken a flower vase
  • n. those that resemble flies from a distance"

From The Celestial Emporium of Benevolent
Knowledge, Borges
15
Avoiding combinatorial explosions
  • The Exploding Bicycle From phrase book to
    dictionary grammar
  • 1980 - ICD-9 (E826) 8
  • 1990 - READ-2 (T30..) 81
  • 1995 - READ-3 87
  • 1996 - ICD-10 (V10-19 Australian) 587
  • V31.22 Injury or accident to the occupant of
    three-wheeled motor vehicle in collision with
    pedal cycle, person on outside of vehicle,
    nontraffic accident, while working for income
  • and meanwhile elsewhere in ICD-10
  • W65.40 Drowning and submersion while in bath-tub,
    street and highway, while engaged in sports
    activity
  • X35.44 Victim of volcanic eruption, street and
    highway, while resting, sleeping, eating or
    engaging in other vital activities

16
The ontology nested in the EHR
the ehr (hl7 rim) moodCodeEvent
subjectRelative code


diabetes (subject person_in_family)
the ontology (snomed-ct)
? ltfamily_hx (assoc_find Diabetes)gt
the combined meaning
What is legal? Required? Mandatory?
17
Developing Software Engineering Methodologies for
Ontologies
  • Building a life cycle
  • Use/test cases exemplars
  • Identifying problems alternative solutions -
    exploring consequences deciding amongst
    alternatives
  • Specifying solutions
  • Human and machine readable form
  • Setting conformance tests for specifications
  • Building reference implementations
  • Monitoring for problems
  • Recording of problems and changes

18
Logic-based Ontologies Conceptual Lego
gene
protein
cell
expression
chronic
acute
bacterial
deletion
polymorphism
ischaemic
19
Logic-based Ontologies Conceptual Lego
SNPolymorphism of CFTRGene causing Defect in
MembraneTransport of Chloride Ion causing
Increase in Viscosity of Mucus in CysticFibrosis
Hand which isanatomically normal
20
Logical Constructs build complex concepts from
modularisedprimitives
Species
Genes
Function
Disease
21
Normalising (untangling) Ontologies
22
A simplified example Build a simple treee
easy to maintain
23
Let the classifier organise it
24
If you want more abstractions,just add new
definitions(re-use existing data)
Diseases linked to abnormal proteins
25
And let the classifier work again
26
And again even for a quite different category
Diseases linked genes described in the mouse
27
Untangling and EnrichmentUsing a classifier to
make life easier
Substance- Protein- - ProteinHormone- - -
Insulin- Steroid- - SteroidHormone- - -
Cortisol- Hormone- -ProteinHormone- - -
Insulin- - SteroidHormone- - - Cortisol-
Catalyst- - Enzyme- - - ATPase
Substance- Protein- - ProteinHormone- - -
Insulin- - Enzyme- - - ATPase- Steroid- -
SteroidHomone- - - Cortisol-Hormone- -
ProteinHormone- - - Insulin- -
SteroidHormone- - - Cortisol- Catalyst- -
Enzyme- - - ATPase
Hormone ? Substance playsRole-someValuesFrom HormoneRole
ProteinHormone ? Protein playsRole someValuesFrom HormoneRole
SteroidHomone ? Steroid playsRole someValuesFrom HormoneRole
Catalyst ? Substance playsRole someValuesFrom CatalystRole
Enzyme ? Protein playsRole someValuesFrom CatalystRole
Insulin ? playsRole someValuesFrom HormoneRole
Cortisol ? playsRole someValuesFrom HormoneRole
ATPase ? playsRole someValuesFrom CatalystRole
28
Ontologies and Reference Information Resources
  • An ontology is just one part
  • Naming - Definitions necessary conditions
  • Classification
  • Indexing
  • Knowledge bases
  • What we know about those entities what is true
    in general
  • Databases
  • What we know about individuals
  • Instance stores specialised databases that link
    to ontologies
  • Plus
  • Lexicons
  • Metadata
  • Mappings

29
Definitionalknowledge
Ontology
Linguistic
Knowledge
30
Example 1 Indexing Drug Contraindications(or
guidelines or information or)
31
Example 2 Indexing data entry formsFractal
tailoring forms for clinical trials
Hypertension
Hypertension
Idiopathic Hypertension
Idiopathic Hypertension
In our companys studies
In our companys studies
In Phase 2 studies
In Phase 2 studies
32
Example 3 PENPADFractal Tailoring of fail
soft forms
What is it sensible to say about ?
33
(No Transcript)
34
Technical Barriers to linking ontologies
  • Overlap
  • Linking independent ontologies easyOverlap
    ALWAYS brings differences in meaning
  • To integrate, separate
  • Appropriate levels of abstraction
  • Genetics/Genomics is changing disease
    clqssification
  • Anti-angina drugs
  • Ingredients conjugated in the liver
  • Feedback
  • New biology ? new clinical classifications ?
    Disciplin required to keep separations
  • Views
  • Anatomy Tissues (developmental) vs Structures
    vs Functions

35
Nontechnical barriers to linking ontologies
  • Organisational barriers
  • How to keep separation and scope of individual
    ontologies
  • All enterprises tend to expand and encroach
  • Discipline barriers
  • Task barriers
  • Fit for one purpose is not fit for all purposes
  • Language barriers
  • Between communities as well as languages
  • IP barriers
  • Process
  • Collaborative distributed vs Centralised
  • Authority
  • Life cycle and rate of change
  • GO runs at web speed seconds - days
  • SNOMED runs at e-publishing speed 6mo-3 years
  • ICD runs at print/committee speed 10-20 years

36
Good ontologies
  • Fitness for purpose
  • Whats it for?
  • Defined scope
  • Ownership by users
  • A language belongs to its community
  • Human factors
  • Understandability, Reliability!
  • Evaluation criteria
  • How do we know if it meets its purpose?Evolution

Process not Product!
37
Good ontologies
  • Internal Structure
  • Consistency
  • Modularity Normalisation
  • Software engineering issues Architecture Tools
  • Its software! It evolves! Its a
    standard!Conformance and regression testing
    matter
  • Philosophical clarity
  • Class-instance divide correct
  • Instances are different in ontologies and
    databases
  • Ontologies are about a view of the worldNot
    about how to store information in a database
  • Clear distinction between part-whole and kind-of

38
Grounding cost vs Cleanup cost
  • What do we need to share?
  • What is broken?
  • How much do we need to know to communicate?
  • Easy to build too much
  • And very costly!
  • Just in time ontology
  • Use logic
  • Use the web
  • Bio / OBO does wellMedicine so far doing badly

39
Important Ontologies related standards
  • OBO (Open Biomedical Ontologies)
  • Gene Ontology
  • MGED family
  • UMLS
  • Massive resource for cross referencing
  • Use CUIs LUIs Concept Unique IDs Lexical
    Unique IDs
  • SNOMED-CT
  • SNOMED-International
  • Anatomy
  • Digital Anatomist FMA, Mouse Developmental, Mouse
    Adult
  • SAEL Standard Anatomy Entry List
  • NCICB
  • CaCORE ontology
  • National minimum data sets controlled
    vocabularies
  • HL7, LOINC, DICOM, CDISC,
  • OpenGALEN source for experimentation and
    development
  • Bio databases at least implicit controlled
    vocabularies
  • Swissprot, OMIM, , ENSEMBLE, PRINTs,

40
Summary Planning forNaming, Classifying,
Indexing
  • What is it for? Is there a gap? What is needed?
  • What are the use cases? Criteria for success?
  • Does it exist already?
  • Is an ontology the answer? Is an ontology needed
    for the answer?
  • What else is needed?
  • A reference knowledge source?
  • What is the MINIMUM that one can do?
  • Who will own it?
  • Can we build it collaboratively?
  • What is the authority?
  • How will it evolve?
  • What is the pace of change?
  • Can we do it just in time?
  • Can we evaluate and test it again and again?
Write a Comment
User Comments (0)
About PowerShow.com