The Unbearable Lightness of Biomedical Informatics - PowerPoint PPT Presentation

Loading...

PPT – The Unbearable Lightness of Biomedical Informatics PowerPoint presentation | free to download - id: 6ed8eb-YjY2N



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

The Unbearable Lightness of Biomedical Informatics

Description:

Title: FMA - GALEN Author: tb Last modified by: Barry Smith Created Date: 6/24/2004 2:27:42 PM Document presentation format: On-screen Show Other titles – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 105
Provided by: tb57
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: The Unbearable Lightness of Biomedical Informatics


1
The Unbearable Lightness of Biomedical Informatics
  • Barry Smith
  • Saarbrücken/Buffalo
  • http//ontologist.com

2
if Medical WordNet is the solution
  • what is the problem?
  • Coling Proceedings, Vol. 1, pp. 371-380

3
(No Transcript)
4
Cerebellar tumor
5
Organism
Organ
Tissue
10-1 m
Cell
Organelle
10-5 m
Protein
DNA
10-9 m
6
The quantity-quality divide
  • 30,000 genes in human
  • 200,000 proteins
  • 100s of cell types
  • 100,000s of disease types
  • 1,000,000s of biochemical pathways (including
    disease pathways)
  • legacy of Human Genome Project
  • and of attempts to institute the electronic
    health record

7
Organism
Organ
Tissue
10-1 m
Cell
Organelle
10-5 m
Protein
DNA
10-9 m
8
FUNCTIONAL GENOMICS
  • proteomics,
  • reactomics,
  • metabonomics,
  • toxicopharmacogenomics
  • phenomics,
  • behaviouromics,

9
Organism
The method of annotations
Organ
Tissue
10-1 m
Cell
Organelle
10-5 m
Protein
DNA
10-9 m
10
Organism
The method of indexing
Organ
Tissue
10-1 m
Cell
Organelle
10-5 m
Protein
DNA
10-9 m
11
The Gene Ontology
  • menopause
  • sensitivity to blue light
  • heptolysis

12
(No Transcript)
13
How overcome incompatibilities between different
scientific index terms?
  • immunology

genetics
cell biology
14
One answer (statistical) computational
linguistics
Pattern recognition based on string searches
15
String searches need constraints
  • we cant leave it to luck to overcome
    terminological incompatibilities

16
Remember different disciplines are using
different terminologies to refer to the same
objects, processes, features in reality
  • immunology

genetics
cell biology
17
An alternative answer
  • Ontology

18
Ontology, roughly
  • Overcome terminological incompatibilities by
    creating a standardized framework into which
    diverse vocabularies can be mapped

19
Kinds of Ontologies
ad hoc Hierarchies (Yahoo!)
Description Logics (DAMLOIL)
XML Schema
structured Glossaries
formal Taxonomies
XML DTDs
Terms
Thesauri
Data Models (UML, STEP)
Principled, informal hierarchies
ordinary Glossaries
Data Dictionaries (EDI)
General Logic
Frames (OKBC)
DB Schema
Glossaries Data Dictionaries
MetaData, XML Schemas, Data Models
Formal Ontologies Inference
Thesauri, Taxonomies
Michael Gruninger
20
Kinds of Ontologies
A shared vocabulary plus a specification of its
intended meaning
21
Kinds of Ontologies
ad hoc Hierarchies (Yahoo!)
Description Logics (DAMLOIL)
XML Schema
structured Glossaries
formal Taxonomies
XML DTDs
Terms
Thesauri
Data Models (UML, STEP)
Principled, informal hierarchies
ordinary Glossaries
Data Dictionaries (EDI)
General Logic
Frames (OKBC)
DB Schema
Glossaries Data Dictionaries
MetaData, XML Schemas, Data Models
Formal Ontologies Inference
Thesauri, Taxonomies
22
Kinds of Ontologies
A shared vocabulary plus a specification of its
intended meaning
23
Kinds of Ontologies
A shared vocabulary plus a specification of its
intended meaning
Two extremes
24
  • Work on biomedical ontologies grew out of work
    on medical thesauri and nomenclatures

25
Kinds of Ontologies
ad hoc Hierarchies (Yahoo!)
Description Logics (DAMLOIL)
XML Schema
structured Glossaries
formal Taxonomies
XML DTDs
Terms
Thesauri
Data Models (UML, STEP)
Principled, informal hierarchies
ordinary Glossaries
Data Dictionaries (EDI)
General Logic
Frames (OKBC)
DB Schema
Glossaries Data Dictionaries
MetaData, XML Schemas, Data Models
Formal Ontologies Inference
Thesauri, Taxonomies
26
NarrowerTerm
Graph with labels edges (similarTo, Narrower,
synonymWith) Fixed set of edge labels (a.k.a.
relations)
Goble Shadbolt
27
  • Unified Medical Language System (UMLS)
  • UMLS Metathesaurus
  • 1 million biomedical concepts
  • 2.8 million concept names
  • from more than 100 controlled vocabularies and
    classifications
  • built by US National Library of Medicine

28
UMLS Source Vocabularies
  • MeSH Medical Subject Headings
  • ICD International Classification of Diseases
  • GO Gene Ontology
  • FMA Foundational Model of Anatomy

29
To reap the benefits of standardization
  • we need to make ONE SYSTEM out of many different
    terminologies
  • UMLS Semantic Network
  • nearest thing to an ontology in the UMLS

30
UMLS SN
  • Alexa McCray, An Upper Level Ontology for the
    Biomedical Domain, Comparative and Functional
    Genomics, 4 (2003), 80-84.

31
UMLS SN
  • 134 Semantic Types
  • 54 types of edges (relations)
  • yielding a graph containing more than 6,000 edges

32
Fragment of UMLS SN
33
(No Transcript)
34
(No Transcript)
35
UMLS SN Top Level
  • entity event
  • physical conceptual
  • object entity
  • organism

36
conceptual entity
  • Organism Attribute
  • Finding
  • Idea or Concept
  • Occupation or Discipline
  • Organization
  • Group
  • Group Attribute
  • Intellectual Product
  • Language

37
conceptual entity
  • Organism Attribute
  • Finding
  • Idea or Concept
  • Occupation or Discipline
  • Organization
  • Group
  • Group Attribute
  • Intellectual Product
  • Language

38
  • Idea or Concept
  • Functional Concept
  • Qualitative Concept
  • Quantitative Concept
  • Spatial Concept
  • Body Location or Region
  • Body Space or Junction
  • Geographic Area
  • Molecular Sequence
  • Amino Acid Sequence
  • Carbohydrate Sequence
  • Nucleotide Sequence

39
  • Idea or Concept
  • Functional Concept
  • Qualitative Concept
  • Quantitative Concept
  • Spatial Concept
  • Body Location or Region
  • Body Space or Junction
  • Geographic Area
  • Molecular Sequence
  • Amino Acid Sequence
  • Carbohydrate Sequence
  • Nucleotide Sequence

40
  • Idea or Concept
  • Functional Concept
  • Qualitative Concept
  • Quantitative Concept
  • Spatial Concept
  • Body Location or Region
  • Body Space or Junction
  • Geographic Area
  • Molecular Sequence
  • Amino Acid Sequence
  • Carbohydrate Sequence
  • Nucleotide Sequence

41
  • Idea or Concept
  • Functional Concept
  • Qualitative Concept
  • Quantitative Concept
  • Spatial Concept
  • Body Location or Region
  • Body Space or Junction
  • Geographic Area
  • Molecular Sequence
  • Amino Acid Sequence
  • Carbohydrate Sequence
  • Nucleotide Sequence

42
Lake Geneva
  • is an Idea or Concept

43
  • Idea or Concept
  • Functional Concept
  • Qualitative Concept
  • Quantitative Concept
  • Spatial Concept
  • Body Location or Region
  • Body Space or Junction
  • Geographic Area
  • Molecular Sequence
  • Amino Acid Sequence
  • Carbohydrate Sequence
  • Nucleotide Sequence

44
UMLS
  • Fingers is_a Body Location or Region
  • Hand is_a Body Part, Organ, or Organ Component
  • hand part_of body
  • BUT NOT
  • fingers part_of hand

45
Problem Running together of concepts and
entities in reality
bioinformatics à la UMLS SN ( like many
knowledge engineering disciplines ) floats
free from reality in a conceptual
world of its own creation
46
Blood Pressure Ontology
  • The hydraulic equation
  • BP COPVR
  • arterial blood pressure (BP) is directly
    proportional to the product of blood flow
    (cardiac output, CO) and peripheral vascular
    resistance (PVR).

47
UMLS SN
  • blood pressure is an Organism Function
  • cardiac output is a Laboratory or Test Result or
    Diagnostic Procedure

48
BP COPVR thus asserts that
  • blood pressure is proportional either to a
    laboratory or test result or to a diagnostic
    procedure

49
Problem Confusion of reality with our (ways of
gaining) knowledge about reality
50
UMLS Semantic Network
  • entity
  • physical conceptual
  • object entity

51
  • Physical Object
  • Substance
  • Food Chemical Body

52
  • Chemical
  • Chemical Chemical
  • Viewed Viewed
  • Structurally Functionally

53
Problem Confusion of objects with our ways of
referring to objects
54
  • Chemical
  • Chemical Chemical
  • Viewed Viewed
  • Structurally Functionally
  • Inorganic Organic Enzyme
    Biomedical or
  • Chemical Chemical Dental
    Material

55
This multiple inheritance leads to errors in
coding
  • Gene Ontology will eliminate multiple inheritance

56
UMLS Semantic Network
  • entity
  • physical conceptual
  • object entity
  • organism

57
  • UMLS SN
  • is_a def.
  • If one item is_a another item then the first
    item is more specific in meaning than the second
    item. (Italics added)

58
  • fish is_a vertebrate
  • copulation is_a biological process
  • both testes is_a testis
  • Nazi is_a Nazism
  • plant parts is_a plant

59
(No Transcript)
60
  • What are the nodes in this graph?
  • Almost all nodes are linked to other nodes by a
    multiplicity of different types of edges
  • Compare swimming is healthy
  • swimming has 8 letters

61
  • Semantic Network Definition
  • Concept def. An abstract concept, such as a
    social, religious, or philosophical concept
  • UMLS Definition
  • Concept def. A class of synonymous terms

62
(No Transcript)
63
How can concepts figure as relata of these
relations?
  • part_of def. Composes, with one or more other
    physical units, some larger whole
  • causes def. Brings about a condition or an
    effect.
  • contains def. Holds or is the receptacle for
    fluids or other substances.

64
  • How can a set of synonymous terms serve as a
    receptacle for fluids or other substances?
  • How can sets of synonymous terms stand in
    relations such as affects or causes?

65
connected_to def. Directly attached to another
physical unit as tendons are connected to
muscles.
  • How can a concept be directly attached to another
    physical unit?

66
What are the relata which are linked by the edges
in the SN graph?
67
To answer this question
  • we need to distinguish clearly between concepts
    and classes
  • concepts are creatures of cognition
  • classes are invariants (types, kinds, universals)
    out there in reality

68
If ontologies are about meanings / concepts
  • it becomes impossible to deal coherently with
    those relations between entities in reality which
    involve appeal to both classes and their
    instances.

69
Illustration re part_of
  • heart part_of human
  • human heart part_of human
  • testis part_of human
  • human testis part_of human

70
  • For instances
  • part_of instance-level parthood
  • (for example between Mary and her heart)
  • For classes
  • A part_of B def. given any instance a of A there
    is some instance b of B such that a part_of b
  • This is an assertion about As.

71
  • a adjacent_to b
  • (instance-level adjacency, for example between
    Marys head and Marys neck)
  • For classes
  • A adjacent_to B def. given any instance a of A
    there is some instance b of B which is such that
    a adjacent_to b

72
A adjacent_to B
  • as an assertion about classes
  • is never an assertion about As exclusively

73
  • A adjacent_to B def.
  • given any instance a of A there is some instance
    b of B which is such that a adjacent_to b
  • and
  • given any instance b of B there is some instance
    a of A which is such that a adjacent_to b

74
Almost all of the 54 types of edges in SN are
dealt with incoherently
  • part_of HAS INVERSE has_part
  • nucleus part_of cell
  • cell has_part nucleus

75
(No Transcript)
76
  • Acquired Abnormality affects Fish
  • Experimental Model of Disease affects Fungus
  • Food causes Experimental Model of Disease
  • Bacterium causes Experimental Model of Disease
  • Biomedical or Dental Material causes Mental or
    Behavioral Dysfunction
  • Manufactured Object causes Disease or Syndrome
  • Vitamin causes Injury or Poisoning

77
How to do better?
78
How to do better?
  • How to create a network of biomedically relevant
    terms/classes, with coherently defined relations
    between them, to which expert terms of the UMLS
    can be assigned in a maximally intelligible way?

79
What linguistic framework
  • is shared in common by immunologists, geneticists
    and cell biologists,
  • by phenobehavioromists and by toxicopharmacogenomi
    sts?

80
Answer
  • the natural language they all use to talk about
    biological (biomedical) phenomena

81
BioWordNet
  • joint work with
  • Christiane Fellbaum
  • (see paper in Proceedings)

82
BioWordNet
  • use WordNets biomedical vocabulary, to create a
    better alternative to UMLS SN

83
Strengths of WordNet 2.0
  • Open source
  • Very broad coverage
  • Is-a / part-of architecture
  • Tool for automatic sense disambiguation

84
Weaknesses of WordNet 2.0
  • Problems with relations
  • Mixes up expert and non-expert vocabulary
  • Errors
  • Gaps
  • Noise
  • all prevent WordNets being used in scientific
    context as substitute for UMLS SN

85
Fix WordNets relations by using the methodology
outlined above
  • already applied to
  • Foundational Model of Anatomy
  • Gene Ontology
  • Open Biological Ontologies

86
Institute for Formal Ontology and Medical
Information Science
  • Saarbrücken
  • http//ifomis.org

87
WordNet mixes up expert and non-expert
vocabulary,
  • both current and medieval
  • suppuration2 pus, purulence, suppuration,
    ichor, sanies, festering

88
WordNet contains biomedically relevant errors
  • snore-sleep
  • WordNet if someone snores, then he necessarily
    also sleeps
  • snoring the respiratory induced vibration of
    glottal tissues
  • associated not only with sleep but also with
    relaxation or obesity

89
WordNet has too much noise for purposes of
scientific applications
90
13 senses for feel is a verb
  • experience She felt resentful
  • find I feel that he doesn't like me
  • feel She felt small and insignificant
  • feel We felt the effects of inflation
  • feel The sheets feel soft
  • grope He felt for his wallet
  • finger Feel this soft cloth!
  • explore He felt his way around the dark room)
  • feel It feels nice to be home again
  • feel He felt the girl in the movie theater)

91
Medical senses of feel
  • palpate examine a body part by palpation
  • The runner felt her pulse.
  • sense perceive by a physical sensation, e.g.
    coming from the skin or muscles
  • He felt his flesh crawl
  • feel seem with respect to a given sensation
  • My cold is gone I feel fine today

92
WordNet has gaps even in its coverage of
biomedical natural language
93
WordNet seness of regulation
  • 1. regulation (ordinance, rule)2. rule,
    regulation -- (a principle that customarily
    governs behavior "short haircuts were the
    regulation")3. regulation -- (the state of being
    controlled or governed)4. regulation -- (the
    ability of an early embryo to continue normal
    development after its structure has been somehow
    damaged)5. regulation, regularization,
    regularisation -- (the act of bringing to
    uniformity)6. regulation, regulating -- (the act
    of controlling according to rule "fiscal
    regulations are in the hands of politicians")

94
Biological sense of regulation
  • A process that modulates the frequency, rate or
    extent of behavior
  • (Gene Ontology)

95
WordNet senses of inhibition
  • 1. inhibition, suppression -- ((psychology) the
    conscious exclusion of unacceptable thoughts or
    desires)2. inhibition -- (the quality of being
    inhibited)3. inhibition -- the process whereby
    nerves can retard or prevent the functioning of
    an organ or part "the inhibition of the heart by
    the vagus nerve")4. prohibition, inhibition,
    forbiddance -- (the action of prohibiting or
    forbidding)

96
Biological senses of inhibition much broader
  • inhibition negative regulation
  • enzymes can be inhibited
  • reactions can be inhibited
  • and not only by nerves

97
WordNet senses of binding
  • 1. binding -- (the capacity to attract and hold
    something)2. binding -- (a strip sewn over or
    along an edge for reinforcement or decoration)3.
    dressing, bandaging -- (the act of applying a
    bandage)4. binding, book binding "the book had
    a leather binding")

98
biological sense of binding
  • interacting selectively with
  • (Gene Ontology)

99
Remove errors, noise and gaps in a two-stage
process
  • 1.select biomedically relevant natural-language
    terms from WordNet 2.0 extended by standard
    biomedical information sources
  • 2.validate these terms and the relations between
    them

100
Validation
  • each arc in BWN is converted into a
    natural-language sentence
  • e.g. mumps is an inflammation
  • via controlled human subjects experiments
  • are accredited
  • 1. as intelligible by non-experts
  • 2. as true by experts

101
we use logical methods to ensure a coherent
treatment of BWNs upper-level classes and
relationsand thereby also bring logical rigor
in a practical fashion to the whole of the UMLS
Metathesaurus
102
Bring ontological rigour to BWN
ad hoc Hierarchies (Yahoo!)
Description Logics (DAMLOIL)
XML Schema
structured Glossaries
formal Taxonomies
XML DTDs
Terms
Thesauri
Data Models (UML, STEP)
Principled, informal hierarchies
ordinary Glossaries
Data Dictionaries (EDI)
General Logic
Frames (OKBC)
DB Schema
Glossaries Data Dictionaries
MetaData, XML Schemas, Data Models
Formal Ontologies Inference
Thesauri, Taxonomies
103
The long-term goal
  • BWN should serve as scaffolding/indexing system
    for the much larger and denser net of expert
    biomedical terminology which is the UMLS
    Metathesaurus

104
The End
About PowerShow.com