Knowledge Driven Software - PowerPoint PPT Presentation

About This Presentation
Title:

Knowledge Driven Software

Description:

Fractal Tailoring : Ontologies in Development Environments for Clinical Systems Where are the boundaries of ontology? How to get back where we were in 1985? – PowerPoint PPT presentation

Number of Views:239
Avg rating:3.0/5.0
Slides: 58
Provided by: csManAcU1
Category:

less

Transcript and Presenter's Notes

Title: Knowledge Driven Software


1
Knowledge Driven Software Fractal
TailoringOntologies in Development
Environments for Clinical Systems
Where are the
boundaries of ontology?How to get back where we
were in 1985?
  • Alan Rector
  • School of Computer Science, University of
    Manchesterrector_at_cs.manchester.ac.uk
  • http//www.cs.manchester.ac.uk/rector

2
Background A common set of problems observed in
implementing practical systems
  • Two industrial projects one major research
    application using Ontologies OWL with software
  • Documentation order entry software
  • Make the right information / forms / widgets
    available at the right time tailored to patient,
    task and setting
  • Minimize cognitive overload
  • Links to many medical terminologies
  • ICD, SNOMED, Ontology for Clinical Research and
    other statistical ontologies, National Center for
    Biomedical Ontology. GALEN,
  • Work on Protégé-OWL, OWL, and related formalisms
  • And living in a hot-bed of DL experts
  • Experiments with NHS National Programme for IT
    standards specifications
  • Plus a question from one of the most prominent
    and successful researchers in Health Informatics
  • Why cant we get back to 1985? (hands thrown up
    in despair)
  • Zak Kohane Harvard based, MIT trained in AI
    CS, amongst most experienced successful NIH
    bioinformatics and translational medicine
    researchers

3
and the obvious observationOntologies have
had little impact on software
  • Artifacts called Ontologies have become
    ubiquitous
  • Everybody thinks they need one sometimes just
    a synonym for good
  • but(outside this room)
  • Ontological methods and OWL remain niche markets
  • Compare to Model Driven Architecture (MDA)
  • Where are the Ontology Driven Architectures
  • Constant questions about how OWL relates to UML
  • and few helpful responses
  • Constant queries about relevance of ontologies
  • Minimal impact outside of annotation
  • And limited there
  • compare with XML, UML, and even RDF(S)

Why?
4
Plan of this talk
  • Where I come from
  • And my slant on the history of KR and Ontologies
    in Information Systems
  • Our use cases
  • And why clinical systems are hard
  • Architecture issues
  • Dual use of ontologies and Fractal tailoring
  • Ontologies, data structures, and user interfaces
  • Knowledge issues
  • Language
  • Generic Knowledge Representation Contingent
    Knowledge
  • Ontology issues
  • Ontology issues that do (and dont) matter to
    clinical systems
  • Whats in a code
  • Metadata, Annotations, Higher order
    representations
  • Evidence for choosing options, evaluation and
    quality assurance
  • Summary of requirements and issue

5
Where I come from
Best Practice
Best Practice
6
By way of User Centred DesignKR was a solution,
not a goal
7
and a long struggle with Poor fit between
problem solution spaces
Problem space
Solutionspace
8
Three guiding principles
  • The user is always right but the user is
    usually wrong
  • about the problem space the problems they have
  • about the solution space how to fix them
  • There is no one way!
  • But there are wrong ways
  • Enumeration does not scale
  • Medicine is a field of niches
  • Easily lead to combinatorial explosions

9
and pragmatic software development
Clinergy/ PenPAD (1997)
10
gt50K potential forms and subforms
From a tiny KB based on a normalised ontology
11
The scaling problem The combinatorial explosion
  • It keeps happening!
  • Simple brute force solutions do not scale up!
  • Conditions ? anatomy ? modifiers ? task ? setting
    ? user type ?
  • Huge number of niches Terms to author /
    Data structures to specify / GUI Screens to
    construct
  • Software CHAOS
  • Massive indexing
  • Massive task for quality assurance

12
The (combinatorially) exploding bicycle(codes
for injuries involving cyclists)
  • 1972 ICD-9 (E826) 8
  • READ-2 (T30..) 81
  • READ-3 87
  • 1999 ICD-10

ICD International Classification of diseases
13
1999 ICD10 587 codes
  • V31.22 Occupant of three-wheeled motor vehicle
    injured in collision with pedal cycle, person on
    outside of vehicle, nontraffic accident, while
    working for income
  • W65.40 Drowning and submersion while in bath-tub,
    street and highway, while engaged in sports
    activity
  • X35.44 Victim of volcanic eruption, street and
    highway, while resting, sleeping, eating or
    engaging in other vital activities

14
Beating the Combinatorial Explosion with
Conceptual Lego
gene
protein
polysacharide
cell
expression
chronic
Lung
acute
infection
inflammation
bacterium
deletion
polymorphism
ischaemic
virus
mucus
15
A grammar rather than a phrase bookComposition
rather than enumeration
SNPolymorphism of CFTRGene causing Defect in
MembraneTransport of Chloride Ion causing
Increase in Viscosity of Mucus in CysticFibrosis
Hand which isanatomically normal
16
Normalisation ModularisationBuilding complex
representations from modularisedprimitives
Reasoner as terminonologyCompiler
Species
Genes
Function
Disease
17
Our use cases and what role ontologies and
reasoners play in solving them
18
In general, Why might one want a Knowledge
Driven Architecture?
  • Consistency of vocabulary and meaning
  • Controlled vocabulary and ID management
  • Composition of new entities from old (post
    coordination)
  • Adaptability context sensitivity
  • Dynamic extension of data structures
  • Scalable maintenance and localisation
  • Common context-sensitive index for all resources
  • Transparent declarative representation
  • Logical Organisation, indexing Consistency
    checking
  • Changes made declaratively in exactly one place
    with predictable consequences

19
Fundamental ApproachDual use of ontologies
  • Content of the information systemWhat is carried
    by the data structures
  • The model of meaning for the information
  • What can be said and when
  • Classifier acts as a Terminology Compiler
  • Index to the information systemWhich data
    structures / UIs / Procedures to use
    whenBrachmans conceptual coatrack
  • Fractal tailoring dynamic assembly based on
    context
  • Context may include setting, task, user, etc.
  • Classifier acts as an Index Compiler
  • for components to assemble

20
Dual use for ontologies Indexing content
Hypertension
Hypertension
Idiopathic Hypertension
Idiopathic Hypertension
In our companys studies
In our companys studies
In Phase 2 studies
In Phase 2 studies
21
Some issues in the approach
22
Separation of Lexicon / Language and Symbolic
Model of Ontology
  • Ontologies are symbolic models implemented in
    software / logic
  • Behaviour unchanged under any consistent
    relabelling
  • Good practice is that every entity have a clear
    linguistic description as well as a label
  • But the descriptions and labels do not affect
    behaviour of the ontology in software
  • Language and Logic use different principles
  • Lexicons are about word usage, grammar, etc.
  • Many ways to express the same symbolic expression
    in language
  • Synonymy, polysemy, metonymy etc. are linguistic
    phenomena and language specific
  • Linguists and ontologists have surprising
    difficulty understanding each other
  • Research on relation between language and formal
    ontology still limited despite
  • John Bateman, Pelletier
  • MONNET project (UPM Asun Gomez Perez, Oscar
    Corho, )
  • Udo Hahn et al.
  • SWAT project Donia Scott, Richard Power,
  • Cycs work on language understanding

23
Language is often misleading wrt model
Arguments about words tend to dogmatism
  • Misleading
  • Heart valve ? valve located the heart
  • Means one of the four great heart valves
  • Cardiomyopathy ? Disorder of cardiac muscle
  • Otherwise a Myocardial Infarction would be a
    Cardiomyopathy.
  • Words lead to more dogmatic arguments than
    substance
  • Does neoplastic imply malignant?
  • bitter controversy
  • Do we need expressions for all of? - Benign
    tumour - Malignant tumour - Tumour,
    benign or malignant
  • No controversy
  • Separating substance and labelling can reduce
    meeting time by 75!

24
Language Generation for ontologies
  • How does an ordinary user understand a complex
    expression in an ontology
  • Procedure that includes (Removal that has_target
    some Appendix and occurs_in some
    (Situation that includes some
    (Inflammation that has_locus some Peritoneum
    )))
  • Appendicectomy in the presence of Peritonitis
    or Appendectomy with co-occuring Peritonitis
    or
  • Disorder that has_locus some Great_heart_valve
  • Disorder of heart valve, orValvular heart
    disease, or
  • Powerful mechanism for QA
  • GALEN used extensively for definitions that often
    ran to 20 lines or more

25
Language generation Multilingual ontologies
26
General background knowledgeThe flesh on the
ontological skeleton
  • Knowledge driven systems require more background
    knowledge than just ontologies
  • Ontologies are about what is universally true
  • Almost the only thing that ontologists can agree
    on
  • Otherwise ontology is just a synonym for
    logical theory
  • In all ontological formalisms, all statements
    begin with for all
  • Implied in DLs
  • Explicit in predicate logic based formulations
  • Expressed as lambda abstraction in Conceptual
    Graphs
  • Generally true in frames, but ambiguous
  • Much (most) background knowledge is
    contingenti.e. of the forms
  • Generally, Typically, may
  • Or Conventional facts

27
The issue of may
  • What may cause pneumonia?
  • Conventional answer (e.g. on a med school exam)
  • Bacteria, Virus, Yeast, Fungus,
  • Drill down and you get specific lists, e.g.
  • Bacteria Pneumococci, Haemophilus,
    Staphylococcus,
  • Virus Respiratory syncitial virus,
  • But
  • Not all pneumococci cause pneumonia
  • or even have the disposition to cause
    pneumonia
  • There are other things that can cause pneumonia
  • E.g. Pneumocystis in immunosuppressed patients
  • Indeed nearly any micro-organism in weird enough
    circumstances
  • Biology is rarely absolute! Almost never
    exhaustive! Rarely even mutually exclusive

28
May and Typically Characteristics of May
statements
  • Reciprocal
  • If A may cause B, then B may be caused by A.
  • Some alternative FoL approximations
  • ?xy . A(x) B(y) causes(x,y)
  • A(x) B(y) causes(x,y) is satisfiable
  • There is a subclass of causal associations CAB
    such that(all) CAB has_topic some A has_target
    some B.
  • (pun) Class A causally_associated_with value
    (pun) Class B
  • (all) A may_cause some B(all) B
    may_be_caused_by some Acause ? may_cause
  • Metalogical / procedural / uncertainty
    Implicatures The mentioned entities are
    distinguished
  • When a text book says Pneumonia may be caused
    by Bacteria, Virus, it means more than ?xy
    . Pneumonia(x) Bacteria(y) causes(x,y)

29
Many sources of contingent knowledge,
e.g.Statistical co-occurence
The nodes may Come from anontology The
contingent links do not.
From http//barabasilab.neu.edu/projects/hudine/
30
Associations by Common Metabolic Pathway
From D-.S Lee et al. 2008
31
Characteristics of typical statements
  • Not reciprocal
  • A typically causes B does not imply that B is
    typically caused by A, or visa versa
  • No first order interpretation
  • Defeasible logics still a research topic - not
    yet (if ever) for implementation
  • Inherited down the left side with exceptions
  • If A typically has C, then unless otherwise
    specified it is typical of subclasses of A
  • Metalogical procedural / uncertainty
  • Around notions like normative, statistical,
    probabilistic
  • Historically, the key function of frames
  • The precursors of modern ontology systems

32
Experience Normalised ontologies lead to clean
default inheritance
33
Inheritance Normalised Ontologies
  • Simple inheritance works well for normalised
    ontologies
  • Each organisational principle uses a separate
    tree
  • Trees combined by composition and classification
  • Each kind of information is inherited along a
    given principle
  • Nixon diamonds are rare but cannot be excluded
  • Heuristic incomplete but pragmatically highly
    useful
  • Heuristic procedureCollect the set of most
    specific1 characteristics of a given type
  • Usually a singleton, but
  • Provide a mechanism for conflict resolution when
    not
  • Good heuristic representation for
    typicallyWorks well to index hierarchies of
    contexts
  • Basis of fractal tailoring
  • (1 Use Touretzky criterion most specific
    not over-ridden)

34
Facts A-boxes vs Databases
  • Much of medical knowledge is just data
  • Product A is licensed for Condition B
  • The clinics in this hospital are
  • The specialists eligible to perform X are
  • The allowed values for this field are
  • The proformas for this condition are
  • The services available to perform this analysis
    are
  • Questions are closed world
  • Negation as failure
  • Even if they may use the classification hierarchy
    as a framework
  • A common pattern is an open world ontology as
    schema for a closed world data base
  • But rarely supported by tools or formalism
  • Conjunctive queries come close, but those still
    involve an A-Box.
  • Querying of an RDF store according to OWL should
    be easy but isnt
  • OWL to UML consistency testing understood, but
    not always the point

35
Procedural Knowledge, e.g.
  • Prototypical sequences
  • Plans and partial plans
  • Calculations attached procedures
  • Links to external services
  • Service Oriented Architectures
  • Workflows Business Process Rules
  • Helpful to index via ontologyUnhelpful to
    try to represent in an ontology

From http//www.mapofmedicine.com
36
Summary of architectural issues
  • Dual role for ontologies
  • Content
  • Indexing
  • Inheritance with defaults works for normalised
    ontologies bases
  • gives a basis for fractal tailoring
  • but ontologies are only a small part of the
    knowledge required for knowledge driving systems

37
Ontological Issues Intended unintended
consequences of borrowing a word (a metaphor)
  • Useful, not so useful, and counterproductive
    ontological notions dogmas
  • User oriented views, transformations and
    intermediate representations
  • Issues too often ignored and need further
    research
  • Linking ontologies to information systems
  • Whats in a code?

38
Example useful ontological distinctions
  • Kind and Role
  • E.g. Diagnosis and Conditions Evidence and
    Observation
  • Parthood and containment
  • E.g. Brain and Skull
  • Mode and Modifier (Generic Dependent vs
    Quality)
  • e.g. Family history of X and severe X
  • Observation
  • method vs result vs observed vs copy of
    data

39
Where ontology / logic can help, e.g.The
equivalence problem in SNOMED
  • Almost all attribute-value pairs can be
    transformed into (quasi) independent entities
  • A patient has a haemoglobin that is elevated
  • Iff
  • The patient has elevated hemoglobin

40
Formally (in OWL) Method 1
  • An equivalence
  • Has some (Hemolglobin that has_interpreatation
    some Elevated)
  • has some (Elevated that is_interpretation_of
    some Hemoglobin)
  • given the axioms
  • has o has_interpretation ? hashas o
    is_interpretation_of ? has
  • Is_interpretation_of inv(has_interpretation)
  • while maintaining
  • Haemoglobin that has_interpretatation some
    Elevated \ Elevated that is_interpretation_of
    some Haemoglobin

41
Alternative creative ambiguity in
formalismanalogous to role groups in SNOMED
  • Patient with a fracture of the leg equivalent
    to Patient with a leg that is fractured
  • Disorder that has_morphology some Fracture
    has_locus some Leg
  • Potential advantage of regularity for software
    with simpler reasoners
  • Measurement that has_target some
    Haemoglobin has_interpretation some
    Elevated

42
The issue of Context Large of elephant vs Large
mouse
  • OWL and related languages give a useful solution
  • Use defined classes and classifier to define and
    organise consistent hierarchies of context, e.g.
  • has_mass range Mass_quantity
  • Criterion for some (Large that is_size_of some
    Elephant) ? has_mass some gt 3500Kg
  • Criterion for some (Large that is_size_of some
    Mouse) ? has_mass some gt 30gms
  • Equally important for selection of data
    structures, procedures,

43
Conflict between clinical and ontological usage
Why should ontologists claim monopoly on correct
use of words?
  • Example parthood
  • Medical usage does not follow mereological theory
  • Best modelled by a different relation language
    is ambiguous
  • The thyroid is part of the endocrine system is
    a matter of function rather than physical
    connectedness
  • faults in parts are faults in the whole comes
    closer to clinical intuition
  • FMA driven to Sets of heterogeneous structures
    as a kluge the immune system has no parts!
  • Lack of functional information is major
    limitation on the use of the FMA
  • The way we bridge the gap by a hierarchy of
    parthood relations
  • Clinical_part Functional_part
    Physical_part

44
Example 2 Clinical distinctions that cut across
ontological distinctions Example - found in
almost all clinical systems
  • Observables
  • Attribute-value pairs, e.g.
  • Serum haemoglobin? has_measure value 13mg
  • has
    intepretation some Normal
  • vs
  • Findings
  • Things normally absent that may be present (or
    vice versa), e.g.
  • Lump, tumour, diabetes, elevated temperature,
    fever,
  • has some Diabetesnot has some Diabetes

45
but ontological categories mixed users
require alternative views
  • Observables may be
  • Qualities of
  • the body, of parts of the body, of functions, of
    roles, of processes, etc
  • Relations to independent entities
  • e.g. site of radiation of pain
  • Findings may be
  • Independent
  • Generic dependent
  • Reified values of interpretations
  • Reified relations

46
Bridging the gap
  • Provide alternative organisations of the ontology
  • Let the classifier do the work
  • But requires strict logical consistency users
    intuitions are not always strictly logical
  • Provide separate user organisation
  • Separate browsing / searching layer overlayed on
    ontology
  • Thesauri and SKOS seem the natural candidates
  • Systematic transformations between thesauri and
    ontologies a critical research area

47
Ontological dogma counterproductive for clinical
systemsPotential epistemic status is
fundamental to medical reporting and reasoning
  • Most clinical systems distinguish at least two
    of
  • Observation Serum Haemoglobin 7 mg
  • Interpretation Serum Haemoglobin low
  • Belief (diagnosis) Anaemia
  • Ignoring potential epistemic status cripples an
    ontology for use in clinical systems
  • Because different behaviours are required
    depending on the potential epistemic status

48
Prohibition of entities that that have no
instancesScience is driven by hypotheses
medicine by differential diagnosesAn ontology
for an information system must represent the
entities in it Whether they exist in the world
is irrelevant
  • The information system may contain a
    representation of a nonexistent entity so that
  • Test for its existence
  • Describe it if it is suddenly found to exist
  • Hold data about it when it was thought to exist
  • Examples
  • The toxin responsible for AIDS
  • The gene for X
  • Pneumonia caused by Trypanosome
  • The Higgs Boson
  • (even Unicorns e.g. to say that Narwhale may
    have been the origin of the myth of the unicorn
    or just to have a catalogue of mythical
    creatures

49
Metadata Many ontologies exist primarily to
carry metadataNot mere annotations Not
just not first order
  • Metadata about the artifact, e.g.
  • Mappings - to other ontologies, terminologies,
    coding systems, UMLS, standards, Web resources,
  • Textual definitions, IDs,
  • Editorial Information, e.g.
  • Authorship, provenance, authority, ...
  • Meta-models / Schemas
  • The structure of the artifact itself to aid in
    authoring, editing, QA, Interfaces,
  • Why dont ontologies have schemas?
  • Higher order domain information, (NOT metadata)
    e.g.
  • Endangered species, Category first described
    by,
  • Two different types of injury

50
Ontological issues needing more
attentionPrototypes
  • Relation of prototypes, individuals fulfilling
    those prototypes, and collections of those
    individuals
  • Prototypes authored pre-hoc and then realised
    in individual
  • Blueprints and buildings
  • Protocols for a trial, individual patients
    histories in the trial, data from the trial,
    analysis and description of the data, description
    of the trial
  • Prototypes abstracted post-hoc from
    observations of individuals
  • Normative anatomy and biology
  • Almost all scientific laws
  • Does the distinction matter? In theory? In
    practice? If so how?
  • What does it mean to be conformant or normal?
    abnormal? missing?

51
Ontologies and Software EngineeringTowards multi
layer models and defined interfaces
  • Ontologies are not data models
  • Although data models may be motivated by
    ontologies
  • Our understanding of the world vs how to store
    information about the world
  • A data structure can have a missing entry for
    heart beat
  • A (live) person cannot have a missing heart beat
  • Ontology languages are really general logic
    languages
  • Can be used to describe either,BUT NOT AT THE
    SAME TIME

52
OWL / Ontologies and Template languages for Data
Structures
  • How to integrate more effectively with UML
    Model Driven Architectures ( MDA)?
  • How to use ontologies to derive value sets
  • How to relate ontologies to user interfaces
  • Detailed analysis in Beradi et al
    2005(http//www.inf.unibz.it/calvanese/papers-ht
    ml/AIJ-2005.html)
  • Described how to use DLs to describe data
    structures
  • Not the relation between Ontologies and Data
    structures
  • Numerous tools and schemes since - but the same
    fault
  • E.g. OWL2XMI
  • OMG Profile is largely syntactic and results in
    translations with unforeseen consequences
  • and none deal with difference between logical
    axioms and specification templates or both OWL
    and UML semantics

53
Fitting with Software Engineering Practice e.g.
Ontology Programming Interface
The only things the application needs to know
about the ontology
54
Evidence Evaluation of proposals and options and
for Quality Assurance
  • What counts as evidence wrt ontologies for
    information systems
  • The consequences for the information systems,
    e.g.
  • Expressivity, inferences
  • Correspondence to users intuitions and usability
  • Relation to standards and common practices
  • Transformations between alternatives (morphisms)
  • Flexibility, effort required for maintenance
  • Complexity of queries
  • Scaling
  • Consistency with the world as we best understand
    it
  • Are any of the consequences refuted when tested
    unacceptable to users
  • Implementations
  • Until implemented, it isnt yet an ontology for
    information systems
  • Where are the papers theses comparing options?
  • Analogous to those in the software, logic and
    other academic communities

55
Summary Dual use of ontologies in Information
Systems
  • As the definitions and universal knowledge of the
    content
  • And a framework for the other background
    knowledge required by the system
  • As the index of the information system itself for
    Fractal Tailoring
  • Scalable software that beats the combinatorial
    explosion

56
Challenges
  • Relation of ontologies to software engineering
  • Relationship of ontologies to data structures
  • Effective interaction with UML family of
    representations
  • Ontology software interfaces, versioning, and
    co-evolution
  • Reconciling ontologists and clinicians
    viewpoints
  • Ontology views, intermediate representations
    transformations
  • Relation to SKOS, thesauri and other browsing /
    searching
  • Relation to language both understaning and
    generation
  • Focusing on problems clinicians see as important
  • Capturing the epistemic essence of clinical
    entities
  • Relation of protocols, patients courses, data,
    and analyses (prototypes/patterns)
  • Equivalences in electronic health records
  • Heuristic hybrid architectures for broader
    knowledge representation
  • may, typical, probabilities,
  • Reviving heuristic approaches to defaults and
    exceptions
  • Interaction with rules, procedures,
  • Establishing standards of evidence and
    scholarship
  • Tooling, tooling, and tooling

57
End
58
Other Open Ontological QuestionsWhat are the
alternatives?
  • Normal, abnormal, pathological, conformant,
    present, absent
  • Handling of collective and mass phenomena
  • Cells, tissues, substances, mixtures, etc.
  • Homogeneity vs Heterogeneity
  • How to represent All of the same kind
    of two different kinds
  • What does a code really stand for?
  • DisorderX
  • having DisorderX
  • Analysis and specification of may, typically,
    etc.
Write a Comment
User Comments (0)
About PowerShow.com