Clinical Trial Ontology Achieving Consensus - PowerPoint PPT Presentation


PPT – Clinical Trial Ontology Achieving Consensus PowerPoint presentation | free to view - id: 44bdb3-Y2QwZ


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

Clinical Trial Ontology Achieving Consensus


Clinical Trial Ontology Achieving Consensus Barry Smith Clinical Trial Ontology and Clinical Trial Information Models Achieving Consensus Barry Smith ... – PowerPoint PPT presentation

Number of Views:285
Avg rating:3.0/5.0
Slides: 85
Provided by: Barry137


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Clinical Trial Ontology Achieving Consensus

Clinical Trial OntologyAchieving Consensus
  • Barry Smith

Clinical Trial Ontology and Clinical Trial
Information ModelsAchieving Consensus
  • Barry Smith

Two types of ontology
  • Natural-science ontologies
  • capture terminology-level knowledge used by best
    current science
  • vs.
  • Administrative ontologies
  • for billing, bloodbank, lab workflow
  • research informatics / healthcare informatics

Scientific ontologies have special features
  • each term must be such that the developers of
    the ontology believe it to refer to some entity
    on the basis of the best current evidence

For scientific ontologies
  • reusability by different groups of users and for
    different purposes is crucial
  • compatibility with neighboring scientific
    ontologies is crucial

For scientific ontologies
  • the issue of how the ontology will be used is
    not a factor relevant for determining which
    entities will be acknowledged by the ontology
  • If ontologies are built to address specific
    logical practical purposes, this will thwart
    their purpose to foster global communication

Hence scientific ontologies
  • are relatively unsophisticated from a
    computational point of view

For administrative ontologies
  • entities may be brought into existence by the
    ontology itself, or by administrative policy
    (surgical procedure not performed because of
    patient request)
  • The ontology is task-dependent reusability and
    compatibility with other ontologies not (always)

Scientific ontologies
  • utilize the scientific method are incremental
    and cumulative rest on empirical testing,
    openness and public criticism should be modular,
    but interdependent ----- descriptive
  • Administrative ontologies
  • employ the methods of administrators can rest
    on commanding and cajoling can create law -----

maps have legends
scientific ontologies are legends for data
legends for images
(No Transcript)
(No Transcript)
(No Transcript)
ontology terms natural language labels
  • organized in a graph-theoretic structure,
    designed to make data
  • cognitively accessible to human beings
  • algorithmically accessible to machines
  • and linked up to other data resources because the
    same labels have been used

Knowledge Environments for Biomedical Research
  • 1. Sustainability
  • 2. Adaptability
  • 3. Interoperability
  • 4. Evolvability
  • http//

through annotations in science-based ontologies
e.g. of phenotypic qualities
  • the results of clinical trials can be integrated
    together and made available for querying on
    disease terms
  • logical structure of ontology should help to
    ensure quality of annotations and enhance
    effectiveness of queries (head ? cranium)

The methodology of annotations
  • is proving very successful in domains such as
    molecular biology
  • is empirical, modular, incremental
  • truly works with very many kinds of heterogeneous
    data supports data integration without the need
    for expensive data standardization
  • Can it be extended to clinical research, which
    spans the boundary between biological science and

There can be scientific ontologies
  • of administrative domains
  • Reflecting the ways administrative domains
    actually are, rather than imposing rules as to
    how they should be
  • There can be scientific ontologies of data and
    information objects.

  • that part of the OBO Foundry which covers the
    broadly administrative/informational aspects of
    clinical research
  • works with interoperable Foundry ontologies to
    cover the biological, clinical parts
  • (need additional ontologies for clinical
    procedures, drugs, drug effects ...)

Scientific ontology approach to a scientific
  • Clinical trials are scientific experiments
  • cohort, randomization, placebo, response,
    efficacy, control, protocol, null hypothesis,
    confidence interval, finding, biomarker, primary
    outcome, secondary outcome, intervention group,

  • The Ontology of Experiments
  • L. Soldatova, R. King

EXPO Formalisation of Science
  • The goal of science is to increase our knowledge
    of the natural world through the performance of
  • This knowledge should, ideally, be expressed in a
    formal logical language, to promote the free
    exchange of scientific knowledge and simplifies
    scientific reasoning.

EXPO Experiment Ontology
equipment part_of experimental design (confuses
object with specification)
  • Clinical trials are in part administrative
    entities (analogous to bank accounts)

There can be a scientific ontology approach to an
administrative domain
  • cohort, randomization, protocol, group,
    intervention, placebo, response, efficacy,
    control, null hypothesis, confidence interval,
    finding, biomarker, primary outcome, secondary
    outcome, confounder

There can be a scientific ontology approach to an
administrative domain
  • cohort, randomization, protocol, group,
    intervention, placebo, response, efficacy,
    control, null hypothesis, confidence interval,
    finding, biomarker, primary outcome, secondary
    outcome, confounder

CTO, EXPO, OBI work like the Gene Ontology
  • all are built to reflect a pre-existing domain
  • we have data
  • we need to make it available for semantic search
    and algorithmic processing
  • we create a consensus-based ontology for
    annotating the data
  • and make it work with ontologies for neighboring
  • No role for administrative fiat in transforming
    the object-domains of these ontologies

General recommendation
  • rely as far as possible on the methodology of
    scientific ontologies
  • The greater the degree to which your data
    regimentation relies on good science (including a
    good scientific understanding of scientific
    experimentation itself), the more likely is this
    regimentation to support good science in the
    future and to survive the changes brought by
    scientific advance

(Modest) Goals of the Clinical Trial Ontology
  • To fully and faithfully capture the types of
    entities and relationships involved in clinical
    trials of any experimental design comprehending
    terms drawn from resources such as the CDISC
    glossary and all terms needed for the task of
    meta-analysis of clinical trials
  • To organize these terms in a structured way,
    providing definitions and logical relations
    designed to enhance retrieval of, reasoning with,
    and integration of the data annotated in its
    terms, in ways which will support trial bank

In the domain of clinical research,scientific
ontology is not enough
  • 1. privacy, security, liability
  • 2. incentive
  • 3. regulatory issues
  • 4. trial management

(No Transcript)
Why (Data) Standards?
  • Improve time to market for safe and effective
    treatments (increased patient safety and reduced
  • Improves efficiency of evaluation of safety and
    efficacy of investigational treatments
  • Facilitates communication between regulatory
    authority and applicant
  • Facilitates development of efficient review
    environment (e.g., training, analysis tools)

Efficient Review Environment
  • Standards provide
  • common structure and terminology
  • single data source for review (less redundant
  • Standards allow
  • use of common tools and techniques
  • common training
  • single validation of data

Problems with standards
  • standards have costs
  • people will need to transform their data
    management habits in a costly process

Problems with standards
  • Not all ISO standards are of high quality
  • ISO 15926 Oil and gas ontology
  • ISO 18308 Health informatics
  • Once a bad standard is set in stone you are
    creating problems for your children and for all
    your childrens children

The administrative approach
  • a standard is built to change the domain, e.g. in
    order to make data available for a clinical data
    repository such as Janus
  • the standard here performs double-duty, it needs
    both to reflect existing work processes, and to
    transform them into something more
  • much more difficult to achieve, especially on a
    large scale

(No Transcript)
Patient Record
(No Transcript)
Given the complexity of the clinical trial domain
  • and given features such as security of patient
    data, role of regulatory bodies, payers, funders,
  • both approaches are needed
  • scientific approach e.g. for clinical terminology
    (SNOMED, ...)
  • administrative approach e.g. for billing,
    ordering ...

Clinical trial research
  • will need to have a clinical trial ontology that
    will work together with ontologies for other
    domains within biomedicine

Alan Rector Medical ITs odd organisational
Separate developmentof medical ontologies /
terminologies such as SNOMED and medical
information models such as HL7 RIM Rector, et
al., Binding Ontologies and Coding Systems to
Electronic Health Records and Messages
Statements about the world All diabetes are
metabolic diseases John has diabetes and it is
brittle and long standing
Valid Specifications for data structures Valid
diabetic data structures have a diagnosis code
that is diabetes or one of its subcodes
Ontologies vs. data structures
  • Ontology All persons have a sex
  • Data structure may have a header for people but
    no field for sex
  • The data structure determines a world in which
    people have no sex.
  • Valid data structures can be exhaustively and
    completely described ? closed world assumption
  • Ontologies are intrinsically open they must be
    able to change with new scientific discoveries
  • We can never describe the world completely

Purpose of an Information Model like HL7
  • to specify valid data structures to carry
    information valid, for example, for billing or
    regulatory purposes
  • to constrain the data structures to just those
    which a given software system can process

The Term Binding Problem
closed world open world

What happens when scientific methods change, for
example new paradigms for testing drug efficacy
are developed?
Administrative ontologies vs. scientific
  • CDISC / BRIDG vs. CTO / OBO Foundry
  • Both share at least one goal in common to
    support high-level information-based scientific
    research and clinical care
  • Can we make the two support each other?
  • Do we really need both?
  • Does CDISC / BRIDG already include an ontology of
    clinical trials?

  • Does the BRIDG/HL7-style methodology work, given
    the problems of achieving compromise and of
    maintaining consistency? (... sort of fits
    together ...)
  • Does the OBO / CTO-style methodology not have
    analogous problems if all those separate modules
    and separate groups are truly to work together?

CDISC Clinical Data Interchange Standards
  • creating worldwide industry standards to support
    the electronic acquisition, exchange, submission
    and archiving of clinical trials data and
    metadata for medical and biopharmaceutical
    product development

CDISC, HL7, and caBIG collaboration The Cancer
Biomedical Informatics Grid Structured Protocol
The Clinical Data Interchange Standards
Consortium (CDISC) Domain Space Analysis Modeling
(DSAM) effort is a strategic initiative between
CDISC, Health Level Seven (HL7) and caBIG to
develop, support, influence and harmonize
standards for data representation and exchange in
clinical research. A requirement for seamless
syntactic and semantic interoperability in
clinical research is a common, shared data
representation. Such an exchange standard will
facilitate data sharing and help to speed the
delivery of innovative solutions to the cancer
patient based on biological research, improve the
efficiency and timeliness of data reporting,
enhance patient safety during clinical trials,
and improve the shared care of oncology patients.
Currently, there are two major standards for
data exchange in healthcare. The CDISC group has
focused on data exchange and reporting among
pharmaceutical companies engaged in clinical
research, while HL7 has developed healthcare
messaging standards for all aspects of patient
care. The caBIG community is playing an
aggressive part in the activities associated with
harmonizing these two models, and constructing a
common model that can be shared among CDISC, HL7,
and caBIG members. Specifically, caBIG
participants have 1. Participated in development
of the CDISC Clinical Trials DSAM , a UML class
model representing the common shared concepts of
clinical research across the key communities
pharmaceutical companies, federal agencies and
academic medical centers. 2. Supported the
mapping of the CDISC DSAM to the HL7 v3 Reference
Information Model (RIM) at the detail attribute
level. This will ensure that all caBIG and
CDISC data exchange requirements are addressed by
the HL7 v3 messaging standards. Concepts
identified in CDISC modeling may uncover new
additions for the RIM, in turn making the RIM
more robust 3. Participated in the CDISC
Protocol Representation Group to develop a common
terminology and vocabulary in support of clinical
research data exchange standards. The CDSIC
vocabulary is being maintained in the NCIs
Cancer Data Standards Repository (caDSR) and will
harmonize with the NCICBs Common Data Elements
(CDEs). This presentation focuses on the first of
these three activities the development of the
UML DSAM, representing many hours of intensive
interactive modeling by domain experts and
modeling experts from pharmaceutical companies,
HL7, and caBIG. The goal of the model is not
only to facilitate seamless data exchange, but
also to explore a computable representation of a
clinical trial for planning, execution and
analysis of clinical trials. In addition to data
exchange, it is hoped this shared model will form
the backbone for the applications developed in
caBIGs Clinical Trials Management Systems (CTMS)
domain-specific workspace. Such applications
could include advanced protocol authoring tools,
tools for data reporting and submissions,
clinical trials management, and interfaces to
clinical systems. This initiative furthers the
caBIG vision to provide a common informatics
platform to exchange standardized data between
disparate systems to support the cancer research
community. The focus to promote data
standardization and ability to exchange data
seamlessly among the various stakeholders will
put critical information in the hands of
researchers, thus enabling them to broaden the
scope of their research and allow hitherto
unaddressable research directions to be pursued.
Increasing specificity and computability
Participants Christo Andonyadis Greg Anglin Lisa
Chatterjee Julie Evans Douglas B Fridsma Smita
Hastak Ray Heimbuch Charlie Mead Joyce
Niland John Speakman Cara Willoughby Diane Wold
A gigantic integration challenge
  • what shall serve as benchmark for integration?
  • for some things (drive on right!) no benchmark is
    needed the choice can be arbirtary
  • for most things we need rules which must be
    somehow grounded in a larger, intuitively
    intelligible framework
  • OBO Foundry the real world (our best current
    scientific knowledge) (is already integrated)

  • UML primarily designed as a language to be used
    by humans to document and communicate software
  • Thus main notation of UML is a graphical model
    rather than a formal language.
  • UML is the union of different preexisting
    proposals aiming at maximal expressivity. Thus it
    uses different interacting model types each
    covering a specific aspect of the overall system.

  • Used as a documentation, design, analysis,
    communication, clarification, collaboration tool.
    Typically for complicated interactions between
    components, or when a method/object is too big to
    hold in your mind.

Advantages of UML
  • single standard framework including facilities to
    represent both static and dynamic features of
  • well-connected to software development
  • can do justice to the way in which in a domain
    like clinical trials dynamic aspects are
    increasingly both inside and outside the computer
  • advances software coupling

UML consistency/validation
  • UML provides several kinds of diagrams to model
    the behavior and structure of a system under
    development. A consistency problem can arise due
    to the fact that some aspects of the model may be
    described by more than one diagram.

Add to UML the dimension of HL7
  • ANSI / ISO standard
  • provides a common, highly general benchmark
  • relatively stable, hard to change, large body of
  • provides the needed pathway for integration
  • has created incentives for use (NCI)

Good parts of HL7
  • are based in reality Clinical Document
    Architecture (CDA)

Problems of consistency of not so good parts of
  • Sometimes Act means real-world action
  • Sometimes Act means information object act
    qua documented, either
  • in an actual document (reflecting the subjective
    view of some author), or
  • in some postulated HL7-conformant absolute
  • Recall double role of ontology and prescriptive

HL7 Axiom
  • ... there is no distinction between an activity
    and its documentation. Every Act includes both to
    varying degrees.
  • (RIM 3.1.1)

For the clinical research domain
  • Reports
  • Observations
  • Entities observed
  • Adverse event observation of adverse event
  • events and observations may take place on
    different continents
  • use-case for CTO (what CTO can add to RCT Schema)

Problems of coordination of large standards
development efforts
  • As different people edit parts of the HL7
    specification, inconsistencies in form and
    quality may emerge as some ambiguities are
    clarified, other previous systematic ideas may be
    corrupted and well-meant glossary entities may
    cause confusion.
  • Gunther Schadow MIE 2006, p. 154

  • On the one hand HL7 is to facilitate agreement on
    consistent meanings across the entire range of
    clinical domains.
  • On the other hand HL7 own collaborating authors
    cannot reach agreement even amongst themselves.
  • Yet the resultant inconsistent normative
    recommendations are approved by ISO as an
    international standard

Everything in HL7 is an act, an entity, or a
  • Definition of Entity
  • A physical thing, group of physical things or an
    organization capable of participating in Acts,
    while in a role. (HL7V3 Ballot 2006-9 RIM)
  • Entity persons, places, organizations, material

Problems of scope
  • No processes (verb-like items) outside Act
  • How can HL7 deal with disease processes, drug
    interactions, traffic accidents, snake bits,
    other adverse events?
  • Answer it identifies them with acts of

Problems of scope
  • No things (noun-like items) outside Entity
  • How can HL7 deal with wounds, fractures?

Diseases in HL7
  • ... are not Acts
  • ... are not Entities
  • ... are not Roles, Participations, Role-Links
  • So what are they?
  • Answer they are Acts of Observation
  • A case of pneumonia is an Act of Observation of
    a case of pneumonia

To solve these problems of scope
  • add to HL7 new upper-level types
  • process (with act as sub-type)
  • condition (with disease ... as subtype)
  • drug interactions are processes, not acts
  • diseases are conditions on the side of patients,
    not acts of observation by clinicians
  • accepted by some inside HL7

The HL7 ballotting methodology
  • vs. scientific method
  • welcoming of open criticism
  • secondary literature
  • incremental testing
  • cumulation of results

BRIDG Biomedical Research Integrated Domain Group
  • a domain analysis model representing
    protocol-driven biomedical/clinical research. It
    was developed to provide an overarching model
    that could readily be understood by domain
    experts and would provide the basis for
    harmonization among standards within the clinical
    research domain and between biomedical/clinical
    research and healthcare.

  • The BRIDG model defines standard entities,
    roles, attributes, and activities for the
    business processes in standard clinical trials.
    It could be used as a core data standard for
    managing the workflow in clinical trials and for
    generating clinical trial software applications
    that share the same semantics and thus can
    exchange data more readily.

  • How is BRIDG different from previous structured
    clinical trial protocol models?
  • most previous protocol models are
  • document-oriented
  • focused on protocol knowledge structure
  • built through a specific knowledge representation
  • In contrast, BRIDG is independent from any
    implementation software and defines only
    semantics for the clinical trial research domain.
    Therefore, BRIDG has the potential to facilitate
    clinical trials across organizations and software

the BRIDG model
(No Transcript)
  • The development of BRIDG was initiated by the
    CDISC Consortium and guided by the HL7
    Development Framework (HDF) methodology.
  • Purpose to create computable interoperability
  • support software authoring and integration

Business case
  • the FDA and the entire pharma industry see that
    they can benefit from some standard
  • industry will be motivated to create software to
    make the standard useable even by those who do
    not understand it

What happens when scientific methods change, for
example new paradigms for testing drug efficacy
are developed?
what is the incentive for all this modeling?
  • Bob There needs to be a business case to justify
    doing the mapping. The most powerful business
    case would be that the NCI required cancer
    centers to do it.

BRIDG p. 670
  • DCI Definition v 2.4.6ExplicitItemRefRepetition
  • Type Class
  • Status Proposed. Version 1.0. Phase 1.0.
  • Package DCI Definition v 2.4.6
  • Details Created on 7/8/2005 50915 PM.
    Modified on 7/13/2006 90127 AM. Author Don
  • This class repesents a repeat of an ItemRef
    within a GroupRef. If the GroupRef owning an
    ItemRef is non-repeating (i.e., if its
    MaximumItemRefRepeats is zero), no
    ExplicitItemRefRepetitions are allowed. If
    repeating is enabled for the GroupRef, an
    ExplicitItemRefRepetition is still not mandatory
    for every repetition of every ItemRef in the
    GroupRef. An ExplicitItemRefRepetition is
    necessary only if either (or both) of these
    reasons obtain o you want to specify a default
    value for the repeat, o you want to specify a
    Triggered Action for the the repeat,Repeats of
    its ItermRefs without ExplictItemRefRepetitions
    are called implicit repeats.If you supply
    ExplicitItemRefRepetitions with
    RepeatSequenceNumber greater than the GroupRef's
    MaximumItemRefRepeats, the excess repetitions
    will be ignored.

Achieving consensus
  • OBO Foundry / OBI / CTO modular, incremental,
    modest, connected to reality
  • terms are included only if they have instances
  • Identify corresponding modules within BRIDG, and
    test for alignment with CTO
  • non-performed performed observation
  • disease response is_a assessment
  • Are there corresponding items within CTO? If so,
    they will be fixed immediately.

  • the balance of opposing powers often keeps each
    within reason and sanity
  • (John Stuart Mill)