An Ontologybased Metadata Management System for Heterogeneous Clinical Databases - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

An Ontologybased Metadata Management System for Heterogeneous Clinical Databases

Description:

... specification of the conceptualization of a domain. ... Knowledge base stores the ontology; consists of: The abstraction model domain-level concepts ... – PowerPoint PPT presentation

Number of Views:96
Avg rating:3.0/5.0
Slides: 13
Provided by: quddus3
Category:

less

Transcript and Presenter's Notes

Title: An Ontologybased Metadata Management System for Heterogeneous Clinical Databases


1
An Ontology-based Metadata Management System for
Heterogeneous Clinical Databases
  • A CS590L Project Proposal
  • By Quddus Chong
  • (qkc509_at_umkc.edu)
  • UMKC-SICE
  • January 21, 2002

2
Outline
  • Towards a clinical data warehouse
  • Integrating heterogeneous data sources
  • Clinical abstractions as Ontologies
  • Managing database metadata
  • The data mediator approach
  • Using Protégé-2000

3
Towards a Clinical Data Warehouse
  • Clinical Data Warehousing is the application of
    Data Warehousing concepts to allow clinical data
    about a large patient population to be analyzed
    to perform clinical quality management and
    medical research.
  • In a data warehouse environment, data has the
    following properties
  • Data is organized by subject, or domain-level
    concepts, rather than by function.
  • Data from various operational systems is
    integrated, by definition or by content.
  • Data is archived in non-volatile storage to allow
    temporal analysis.
  • Data is recorded with a temporal dimension (e.g.
    timestamp)
  • Data is optimized for decision making (DSS) or
    analysis (OLAP).

4
Integrating Heterogeneous Data Sources
  • The main challenge in integrating data from
    heterogeneous sources is in resolving schema and
    data conflicts.
  • Approaches to this problem include using a
    federated database architecture, or providing a
    multi-database interface. These approaches are
    geared more towards providing query access to the
    data sources than towards supporting analysis.
  • Types of data integration
  • Physical integration convert records from
    heterogeneous data sources into a common format
    (e.g. .xml).
  • Logical integration relate all data to a common
    process model (e.g. a medical service like
    diagnose patient or analyze outcomes).
  • Semantic integration allow cross-reference and
    possibly inferencing of data with regards to a
    common metadata standard or ontology (e.g. HL7
    RIM, OILDAML).

5
Clinical abstractions as Ontologies
  • An ontology is a explicit specification of the
    conceptualization of a domain. Information
    models (such as the HL7 RIM) and standardized
    vocabularies (such as UMLS) can be part of an
    ontology. An ontology provides a core component
    in a Knowledge-Based System.
  • In the clinical research field, ontologies have
    been used in computerized guideline modeling.
    This allows the development of applications to
    provide recommendations (e.g. to make indications
    for the use of surgical procedures), to identify
    deviations in practices, and screening services
    (e.g. evaluate patient eligibility).
  • Benefits of using ontologies include
  • Facilitating sharing between systems and reuse of
    knowledge
  • Aiding new knowledge acquisition
  • Improving the verification and validation of
    knowledge-based systems.

6
Managing database metadata
  • Metadata is the detailed description of the
    instance data the format and characteristics of
    the populated instance data instances and values
    dependent on the requirements/role of the
    metadata recipient.
  • Metadata is used in locating information,
    interpreting information, and integrating/transfor
    ming data.
  • Being able to maintain a well-organized and
    up-to-date collection of the organizations
    metadata is a great step towards improving
    overall data quality and usage. However this
    task is complicated by the different quality and
    formats of metadata available (or not) from the
    heterogeneous data sources, and the consistency
    in updating existing metadata.
  • A common metadata architecture is essential to
    keeping data manageable.

7
The Data Mediator approach
  • In this project, we will attempt to develop an
    extensible and adaptable architecture to perform
    integration of heterogeneous data sources into a
    data warehouse environment using a ontology-based
    data mediator approach.
  • The components of this architecture include
  • Knowledge base stores the ontology consists
    of
  • The abstraction model domain-level concepts
  • The database description model metadata record
    of data sources
  • The mappings model how data elements relate to
    attributes in the abstraction model
  • The transformations model metadata of available
    methods to transform data elements from one data
    source to another
  • Data mediators provides each data source an
    interface to the warehouse and resolving data
    conflicts between any different representations
    necessary classes generated from the ontology.
  • Data warehouse provides access to integrated
    data for analysis and decision-making.

8
A prototype architecture
(Data Warehouse environment, e.g. SQL Server)
ontologies can be created and modified via
Protégé-2000 tool underlying format is RDF
possible use of JDBC metadata to obtain db
descriptions
Ontology Server
Source db 1
Target db
Mediator Interface 1
Abstractions
alternatively, a common metadata exchange
standard such as XMI could be used
(Relational DBMS, e.g. MySQL)
Data Descriptions
abstraction model in the ontology is extensible
to any domain
Data Mappings
Source db 2
Mediator Interface 2
Warehouse Mediator
Transformation Descriptions
(Object-Relational DBMS, e.g. Postgresql)
possible use of XSLT to perform data
transformations
XML data binding could be used to generate APIs
for data validation or transformation
key goal develop the ontology server as a
component, use EJB or .NET
9
Using Protégé-2000
  • Protégé-2000 is a experimental knowledge-acquisiti
    on tool, written in Java, that allows users to
    import, export and create their own ontologies.
  • The tool itself is extensible a programming
    developer kit is available for instructions on
    creating plug-ins
  • tabs - user interface between a ontology model
    in Protégé and another knowledge-based
    application.
  • slot-widget user interface for viewing and
    acquiring slot values for new instances.
  • backend plug-ins specify the mechanism that
    Protégé-2000 will use to store the ontology.

10
Screenshot Creating the classes and slots of an
ontology
11
Screenshot Viewing the newly created ontology
model
12
References
  • Pedersen T. B., Jensen C. S., Research Issues in
    Clinical Data Warehousing In Proceedings of the
    10th International Conference on Scientific and
    Statistical Database Management, pg. 43-52, July
    1998 (available online http//citeseer.nj.nec.com
    /pedersen98research.html)
  • Critchlow T., Ganesh M., Musick R., Meta-Data
    Based Mediator Generation In Proceedings of the
    3rd IFCIS Conference on Cooperative Information
    Systems, August 1998 (available online
    http//citeseer.nj.nec.com/critchlow98metadata.htm
    l)
  • Tu S. et. al. A Flexible Approach to Guideline
    Modeling AMIA Annual Symposium, 1999 (available
    online http//smi-web.stanford.edu/pubs/SMI_Abstr
    acts/SMI-1999-0793.html)
Write a Comment
User Comments (0)
About PowerShow.com