Transitioning Relational Databases to Ontologies Farid Cerbah Dassault Aviation farid.cerbahdassault - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Transitioning Relational Databases to Ontologies Farid Cerbah Dassault Aviation farid.cerbahdassault

Description:

rear under pylon fairing. 342EZ. F7X. PANEL. umbrella ... Fairing. Floor. Two sources involved in the identification of categ. attributes. Attribute names ... – PowerPoint PPT presentation

Number of Views:297
Avg rating:3.0/5.0
Slides: 24
Provided by: dassault1
Category:

less

Transcript and Presenter's Notes

Title: Transitioning Relational Databases to Ontologies Farid Cerbah Dassault Aviation farid.cerbahdassault


1
Transitioning Relational Databases to
OntologiesFarid CerbahDassault
Aviationfarid.cerbah_at_dassault-aviation.fr
2
Outline
  • Problem statement
  • Previous work
  • The RDBToOnto tool and the RTAXON method
  • Improving the process through database
    optimisation
  • A case study in aircraft maintenance
  • Extending RDBToOnto
  • Conclusion

3
Problem statement
  • Relational databases are valuable heterogeneous
    sources for ontology learning
  • Better accuracy can be expected than from text
    corpora
  • Ontology learning from relational databases is
    not a new research issue
  • Limitations of existing support
  • Problem often restricted to finding automated
    ways to import tables into ontologies
  • Derivation of ontologies with flat structure that
    look like the source databases

4
Our contribution
  • RDBToOnto Platform
  • A comprehensive software support to learn
    fine-tuned ontologies
  • A framework that eases the development and the
    experimentation of transitioning methods
  • RTAXON Method
  • To find out taxonomies hidden in the data

5
A motivating example
Typical mappings covered by several methods
6
Previous work (1)
  • RDB -gt Ontology Transformation
  • Database Reverse Engineering
  • Many transformation rules from this domain are
    reused for ontology learning
  • Behm et al. 1997, Ramanathan Hodges 1997,
  • Approaches mostly based on an analysis of the RDB
    schema
  • Data correlations are considered but
  • with the restriction "Data Key Values"
  • Key inclusion may express inheritance
  • Exploiting null values semantics Lammari et al.
    2007
  • Partitioning of a table on the basis of null
    values may reveal concept hierarchies
  • Involves data from non-key attributes

7
Previous work (2)
  • Mapping languages and tools
  • D2RQ
  • RDB to OWL/RDF mapping
  • Ontology-based access to relational databases
  • Rewriting SPARQL queries into SQL
  • Relational.OWL
  • A minimal ontology of tables and column and
    a processor to populate this ontology with data
    from relational databases
  • Can be used to exchange data between databases
  • Triplify
  • Plugin for web applications
  • Converts the result of SQL queries into RDF
  • KAON Reverse
  • Software support to interactively map an RDB
    schema to a predefined ontology
  • DataMaster
  • Protégé Plugin to import table data into
    ontologies

8
RDBToOnto
  • A user-oriented tool with a full-fledged user
    interface
  • Supports an extensive process from the access to
    the data to ontology generation
  • Includes the RTAXON converter
  • Though automated to a large extent, local
    constraints can be interactively included to
    progressively refine the ontologies
  • Types of local constraints
  • Table and column exclusion
  • Naming patterns for classes and instances
  • Categorisation patterns

9
The RTAXON method
  • Major improvement over existing methods
  • Further refine the classes derived from the
    schema with subclasses found in the content of
    the relations
  • Focus on reliable categorisation patterns

Categorising attribute
Access Zone
Door
Panel
Fairing
Floor
  • Two sources involved in the identification of
    categ. attributes
  • Attribute names
  • Revealed by lexical clues
  • Redundancy in attribute extensions
  • Entropy-based approach to find good profiles


Formal definition of RTAXON
Demo
10
Optimising the source databases
  • Another key improvement is the inclusion of a
    database optimisation step
  • Many input databases suffer from data duplication
    problems
  • Optimisation -gt eliminate data duplication
    through the processing of inclusion dependencies

11
Effect of inclusion dependency processing
  • Inclusion dependencies ? more inter-class
    relations (i.e. object properties).

Without ID identification
With ID identification
12
Identification of inclusion dependencies
  • RDBToOnto includes an editor to interactively
    define inclusion dependencies
  • Automated identification of inclusion
    dependencies
  • A data mining approach Based on LATINO

13
Mining inclusion dependencies with LATINO
14
A case study in aircraft maintenance
KCIT(GATE-based annotator)
RDBToOnto LATINO
Radiant
OWLIM
15
The ontology acquisition process
  • The legacy data
  • LSA database an heterogeneous relational
    database that gathers all information related to
    maintenance activity
  • Required logistic resources
  • Aircraft parts (Product tree)
  • Scheduling data
  • Standards Documents including widely shared
    conceptual models
  • The ontology acquisition process
  • A multi-step transitioning process that favours
    modular design

16
Model Boostrapping Ontology Normalisation
MSG-3
SNS/ATA
FOAF
Reusable Ontologies
ltgtlt/gt ltgt lt/gt . ltgt lt/gt
Model Bootstrapping
Legacy Data
Ontology Learning Tools
17
The defined RDBToOnto conversion project
  • 75 constraints
  • Mostly naming patterns and inclusion dependencies
  • Resulting ontology
  • Ontology model
  • 115 classes, 334 datatypes, 54 object properties
  • Population
  • 49617 class instances, 51449 object property
    instances
  • No constraints for categorisation
  • The ten discovered hierarchies by RTAXON are
    relevant
  • Good behaviour when faced with categorisation
    conflicts

18
The generated class hierarchy
19
Identified object properties
20
RDBToOnto extension capabilities
  • RDBToOnto is a user-oriented tool but it is also
    a framework
  • Written in Java
  • OWL as target language (exploiting Jena 2.5 API)
  • Two types of components can be added
  • Database readers to cover more database formats
  • Converters to implement new learning methods
  • New converters can have their specific global
    options, local constraints and GUI

21
Structure of RDBToONTO
Database
DBReader Database getDatabase() Table
ReadData(String name)
RDBToOntoConverter OntModel
Convert(Database db) OntClass CreateClass(TableDef
)
RTAXON
BasicConverter
MSAccessReader
DB2Reader
can be extended by the users
22
The neutral database model


Database
DBSchema
Table
Column
Attribute


TableDef
friendlyNames
Values
String


Key
Input to any converter

PrimaryKey
ForeignKey
23
Conclusion
  • We presented a significant support for
    transitioning relational databases to ontologies
  • RDBToOnto and RTAXON method have been evaluated
    on significant databases
  • RTAXON is just a first step as many extensions
    can be studied
  • Learning two-level hierarchies
  • Automatically generating local constraints (e.g.
    naming patterns)
  • More resources are available on TAO project web
    site, including
  • User Guide and demos
  • Development Guide
  • A fully implemented sample showing how to extend
    the tool
Write a Comment
User Comments (0)
About PowerShow.com