Title: OWLbased Semantic Conflicts Detection and Resolution for Data Interoperability
1 OWL-based Semantic Conflicts Detection
and Resolution for Data Interoperability
- Changqing Li, Tok Wang Ling
- Department of Computer Science
- School of Computing
- National University of Singapore
2Outline
- Introduction
- Preliminary and motivation
- OWL-based Semantic Conflicts Detection and
Resolution - Conclusion
- Q A
3Introduction
- Data interoperability and integration is a
long-standing challenge to the database research
community. - Ontology provides sharing knowledge among
different data sources - Clarify the semantics of information.
- Provide a way to solve the interoperability
problem in database integration
4Introduction (Cont.)
- OWL is being promoted as a standard for web
ontology language - In the future a considerable number of ontologies
will be created based on OWL. - Therefore automatically detecting semantic
conflicts based on OWL will greatly expedite the
step to achieve semantic interoperability, and
will greatly reduce the manual work to detect
semantic conflicts.
5Ontology Definition
- An ontology defines the basic terms and relations
comprising the vocabulary of a topic area, as
well as the rules for combining terms and
relations to define extensions to the vocabulary
1.
1. Robert Neches, Richard Fikes, Timothy W.
Finin, Thomas R. Gruber, Ramesh Patil, Ted E.
Senator, William R. Swartout Enabling Technology
for Knowledge Sharing. AI Magazine 12(3) pp36-56
(1991)
6Ontology Language
- SHOE
- RDF
- RDFS
- DAMLOIL
- OWL
7SHOE
- The Simple HTML Ontological Extensions (SHOE) 2
extends HTML with machine-readable knowledge
annotated.
2. Sean Luke and Jeff Heflin SHOE
Specification 1.01. http//www.cs.umd.edu/projects
/plus/SHOE/spec.html
8RDF
- Resource Description Framework (RDF) 3 is a
recommendation of W3C for Semantic Web 4 - It defines a simple model to describe
relationships among resources in terms of
properties and values. - SVO form (Subject-Verb-Object)
- Resource-property-Value
- 3. Ora Lassila and Ralph R. Swick Resource
description framework (RDF). - http//www.w3c.org/TR/WD-rdf-syntax
- 4. The SemanticWeb Homepage.
http//www.semanticweb.org
9RDF (Cont.)
10RDFS
- RDF Schema (RDFS) 5, the primitive description
language of RDF - Provide some basic primitives
- subClassOf
- subPropertyOf
5. Dan Brickley and R.V. Guha. Resource
Description Framework (RDF) Schema Specification
1.0, W3C Candidate Recommendation 27 March 2000.
http//www.w3.org/TR/rdf-schema/
11DAMLOIL
- DARPA Agent Markup Language (DAML) 6
- To facilitate the semantic concepts and
relationships understood by machines - Ontology Inference Layer (OIL) 7
- Extends RDFS with additional language primitives
not yet presented in RDFS. - DAMLOIL 8 are the successors of RDFS
- Combination of DAML and OIL
- More semantic rich primitives are defined
- 6. The DARPA Agent Markup Language Homepage.
- http//daml.semanticweb.org/
- 7. The Ontology Inference Layer OIL Homepage.
- http//www.ontoknowledge.org/oil/TR/oil.long.html
- 8. DAMLOIL Definition. http//www.daml.org/2001/0
3/damloil
12OWL
- DAMLOIL is evolving as OWL (Web Ontology
Language) 9. - OWL is almost the same as DAMLOIL
- Some primitives of DAMLOIL are renamed in OWL
for easier understanding. - e.g., sameClassAs is changed to
equivalentClass
9. Frank van Harmelen, Jim Hendler, Ian
Horrocks, Deborah L. McGuinness, Peter F.
Patel-Schneider and Lynn Andrea Stein. OWL Web
Ontology Language Reference. http//www.w3.org/TR/
owl-ref/
13Primitives of OWL
- owl before is the namespace
- owlequivalentClass
- owleuqivalentProperty
- owlsameIndividualAs
- owldisjointWith
- owldifferentFrom
14Our Extension of OWL (EOWL)
- We extend OWL with the following primitives
- eowlorderingProperty
- eowloverlap
- eowlproperSubClassOf
- eowlproperSubPropertyOf
15OWL-based Semantic Conflicts Cases
- A. Name conflicts
- B. Order sensitive conflicts
- C. Scaling conflicts
- D. Whole and part conflicts
- E. Partial similarity conflicts
- F. Swap conflicts
16A. Name conflicts
- Example A. two distributed data warehouses
- one is used to analyze the United States market
- country, state, city and district
- and the other is used to analyze the China market
- country, province, city and county
- Based on the context
- provicnce is defined equivalent to State
using the OWL primitive owlequivalentClass. - To resolve this conflict, one name needs to be
changed. Change to the referenced name.
17A. Name conflicts (Cont.)
ltowlClass rdfID"Province"gt
ltrdfslabelgtProvincelt/rdfslabelgt
ltowlequivalentClass rdfresource"State"/gt lt/owl
Classgt
Fig. A. Detection of synonym conflicts
- owlequivalentClass is the indicator to detect
synonym conflicts - Change to State as which is referenced in the
ontology definition.
18A. Name conflicts (Cont.)
- Case A. Synonyms. The OWL primitives
owlequivalentClass, owlequivalentProperty
and owlsameInvidualAs are indicators to detect
this case. - Conflict Resolution Rule A. If synonym conflicts
are detected, different attribute names with the
same semantics need to be translated to the same
name (referenced name) for smooth data
interoperability.
19B. Order sensitive conflicts
- Example B. Consider the highest three scores of a
course. - The highest three scores of course A are listed
as 90, 95, 100 at ascending order, - The highest three scores of course B are listed
as 98, 95, 93 at descending order. - The highestThreeScores is defined as an
eowlorderingProperty in the ontology - The sequences of the highest three scores for
course A and B should be adjusted both to
ascending order or descending order. - Adjust to the sequence of the first one by
default, e.g. the sequence of course A
20B. Order sensitive conflicts (Cont.)
lteowlorderingProperty rdfID"highestThreeScores"
gt ltrdfslabelgthighest three scores of a
courselt/rdfslabelgt ltrdfsdomain
rdfresource"Course"/gt ltrdfsrange
rdfresource"xsdinteger"/gt lt/eowlorderingProper
tygt
Fig. B. Detection of order sensitive conflicts
- We can further define the ascendant or descendant
order for more precise semantics.
21B. Order sensitive conflicts (Cont.)
- Case B. Order sensitive. EOWL primitive
eowlorderingProperty and RDF primitive
rdfSeq are indicators to detect this case. - Conflict Resolution Rule B. If order sensitive
conflicts are detected, we need to adjust the
member sequence according to the same criterion
for smooth data interoperability, the sequence of
the first one by default.
22C. Scaling conflicts
- Example C. Consider two database schemas
- Product(ID, Price)
- Product(ID, Price)
- One price may refer to the US dollars, while the
other may refer to the Singapore dollars. Figure
4 shows some concepts about a currency ontology
price is defined - Translate the price to refer to the same currency
unit. The unit of the first one by default.
23C. Scaling conflicts (Cont.)
ltowlDatatypeProperty rdfID"price"gt
ltrdfsdomain rdfresource"Product"gt
ltrdfsrange rdfparseType"Resource"gt
ltrdfvalue/gt ltcurrencyCurrencyUnit/gt
lt/rdfsrangegt lt/owlDatatypePropertygt
Fig. C. Detection of scaling conflicts
24C. Scaling conflicts (Cont.)
- Case C. Semantic conflicts may exist if the value
of a data type property comprises both value and
unit (Scaling). RDF primitive rdfparseType"Reso
urce" and OWL primitive owlDatatypeProperty
are indicators for this case. - Conflict Resolution Rule C. If scaling conflicts
are detected, the value should be translated to
refer to the same unit for smooth data
interoperability. The first unit by default.
25D. Whole and part conflicts
- Example D. Consider schemas
- Person(ID, name)
- Person(ID, surname, givenName)
- surname and givenName are both defined as the
proper sub property of name using
eowlproperSubClassOf - eowlproperSubClassOf has clearer semantics
than rdfssubClassOf because rdfssubClassOf
is ambiguous with two meanings
eowlproperSubClassOfand owlequivalentClass.
- Divide the whole attribute name to the part
attributes surname and givenName - Or combine the part attributes surname and
givenName together in the correct sequence to
form the whole attribute name.
26D. Whole and part conflicts (Cont.)
ltrdfProperty rdfID"surname"gt
lteowlproperSubPropertyOf rdfresource"name"gt lt/
rdfPropertygt
Fig. D1. Detection of whole and part conflicts
ltrdfProperty rdfIDgivenname"gt
lteowlproperSubPropertyOf rdfresource"name"gt lt/
rdfPropertygt
Fig. D2. Detection of whole and part conflicts
27D. Whole and part conflicts (Cont.)
- Case D. Semantic conflicts may exist if one
concept is completely contained in another
concept (Whole and part). EOWL primitives
eowlproperSubClassOf, eowlproperSubPropertyOf
are indicators to detect this case. - Conflict Resolution Rule D. If whole and part
conflicts are detected, the whole attributes
should be divided into part attributes or the
part attributes should be combined together to
whole attributes for smooth data interoperability.
28E. Partial similarity conflicts
- Example E. integration ResearchAssistant and
GraduateStudent - The relationship between research assistant and
graduate student is overlap because some research
assistants are also graduate students, - but not all research assistants are graduate
students, - and not all graduate students are research
assistants. - After integration, there should be three schemas
- Research Assistant but not Graduate Student
RNotG - Graduate Student but not Research Assistant
GNotR - both Research Assistant and Graduate Student
RAndG
29E. Partial similarity conflicts (Cont.)
ltowlClass rdfID"ResearchAssistant"gt
lteowloverlap rdfresource"GraduateStudent"/gt lt/
owlClassgt
Fig. E. Detection of partial similarity conflicts
30E. Partial similarity conflicts (Cont.)
- Case E. Semantic conflicts may exist if two
concepts are overlapped (Partial similarity).
EOWL primitive eowloverlap is indicators to
detect this case. - Conflict Resolution Rule E. If partial similarity
conflicts are detected, the overlap part should
be separated before integration.
31F. Swap conflicts
- Example F. Continued from Example A
- In China, county is contained in city (city has
larger area) - In US, city is contained in county (county has
larger area). - The domain (County) of property
regioncontainedIn in the China ontology is
just the range of the same property
regioncontainedIn in the US ontology - The range (City) of property regioncontainedIn
in the China ontology is just the domain of the
same property regioncontainedIn in the US
ontology. - We can add China. or US. before City and
County for smooth data interoperability.
32F. Swap conflicts (Cont.)
ltowlClass rdfID"County"gt
ltregioncontainedIn rdfresource"City/gt lt/owlC
lassgt
Fig. F1. Detection of swap conflicts (the
relationship between city and county in the China
ontology)
ltowlClass rdfID"City"gt ltregioncontainedIn
rdfresource"County/gt lt/owlClassgt
Fig. F2. Detection of swap conflicts (the
relationship between city and county in the US
ontology)
33F. Swap conflicts (Cont.)
- Case F. Semantic conflicts may exist if the
domain of a property in the first ontology is the
range of the same property in the second
ontology, and the range of the property in the
first ontology is the domain of the same property
in the second ontology (Swap). - Conflict Resolution Rule F. If swap conflicts are
detected, context restrictions (see Example F)
should be added to the schema for smooth data
interoperability.
34Conclusion
- We extend OWL with several primitives which have
clearer semantics - summarize several cases based on OWL in which
semantic conflicts are easily to be encountered - The conflict resolution rules for each case are
presented. - In the future, OWL will be frequently used to
build ontologies, and this paper provides a
computer-aid approach to detect and resolve
semantic conflicts for smooth data
interoperability.
35References
- 1. Robert Neches, Richard Fikes, Timothy W.
Finin, Thomas R. Gruber, Ramesh Patil, Ted E.
Senator, William R. Swartout Enabling Technology
for Knowledge Sharing. AI Magazine 12(3) pp36-56
(1991) - 2. Sean Luke and Jeff Heflin SHOE
Specification 1.01. http//www.cs.umd.edu/projects
/plus/SHOE/spec.html - 3. Ora Lassila and Ralph R. Swick Resource
description framework (RDF). - http//www.w3c.org/TR/WD-rdf-syntax
- 4. The SemanticWeb Homepage.
http//www.semanticweb.org - 5. Dan Brickley and R.V. Guha. Resource
Description Framework (RDF) Schema Specification
1.0, W3C Candidate Recommendation 27 March 2000.
http//www.w3.org/TR/rdf-schema/ - 6. The DARPA Agent Markup Language Homepage.
- http//daml.semanticweb.org/
- 7. The Ontology Inference Layer OIL Homepage.
- http//www.ontoknowledge.org/oil/TR/oil.long.html
- 8. DAMLOIL Definition. http//www.daml.org/2001/0
3/damloil - 9. Frank van Harmelen, Jim Hendler, Ian
Horrocks, Deborah L. McGuinness, Peter F.
Patel-Schneider and Lynn Andrea Stein. OWL Web
Ontology Language Reference. http//www.w3.org/TR/
owl-ref/
36Thank you