Title: OntologyBased Semantic Integration Method for Domain Specific Scientific Data
1Ontology-Based Semantic Integration Method for
Domain Specific Scientific Data
- AUTHORS
- Hu Changjun
- Zhang Xiaoming
- Zhao Qian
- Zhao Chongchong
- Presented By Mayank Gupta
2Outline
- Introduction
- Heterogeneity
- Need for Ontology based solution
- Ontology based solution
- Ontology based semantic integration method
- Semantic integration method steps
- Semantic integration method architecture
- Future work
- References
3Introduction
- What is the motive of this paper?
- Sharing huge amount of domain specific data
which is of great significance for researches. - What is the problem in achieving such motive?
- Semantic heterogeneity between various sources
of data in specific area of science (research
area). - What solution does author proposes?
- Author proposes an Ontology based solution which
uses TBox reasoning (i.e., reasoning about the
concepts in an ontology) for resolving this
semantic heterogeneity.
4Heterogeneity
- What do you mean by heterogeneity?
- Data heterogeneity can be divide into following
levels - System
- Syntactic
- Structural
- Semantic
5Need for Ontology based solution
- Domain specific markup languages such as MatML,
CML and MathML are used to exchange data among
different sources in specific domain. - They do solve syntactic and structural problems
and to little extend semantic problem too. - Still data lacks high level abstraction of
concept semantics.
6 Ontology based solution
- Why only Ontology based solution?
- An ontology is formal, explicit specification of
a shared conceptualization. - It can be used to capture shared domain
knowledge. - It also serves a lot for logic reasoning of
information content in a specific domain (TBox
reasoning).
7 Ontology based solution
- What are disadvantages of non-ontology based
solution to ontology based solution? - Non-ontology solution has no means of checking
the consistence and discovering domain
terminology conflicts. - It cant implement inheritance mechanism.
- Implicit knowledge cant be discovered since no
reasoning mechanism supported.
8 Ontology based solution
- In ontology based solution
- We use OWL ontology to represent global semantics
of domain and local semantics of heterogeneous
sources . - A mapping ontology for mapping between global and
local ontology. - Semantic heterogeneity is resolved using TBox
reasoning and there is no need to migrate data
from sources to ontology instances. - Domain specific markup languages are used as
uniform format for query results for data
exchange.
9Ontology based semantic integration method
- Ontology is defined as a 4-tuple
- OltC, R, I, Aogt
- Here
- C is a finite set of concepts
- R is a finite set of relations
- I is a set of instances
- Ao is a set of axioms, which is expressed in an
appropriate logical language. Given SC?R?I, Ao
is logic axiom set over S
10Ontology based semantic integration method
- Semantic data integration system can be defined
as 5- tuple SIltG, S, D, MGS, MSDgt - Here
- G is global conceptual schema . It gives global
view for users. - S is local conceptual schema. It gives local
semantics for data source. - D is the data source.
- MGS is the mapping between G and S.
- MSD is the mapping between S and D.
11Ontology based semantic integration method steps
- Step 1
- First deeply analyze the data requirements
together with the domain experts, and construct
the global ontology OG to provide the formal and
explicit shared knowledge in this domain. OG is
considered as global conceptual schema of SI,
i.e. G OG and AG SOG . - Step 2
- Assuming there are p data sources expressed as
D1, D2, , Dp, we construct local ontology OSi
for each Di, where i?0, p. OS is viewed as
local conceptual schema, namely S OS and AS
SOS .
12Ontology based semantic integration method steps
13Ontology based semantic integration method steps
- Step 3
- Build MSD ?(D, OS ) is axiom set defined over
SD?SOS. In this step, we should build a set of
mapping axioms f between each Di and OSi, where
f??i and i?0, p. - Step 4
- Build MGS?(OS ,OG ) is axiom set defined over
SOS?SOG. In this step we should build a set of
mapping axioms ? between each OSi and OG, where
???i and i?0, p. A mapping ontology OM is built
to express ? in ontology language.
14Ontology based semantic integration method steps
- Case study in material science domain
15Ontology based semantic integration method steps
- Step 5
- Data user submits a qG over G
- MGS and the imported ontology are loaded into DL
(Description Logic) reasoner. - By TBox reasoning, global concepts and their
subconcpets in qG will be converted into
corresponding local concepts in local ontology - Then qG will be decomposed into subqueries qS1,
qS2,, qSn, where n?0, p. - Both qG and its subqueries are expressed as
SPARQL.
16Ontology based semantic integration method steps
- Step 6
- Using mappings in MSD, the local concepts in qSj
can be translated into symbols from SDj, where
j?0, n - qSj will be rewritten into native query qDj.
- After accessing the data from data sources, the
query results aD1, aD2,, aDn will be returned in
a uniform way. - In this method, domain markup language is used to
express the query results
17Ontology based semantic integration method steps
- Step 7
- aD1, aD2, , aDn are composed as final result
aG, which will be returned back to data user.
18Ontology based semantic integration method steps
- How ontology helped in this above process?
- It provides formal description for specific
domain. Domain concepts and knowledge structure
are formally defined by OG. - It provides semantic extension for data sources.
Data sources are enhanced semantically by OS
which can make inner structure of data sources
explicit. - It provides semantic interoperability. OM,
considered as foundation for query reformulation,
contains mappings between global semantic and
local semantic. - It provides reasoning and deducing ability via
TBox reasoning. -
19Ontology based semantic integration software
architecture
20Future work
- Still this system relies on global ontology which
as discussed in earlier papers is hard come up
with as it constraints the distributed resources. - Need to automate the process so that local
ontology can learn from data source. - Query optimization.
21References
- http//en.wikipedia.org/wiki/SPARQL
- http//en.wikipedia.org/wiki/Axiom
- http//en.wikipedia.org/wiki/TBox
- http//ieeexplore.ieee.org/Xplore/login.jsp?url/i
el5/4287452/4287802/04287953.pdftparnumber4287
953isnumber4287802
22Questions ????