Leveraging Data and Structure in Ontology Integration - PowerPoint PPT Presentation

Loading...

PPT – Leveraging Data and Structure in Ontology Integration PowerPoint presentation | free to download - id: 16836e-ZDc1Z



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Leveraging Data and Structure in Ontology Integration

Description:

... logical inference (e.g., in OWL) to estimate how good an ... OWL Lite ... Often, this means the 'big picture' of the search space is ignored. A ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 57
Provided by: octavia
Learn more at: http://om.umiacs.umd.edu
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Leveraging Data and Structure in Ontology Integration


1
Leveraging Data and Structure in Ontology
Integration
  • Octavian Udrea1
  • Lise Getoor1
  • Renée J. Miller2
  • 1University of Maryland College Park
  • 2University of Toronto

2
Contents
  • Motivation and goals
  • Short overview of OWL Lite
  • The ILIADS method
  • Experimental evaluation

3
Motivation and goals
  • No silver bullet on how to represent a domain
  • To use knowledge effectively, we need to
    integrate multiple ontologies
  • Our goals
  • Improve the quality of computed alignments
  • In a way flexible enough to adapt to a wide
    variety of inputs
  • Find correlations between the features of the
    input and the criteria for good quality alignments

4
The method at a glance
  • Produce better quality alignments by
  • using data (instances) effectively and
  • using logical inference (e.g., in OWL) to
    estimate how good an alignment is
  • Parameterize the method such that
  • It can be adapted for a wide variety of inputs
  • The parameters can be adjusted with minimal
    effort based on the input ontologies

5
Defining the terms
  • Entity everything that has an URI identifier
    (plus literals)
  • Ontology software artifact consisting of
    classes, instances, facts, axioms
  • Alignment Given two ontologies, find
    relationships between their respective entities
  • Integration Merge two ontologies under a set of
    alignments to obtain a consistent result

6
Contents
  • Motivation and goals
  • Short overview of OWL Lite
  • The ILIADS method
  • Experimental evaluation

7
Example OWL Lite ontologies
(discoveredBy, owlinverseOf, discoverer)
(discoveredBy, owltype, owlFunctionalProperty) (
discoveredBy, owlinverseOf, discoverer)
(associatedWith, owltype, owlTransitiveProperty)
(resultsF rom, rdfssubPropertyOf,
associatedWith)
8
Example OWL Lite ontologies
(discoveredBy, owlinverseOf, discoverer)
(discoveredBy, owltype, owlFunctionalProperty) (
discoveredBy, owlinverseOf, discoverer)
(associatedWith, owltype, owlTransitiveProperty)
(resultsF rom, rdfssubPropertyOf,
associatedWith)
9
Example OWL Lite ontologies
(discoveredBy, owlinverseOf, discoverer)
(discoveredBy, owltype, owlFunctionalProperty) (
discoveredBy, owlinverseOf, discoverer)
(associatedWith, owltype, owlTransitiveProperty)
(resultsF rom, rdfssubPropertyOf,
associatedWith)
10
Example OWL Lite ontologies
(discoveredBy, owlinverseOf, discoverer)
(discoveredBy, owltype, owlFunctionalProperty) (
discoveredBy, owlinverseOf, discoverer)
(associatedWith, owltype, owlTransitiveProperty)
(resultsF rom, rdfssubPropertyOf,
associatedWith)
11
Example OWL Lite ontologies
(discoveredBy, owlinverseOf, discoverer)
(discoveredBy, owltype, owlFunctionalProperty) (
discoveredBy, owlinverseOf, discoverer)
(associatedWith, owltype, owlTransitiveProperty)
(resultsF rom, rdfssubPropertyOf,
associatedWith)
12
Inference in OWL (Lite)
  • A tableau-based method
  • Example tableau rule
  • (p owlinverseOf p) (o1 p o2)
  • (o2 p o1)
  • Example inconsistency
  • (o1 owlsameAs o2) (o2 owldifferentFrom o1)
  • -

13
Example inference
(discoveredBy, owlinverseOf, discoverer)
(discoveredBy, owltype, owlFunctionalProperty) (
discoveredBy, owlinverseOf, discoverer)
(associatedWith, owltype, owlTransitiveProperty)
(resultsF rom, rdfssubPropertyOf,
associatedWith)
14
Example inference
(discoveredBy, owlinverseOf, discoverer)
15
Example inference
(discoveredBy, owltype, owlFunctionalProperty)
16
The alignment problem
  • Find a set of triples (entity1 relation entity2)
    where
  • entity1, entity2 are entities from the two
    ontologies
  • relation is one of
  • subClassOf, equivalentClass, subPropertyOf,
  • equivalentProperty, sameAs
  • For integration, the union of the ontologies and
    the alignment must be consistent.

17
Contents
  • Motivation and goals
  • Short overview of OWL Lite
  • The ILIADS method
  • Experimental evaluation

18
State of the art
  • Ideally, alignment should be treated as an
    optimization problem
  • Choose candidate pairs to maximize an
    ontology-level similarity measure
  • Unfeasible in practice
  • To approximate, existing tools use locally
    computed similarity measures
  • Often, this means the big picture of the search
    space is ignored

19
A simplified view of local methods
20
A simplified view of local methods
This score is high enough, so we commit to the
owlsameAs relation
21
A simplified view of incremental methods
This changes the scores of the neighbors
22
A simplified view of incremental methods
This is again high-enough, so we have found
another alignment
23
The core of ILIADS
  • Compute alignment candidates based on well
    established methods
  • Lexical, structural, extensional similarity
  • In addition, evaluate how good a candidate pair
    is based on the logical consequences of asserting
    the alignment
  • We call this inference similarity
  • Essentially a look-ahead that estimates the
    impact of the alignment on the global similarity
    score

24
The ILIADS algorithm
  • repeat until no more candidates
  • Compute local similarities
  • Select promising candidates
  • For each candidate
  • Perform N inference steps
  • Update score with the inference similarity
  • Select the candidate with the best score
  • end

25
Computing similarity
  • repeat until no more candidates
  • Compute local similarities
  • Select promising candidates
  • For each candidate
  • Perform N inference steps
  • Update score with the inference similarity
  • Select the candidate with the best score
  • end
  • sim(e,e) ?x simlexical(e,e)
  • ?s simstructural(e,e) ?e
    simextensional(e,e)
  • Lexical similarity Jaro-Winkler and Wordnet
  • Structural similarity Jaccard for various
    neighborhoods
  • Extensional similarity Jaccard on extensions
  • Select candidates with sim(e,e) above a
    threshold

26
Performing inference
  • repeat until no more candidates
  • Compute local similarities
  • Select promising candidates
  • For each candidate
  • Perform N inference steps
  • Update score with the inference similarity
  • Select the candidate with the best score
  • end
  • For the candidate pair (e,e)
  • Select an axiom and apply the corresponding rule
  • The logical consequences are the pairs of
    entities (e(i), e(j)) that have just become
    equivalent
  • Repeat a small number of times (5)

27
Updated score
  • repeat until no more candidates
  • Compute local similarities
  • Select promising candidates
  • For each candidate
  • Perform N inference steps
  • Update score with the inference similarity
  • Select the candidate with the best score
  • end
  • For the candidate pair (e,e)
  • Compute the product P of sim(e(i), e(j)) / (1
    sim(e(i), e(j))) over all logical consequences
  • simupdated(e,e) sim(e,e) P

28
Example inference similarity
29
Example inference similarity
We assume this candidate pair is in a owlsameAs
relation before starting inference
30
Example inference similarity
(discoveredBy, owlinverseOf, discoverer)
31
Example inference similarity
Remember that during inference, (E-Coli
Poisoning, owlsameAs, E-Coli)
(discoveredBy, owltype, owlFunctionalProperty)
32
Example inference similarity
Updated score .5 1.5 7.5
This is the only logical consequence. P .6 /.4
1.5
33
The ILIADS algorithm
  • It is still a local method
  • Ultimately, it selects the best alignment after
    each step
  • But it estimates the global impact of each
    alignment better
  • The inference similarity is a look-ahead measure
    of how good the candidate alignment is

34
Other issues
  • ILIADS may not produce a consistent result
  • Inconsistent ontologies in less than .5 of runs
  • Pellet used to check consistency after ILIADS
  • How do we decide between subsumption and
    equivalence for a pair of entities?
  • How do we select the promising candidates?
  • How do we choose the axioms to apply in the five
    inference steps?

35
Subsumption vs. equivalence
  • Deciding whether two entities should subsume each
    other or be equivalent is not clear-cut
  • Simple extensional technique to distinguish
    between the two cases
  • E.g., measure whether the instances of class c
    are almost the same of those of class c gt
    rdfsequivalentClass
  • If they are a subset, then rdfssubClassOf

36
Deciding the type of relationship
present in the extensions of both FoodPoisoning
and FoodBorneDisease
To measure how much the two classes have in
common, we divide the size of the unique part to
the size of the common part. We obtain 1/3 and
2/4 respectively.
37
Deciding the type of relationship
We decide based on ?r. If ?r .49, then we
choose rdfssubClassOf
38
Deciding the type of relationship
If ?r .7, then we choose owlequivalenClass
39
Cluster type selection
  • Existing tools use various strategies to generate
    candidates from classes, individuals or
    properties
  • ILIADS supports
  • Randomly select from the three types
  • Weighted random (more classes than individuals
    means classes will be selected more often)
  • Classes first / Individuals first
  • Alternate at each step

40
Axiom selection policies
  • The number of inference steps is small
  • The axioms applied must make a difference
  • ILIADS always selects from relevant axioms
    according to a policy
  • Random
  • Property axioms first (e.g, owlTransitiveProperty
    )
  • Class axioms first (e.g., rdfssubClassOf)
  • Transitive/Inverse/Functional first (since they
    tend to generated sameAs relationships)

41
Contents
  • Motivation and goals
  • Short overview of OWL Lite
  • The ILIADS method
  • Experimental evaluation

42
Experimental framework
  • 30 pairs of ontologies
  • Ontologies from 194 to over 20000 triples
  • Ground truth provided by human reviewers
  • Comparison in terms of recall and precision with
    FCA-merge and COMA
  • Two versions of the algorithm
  • Best overall average quality ILIADS FP
  • Best parameters for each pair ILIADS BP

43
ILIADS-BP parameter setting
44
Precision/recall
45
Precision/recall comparison
46
Precision/recall for ontologies with substantial
instance data
47
False negative analysis
48
Number of inference steps
  • The number of 5 inference steps was chosen as the
    best compromise between

49
Cluster type/axiom selection policies
50
And the result is...
(discoveredBy, owlinverseOf, discoverer)
(discoveredBy, owltype, owlFunctionalProperty) (
discoveredBy, owlinverseOf, discoverer)
(associatedWith, owltype, owlTransitiveProperty)
(resultsF rom, rdfssubPropertyOf,
associatedWith)
51
Choosing the parameters
  • The structural similarity coefficients strongly
    correlate with the average degree of the node
  • The structural coefficient for classes correlates
    with the number of rdfssubClassOf relationships
  • The extensional coefficients correlate with the
    ratio of instance to classes

52
Parameter sensitivity
  • Structural coefficients are stable around the
    ILIADS-FP setting for 25 out of 30 pairs
  • The remaining 5 pairs have large differences
    between their average node degrees
  • Extensional coefficients are stable around the
    ILIADS-FP setting for 21 pairs
  • The remaining 9 pairs have a low ratio of
    instances to classes (lt 1.9)

53
Experimental results summary
  • ILIADS has better quality than COMA and
    FCA-merge, with a significant difference for all
    pairs with substantial instance data
  • Matching properties is the major cause of false
    negatives for all three systems, but ILIADS does
    better at matching instances
  • Structural and extensional coefficients correlate
    with structural properties and are stable for
    ontologies with similar structure

54
Conclusions
  • New algorithm that tightly integrates statistical
    matching and logical inference to produce better
    quality alignments
  • Found intriguing correlations between structure
    and matching strategies
  • Improvement over existing systems
  • 25 higher quality than FCA-merge,
  • 11 higher recall than COMA at comparable
    precision

55
  • Thank you!
  • Questions comments

56
Related work
  • Aumueller et al., 2005 D. Aumueller, H. H. Do,
    S. Massmann, and E. Rahm. Schema and ontology
    matching with COMA.
  • Bao et al., 2006 Jie Bao, Doina Caragea, and
    Vasant Honavar. Modular ontologies - a formal
    investigation of semantics and expressivity.
  • Calvanese et al., 2001 Diego Calvanese,
    Giuseppe De Giacomo, and Maurizio Lenzerini. A
    framework for ontology integration.
  • Euzenat and Valtchev, 2003 J. Euzenat and P.
    Valtchev. An integrative proximity measure for
    ontology alignment.
  • Euzenat et al., 2004 Jerome Euzenat, David
    Loup, Mohamed Touzani, and Petko Valtchev.
    Ontology alignment with OLA.
  • Giunchiglia et al., 2005 Fausto Giunchiglia,
    Pavel Shvaiko, and Mikalai Yatskevich. S-Match
    an algorithm and an implementation of semantic
    matching.
  • McGuinness et al., 2000 Deborah L. McGuinness,
    Richard Fikes, James Rice, and Steve Wilder. An
    environment for merging and testing large
    ontologies.
  • Noy and Musen, 2003 Natalya F. Noy and Mark A.
    Musen. The PROMPT suite interactive tools for
    ontology merging and mapping.
  • Stumme and Maedche, 2001 Gerd Stumme and
    Alexander Maedche. FCA-MERGE Bottom-Up Merging
    of Ontologies.
About PowerShow.com