Evaluating semantic similarity using GML in Geographic Information Systems - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Evaluating semantic similarity using GML in Geographic Information Systems

Description:

Fernando Ferri 1, Anna Formica 2, Patrizia Grifoni 1, and ... Inspired by the maximum weighted matching problem in bipartite graphs, we have to identify the ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 28
Provided by: fer89
Category:

less

Transcript and Presenter's Notes

Title: Evaluating semantic similarity using GML in Geographic Information Systems


1
Evaluating semantic similarity using GML in
Geographic Information Systems
  • Fernando Ferri 1, Anna Formica 2, Patrizia
    Grifoni 1, and Maurizio Rafanelli 2
  • 1 IRPPS-CNR, via Nizza 128, 00198 Roma, Italy
  • fernando.ferri_at_irpps.cnr.it, patrizia.grifoni_at_irpp
    s.cnr.it
  • 2 IASI-CNR, viale Manzoni 30, 00185 Roma, Italy
  • formica_at_iasi.cnr.it, rafanelli_at_iasi.cnr.

2
Summary
  • Motivation
  • Related works
  • Coding a Part-of Hierarchy using GML
  • Similarity evaluation
  • Conclusion

3
Motivation (1)
  • In Geographic Information Systems (GISs) semantic
    similarity plays an important role, as it
    supports the identification of objects that are
    conceptually close, but not identical.
  • GML (Geography Markup Language) is emerging as
    the dominant standard for exchanging geographic
    data across the Internet.
  • A semantic similarity model facilitates
    comparison of entities and allows information
    retrieval and integration to handle semantically
    similar concepts . The goal of a similarity model
    is to obtain flexible and better matches between
    user-expected and system-retrieved information.

4
Motivation (2)
  • Given the relevance of the Is-in relationship in
    the geographic context, we focus on GML elements
    organized according to Part-of (meronymic)
    hierarchies.
  • The semantics essentially concerns parts which
    are similar to and inseparable from the whole.

5
Related works (1)
  • Similarity of hierarchically related concepts has
    been widely investigated in the literature
    Resnik Rodriguez, Egenhofer.
  • From the various proposals, we followed the
    probabilistic approach of Lin, which is based on
    the notion of information content and overcomes
    the drawbacks of the traditional edge-counting
    approach.

6
Related Works (2)
  • Resnik proposes algorithms that take advantage of
    taxonomic similarity in resolving syntactic and
    semantic ambiguities.
  • Lin starts from the Resnik work and addresses
    also the information content of the comparing
    concepts.

7
Coding a Part-of Hierarchy with GML (1)
  • The real world in the geographic domain can be
    represented as a set of features, and
    AbstractFeatureType codifies a geographic feature
    in GML.
  • Its geometry type is an important property, it is
    given in the reference coordinate system and
    describes the extent, position or relative
    location of the represented concept.

8
Coding a Part-of Hierarchy with GML (2)
  • The geometric types defined in GML provide the
    framework for modelling all the geographical
    concepts.
  • By means of this framework it is possible to
    model, for example, the concepts composing a
    communication ways network, such as roads,
    rivers, canals and other communication
    infrastructures.

9
Coding a Part-of Hierarchy with GML (3)
  • This figure shows an example of a type hierarchy
    that introduces concepts concerning communication
    infrastructures starting from the GML geometric
    types.

10
Coding a Part-of Hierarchy with GML (4)
  • As mentioned in the motivation, due to the
    relevance of the Is-in relationship in the
    geographic context, the paper focuses on GML
    elements organized according to Part-of
    (meronymic) hierarchies.
  • For instance, in our example a Part-of
    relationship exists among communication ways
    (ComWay) and roads, rivers and canals.

11
Coding a Part-of Hierarchy with GML (5)
  • Usually, in the literature, Part-of hierarchies
    are modelled in XML using sequences of
    elements, and a similar approach could be
    followed in GML
  • However, this approach does not permit to
    distinguish between elements of the Part-of
    hierarchy and other elements eventually defined
    out of the Part-of hierarchy, such as Kind and
    Country

12
Coding a Part-of Hierarchy with GML (6)
  • In order to put in evidence meronymic
    relationships within the GML element hierarchy, a
    Part-of hierarchy could be modelled by
    introducing some special geographic types such as
    PartOfWayType, PartOfRivType, PartOfCanType
  • Each special type is introduced for modelling a
    Part-of relationship between a geographic concept
    and their component concepts

13
Coding a Part-of Hierarchy with GML (7)
  • ltelement name"ComWay" type"ComWayType"/gt
  • ltelement name"Road" type"RoadType"/gt
  • ltelement name"River" type"RiverType"/gt
  • ltelement name"Canal" type"CanalType"/gt
  • ltelement name"NavRiver" type"NavSegmentType"/
    gt
  • ltelement name"NNavRiver type"NNavSegmentType"
    /gt
  • ltelement name"NavCanal type"NavSegmentType"/
    gt
  • ltelement name"NNavCanal type"NNavSegmentType"
    /gt
  •  
  • ltcomplexType name"ComWayType"gt
  • ltsequencegt
  • ltelement name "kind" type"string"/gt
  • ltelement name "country" type"string"/gt
  • ltelement name "PartOfWay"
    type"PartOfWayType"/gt
  • lt/sequencegt
  • ltattribute name"label" type"string" /gt
  • ltattribute name"label" type"string" /gt
  • ltattribute name"length" type"integer" /gt
  • lt/complexTypegt
  • This GML code shows how to put in evidence a
    meronymic relationship within the GML element
    hierarchy introducing a special geographic type
    such as PartOfWayType

14
Evaluating similarity (1)
  • For evaluating concept similarity this paper
    combines and revisits
  • the information content approach Lin98,
  • a proposal inspired by the maximum weighted
    matching problem in bipartite graphs FM02.

15
Evaluating similarity (2)
  • The starting assumption is that the association
    of probabilities with the Part-of taxonomy allows
    us the notion of a weighted element hierarchy to
    be introduced. In particular, in our example the
    probabilities have been estimated in line with
    WordNet 2.0.
  • For instance, below the concepts Road and River
    have been defined, with the related frequencies
    (the numbers in parenthesis).

    (95) Road an open way (generally public)
    for travel and transportation

    (55) River a large natural stream of water
    (larger than a creek)

16
Evaluating similarity (3)
  • The probability of a concept
  • The probability of a concept c is defined as
  • p(c) freq(c)/N
  • where freq(c) is the frequency of the concept c
    in the taxonomy, and N is the total number of
    concepts.
  • In the example probabilities have been assigned
    according to WordNet.

17
Evaluating similarity (4)
  • Example Weighted Concept Hierarchy

18
Evaluating similarity (5)
  • Following the standard approach of information
    theory Ross76, the information content of a
    concept c can be quantified as
  • log p(c)
  • that is, as the probability increases, the
    informativeness decreases.

19
Evaluating similarity (6)
  • The information content similarity (ics) of
    two concepts such as River and Canal is defined
    as
  • ics(River, Canal) 2 log p(ComWay)/(log
    p(River)log p(Canal)) 0,72
  • where ComWay is the concept representing the
    maximum information content shared by River and
    Canal. According to the Lins approach the more
    information two concepts share, the more similar
    they are.

20
Evaluating similarity (7)
  • Structural similarity (asim)
  • Inspired by the maximum weighted matching
    problem in bipartite graphs, we have to identify
    the
  • set of pairs of typed attributes
  • such that is maximal the sum of the products
    of the information content similarity of the
    attributes and the related types.

21
Evaluating similarity (8)
  • Example

RiverType
CanalType
labelstring lengthinteger flowinteger deepne
ssinteger
labelstring profundityinteger capacityinteger
lengthinteger
22
Evaluating similarity (9)
  • In the previous example the set of pairs of
    attributes that maximizes the sum of the related
    information content similarity is the following
  • (label,label), (length,length),
    (flow,capacity), (deepness,profundity)

23
Evaluating similarity (10)
  • In fact, by assuming that deepness and
    profundity are synonyms, we have
  • ics(label,label)ics(length,length)
    ics(deepness,profundity) 1
  • and ics(flow,capacity) 0.  

24
Evaluating similarity (11)
  • The similarity of the sets of attributes of
    complexTypes (asim) is therefore defined by the
    above maximum sum divided by the greatest of the
    cardinalities of the sets of attributes of the
    types compared.
  • In the case of RiverType and CanalType we
    have
  • asim(RiverType,CanalType) ¾ 0.75

25
Evaluating similarity (12)
  • Concept Similarity (Gsim)
  • The Similarity (Gsim) of the concepts River and
    Canal is defined as
  • Gsim(River , Canal) (ics(River , Canal)w
    asim(River, Canal)(1-w)) Bt(RiverType,CanalTyp
    e)
  • where
  • ics(River , Canal) is the information content
    similarity
  • asim(River , Canal) is the structural similarity
  • w is a weight, s.t. 0 lt w lt 1.
  • Bt is a Boolean function that, given two
    complexTypes, returns 0 if their least upper
    bound in the type hierarchy is AbstractFeatureType
    , otherwise it returns 1.

26
Evaluating similarity (13)
  • In particular, if we assume w0.5
  • Gsim(River , Canal) (ics(River , Canal)w
    asim(River, Canal)(1-w)) Bt(RiverType,CanalTyp
    e)
  • Gsim(River , Canal) 0.5 (0.720.75)1 0.74

27
Conclusion
  • Thank you
Write a Comment
User Comments (0)
About PowerShow.com