A Unified Schema Matching Framework - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

A Unified Schema Matching Framework

Description:

Institute fur Technische und Betriebliche Informationssysteme ... element type='provine'/ element type='code'/ /ElementType /Schema Representation P. ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 25
Provided by: alsh4
Category:

less

Transcript and Presenter's Notes

Title: A Unified Schema Matching Framework


1
A Unified Schema Matching Framework
  • Alsayed Algergawy,
  • Eike Schallehn, and
  • Gunter Saake
  • Institute fur Technische und Betriebliche
    Informationssysteme
  • Otto-von-Guericke-Universitat Magdeburg
  • Postfach 4120, D-39016 Magdeburg, Germany
  • alshaht/ eike / saake_at_iti.cs.uni-magdeburg.de

2
Outline
  • What is schema matching?
  • Where is schema matching used?
  • Schema Matching Challenges
  • The Proposed Framework
  • Summary and Future Work

19. Workshop uber Grundlagen von Datenbanken
3
Schema Matching
Area A
Area B
19. Workshop uber Grundlagen von Datenbanken
ltSchema name"Schema Tgt ltElementType
name"Customer"gt ltelement type"FName"/gt
ltelement type"LName"/gt ltelement
type"CAddress"/gt lt/ElementTypegt ltElementType
name"CAddress"gt ltelement type"street"/gt
ltelement type"city"/gt ltelement
type"provine"/gt ltelement type"code"/gt
lt/ElementTypegt lt/Schemagt
ltSchema name"Schema S"gt ltElementType
name"AccountOwner"gt ltelement type"Name"/gt
ltelement type"Address"/gt ltelement
type"BirthDate"/gt lt/ElementTypegt
ltElementType name"Address"gt ltelement
type"street"/gt ltelement type"city"/gt
ltelement type"state"/gt ltelement
type"ZIP"/gt lt/ElementTypegt lt/Schemagt
4
Schema Matching Def.
  • Schema matching is define as the task of
    finding the semantic correspondences between
    elements of two schemas.

19. Workshop uber Grundlagen von Datenbanken
S1
Match
Match Result
S2
Auxiliary information
( User feedback, Dictionaries, Previous mappings)
5
Where is schema matching used?
  • To motivate the importance of schema
    matching, we summarize its use in several
    application domains
  • Databases
  • Data integration
  • Data warehouse
  • E-commerce
  • Query processing
  • Peer data management
  • Model management
  • Artificial Intelligent
  • Knowledge bases, ontology merging,
  • Web
  • Semantic web services,

19. Workshop uber Grundlagen von Datenbanken
6
Data Integration
  • Problem Construct a global view from a set of
    independently constructed schemas.
  • - Different structure and terminologies
  • Solution Schema Matching is performed to find
    relationships between concepts in each schema.
    Then the matching elements can be unified.

19. Workshop uber Grundlagen von Datenbanken
7
Query Processing
  • Problem The terms used in the users query may
    be different from those in the database.
  • Solution Matching is used to map the
    user-specified concepts in the query to schema
    elements.

19. Workshop uber Grundlagen von Datenbanken
8
Challenges of Schema Matching
  • Despite its pervasiveness and importance,
    schema matching remains an extremely difficult
    problem
  • Representation Problems different representation
    models, different names and structures
  • Semantic Problems clues in schema and data are
    incomplete and unreliable
  • Computational Cost problems
  • Subjective and depending on the application

19. Workshop uber Grundlagen von Datenbanken
9
  • So we need
  • A unified schema matching system, which
  • Independent on the schema models,
  • Independent on the application domains,
  • Accurately identifies mapping elements
  • Concerns on both match effectiveness and match
    efficiency

ltSchema name"Schema Tgt ltElementType
name"Customer"gt ltelement type"FName"/gt
ltelement type"LName"/gt ltelement
type"CAddress"/gt lt/ElementTypegt ltElementType
name"CAddress"gt ltelement type"street"/gt
ltelement type"city"/gt ltelement
type"provine"/gt ltelement type"code"/gt
lt/ElementTypegt lt/Schemagt
ltSchema name"Schema S"gt ltElementType
name"AccountOwner"gt ltelement type"Name"/gt
ltelement type"Address"/gt ltelement
type"BirthDate"/gt lt/ElementTypegt
ltElementType name"Address"gt ltelement
type"street"/gt ltelement type"city"/gt
ltelement type"state"/gt ltelement
type"ZIP"/gt lt/ElementTypegt lt/Schemagt
19. Workshop uber Grundlagen von Datenbanken
Representation P.
Semantic P.
  • tedious
  • time consuming
  • error prone, and
  • expensive

10
General Schema Matching Procedure (Proposed
Framework)
  • The schema matching process requires the
    following main phases
  • 1. Importing the schemas to be matched TransMat
    Phase
  • 2. Identifying the elements to be matched
    Pr-match Phase
  • 3. Applying the matching algorithm Matching
    Phase
  • 4. Exporting the match result MapTrans Phase

19. Workshop uber Grundlagen von Datenbanken
11
19. Workshop uber Grundlagen von Datenbanken
12
TransMat Phase
  • Transformation for Matching Process.
  • To make the matching process a generic process.
  • A common model chosen to represent the matched
    schemas.
  • Graph data structure is used for internal
    representation.
  • Graphs well-known data structure
  • Transforming schema matching problem into a well
    known problem graph matching

19. Workshop uber Grundlagen von Datenbanken
13
Pre-Match Phase
  • It is a critical phase
  • Its output affects the input of matching phase
  • Depends on the type of used matching algorithm.
  • In rule-based systems
  • COMA -------?graph traversing to identify
    element to be matched ---------? nodes, paths,
    fragments (COMA)
  • In learner-based system
  • It is called a training phase
  • Using AI techniques
  • Neural networks -----------?SemInt
  • Machine Learning -----------?LSD, iMAp

19. Workshop uber Grundlagen von Datenbanken
14
Matching Phase
  • It is the most important phase
    ---?identification of corresponding elements

19. Workshop uber Grundlagen von Datenbanken
15
Element Matcher
Element Matcher
19. Workshop uber Grundlagen von Datenbanken
Element Property
Matcher Algorithm
Auxiliary information
Atomic/ structure
Schema-based/ instance-based
Rule-based
Learner-based
16
Similarity Combiner
  • Semantics of schema elements ---?available
    information may vary
  • The relationships between them are fuzzy.
  • A unified schema matching framework should
    implement multiple matchers.
  • For every element pair, several similarity values
    are computed.
  • To combine these values, similarity combiner is
    used.

19. Workshop uber Grundlagen von Datenbanken
17
Similarity Selector
  • Not all the identified corresponding elements are
    correct mappings.
  • Some selection criteria should used to select the
    most suitable mappings.

19. Workshop uber Grundlagen von Datenbanken
18
MapTrans Phase
  • Mapping Transformation
  • The match result should be exported to the
    application domain.
  • inherently depends on
  • the matching cardinality and
  • the mapping representation.

19. Workshop uber Grundlagen von Datenbanken
19
19. Workshop uber Grundlagen von Datenbanken
20
19. Workshop uber Grundlagen von Datenbanken
21
19. Workshop uber Grundlagen von Datenbanken
22
Summary and Future Work
  • Schema matching is such a pervasive, important,
    and extremely difficult problem
  • A unified schema matching framework is proposed
    to cope with the schema matching problem.
  • Many open research points are
  • Problem formulation
  • the internal representation
  • Pre-matching especially for rule-based systems
  • Choosing of matcher algorithm for performance
    aspects

19. Workshop uber Grundlagen von Datenbanken
23
References
  • Aumüller, D., H.H. Do, S. Massmann, E. Rahm
    Schema and Ontology Matching with COMA
    (Software Demonstration). Proc. 24. ACM SIGMOD
    Intl. Conf. Management of Data, 2005
  • Berlin, J., A. Motro Database Schema Matching
    Using Machine Learning with Feature Selection.
    Proc. 14. Intl. Conf. Advanced Information
    Systems Engineering (CAiSE), 2002
  • Clifton, C., E. Housman, A. Rosenthal Experience
    with a Combined Approach to Attribute-Matching
    Across Heterogeneous Databases. Proc. IFIP 2.6
    Working Conf. Database Semantics, 1996
  • Do, H.H., E. Rahm COMA - A System for Flexible
    Combination of Schema Matching Approach. Proc.
    Intl. Conf. Very Large Databases (VLDB), 2002
  • Doan, A.H., A. Halevy Semantic Integration
    Research in the Database Community A Brief
    Survey. AI Magazine, Special Issue on Semantic
    Integration, 2005
  • Rahm, E., P.A. Bernstein A Survey of Approaches
    to Automatic Schema Matching. VLDB Journal,
    10(4), 2001

19. Workshop uber Grundlagen von Datenbanken
24
Thank you
19. Workshop uber Grundlagen von Datenbanken
Write a Comment
User Comments (0)
About PowerShow.com