Peer-to-Peer Data Integration Using Distributed Bridges - PowerPoint PPT Presentation

About This Presentation
Title:

Peer-to-Peer Data Integration Using Distributed Bridges

Description:

Limitations of Related Work. Global shared schemas are fragile and not scalable ... Includes simple semantic relation. Attached mappings and/or transforms ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 14
Provided by: joepu8
Category:

less

Transcript and Presenter's Notes

Title: Peer-to-Peer Data Integration Using Distributed Bridges


1
Peer-to-Peer Data Integration Using Distributed
Bridges
Candidate Thesis for M. A. Sc. in Electrical
Engineering
  • Neal Arthorne
  • B. Eng. Computer Systems (2002)
  • Supervisor Babak Esfandiari
  • April 12, 2005

2
Introduction
  • Multiple autonomous heterogeneous data sources
  • E.g. chemistry and genetics databases, digital
    repositories, astronomy databases
  • Data is distributed and network-accessible
  • Each data source may use a different syntax or
    query language (SQL, Web Services etc.)

3
Related Work
  • Federated database systems Sheth, 1990
  • Global federated schema
  • Mediator approach Wiederhold, 1992
  • Databases wrapped in a software layer that
    translates to a common information model
  • Middleware lies between user applications and
    data sources
  • Theoretical description of data integration
  • Schemas mapped with FOL statements LAV or GAV
    approach

4
Related Work Contd
  • OWL/RDF/RDFS used to describe semantic
    relationships ontologies
  • Peer-to-Peer Data Integration WWW approach to
    integrating data
  • PIAZZA (Halevy et al.), Lenzerini, Franconi
  • Focused on query optimization and decidability in
    FOL systems

5
Limitations of Related Work
  • Global shared schemas are fragile and not
    scalable
  • Centrally located and administered
  • Changes affect all component databases or
    middleware
  • P2P data integration is limited
  • Semantic differences not addressed
  • Centrally stored mappings
  • Large databases not compatible with centralized
    metadata

6
Proposed Solutions
  • User-contributed mappings between schemas
    (bridges)
  • Fully de-centralized distribution of mappings
  • Anyone can publish a new mapping
  • No global schema means improved scalability
  • Provide semantic mappings for data
  • Distributed searching compatible with large
    databases
  • Use existing Universal Peer-to-Peer (U-P2P)
    framework

7
Universal Peer-to-Peer (U-P2P)
  • Peers share XML metadata with binary attachments
  • Communities formed around a shared XML Schema
  • Community itself is published anyone can create
    a community
  • Flexible deployment pluggable Network Adapters

Book Comunity
0..
1
Resource
1
0..
lt?xml version1.0?gtltbookgt lttitlegtWar
Peacelt/titlegt lte-textgtfile//...lt/e-textgt
Attachments
8
P2P Data Integration with U-P2P
  • Proposed Bridge Community and bridge schema
  • Anyone can publish a bridge
  • Includes simple semantic relation
  • Attached mappings and/or transforms
  • U-P2P modularized for database proxies
  • Distributed Network Adapter (Gnutella)
  • Compatible with large databases
  • No central indexing servers

9
Bridges in U-P2P
Resource
Community
Bridge
10
Example Bridge
ltbridgegt lttitlegtDSpace to Fedora
bridgelt/titlegt ltdescriptiongtlt/descriptiongt
ltbridgeMappinggt ltsourcegt
ltcommunitygtd2a9d6f78dcf91828f68a52f78260e05lt/commu
nitygt ltresourcegt134d8f8ecd57acb35206b4cd13e3
8622lt/resourcegt lt/sourcegt
ltrelationgtowlsameAslt/relationgt lttargetgt
ltcommunitygtd2a9d6f78dcf91828f68a52f78260e05lt/comm
unitygt ltresourcegtda1058314b7d8890fc7df7f879a
0a7dblt/resourcegt lt/targetgt
lt/bridgeMappinggt lttransformListgt
lttransformgt ltfilegtfile//lt/filegt
lt/transformgt
11
Case Study Digital Repositories
Peer A
DSpace Community
Generic Central Server
Fedora Community
Peer C
Fedora Database
DSpace Community
Peer B
Fedora Community
Fedora Community
Proxy
Gnutella Protocol
Centralized P2P
12
Conclusion
  • P2P approach to integration anyone can create a
    bridge
  • Fully-distributed network adapter brings in large
    data sources via proxies
  • Demonstrated integration with digital
    repositories
  • Simple semantic relationship (OWL)
  • Query translation

13
Future Work
  • Manual navigation between schemas
  • Need to automate retrieving bridges
  • XPath query translation is limited
  • Need to provide robust query translation modules
  • Translate instance data
  • Semantic relationships not exploited
  • Use OWL ontologies to give bridges a context
  • Software agents can be introduced to discover and
    use bridges
Write a Comment
User Comments (0)
About PowerShow.com