Piazza: Data Management Infrastructure for Semantic Web Applications - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Piazza: Data Management Infrastructure for Semantic Web Applications

Description:

09/12/2003. Peer-to-Peer Information Systems WS 03/04. 1 ... Zachary Ives, Jayant Madhavan, Alon Halevy, Dan Suciu, Nilesh Dalvi, Xin (Luna) ... – PowerPoint PPT presentation

Number of Views:104
Avg rating:3.0/5.0
Slides: 24
Provided by: chern150
Category:

less

Transcript and Presenter's Notes

Title: Piazza: Data Management Infrastructure for Semantic Web Applications


1
Piazza Data Management Infrastructure for
Semantic WebApplications
  • Alon Y. Halevy, Zachary G. Ives, Peter Mork, Igor
    Tatarinov.

Speaker Sergey Chernov Tutor Jens Graupmann
2
Outline
  • INTRODUCTION. SEMANTIC WEB.
  • PIAZZA SYSTEM OVERVIEW
  • IMPLEMENTATION DETAILS
  • 3.1 MAPPING LANGUAGE
  • 3.2 QUERY ANSWERING ALGORITHM
  • CONCLUSIONS.

3
Introduction
  • Goal
  • Data Integration and Knowledge Management
  • Problem
  • Web data lacks machine-understandable semantics
  • Solution
  • Semantic Web?

4
The Semantic Web
  • Web sites include structural annotations
  • You can pose meaningful queries on them.
  • Ontologies provide the semantic glue.
  • Internal implementation of web sites left open.
  • Agents perform tasks
  • Query one or more web sites
  • Perform updates (e.g., set schedules)
  • Coordinate actions
  • Trust each other (or not).
  • I.e., agents operating on a gigantic
    heterogeneous distributed database.

(View by A. Halevy)
5
General requirements
  • Robust infrastructure for querying
  • Peer data management systems.
  • Facilitate mapping between different structures.
    Need tools for
  • Locating relevant structures
  • Easily joining the semantic web.
  • Get data into structured form
  • Should we worry about the legacy web?

6
Using views for specifyingmappings
  • Local-As-View (LAV).
  • Data sources can be described as views over the
    mediated schema.
  • Global-As-View (GAV).
  • Mediated schema can be described as a set of
    views over the data sources.

Mediated Schema
Site B
Site A
Site C
Mediated Schema
Site B
Site A
Site C
7
Mapping
  • Mapping AB specifies representation of structured
    data from scheme of node A into scheme of node B

Mediated Schema
Mapping MS-C
Mapping A-MS
Mapping C-MS
Mapping MS-A
Mapping AB
Mapping BC
Site B
Site C
Site A
Mapping BA
Mapping CB
8
Piazza Peer Data-Management System
  • Goal
  • Large scale autonomous sharing of structured data
  • Peer data management system (PDMS)
  • Autonomous Peers export data in their own schemas
  • Pair-wise mappings between peers
  • Generalization of a Data Integration system
  • NOT a P2P file sharing system

9
Relationship of PDMS to
  • P2P overlay networks (the Structured World)
  • Data integration systems (no central logical
    mediated schema)
  • Federated databases (scale, ad-hoc nature)
  • Distributed databases (no central administration)

10
Representing Data
  • A spectrum of possibilities
  • Relational tables, some integrity constraints
  • XML can encode relational, hierarchical
  • Xquery emerging standard query language (SQL
    for XML)
  • RDF XML on drugs.
  • Sees only the logic ignores other aspects.
  • DAMLOIL
  • Full-blown Knowledge representation language.
  • They all have semantics just different
    expressive powers.
  • We keep the data simple. Mappings between data at
    different peers are more complex.

11
Peer Data Management
DB Projects
MIT
UW
UCB
Stanford
  • Mappings are query expressions
  • DbResearcher(x) ? Researcher(x),Area(x,DB)
  • DbResearcher(x), Office(x,DBLab) DbLabMember(x)

12
Piazza mapping language (1)
  • XML/XML Example

ltpubsgt ltbookgt a IN
document(source.xml)\ /authors/author
t IN a/publication/title, typ IN
a/publication/pub-type WHERE typ book
lttitlegt t lt/titlegt
ltauthorgt ltnamegt a/full-name lt/namegt
lt/authorgt lt/bookgt lt/pubsgt
Target pubs book title
author name publisher
name
Source authors author full-name
publication title
pub-type
13
Piazza mapping language (2)
  • piazzaid attribute

ltpubsgt ltbook piazzaidtgt
a IN document(source.xml)\
/authors/author t IN a/publication/title,
typ IN a/publication/pub-type WHERE typ
book lttitle piazzaidtgt t
lt/titlegt ltauthor piazzaidtgt
ltnamegt a/full-name lt/namegt
lt/authorgt lt/bookgt lt/pubsgt
Target pubs book title
author name publisher
name
Source authors author full-name
publication title
pub-type
14
Piazza mapping language (3)
  • Partial mapping

ltpubsgt ltbook piazzaidtgt
a IN document(source.xml)\
/authors/author t IN a/publication/title,
typ IN a/publication/pub-type WHERE typ
book PROPERTY t gtA AND t lt
B ltpublishergt
ltnamegt PROPERTY this IN
PrintersInc, PubsInc lt/namegt
lt/publishergt lt/bookgt lt/pubsgt
Target pubs book title
author name publisher
name
Source authors author full-name
publication title
pub-type
15
Query Answering Algorithm
  • Problem
  • Evaluate query Q at P1 given a network of
    mappings
  • Reformulate the query over all relevant peers
  • Chaining of mappings using a combination of query
    composition and query rewriting
  • QP1(x) - DbResearcher(x)
  • Query Composition
  • M DbResearcher(x) ? Researcher(x),Area(x,DB)
  • ? QP2 (x) ?
    Researcher(x),Area(x,DB)
  • Query Rewriting
  • M DbResearcher(x), Office(x,DBLab)
    DbLabMember(x)
  • ? QP3 (x) ?
    DbLabMember(x)

16
Query Reformulation (1)
Mapping
Query
ltS2gt ltpeoplegt people/S1/people
ltfacultygt namepeople/faculty/name/text()
name lt/facultygt
ltstudentgt studentpeople/student/text()
ltnamegt student lt/namegt
ltadvisorgt facultypeople/faculty,
namefaculty/name/text(),
adviseefaculty/advisee/text()
where adviseestudent
name ltadvisorgt lt/studentgt
lt/peoplegt lt/S2gt
ltresultgt for faculty in
/S1/people/faculty, name in
faculty/name/text(), advisee in
faculty/advisee/text() where name
Ullman return ltstudentgt
advisee lt/studentgt lt/resultgt
17
Query Reformulation (2)
Query tree pattern
Mapping tree pattern
Query
ltS2gt
ltresultgt for faculty in
/S1/people/faculty, name in
faculty/name/text(), advisee in
faculty/advisee/text() where name
Ullman return ltstudentgt
advisee lt/studentgt lt/resultgt
S1 ltpeoplegt people
faculty name advisee adviseestudent
ltadvisorgt name
18
Query Reformulation (3)
Query tree pattern
Mapping tree pattern
Query
ltS2gt
ltresultgt for faculty in
/S2/people/student, advisor in
student/advisor/text(), name in
student/name/text() where advisor
Ullman return ltstudentgt name
lt/studentgt lt/resultgt
S1 ltpeoplegt people
faculty name advisee adviseestudent
ltadvisorgt name
19
Reformulation times
  • Table 1 The test queries and their respective
    running times.

20
Current and the Future
  • Current status
  • Demo scenario using XML
  • Looking at real domains (Bio dbs, NASA dbs)
  • Future Work
  • More efficient reformulation algorithm
  • Semantic network analysis eliminate redundant
    mappings and inconsistent mappings
  • Query caching to speed up query evaluation

21
Conclusions
  • Mapping language for mapping between sets of XML
    source nodes with different document structures
  • Architecture that uses the transitive closure of
    mappings to answer queries
  • Algorithm for query answering over this
    transitive closure of mappings, which is able to
    follow mappings in both forward and reverse
    directions

22
Thank You!
23
Further literature
  • Alon Y. Halevy, Zachary G. Ives, Dan Suciu, Igor
    Tatarinov Schema Mediation for Large-Scale
    Semantic Data Sharing
  • Igor Tatarinov, Zachary Ives, Jayant Madhavan,
    Alon Halevy, Dan Suciu, Nilesh Dalvi, Xin (Luna)
    Dong, Yana Kadiyska, Gerome Miklau, Peter Mork
    The Piazza Peer Data Management Project
  • Alon Y. Halevy, Zachary G. Ives, Dan Suciu, Igor
    Tatarinov Schema Mediation in Peer Data
    Management Systems
  • Alon Halevy, Oren Etzioni, AnHai Doan, Zachary
    Ives, Jayant Madhavan, Luke McDowell, Igor
    Tatarinov Crossing the Structure Chasm
  • Madhan Arumugam, Amit Sheth, and I. Budak
    Arpinar Towards Peer-to-Peer Semantic Web A
    Distributed Environment for Sharing Semantic
    Knowledge on the Web
  • Hendler J., Berners-Lee T., Miller E.
    Integrating Applications on the Semantic Web
Write a Comment
User Comments (0)
About PowerShow.com