Schema Mediation in Peer Data Management Systems Alon Y' Halevy, Zachary G' Ives, Dan Suciu and Igor - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Schema Mediation in Peer Data Management Systems Alon Y' Halevy, Zachary G' Ives, Dan Suciu and Igor

Description:

... Management Systems. Alon Y. Halevy, Zachary G. Ives, Dan Suciu and Igor Tatarinov ... This paper is focused on the problem of providing decentralized schema ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 18
Provided by: ilya5
Category:

less

Transcript and Presenter's Notes

Title: Schema Mediation in Peer Data Management Systems Alon Y' Halevy, Zachary G' Ives, Dan Suciu and Igor


1
Schema Mediation in Peer Data Management
SystemsAlon Y. Halevy, Zachary G. Ives, Dan
Suciu and Igor TatarinovUniversity of
Washington, Seattle, WA, USA
  • Presented by Ilya Zaihrayeufor the Logics for
    knowledge representation and reasoning course,
    lectured by Luciano Serafini
  • July 1st, 2003

2
Mission of the paper
  • This paper is focused on the problem of providing
    decentralized schema mediation, specifically on
    the topics of expressing mappings between schemas
    in a P2P data sharing system and answering
    queries over multiple schemas

3
Index
  • PPL a mapping language for PDMSs
  • Semantics of PPL
  • Query answering problem with PPL
  • Conclusions

4
Main definitions
  • Peer schema (whose relations are peer relations)
    a schema, defined at each peer, over which
    relations queries are posed
  • Stored relations. Peers may contribute data to
    the system in the form of stored relations. All
    queries are reformulated strictly in terms of
    stored relations.

5
PPL storage descriptions
  • Relate stored relations to peer relations
  • Defined in the form ARQ
  • Q is a query over the schema of peer A, and R is
    a stored relation of A
  • Under OWA (data at peer may be incomplete) AR?Q

6
PPL peer mappings
  • Inclusion and equality mappings
  • Q1(A1) Q2(A2) or Q1(A1) ? Q2(A2)
  • Q1 and Q2 are conjunctive queries with the same
    arity
  • A1 and A2 are sets of peers
  • Definitional mappings
  • Datalog rules whose relations (both head and
    body) are peer relations

7
PDMS N
P1
  • A set of peers P1, , Pn
  • A set of peer schemas S1, , Sm
  • A mapping function from peers to schemas
  • A set of stored relations Ri at each peer Pi
  • A set of peer mappings LN
  • A set of storage descriptions DN

S1
S2
P2
R1
S3
R2
P3
Pn
S4
Sm-1
Sm
R3 Ø
R4
8
Semantics of PPL
P1
  • We are given
  • A PDMS N
  • Instance for stored relations D, i.e. a set of
    tuples D (R) for each stored relation R ?
    (R1??Rn)
  • A data instance I for a PDMS N is an assignment
    of tuples to each relation in each peer
  • I(R) the set of tuples assigned to the relation
    R by I
  • Q(I) the result of computing the query Q over
    the extensional data in I

S1
S2
P2
R1
S3
R2
P3
Pn
S4
Sm-1
Sm
R3 Ø
R4
9
Consistent data instance
  • A data instance I is said to be consistent with a
    PDMS N and an instance D for Ns stored
    relations if
  • For each storage description in DN, AR Q1 (AR
    ? Q1) implies D(R) Q1(I) (D(R) ? Q1(I))
  • For each peer description
  • If it is of the form Q1(A1) Q2(A2), then Q1(I)
    Q2(I)
  • If it is of the form Q1(A1) ? Q2(A2), then Q1(I)
    ? Q2(I)
  • if it is a definitional description whose head
    predicate is p, then let r1, . . . , rm be all
    the definitional mappings with p in the head, and
    let I (ri) be the result of evaluating the body
    of ri on the instance I.Then, I (p) I (r1) ? .
    . . ? I (rm).

10
Certain answers
  • Let Q be a query over the schema of a peer A in a
    PDMS N, and let D be an instance of the stored
    relations of N
  • A tuple â is a certain answer to Q if â is in Q
    (I ) for every data instance that is consistent
    with N and D

11
Query answering problem
  • Given a PDMS N, an instance of the stored
    relations D and a query Q, find all certain
    answers of Q

12
Cyclicity
  • A set L of inclusion peer mappings in PPL, is
    said to be acyclic if the following directed
    graph is acyclic
  • The graph contains a node for every peer
    relation. There is an arc from node R to node S
    if there is a peer description in L of the form
    Q1(A1) ? Q2(A2) where R appears in Q1 and S
    appears in Q2

13
Theorem 1
  • The problem of finding all certain answers to a
    conjunctive query Q, for a given PDMS N, is
    undecidable
  • If a PDMS N includes only inclusion peer and
    storage descriptions and the peer mappings are
    acyclic, then a conjunctive query can be answered
    in polynomial time data complexity.

14
Theorem 2
  • Let N be a PDMS for which all inclusion peer
    mappings are acyclic, but which may also contain
    equality peer mappings
  • if (1) whenever a storage or peer description in
    N is an equality description, it does not contain
    projections, and (2) a peer relation that appears
    in the head of a definitional description does
    not appear on the right-hand side of any other
    description, then the query answering problem is
    in polynomial time bullet

15
Theorem 2, contd
  • if the conditions of the bullet hold, except
    that some equality storage descriptions contain
    projections, then the data complexity of the
    query answering problem is co-NP complete
  • if the conditions of the bullet hold, except
    that some of the queries on the right-hand side
    of the peer mappings may be unions of conjunctive
    queries, the data complexity of query answering
    is co-NP complete.

16
Theorem 3 (comparison pred.)
  • Let N be a PDMS satisfying the conditions of the
    bullet , and let Q be a conjunctive query
  • if comparison predicates appear only in storage
    descriptions or in the bodies of definitional
    mappings, but not in Q, then query answering is
    in polynomial time
  • otherwise, if either the query contains
    comparison predicates or comparison predicates
    appear in non-definitional peer mappings, then
    the query answering problem is co-NP complete.

17
Conclusions
  • With arbitrary use of the data integration
    formalisms in a PDMS, query answering is
    undecidable
  • However, it has been shown that there is a
    powerful subset of PPL in which query answering
    is tractable
  • The subset supports a limited form of cycles in
    the peer mappings and as well as limited use of
    comparison predicates
Write a Comment
User Comments (0)
About PowerShow.com