CS 245: Database System Principles Notes 14: Coping with Limited Capabilities of Sources - PowerPoint PPT Presentation

About This Presentation
Title:

CS 245: Database System Principles Notes 14: Coping with Limited Capabilities of Sources

Description:

Hector Garcia-Molina. CS 245. Notes 14. 2. Heterogeneous Databases. data. DBMS1. data. DBMS2 ... Can only search for subject = 'art,' 'history,' 'science' ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 23
Provided by: Sir107
Category:

less

Transcript and Presenter's Notes

Title: CS 245: Database System Principles Notes 14: Coping with Limited Capabilities of Sources


1
CS 245 Database System PrinciplesNotes 14
Coping with LimitedCapabilities of Sources
  • Hector Garcia-Molina

2
Heterogeneous Databases
Distributed Database System
DBMS1
DBMS2
legacy
web site
data
data
data
data
3
Limited Capabilities
4
Example Amazon.com
must specify at least one of these
author
title
this attribute not returned
subject
format
menu of choices
price
cannot query on this attribute
5
Example BarnesAndNoble.com
must specify at least one of these
author
title
Menu of choices
subject
format
can query if one of other attributes specified
price
6
Why Limited Capabilities?
  • Search forms
  • Security
  • Indexes
  • Legacy

7
Capability vs. Content
  • Capability description
  • Can only search for subject art, history,
    science
  • Content description
  • Source only contains subject art, history,
    science

8
Outline
  • Describing source capabilities
  • Extending source capabilities
  • How mediators cope with limited capabilities
  • Mediator capabilities
  • Other topics

mediator
source
source
source
9
Describing Query Capabilities
R(X, Y, ... Z)
  • Adornments
  • f may or may not specify
  • u cannot be specified
  • b must be specified
  • cS specified from list S
  • oS optional, chose from S

10
Describing Query Capabilities
R(X, Y, ... Z)
  • With output restriction
  • f
  • u
  • b
  • cS
  • oS
  • Adornments
  • f may or may not specify
  • u cannot be specified
  • b must be specified
  • cS specified from list S
  • oS optional, chose from S

11
Example
  • Relation R(X, Y, Z)
  • Description Templates buf, ufcz1, z2
  • Answerable queries R(x1, Y, Z), R(X, Y, z1)
  • Unanswerable queries R(X, y1, Z),
    R(X, Y, z3)

12
Other Description Mechanisms
  • Tsimmis
  • query templates
  • Information Manifold
  • capability records ( bound attrs, conditions
    ok,...)
  • Disco
  • Garlic
  • black box
  • Contex-free grammars

13
Extending Source Capabilities
Query authorFreud AND price gt
10
wrapper
amazon
Source R(author, price, ...) Template
b, u, ...
14
Extending Source Capabilities
Query authorFreud AND price gt
10
Wrapper Filter price gt 10
wrapper
Source Query authorFreud
amazon
Source R(author, price, ...) Template
b, u, ...
15
Another Example
Query (author Freud OR author Jung)
AND price lt 10
wrapper
BarnesNoble
R(author, price, ...) No disjunctive
conditions Price can only be specified with
author
16
Another Example
Query (author Freud OR author Jung)
AND price lt 10
Union Operation
wrapper
Q1 author Freud AND price lt 10 Q2 author
Jung AND price lt 10
BarnesNoble
R(author, price, ) No disjunctive
conditions Price can only be specified with
author
17
Extending Source Capabilities
  • General scheme
  • try many query rewritings
  • check if query fragments supported by source
  • check if wrapper can combine answer fragments
  • do all this very efficiently!! See ICDE99
    paper
  • Tsimmis, Info Manifold no disjunctive queries
  • DISCO no query splitting
  • Garlic only CNF queries

18
Mediator Processing
Query M(5, Y, Z, W, 3)
M(X, Y, Z, W, U) Join(R, T)
mediator
source
source
R(X, Y, Z) f, f, b
T(Z, W, U) f, u, b
19
Plan 1
Query M(5, Y, Z, W, 3)
(3) Join answers
M(X, Y, Z, W, U) Join(R, T)
mediator
(1) R(5, Y, Z)
(2) T(Z, W, 3)
source
source
T(Z, W, U) f, u, b
R(X, Y, Z) f, f, b
20
Plan 2
Query M(5, Y, Z, W, 3)
(3) Join answers
(2) for each (z,w,u) ? P R(5, Y, u)
M(X, Y, Z, W, U) Join(R, T)
mediator
(1) P T(Z, W, 3)
source
source
T(Z, W, U) f, u, b
R(X, Y, Z) f, f, b
21
Mediator Plan Generation
  • Need feasible and efficient plan
  • Search space is huge
  • Tsimmis, Info Manifold, Garlic
  • exponential algorithms
  • Polynomial algorithms
  • often find optimal or near-optimal plan
  • bounded performance
  • See ICDT99 Paper

22
Conclusion
  • Not all sources are created equal!
  • Need to
  • describe what sources can do
  • efficiently process queries with limited sources
  • describe what mediators can do
  • exploit content information
  • deal with unavailable sources
Write a Comment
User Comments (0)
About PowerShow.com