A Scalable Algorithm for Answering Queries Using Views - PowerPoint PPT Presentation

About This Presentation
Title:

A Scalable Algorithm for Answering Queries Using Views

Description:

First experimental comparison of algorithms for answering queries using views ... May even be more scalable vs Inverse Rules, Bucket Algorithm ... – PowerPoint PPT presentation

Number of Views:94
Avg rating:3.0/5.0
Slides: 29
Provided by: rachelaman
Category:

less

Transcript and Presenter's Notes

Title: A Scalable Algorithm for Answering Queries Using Views


1
A Scalable Algorithm for Answering Queries Using
Views
  • Rachel Pottinger
  • Qualifying Exam
  • October 29, 1999
  • Advisor Alon Levy

2
Answering Queries Using Views
  • Problem access views instead of original
    relations
  • Useful in data integration and query optimization
  • NP-Complete
  • Many papers on the subject
  • No empirical testing of algorithms

3
Data IntegrationQuery Reformulation
  • Data sources are pre-calculated views
  • Views are not complete
  • Get the most answers possible given the views
  • Many data sources

Car sale information
Ford cars - dealer prices - sticker prices -
inventory
Cheap cars - prices -manufacturer
Used cars - prices - dealer - year
4
Data Integration Example
Query find the prices of cars that we can buy at
cost
Database relations
Query
  • Q(cost)-dealercost(car,cost)
    stickerprice(car,cost)
  • V1(price1,price2)-dealercost(car, price1)
  • stickerprice(car, price2) maker(car, Ford)
  • V2(cost)-dealercost(car, cost)
    stickerprice(car,cost) cheap(car)
  • Q1(cost)-Ford(cost, cost) ?
    Q2(cost)-BMW(cost)

Views
existential
distinguished
Maximally contained rewriting
Conjunctive rewritings
5
Outline
  • Previous algorithms
  • Bucket Algorithm Levy, Rajaraman, Ordille, 1996
  • Inverse rules Duschka, Genesereth, 1997
  • Minimum Necessary Connections (MiniCon) Algorithm
  • Experimental evaluation
  • Extension to arithmetic comparisons
  • Conclusions and future work

6
The Bucket Algorithm
  • Introduced as part of Information Manifold
  • Treats subgoals individually

7
Bucket Algorithm Populating buckets
  • For each subgoal in the query, place relevant
    views in the subgoals bucket
  • Inputs
  • Q(x)- r1(x,y) r2(y,x)
  • V1(a)-r1(a,b)
  • V2(d)-r2(c,d)
  • V3(f)- r1(f,g) r2(g,f)

Buckets
8
Combining Buckets
  • For every combination in the Cartesian products
    from the buckets, check containment in the query
  • Candidate rewritings
  • Q1(x) - V1(x) V2(x) ?
  • Q2(x) - V1(x) V3(x) ?
  • Q3(x) - V3(x) V2(x) ?
  • Q4(x) - V3(x) V3(x) ?

Bucket Algorithm will check all possible
combinations
Buckets
r1(x,y)
r2(y,x)
9
Inverse Rules
  • Part of the Info Master system
  • Inverse rules show how to get database tuples
    from the views
  • Cannot be extended to interpreted predicates
  • Stops earlier than the Bucket Algorithm

10
Creating Inverse Rules
For each V(X)-r1(X1) rn(Xn) for each j 1,
, n form an inverse rule rj(Xj)-V(X)
  • Inverse Rules
  • IR1 r1(a, sfV1(a)) -V1(a)
  • IR2 r2(sfV2(d),d) -V2(d)
  • IR3 r1(f,sfV3(f)) -V3(f)
  • IR4 r2(sfV3(f),f) -V3(f)
  • Inputs
  • V1(a)-r1(a,b)
  • V2(d)-r2(c,d)
  • V3(f)- r1(f,g) r2(g,f)

Skolem Function
11
Combining Inverse Rules
At query time, query over rules
  • Inverse Rules
  • IR1 r1(a, sfV1(a)) -V1(a)
  • IR2 r2(sfV2(d),d) -V2(d)
  • IR3 r1(f,sfV3(f)) -V3(f)
  • IR4 r2(sfV3(f),f) -V3(f)
  • Tuples
  • V1(g)
  • V2(h)
  • V3(j)
  • V3(m)

Query
Q(x)-r1(x,y) r2(y,x)
Expansion r1(g,sfV1(g)), r2(sfV2(h),h),
r1(j,sfV3(j)), r2(sfV3(j),j) r1(m,sfV3(m)),
r2(sfV3(m),m)
12
Unfolding rules before tuples
  • Q(x)- r1(x,y) r2(y,x)

IR1 IR3
IR2 IR4
Use unification to see if rewriting is contained
in the query No containment check necessary
13
The MiniCon Algorithm
  • Concentrate on variables rather than subgoals to
    create MiniCon Descriptions (MCDs)
  • Combine MCDs that only overlap on distinguished
    view variables
  • No containment check!

14
MiniCon Description Formation
  • Form all MiniCon Descriptions (MCDs) that map all
    query variables that have to be mapped together
  • Inputs
  • Q(x) -r1(x,y) r2(y,x)
  • V1(a)-r1(a,b)
  • V2(d)-r2(c,d)
  • V3(f)- r1(f,g) r2(g,f)

MCDs
15
MiniCon Combination
  • Take all combinations of MCDs that
  • map disjoint sets of subgoals
  • map all subgoals of the query
  • MCDs

Rewriting Q(x)-V3(x)
16
Experimental Evaluation
  • Tested performance and scale up of
  • Bucket Algorithm
  • Inverse Rules extended with unification
  • MiniCon Algorithm
  • MiniCon at least as good in all cases, much
    better in some
  • Show results for chain queries
  • Q(a)-r1(a,b), r2(b,c), r3(c,d), r4(d,e)

17
Many Rewritings
18
Few rewritings, very structured query and views
19
Few rewritings, less structured views
20
ExtensionInterpreted Predicates
  • Problem is in general undecidable
  • We looked at subgoals of the form
  • var lt constant or var gt constant
  • If maps to an existential view variable, require
    interpreted predicates implied
  • Ex Q(x)-r1(x,y), y gt 17
  • V1(a)-r1(a,b), b gt 18
  • Guaranteed to be sound

Interpreted Predicates
21
Interpreted Predicate Results
22
Future Work
  • Query Optimization
  • Look for the fastest answer to query
  • Assume that all views are complete
  • Require equivalent rewritings
  • Need to allow overlap on subgoals mapped
  • A fuller comparison of interpreted predicates

23
Conclusions
  • Scalability of previous algorithms understood
  • MiniCon Algorithm invented
  • First experimental comparison of algorithms for
    answering queries using views
  • Extensions to binding patterns, interpreted
    predicates
  • New maximally contained rewriting form

24
Maximally contained Rewritings
  • Q is a maximally contained rewriting of a query
    Q using the views V V1, , Vn if
  • For any database D, and extensions v1, , vn of
    the views such that vi ? Vi(D), 1 ?i ?n, then
    Q(v1, , v2) ? Q(D) for all i
  • There is no other query Q1 such that
  • Q(v1, , vn) ? Q1(v1, , vn)
  • (2) Q1(v1, , vn) ? Q(D), and there exists at
    least one database for which ? is a strict subset

25
Containment Checks
  • Q1 ? Q2 if the answer to Q1 is a subset of Q2
  • m is a containment mapping from Vars(Q2) to
    Vars(Q1) if
  • m maps every subgoal in the body of Q2 to a
    subgoal in the body of Q1
  • m maps the head of Q2 to the head of Q1

26
Inverse Rules With Unification
  • Find all Inverse Rules that match each query
    subgoal place in bucket for that subgoal
  • For each rule in the first bucket
  • For each other subgoal, i, attempt to unify the
    rules so far with all elements in the bucket for
    I
  • If we cannot unify with anything in that bucket,
    break out of loop, otherwise, recurse

27
Correctness requirements
  • We need both soundness and completeness
  • A sound rewriting has a valid containment mapping
    from the variables of the query to the variables
    of the view
  • For completeness we need only to check rewritings
    of length less than or equal to that of the query

28
Extensions to XML
  • Need to choose a query language
  • Containment checks should still hold
  • Need to check to make sure that restructured
    elements are distinguished
  • May even be more scalable vs Inverse Rules,
    Bucket Algorithm
Write a Comment
User Comments (0)
About PowerShow.com