# On the Inverse rules algorithm - PowerPoint PPT Presentation

PPT – On the Inverse rules algorithm PowerPoint presentation | free to download - id: d5df-NTM3O

The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
Title:

## On the Inverse rules algorithm

Description:

### ... what if the view contained no ... a view does not export a join variable, and does not contain ... Along the way, find out also which view head variables ... – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 32
Provided by: off9
Category:
Tags:
Transcript and Presenter's Notes

Title: On the Inverse rules algorithm

1
On the Inverse rules algorithm
• It is guaranteed to compute the certain answers
• But, what about its efficiency?
• As presented, it computes tuples using views that
cannot contribute to the rewriting, and then
• We show examples, and then how to address the
problems

2
Example A db parenthood relation par(c, p)
A view v(C, G) - par(C, P), par(P, G) // o
nly grandchildren A query Q q(X, Y) - p
ar(X, Z), par(Z, Y) // find grandchildren
The algorithm inverts the view
par(C, f(C, G)) , par ((f(C,G), G) -
v(C,G) Given n tuples in the view, it produces 2n
tuples, then joins, the discards the results
that contain f(-,-) The bucket algorithm will spe
nd more time on rewriting, find
Q(X, Y) - v(X, Y)
And then output the n results
3
Example (university db) Views v1(s, c, q,
t) - registered(s, c, q), course(c, t),
c500, qa98 v2(s, p, c, q) - registered(s
, c, q), teaches(p, c, q) v3(s, c) -
registered(s, c, q), q v4(p, c, t, q) - registered(s, c, q),
teaches(p, c, q), course(c, t), qQuery q(s, p, c) - registered(s, c, q), te
aches(p, c, q), course(c, t), c300, qa95
Inverting v3 registered(s, c, f(s,c)) -
v3(s, c) This may produce any number of facts f
or registered, but for this query none can be
used why?
4
• v3(s, c) - registered(s, c, q),
q
• q(s, p, c) - registered(s, c, q), teaches(p,
c, q), course(c, t), c300, qa95
• How should the constraint on q in v3 be
represented?
• Could export it by f(s, c) conflict with f(s, c) a95 in query (how is q
in the query transformed to f(s,c)?)
• But, what if the view contained no constraint?
• The view must export variables constrained in the
query
• The query has a join on q with teaches teaches
facts are derived only from other views, so q
will be exported as a different function symbol,
or as q (which of these here?)
• ? a join will fail (cannot join f1(-,-) with
f2(-,-) or a regular variable)
• ? The view must export join variables of the
query

5
The factors that determine usability of a view
are the same as in the bucket algorithm, but the
inverse rules algorithm tries to use all views
anyway Solution compose query with inverse rul
es, to obtain a new query that uses directly the
views Composition Consider the heads of inverse
rules as a db collection of facts
Look for valuations mapping of query variables
that map query atoms to this db
Then repalce query goals by views
6
Example A db parenthood relation
par(c, p) A view v(C, G) - par(C, P), par(P,
G) // only grandchildren A query Q q(X
, Y) - par(X, Z), par(Z, Y) // find
grandchildren The algorithm inverts the view
par(C, f(C, G)) , par ((f(C,G), G) - v
(C,G) Two candidate valuation mappings X ? C, Z
? f(C,G), Y ? G ? q(C, G) - v(C, G),
v(C, G) X ? f(C, G), Z ? ,G, Y ? f(C, G) ? (assu
q(f(G, G), f(G,G)) - v(G, G),
v(G, G) 2nd is discarded no function symbols in
result Minimization of 1st gives q(C, G) - v(C,
G), same as bucket
db
7
• q(s, p, c) - registered(s, c, q), teaches(p,
c, q), course(c, t), c300, qa95
• registered(s, c, f(s, c)), f(s, c)v3(s, c)
• Any valuation that uses this fact must map q ?
f(s, c)
• The constraint f(s, c) f(s,c)a95,
• but what if there is no constraint to
export?
• The mapping q? f(s, c) cannot be used to map
teaches to any fact derived from other views
• ? v3 cannot be used

8
• A mapping will fail to define a valuation if
• a view does not export a join variable, and does
not contain the join (why?)
• The view does not export a variable that is
constrained in the query (cannot check the
constraint in the db)
• Thus, the results (for a CQ query, possibly with
constraints) will be the same as for bucket
(assuming it is correct complete)
• The amount of work invested will probably be
similar
• Composition can be performed also for Datalog
queries, but weeding out useless mappings is more
difficult

9
The MiniCon algorithm --- the final one?
• Motivation
• Preliminaries
• The MiniCon algorithm

10
Motivation
• Previous algorithms bucket, inverse
rules, may be quite expensive to use, especially
for systems with many views.
• The bucket algorithm has a narrow peephole in 1st
stage each bucket is for a single atom
• ? global constraints are treated only in 2nd
stage
• ? Many useless combinations may be examined
• The inverse rules algorithm improved by
composition, seems to perform similar work
• The motivation find an algorithm that will do
more work in preliminary filtering, and will
scale up to hundreds of views

11
Preliminaries
• The idea
• Once a view is put in a bucket of a query atom,
switch to considering join variables and find
which other atoms are necessarily covered by the
view
• Along the way, find out also which view head
variables need to be equated
• Given coverage by views, combine views with
disjoint covers
• Expected gain
• more filtering in the 1st stage,
• better representation of information
• ? A smaller number of combinations, reduced
number of containment checks in the 2nd stage

12
Example A db parenthood relation par(c, p)
A view v(C, G) - par(C, P), par(P, G) // o
nly grandchildren A query Q q(X, Y) - p
ar(X, Z), par(Z, Y) Bucket one view in e
ach bucket par(X, Z) v(X,G) par(Z
, Y) v(P, Y) When the two view atoms are com
bined, a containment check discovers that GY ?
containment, redundancy of 2nd atom
Alternative given par(X, Z) v(X,G), since Z
(join var) occurs in 2nd atom of query, add
par(Z, Y) to coverage of v(X,G), with GY
In 2nd stage, just use v(X, Y)
13
• Assumptions, terminology
• CQ queries and views, for now no constants /
constraints in query/views
• View definitions use variables different from
those in query or other views (disjoint sets of
variables)
• b(Q) body atoms of Q, b(V) body atoms of view
V
• A mapping from vars(Q) to a vars(V) is
interesting only if it maps a non-empty subset
of b(Q) to b(V)
• Considered mappings always map Q head vars to V
• If h maps x in vars(Q) to an existential var in
some V, then all atoms of b(Q) that contain x
must be mapped to same V
• join variable condition --- (jvc)

14
• Given Q(X), assume Q is a rewriting in terms of
views
• Q q(X) - v1(X1), , vn(Xn)
• (some vi, vj may be occurrences of
same view v)
• Exists containment mapping h from Q to
exp(Q) (satisfies hvp)
• Let
• Gi be the set of atoms of b(Q) mapped to
b(exp(vi))
• h/i h restricted to vars(Gi)
• Then
• And Gi satisfies (jvc)
• if h/i maps x of vars(Gi) to existential
variable of vi,
• then every atom g in b(Q) that contains
this atom is in Gi

15
The occurrence of vi in Q may have some head
variables equated Example the original hea
d might be vi(A, B, C) the head in Q vi(X
, X, Z) These equalities are given by a unique le
ast set of equality constraints Ei
(v/E -- the view v, with head variables equated
as specified by E) Summary (so far) the contain
ment mapping can be decomposed into disjoint
components (vi, Ei, h/i , Gi)
All we need to do is find such components, then
combine them What is the condition for successful
combination? Does a combination (s.t.
) ever fail
?
16
• To find such components, we must use the given
view definitions (variables different from those
of Q or exp(Q)).
• Answer a component and its mapping can be
expressed as
• Here
• hi is a mapping from Q to the given view
definition for vi
• Ei the least set of equalities that make
hi a good mapping
• hi is a variable renaming
• Ei and hi depend only on Q and the definition of
vi
• We can find components mappings from Q to the
view defs, then combine rename, possibly

h/i
Gi
exp(vi(Xi))
hi
hi
vi/Ei
17
• One more step
• A component (vi, Ei, hi , Gi) may be further
decomposed into smaller components (vi, Ei1, hi1
, Gi1), (vi, Ei2, hi2 , Gi2) provided
• each of Gi1, Gi2 satisfies (jvc), and they are
disjoint
• Each of Ei1, Ei2 is a subset of Ei, least sets
for the mappings hi1, hi2 to be ok
• When these are combined, Ei1 union Ei2 is
augmented with the remaining equalities of Ei
• Minimal such components
• Easier to find
• Can be re-used for different combinations.

18
• What is a minimal component?
• C (vi, Ei, hi, Gi) is minimal if
• hi satisfies (hvp) (jvc) (assuming the
equalities in Ei)
• There is no component C1 whose last three
components are contained in Cs last three
components (at least one is proper containment)
• A component minicon (mini containment)
description -- MCD
• The algorithm constructs and combines minimal MCDs

19
The MiniCon Algorithm
• Minimal MCD Construction Algorithm
• For each g in b(Q), each k in each b(vi)
• Let E(g,k) be the least set of equalities s.t.
a mapping h(g,k) from g to k that satisfies (hvp)
exists
• // E(g,k)
and h(g,k), if they exist,
• // are
uniquely determined by g, k
• If E(g,k) and h(g,k) exist
• find all minimal MCDs that extend them
• (vi, Ei, hi, Gi) extends if
• Ei contains E(g,k), hi contains
h(g,k), Gi contains g
• For the final set of MCDs remove duplicates

20
• How do we find minimal MCDs that extend a given
mapping?
• I. Extension to one more query atom, one view
atom
• extend (vi, E, h, g, k) // E equalities on head
vars of vi
• // h
vars(Q) ? vars(vi), partial, hvp with E
• // g in
b(Q), k in b(vi)
• try to extend h to map g to k, with hvp, by
• return fail, or the (uniquely determined)
E,h
• (The first step in alg. of previous page is this
one, given empty E and h)

21
• How do we find minimal MCDs that extend a given
mapping?
• II. Extend repeatedly, as long as needed and
successful
• Given vi, g, k , E(g,k) and h(g,k)
• Let C (vi, E(g,k), h(g,k), g, MC

//C initial component, (jvc) possibly not
satisfied
• While C not empty
• remove some c (vi, E, h, G) from C
• if (jvc) satisifed put in MC
• if not, exists x in vars(Q) s.t. h(x) is
existential, g that contains x, g not in G
• for each k in b(vi)
• if extend(vi, E, h, g, k)
succeeds, put extension in C
• Remove duplicates from MC

22
• Example
• A db parenthood relation par(c, p)
• A view v(C, G) - par(C, P), par(P, G) //
only grandchildren
• A query Q q(X, Y) - par(X, Z), par(Z, Y)

• MCDs
• 1st query atom, 1st view atom h(1,1) X?C, Z?
P, E(1.1)
• need to extend to par(Z, Y), can only map to
2nd view atom
• MCD (v, E, hX?C, Z?P, Y?G, b(Q))
• 1st query atom, 2nd view atom no mapping
•
• The only MCD is the above

23
Comment In the paper, if (vi, Ei1, hi1, Gi
1) and (vi, Ei2, hi2, Gi2) are both minimal
extensions, and Gi1 is contained in Gi2, then
the 2nd is thrown away (another minimization)
I do not know how to explain this optimization,
or prove that with it the algorithm is still
complete
24
2nd phase MCD combination, and variable renaming
A set of MCDs (vi, Ei, hi, Gi) is a candidate
if For each candidate set Rename variables
for each view variable y If hi(x) y (y
a view variable), rename y to x
else rename y to a fresh distinct
variable Note if x in domain of both hi, hj ,
then hi(x), hj(x) are head variables of vi, vj
(by def of MCD), ? renaming makes them equal

25
Example (contd) A db parenthood relation p
ar(c, p) A view v(C, G) - par(C, P), par(P, G
) // only grandchildren A query Q q(X,
Y) - par(X, Z), par(Z, Y)
MCD (v, E, hX?C, Z?P, Y?G, b(Q))
Rename in v C to X, G to Y Rewriting q(X, Y) -
v(X, Y)
26
• Example
• A db parenthood relation par(c, p)
• A view v(C, G) - par(C, P), par(P,
G) // only grandchildren
• A query Q q(X, X) - par(X, Z), par(Z, X)
// I am my own grandpa
• MCDs
• 1st query atom, 1st view atom h(1,1) X?C, Z?
P, E(1.1)
• need to extend to par(Z, X), can only map to
2nd view atom
• MCD (v, CG, X?C, Z?P, b(Q))
• 1st query atom, 2nd view atom no mapping
•
• The only MCD is the above

27
• Example
• A db parenthood relation par(c, p)
• A view v(C, P) - par(C, P), par(P,
G)

• // parents where grandparents exist
• A query Q q(X, Y) - par(X, Z), par(Z, Y)

• MCDs
• h(1,1) X? C, Z? P, E(1.1)
• ? MCD A1 ( v(C, P), , h(1,1),
par(X,Z) )
• h(1, 2) X? P, Z ? G, E(1,2), fails
(why?)
• h(2, 1) Z? C, Y ? P, E(2,1)
• ? MCD A2 ( v(C, P), , h(2,1), ,
par(Z,Y) )
• h(2, 2) Z? P, Y ? G, fails (why?)

28
A view v(C, P) - par(C, P), par(P,
G) A query Q q(X, Y) - par(X, Z), pa
r(Z, Y) MCDs A1 ( v(C, P), , h(
1,1), par(X,Z) ) A2 ( v(C, P), , h(2,1),
par(Z,Y) ) Rewritings (rename views to hav
e distinct vars) A1A2 X? C1, Z? P1, Z? C2, Y ?
P2 add P1 (in 1st v) C2 (in 2nd v)
rewriting v(C1,P1), v(P1, P2)
renaming v(X, Z), v(Z, Y) a correct
rewriting
29
• When Q or views contain constants
• MCD formation
• a of Q must be mapped to a head variable of vi,
or itself
• If x is in headvar(Q), it can be mapped to
• Whenever x is mapped to a, hi records this fact
• MCD combination
• If A1, A2 are defined on x, then allow also
• Both map x to a
• One maps x to a, the other to head var of view
• In either case, rename x to a in rewriting

30
• When Q or views contain comparisons
• If views contain comparisons, no change to
algorithm (it finds contained
rewritings anyway)
• If Q contains comparisons, then there may be no
Datalog program that computes the certain answers
(can express x ! y)
• But, we can expect that extending the algorithm
for comparisons will be a good heuristics, and
will find certain answers in many cases

31
• When Q or views contain comparisons
• C(Q) constraints of Q (closed under inference)
• MCD formation (vi, Ei, hi, Gi) (extend the
join variable condition)
• If hi(x) is existential of vi, and c(x, y) in
C(Q), then hi(y) is defined
• C(vi) must imply all constraints in hi(C(Q))
that involve at least one existential of vi
• MCD combination
• Add all constraints of C(Q) not covered by those
of the views