Web data management and distribution XQuery processing Part 4: A logical algebra for XQuery - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

Web data management and distribution XQuery processing Part 4: A logical algebra for XQuery

Description:

The '=' of XQuery translates to existential equality comparison. Existential list equality (ele) comparison = l1 = l2 iff o1 l1, o2 l2 such that o1 =v o2 ... – PowerPoint PPT presentation

Number of Views:171
Avg rating:3.0/5.0
Slides: 50
Provided by: proje50
Category:

less

Transcript and Presenter's Notes

Title: Web data management and distribution XQuery processing Part 4: A logical algebra for XQuery


1
Web data management and distributionXQuery
processingPart 4 A logical algebra for XQuery
S.Abiteboul, I.Manolescu, P.Rigaux,
P.Senellart INRIA Saclay Île-de-France This
course based on I.Manolescu, Y.Papakonstantinou
XQuery Midflight Emerging Database-Oriented
Paradigms and a Classification of Research
Advances, ICDE 2005
2

XQDMA an abstraction of the XQuery Data Model
  • Labeled (ordered) trees
  • Nodes
  • Four kinds Document, element, attribute, text
  • Single document node
  • Element/attribute nodes labeled with XQDM
    element/attribute names
  • by convention, attribute names start with _at_
  • Text nodes labeled with XQDM values
  • Nodes have unique identities
  • Type function T labels every node with XQDM type
  • Edges
  • Document node is root and has exactly one child,
    which is element
  • Attribute nodes may appear only as children of
    element nodes
  • Attribute nodes may only have text node children
  • Text nodes may only be leaves

3
Equality Relationships
  • Node ID-based equality id (XQuery is)
  • Two nodes are id-equal if they are the same
  • Value-based equality v (XQuery eq)
  • Two values are equal if the results of casting
    one or both into a common domain are equal.
    Depends on values' types.
  • 24 atomic types XQDM,XSch casting rules in
    XQFO
  • Limited value-based equality
  • Text nodes are value-based equal if their labels
    are equal
  • Attribute/Element nodes n1 and n2 are value-based
    equal if their labels are equal and
  • (unordered) for every child of n1 there is an
    equal child of n2 and vice versa
  • (obvious generalization to ordered)
  • Limited value-based equality implies Value-based
    equality
  • Not vice-versa (eg, text node with "005" is equal
    to text node with "05", considering typing and
    coercion)

4
Order Relationship
  • Node order relationship ltlt (XQuery before) in
    ordered
  • Parent before children
  • Attribute nodes directly follow parent (precede
    non-attribute nodes)
  • Undefined order between attributes

5
Node and value comparisons
document node element node attribute node text
node
"group.xml"
r1
group
g1
faculty
f1
s1
j2
j1
_at_name
_at_name
p1
p3
p2
n2
n1
inproject
inproject
inproject
m1
m2
i1
m3
KadoP
NexT
i2
i3
n5
t3
t1
NexT
KadoP
m_at_u.edu
t5
t8
j_at_u.edu
KadoP
Lily
t12
t9
URL
URL
t14
t15
n1
u1
u2
m_at_acm.org
Mary
kadop.net
next.org
t13
t10
t4
t2
6
Equality comparisons
t1 ?id t5 t1 v t5 t5 v t14 i1 ?id i3 i1 v
i3 i1 ?v n1
"group.xml"
r1
group
g1
faculty
f1
s1
j2
j1
_at_name
_at_name
p1
p3
p2
n2
n1
inproject
inproject
inproject
m1
m2
i1
m3
KadoP
NexT
i2
i3
n5
t3
t1
NexT
KadoP
m_at_u.edu
t5
t8
j_at_u.edu
KadoP
Lily
t12
t9
URL
URL
t14
t15
n1
u1
u2
m_at_acm.org
Mary
kadop.net
next.org
t13
t10
t4
t2
7
Order comparisons
j1 ltlt n1, j1 ltlt u1 p1 ltlt i1 ltlt t5 ltlt n3 ltlt f1
"group.xml"
r1
group
g1
faculty
f1
s1
j2
j1
_at_name
_at_name
p1
p3
p2
n2
n1
inproject
inproject
inproject
m1
m2
i1
m3
KadoP
NexT
i2
i3
n5
t3
t1
NexT
KadoP
m_at_u.edu
t5
t8
j_at_u.edu
KadoP
Lily
t12
t9
URL
URL
t14
t15
n1
u1
u2
m_at_acm.org
Mary
kadop.net
next.org
t13
t10
t4
t2
8
Equalities on XQDMA lists
  • Deep-equal
  • Two lists are deep-equal if they have the same
    length and their items at corresponding positions
    are value-based equal
  • The "" of XQuery translates to existential
    equality comparison
  • Existential list equality (ele) comparison ?
  • l1 ? l2 iff ? o1 ? l1, o2 ? l2 such that o1 v
    o2
  • ele not transitive

9
Unified data model (UDM) extending XQDMA
  • (XQDMA) lists l o1, o2, ..., on
  • ois are XQDMA nodes
  • Tuples t (v1a1, v2a2, ..., vnan)
  • "(in t) the variable vi binds to variable
    binding ai"
  • t.vi may be an XQDMA list, or
  • a set/bag/list
    (collection) of homogenous tuples
  • (v1, v2, ..., vn) tuple schema

10
UDM tuple equality
b1 (v1p1, v2t5, v3 i1, t5 ) b2
(v1p1, v2t11, v3 i3, t14 )
b1 b2
"group.xml"
r1
group
g1
faculty
f1
s1
j2
j1
_at_name
_at_name
p1
p3
p2
n2
n1
inproject
inproject
inproject
m1
m2
i1
m3
KadoP
NexT
i2
i3
n5
t3
t1
NexT
KadoP
m_at_u.edu
t5
t8
j_at_u.edu
KadoP
Lily
t9
t7
URL
URL
t11
t12
n4
u1
u2
m_at_acm.org
Mary
kadop.net
next.org
t10
t8
t4
t2
11
  • Unified Tuple-Based Algebra Operators

12
Unified tuple-based algebra operators
  • Navigation
  • XML construction
  • Nested plans
  • Relational-style operators
  • Other operators

13
XPath navigation
  • Based on tree patterns
  • Navigation operator nav (Collection of
    Tuples)

R//faculty/person
R//person
R//personemail/name


person
faculty
Nname
email
Pperson
14
XPath navigation
"group.xml"
r1
R//personmail/name
group
g1
faculty
f1
p1
p2

R N r1 n3 r1 n4
inproject
m1
m2
m3
i2
person
NexT
Nname
m_at_u.edu
mail
t8
j_at_u.edu
R r1
t9
t7
n4
m_at_acm.org
Mary
t10
t8
15
Tree patterns capture navigation of "for"
for P in R//person, M in P/mail, N in
P/name return P, M, N
R P M N r1 p1 m1 n3 r1 p2 m2
n4 r1 p2 m3 n4

Pperson
Mmail
Nname
R r1
16
Generalized "for" navigation
for P in R//person, M in P/mail, N in
P/name return P, M, N
P M N p1 m1 n3 p2 m2 n4 p2 m3
n4
p
P,M,N

R P M N r1 p1 m1 n3 r1 p2 m2
n4 r1 p2 m3 n4
Pperson
Mmail
Nname
R r1
17
Generalized "for" navigation
for P in R//person, M in P/mail, N in
P/name return P, M, N
P M N p1 m1 n3 p2 m2 n4 p2 m3
n4

Pperson
Mmail
Nname
R r1
18
Generalized "for" and "where" navigation
for P in R//person, N in P/name where
P/email return P, N
P N p1 n3 p2 n4 p2 n4

Pperson
mail
Nname
19
XML result construction
person
x1
name
mail
m1'
n3'
j_at_u.edu
John
t6'
t9'
for P in //person, N in P/name, M
in P/mail return ltpersongt M, N
lt/persongt
person
x2
mail
name
m2'
n4'
m_at_u.edu
Mary
t12'
t10'
person
x3
mail
m3'
n4'
m_at_acm.org
Mary
t13'
t10'

Pperson
Mmail
Nname
20
XML result construction
person
x1
n3'
m1'
for P in //person, N in P/name, M
in P/mail return ltpersongt M, N
lt/persongt
person
x2
m2'
n4'
person
x3
m3'
n4'
21
Nested plans and the apply operator (1)
for P in //person return ltpersongt for N
in P/name return N
lt/persongt
22
Nested plans and apply (2)
P1
for P in //person return ltpersongt for N
in P/name return N,
P/mail lt/persongt
P p1 p2 p3
23
Nested plans and apply (3)
person
x1
n3'
m1'
person
x2
n4'
m2'
m3'
for P in //person return ltpersongt for N
in P/name return N,
P/mail lt/persongt
person
person
x3
1
n5'
24
Nested plans and let clauses
Previous query for P in //person return
ltpersongt for N in P/name return
N, P/mail lt/persongt
25
Nested plans and let clauses
Same query with let for P in //person let L1
for N in P/name let L2
N, P/mail return L2
return ltpersongt L1 lt/persongt
26
Nested plans and let clauses
Same query with let for P in //person let L1
for N in P/name let L2
P/mail return N, L2 return
ltpersongt L1 lt/persongt
T
N L2
L2
P2
crList

nav
N name
27
Nested plans and let clauses
Same query with let for P in //person let L1
for N in P/name let L2
N, P/mail return L2
return ltpersongt L1 lt/persongt
P1
28
Nested plans and let clauses
  • Nested queries can be equivalently rewritten
    using let clauses until return clauses reach the
    form
  • V1, V2, ..., VK or lttaggt V1, V2,
    ..., Vk lt/taggt
  • Nested queries with let can be "automatically"
    translated
  • for ? nav
  • let ? apply
  • return ? crList

29
Nested plans and optional navigation
  • Capture all navigation with a single pattern
  • optional edges
  • null (?) variable values
  • groupBy variables values

for P in //person return ltpersongt for N
in P/name return N,
P/mail lt/persongt
30
Nested plans and optional navigation
  • Capture all navigation with a single pattern
  • optional edges
  • null (?) variable values
  • groupBy variables values
  • apply on set appsP ? V

for P in //person return ltpersongt for N
in P/name return N,
P/mail lt/persongt
appsP1?V
R
Pperson
Nname
Mmail
31
Nested plans and optional navigation
P
G
P M N
V
p1 p1 m1 n3 p2 p2 m2
n4 , p2 m3 n4 p3
p3 ? n5
person
x1
n3'
n3'
m1'
person
x2
n4'
m2'
m3'
person
x3
for P in //person return ltpersongt for N
in P/name return N,
P/mail lt/persongt
n5'
appsP1?V
R
Pperson
Nname
Mmail
32
Nested plans and optional navigation
for P in //person return ltpersongt for N
in P/name return N,
P/mail lt/persongt
P
G
P M N
V
p1 p1 m1 n3 p2 p2 m2
n4 , p2 m3 n4 p3
p3 ? n5
n3'
appsP1?V
R
Pperson
Mmail
Nname
33
Selection predicates
  • Predicate Meaning XQuery notation / fn or op
    XQFO
  • id same node is /
    opis-same-node
  • v same value eq /
    fncompare, opnumeric-equal...
  • ltv smaller value lt /
    fncompare, opnum-less-than...
  • ltlt node before ltlt /
    opnode-before
  • list equality eq /
    fndeep-equal
    tuple equality
  • ? exist. equality /
    fns. backing eq,
    tuple equality

34
Selection plan (1)
  • ??(p)

n4
?M v? "m_at_acm.org"
for P in //person, N in P/name where
P/email "m_at_acm.org" return N
appsP1?V
35
Selection plan (2)
P
G
P M N
M2
for P in //person, N in P/name where
P/mail "m_at_acm.org" return N
p1 p1 m1 n3 p2 p2 m2 n4
, p2 m3 n4
n3
n4
apps P2?M2
R
P M N p1 m1 n3 p2 m2 n4 p2 m3
n4
Pperson
Mmail
Nname
36
Selection plan (2)
T
M2
2
P
G
P M N
M2
p2 p2 m2 n4 , p2 m3
n4
for P in //person, N in P/name where
P/mail "m_at_acm.org" return N
n4
?M2 v? "m_at_acm.org"
apps P2?M2
37
Selection plan (3)
  • ??(p)

for P in //person, N in P/name where
P/mail "m_at_acm.org" return N
?M v "m_at_acm.org"
R
Pperson
Mmail
Nname
38
Joins
  • for p in //person, j in //projects
  • where p/inproject j/_at_name
  • return p, j

...
x v? y
R
R
Jproject
Pperson
y_at_name
xinproject
39
Generalized navigation
"group.xml"
for P in //person, M in P/mail, N
in P/name return ltpersgt M, N
lt/persgt
r1
group
g1
faculty
f1
...
p1
p2
m2
m3
m_at_u.edu
t12
m_at_acm.org
n4
t13
Mary
t10
40
Other algebras
  • Tsimmis and YAT algebras for semistructured data
  • Introduced naviagtion and construction patterns
  • SAL from U. Tel Aviv
  • Close relative of OQL
  • TAX Generalized Tree Patterns from Michigan
  • Navigation extracts bindings to hidden tuples
    (packaged as trees)
  • Grouping tracking navigation
  • Enosys algebra
  • Collect bindings (nav and join), do nested plans,
    create XML
  • Also present in NEXT system
  • Xstasy from U. Pisa
  • Context-based algebra from U. Oregon
  • Rainbow algebra from Worcester Polytechnic Inst.

41
  • Putting it all together processing a sample query

42
Sample query
for P in //person, N in P/name where
P/mail "m_at_acm.org" return N
43
Sample query and logical plan 1
T
N
C
for P in //person, N in P/name where
P/mail "m_at_acm.org" return N
?M v "m_at_acm.org"
R
Pperson
Mmail
Nname
44
Sample query and logical plan 1
T
N
C
for P in //person, N in P/name where
P/mail "m_at_acm.org" return N
?M v "m_at_acm.org"
R
Pperson
Mmail
Nname
45
Sample query and logical plan 1
How to implement navTN? If a (person, mail,
name) table exists, scan it, then apply
selection If a (person, mail) and a (person,
name) table exist, join them If the document
still exists as a whole, evaluate in streaming
for P in //person, N in P/name where
P/mail "m_at_acm.org" return N

?M v "m_at_acm.org"
R
Pperson
Mmail
Nname
46
Sample query and logical plan 2
How to implement navTN2? If a (person, mail,
name) table exists, scan it, then apply
selection If a (person, mail, name) table exists,
indexedon mail, perform an index access If the
document still exists as a whole, evaluate in
streaming
for P in //person, N in P/name where
P/mail "m_at_acm.org" return N

R
Pperson
Mmail m_at_acm.org
Nname
47
  • XQuery processing conclusion

48
Conclusions
  • XML query processing takes place in a huge
    variety of settings
  • Many small documents
  • Few large documents
  • Varied structure vs. uniform structure
  • Retrieve vs. construct queries
  • Different implementation options
  • Persistent store
  • Streaming
  • Both
  • Choice of storage model choice of materialized
    views
  • Access path selection view-based query
    rewriting

49
Conclusions
  • Logical algebra for XQuery allows making sense of
    a query (decomposing it in logical operator)
  • Each logical operator may have different
    implementations (physical operators)
  • Navigation
  • Table access
  • Index access
  • Stream-based evaluation
  • Join
  • Hash-based
  • Nested loops
  • Apply
  • Iteration-based (nested loops)
  • If there are duplicates, use group-by
  • List constructor typically easy
Write a Comment
User Comments (0)
About PowerShow.com