Evaluation of Partial Path Queries on XML Data - PowerPoint PPT Presentation

1 / 77
About This Presentation
Title:

Evaluation of Partial Path Queries on XML Data

Description:

Search problem. Name: Xiaoying Wu. Place: Athens Center, Heraklio. Purpose: Sightseeing ... hotels.gr. holidays.gr. 1400 islands. 8. Difficulties on Querying ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 78
Provided by: TD1
Category:

less

Transcript and Presenter's Notes

Title: Evaluation of Partial Path Queries on XML Data


1
Evaluation of Partial Path Queries on XML Data
Stefanos Souldatos (NTUA, GREECE) Xiaoying Wu
(NJIT, USA) Dimitri Theodoratos (NJIT,
USA) Theodore Dalamagas (NTUA, GREECE) Timos
Sellis (NTUA, GREECE)
2
Evaluation of Partial Path Queries on XML Data
  • Partial path queries
  • Query processing
  • Query evaluation
  • Experiments
  • Conclusion

?
3
Difficulties on Querying XML Data
Creta
4
Difficulties on Querying XML Data
Search problem Name Xiaoying Wu Place Athens
Center, Heraklio Purpose Sightseeing Problem
? structural difference
Parthenon (438 BC)
Phaistos Disk (1700 BC)
Creta
5
Difficulties on Querying XML Data
Search problem Name Theodore Dalamagas Place
Islands Purpose Sea sports Problem ?
structural inconsistency
Windsurf
Jet ski
Creta
6
Difficulties on Querying XML Data
Search problem Name Dimitri Theodoratos Place
Heraklio Purpose HDMS Conference Problem ?
unknown structure
HDMS 2008
Creta
7
Difficulties on Querying XML Data
Search problem Name Stefanos Souldatos Place
Any island Purpose Escape from PhD! Problem ?
multiple sources
Creta
?
theHotel.gr
1400 islands
hotels.gr
holidays.gr
8
Difficulties on Querying XML Data
Can we use existing query languages (XPath,
XQuery) to express our queries?
Can we use existing techniques to evaluate our
queries?
Creta
9
Path Queries in XPath
no structure (keywords)
full structure (path patterns)
partial path queries
//theHotel.gr descendant-or-self ancestor-or
-selfCity ancestor-or-selfIsland
//theHotel.gr//City descendant-or-self ances
tor-or-selfIsland
/theHotel.gr/City//Island
10
Partial Path Queries

root node (optional) query node labelled by
a child relationship descendant relationship
r
a
partial path query
11
Partial Path Queries
QUERY PROCESSING
QUERY EVALUATION
partial path query
partial path query in canonical form
12
Evaluation of Partial Path Queries on XML Data
  • Partial path queries
  • Query processing
  • Query evaluation
  • Experiments
  • Conclusion

?
?
13
Query Processing
  • Full form
  • Satisfiability
  • Redundant nodes
  • Canonical form

14
Query Processing
INFERENCE RULES (IR1) - r//ai (IR2) x/y - x//y
(IR3) x//y, y//z - x//z (IR4) x/ai, x//bj -
ai//bj (IR5) ai/x, bj//x - bj//ai (IR6) x/y,
y/w, x//z, z//w - x/z (IR7) x/y, x//z, w/z, w//y
- x/z (IR8) x/y, y/w, x/z - z/w (IR9) x//y,
y//w, x/z - z//w (IR10) x/y, w/y, w/z -
x/z (IR11) x//y, w/y, w//z - x//z (IR12) x/y,
y/w, z/w - x/z (IR13) x//y, y//w, z/w -
x//z x,y,z,w query nodes ai/bj nodes labelled
by a/b
  • Full form
  • Satisfiability
  • Redundant nodes
  • Canonical form

15
Query Processing
INFERENCE RULES (IR1) - r//ai (IR2) x/y - x//y
(IR3) x//y, y//z - x//z (IR4) x/ai, x//bj -
ai//bj (IR5) ai/x, bj//x - bj//ai (IR6) x/y,
y/w, x//z, z//w - x/z (IR7) x/y, x//z, w/z, w//y
- x/z (IR8) x/y, y/w, x/z - z/w (IR9) x//y,
y//w, x/z - z//w (IR10) x/y, w/y, w/z -
x/z (IR11) x//y, w/y, w//z - x//z (IR12) x/y,
y/w, z/w - x/z (IR13) x//y, y//w, z/w -
x//z x,y,z,w query nodes ai/bj nodes labelled
by a/b
  • Full form
  • Satisfiability
  • Redundant nodes
  • Canonical form

16
Query Processing
INFERENCE RULES (IR1) - r//ai (IR2) x/y - x//y
(IR3) x//y, y//z - x//z (IR4) x/ai, x//bj -
ai//bj (IR5) ai/x, bj//x - bj//ai (IR6) x/y,
y/w, x//z, z//w - x/z (IR7) x/y, x//z, w/z, w//y
- x/z (IR8) x/y, y/w, x/z - z/w (IR9) x//y,
y//w, x/z - z//w (IR10) x/y, w/y, w/z -
x/z (IR11) x//y, w/y, w//z - x//z (IR12) x/y,
y/w, z/w - x/z (IR13) x//y, y//w, z/w -
x//z x,y,z,w query nodes ai/bj nodes labelled
by a/b
  • Full form
  • Satisfiability
  • Redundant nodes
  • Canonical form

17
Query Processing
INFERENCE RULES (IR1) - r//ai (IR2) x/y - x//y
(IR3) x//y, y//z - x//z (IR4) x/ai, x//bj -
ai//bj (IR5) ai/x, bj//x - bj//ai (IR6) x/y,
y/w, x//z, z//w - x/z (IR7) x/y, x//z, w/z, w//y
- x/z (IR8) x/y, y/w, x/z - z/w (IR9) x//y,
y//w, x/z - z//w (IR10) x/y, w/y, w/z -
x/z (IR11) x//y, w/y, w//z - x//z (IR12) x/y,
y/w, z/w - x/z (IR13) x//y, y//w, z/w -
x//z x,y,z,w query nodes ai/bj nodes labelled
by a/b
  • Full form
  • Satisfiability
  • Redundant nodes
  • Canonical form

18
Query Processing
  • Full form
  • Satisfiability
  • Redundant nodes
  • Canonical form

A query is unsatisfiable if its full form
contains a trivial cycle
19
Query Processing
A node y is redundant if one of the following
patterns occur
  • Full form
  • Satisfiability
  • Redundant nodes
  • Canonical form

a)
d)
b)
c
c)
20
Query Processing
  • Full form
  • Satisfiability
  • Redundant nodes
  • Canonical form

canonical form of satisfiable query full form
IR2 IR3 redundant nodes
The canonical form of a query is a directed
acyclic graph (dag)
21
Evaluation of Partial Path Queries on XML Data
  • Partial path queries
  • Query processing
  • Query evaluation
  • Experiments
  • Conclusion

?
?
?
22
Evaluation Algorithms
  • Based on PathStack Bruno et al. 02
  • Produce all possible path queries
  • Decompose into root-to-leaf paths
  • PartialMJ Decompose a spanning tree into paths
  • Extending PathStack Bruno et al. 02
  • PartialPathStack Produce a topological order of
    the query nodes and extend PathStack to handle
    it

23
Based on PathStack
1. Producing all possible path queries
r
a
c
b
d
e
f
g
24
Based on PathStack
1. Producing all possible path queries
r
r
r
r
a
a
a
a
c
b
c
b
c
b
c
b
d
d
d
d
e
f
e
f
e
f
e
f
g
g
g
g
25
Based on PathStack
1. Producing all possible path queries
26
Based on PathStack
1. Producing all possible path queries
Problems ? too many queries to evaluate ?
multiple traversal of the XML tree
27
Based on PathStack
2. Decomposing into root-to-leaf paths
28
Based on PathStack
2. Decomposing into root-to-leaf paths
PathStack
29
Based on PathStack
2. Decomposing into root-to-leaf paths
Problems ? path overlaps ? more than one
components to evaluate ? intermediate results
30
Based on PathStack
PartialMJ. Using a spanning tree
Remove edges to create a spanning tree
31
Based on PathStack
PartialMJ. Using a spanning tree
32
Based on PathStack
PartialMJ. Using a spanning tree
PathStack
33
Based on PathStack
PartialMJ. Using a spanning tree
Join conditions (identity, structural, path)
34
Based on PathStack
PartialMJ. Using a spanning tree
Join conditions (identity, structural, path)
35
Based on PathStack
PartialMJ. Using a spanning tree
Join conditions (identity, structural, path)
36
Based on PathStack
PartialMJ. Using a spanning tree
37
Based on PathStack
PartialMJ. Using a spanning tree
Problems ? path overlaps ? more than one
components to evaluate ? intermediate results
38
Extending PathStack
PartialPathStack. Employ a topological order
r
a
c
b
d
e
f
g
39
Extending PathStack
PartialPathStack. Employ a topological order
PartialPathStack
40
PartialPathStack Example
query
tree
results
r
a1
b1
d1
d1
sink nodes
c1
e1
d2
c2
e2
41
PartialPathStack Example
tree
query
results
r
a1
b1
d1
d1
sink nodes
c1
e1
r
d2
c2
e2
42
PartialPathStack Example
tree
query
results
r
a1
b1
d1
d1
sink nodes
c1
e1
r
a1
d2
c2
e2
43
PartialPathStack Example
tree
query
results
r
a1
b1
d1
d1
sink nodes
c1
e1
r
a1
b1
d2
c2
e2
44
PartialPathStack Example
tree
query
results
r
a1
b1
d1
d1
sink nodes
c1
e1
r
a1
b1
d1
d2
c2
e2
45
PartialPathStack Example
tree
query
results
r
a1
b1
d1
d1
sink nodes
c1
e1
r
a1
b1
d1
c1
d2
c2
e2
46
PartialPathStack Example
tree
query
results
r
a1
b1
OUTPUT!!!
d1
d1
sink nodes
c1
e1
r
a1
b1
d1
c1
e1
d2
c2
e2
47
PartialPathStack Example
tree
query
results
r
a1
b1
OUTPUT!!!
d1
d1
sink nodes
c1
e1
r
a1
b1
d1
c1
e1
d2
c2
e2
48
PartialPathStack Example
tree
query
results
r
a1
b1
OUTPUT!!!
d1
d1
sink nodes
c1
e1
r
a1
b1
d1
c1
e1
d2
c2
e2
49
PartialPathStack Example
tree
query
results
r
a1
b1
OUTPUT!!!
d1
d1
sink nodes
c1
e1
r
a1
b1
d1
c1
e1
d2
c2
e2
50
PartialPathStack Example
tree
query
results
r
a1
b1
OUTPUT!!!
d1
d1
sink nodes
c1
e1
r
a1
b1
d1
c1
e1
d2
c2
e2
51
PartialPathStack Example
tree
query
results ra1b1d1c1e1
r
a1
b1
OUTPUT!!!
d1
d1
sink nodes
c1
e1
r
a1
b1
d1
c1
e1
d2
c2
e2
52
PartialPathStack Example
tree
query
results ra1b1d1c1e1
r
a1
b1
d1
d1
sink nodes
c1
e1
d2
r
a1
b1
d1
c1
e1
d2
c2
e2
53
PartialPathStack Example
tree
query
results ra1b1d1c1e1
r
a1
b1
OUTPUT!!!
d1
d1
sink nodes
c1
e1
d2
c2
r
a1
b1
d1
c1
e1
d2
c2
e2
54
PartialPathStack Example
tree
query
results ra1b1d1c1e1
r
a1
b1
OUTPUT!!!
d1
d1
sink nodes
c1
e1
d2
c2
r
a1
b1
d1
c1
e1
d2
c2
e2
55
PartialPathStack Example
tree
query
results ra1b1d1c1e1
r
a1
b1
OUTPUT!!!
d1
d1
sink nodes
c1
e1
d2
c2
r
a1
b1
d1
c1
e1
d2
c2
e2
56
PartialPathStack Example
tree
query
results ra1b1d1c1e1
r
a1
b1
OUTPUT!!!
d1
d1
sink nodes
c1
e1
d2
c2
r
a1
b1
d1
c1
e1
d2
c2
e2
57
PartialPathStack Example
tree
query
results ra1b1d1c1e1
r
a1
b1
OUTPUT!!!
d1
d1
sink nodes
c1
e1
d2
c2
r
a1
b1
d1
c1
e1
d2
c2
e2
58
PartialPathStack Example
tree
query
results ra1b1d1c1e1 ra1b1d1c2e1
r
a1
b1
OUTPUT!!!
d1
d1
sink nodes
c1
e1
d2
c2
r
a1
b1
d1
c1
e1
d2
c2
e2
59
PartialPathStack Example
tree
query
results ra1b1d1c1e1 ra1b1d1c2e1
r
a1
b1
d1
d1
sink nodes
c1
e1
d2
c2
r
a1
b1
d1
c1
e1
d2
c2
e2
60
PartialPathStack Example
tree
query
results ra1b1d1c1e1 ra1b1d1c2e1
r
a1
b1
OUTPUT!!!
d1
d1
sink nodes
c1
e1
d2
e2
r
a1
b1
d1
c1
e1
d2
c2
e2
61
PartialPathStack Example
tree
query
results ra1b1d1c1e1 ra1b1d1c2e1
r
a1
b1
OUTPUT!!!
d1
d1
sink nodes
c1
e1
d2
e2
r
a1
b1
d1
c1
e1
d2
c2
e2
62
PartialPathStack Example
tree
query
results ra1b1d1c1e1 ra1b1d1c2e1
r
a1
b1
OUTPUT!!!
d1
d1
sink nodes
c1
e1
d2
e2
r
a1
b1
d1
c1
e1
d2
c2
e2
63
PartialPathStack Example
tree
query
results ra1b1d1c1e1 ra1b1d1c2e1
r
a1
b1
OUTPUT!!!
d1
d1
sink nodes
c1
e1
d2
e2
r
a1
b1
d1
c1
e1
d2
c2
e2
64
PartialPathStack Example
tree
query
results ra1b1d1c1e1 ra1b1d1c2e1
r
a1
b1
OUTPUT!!!
d1
d1
sink nodes
c1
e1
d2
e2
r
a1
b1
d1
c1
e1
d2
c2
e2
65
PartialPathStack Example
tree
query
results ra1b1d1c1e1 ra1b1d1c2e1 ra1b1d1c1e2
r
a1
b1
OUTPUT!!!
d1
d1
sink nodes
c1
e1
d2
e2
r
a1
b1
d1
c1
e1
d2
c2
e2
66
PartialPathStack Example
query
tree
results ra1b1d1c1e1 ra1b1d1c2e1 ra1b1d1c1e2
r
a1
b1
d1
d1
c1
e1
? only one component to evaluate ? no
intermediate results
d2
c2
e2
67
Evaluation Algorithms
68
PartialPathStack vs PathStack
  • PathStack
  • Path queries
  • Indegree 1
  • Outdegree 1
  • O(input output)
  • PartialPathStack
  • Partial path queries
  • Indegree gt 1
  • Outdegree gt 1
  • O(inputindegree outputoutdegree)

69
Evaluation of Partial Path Queries on XML Data
  • Partial path queries
  • Query processing
  • Query evaluation
  • Experiments
  • Conclusion

?
?
?
?
70
Queries Used in the Experiments
Q1/Q5
Q2/Q6
Q3/Q7
Q4/Q8
71
Experiment 1
Execution time on Treebank
2.5 million nodes
72
Experiment 1
Execution time on Treebank
2.5 million nodes
path queries
73
Experiment 1
Execution time on Treebank
2.5 million nodes
too many results
74
Experiment 1
Execution time on Synthetic data
2.5 million nodes (IBM AlphaWorks XML generator)
75
Experiment 2
Q2
Execution time varying the size of the XML
tree (1 - 3 million nodes)
PartialMJ
PartialPathStack
Q3
Q7
PartialMJ
PartialMJ
PartialPathStack
PartialPathStack
76
Evaluation of Partial Path Queries on XML Data
  • Partial path queries
  • Query processing
  • Query evaluation
  • Experiments
  • Conclusion

?
?
?
?
?
77
Conclusion
78
Questions?
  • Partial path queries
  • Query processing
  • Query evaluation
  • Experiments
  • Conclusion

?
?
?
?
?
Write a Comment
User Comments (0)
About PowerShow.com