InstanceIndependent Concurrency Control for Semistructured Databases - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

InstanceIndependent Concurrency Control for Semistructured Databases

Description:

Seminarie Informatica UA, April 2004. 1 ... Seminarie Informatica UA, April 2004. 5. Some schedules are defined for some input documents, ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 50
Provided by: par105
Category:

less

Transcript and Presenter's Notes

Title: InstanceIndependent Concurrency Control for Semistructured Databases


1
Instance-Independent Concurrency Control for
Semistructured Databases
  • Jan Paredaens, Jan Hidders enStijn Dekeyser
  • ADReM onderzoeksgroep, Universiteit Antwerpen

2
Problem Statement (1/4)
Concurrency Control for Semistructured Data?
  • Tree-shaped data
  • Access additions, deletions, path expressions
  • Use tree-shape of data, tree-shape is the data
  • Path locks on instance nodes

Instance independent locking?
  • Inst. dep. locking leads to many locks
  • Instances are big, transactions small

3
Problem Statement (2/4)
  • Example Inst.-dep locking

//child//hobby
//child//hobby
Doc. root
//child//hobby
child//hobby
document
//child//hobby
child//hobby
//child//hobby
child//hobby
person
person
//hobby
//child//hobby
hobby
child//hobby
child
child
age
name
addr
hobby
addr
name
age
//child//hobby
child//hobby
//hobby
hobby
person
person
age
name
addr
hobby
hobby
age
name
addr
4
Problem Statement (3/4)
Group
Transaction T3 Add(Group,member,Person2) Transac
tion T4 Add(Person2,hobby,Cycling)
member
Person1
hobby
Cycling
Schedule 4 T3 Add(Group,member,Person2) T4
Add(Person2,hobby,Cycling) Serial - defined
Schedule 5 T4 Add(Person2,hobby,Cycling) T3
Add(Group,member,Person2) Not defined (for any
document)
Schedule 6 T4 Add(Person2,hobby,Cycling) Defin
ed (not defined for documents
without Person2)
5
Problem Statement (4/4)
  • Some schedules are defined for some input
    documents,
  • not for others
  • Some schedules are serializable for some input
    documents,
  • not for others
  • Characterize the schedules for which there is at
    least one
  • input document for which they are defined and
    that are
  • serializable for all input documents for which
    they are defined.
  • input documents have no DTD nor XML-schema
  • schedules are given completely, not
    incrementally

6
Path expressions and the paths they represent
Let a, b, c be labels of edges a L(a)
a a/b L(a/b) a/b a//b L(a//b) a/a/b,
a/b/b, a/c/b, a//b L(a//b) a/b, a/c/b,
a/c/a/b/c/b, . L(.) e
7
Queries, Additions, Deletions
  • XQuery
  • XUpdate
  • Query(n, pe) DT m there is a path in the
    document tree
  • DT
    from n to m that is labeled with
  • a
    string of L(pe)
  • Add(n, l, n) DT DT ? (n, l, n), only
    defined if the result
  • is a document tree
  • Del(n, l, n) DT DT - (n, l, n), only
    defined if (n, l, n) is

  • in DT and the result is a

  • document tree

8
Action, Transaction, Schedule
An action (o, t) o Add, Del, Query t
transaction identifier A transaction is a
sequence of actions with the same transaction
identifier A schedule over a set of
transactions is an interleaving of these
transactions
9
1
a
2
Example
10
1
(Add(1,b,3), t1 )
a
b
2
3
Example
11
1
(Add(1,b,3), t1 ) (Add(1,a,4), t1 )
a
a
b
2
3
4
Example
12
1
(Add(1,b,3), t1 ) (Add(1,a,4), t1 ) (Add(3,a,5),
t2 )
a
a
b
2
3
4
a
5
Example
13
1
(Add(1,b,3), t1 ) (Add(1,a,4), t1 ) (Add(3,a,5),
t2 ) (Query(1,a//), t1 ) ?
a
a
b
2
3
4
a
5
Example
14
1
(Add(1,b,3), t1 ) (Add(1,a,4), t1 ) (Add(3,a,5),
t2 ) (Query(1,a//), t1 ) (Add(2,b,5), t2 )
a
a
b
2
3
4
a
5
Example
15
1
(Add(1,b,3), t1 ) (Add(1,a,4), t1 ) (Add(3,a,5),
t2 ) (Query(1,a//), t1 ) (Add(2,b,5), t2 )
a
a
b
2
3
4
a
5
Example
16
1
(Add(1,b,3), t1 ) (Add(1,a,4), t1 ) (Add(3,a,5),
t2 ) (Query(1,a//), t1 ) (Add(2,b,6), t1 )
a
a
b
2
3
4
b
a
5
6
Example
17
1
(Add(1,b,3), t1 ) (Add(1,a,4), t1 ) (Add(3,a,5),
t2 ) (Query(1,a//), t1 ) (Add(2,b,6), t1
) (Add(6,c,7), t3 )
a
a
b
2
3
4
b
a
5
6
c
7
Example
18
1
(Add(1,b,3), t1 ) (Add(1,a,4), t1 ) (Add(3,a,5),
t2 ) (Query(1,a//), t1 ) (Add(2,b,6), t1
) (Add(6,c,7), t3 ) (Query(1,a//), t1 ) 6,7
a
a
b
2
3
4
b
a
5
6
c
7
Example
19
1
(Add(1,b,3), t1 ) (Add(1,a,4), t1 ) (Add(3,a,5),
t2 ) (Query(1,a//), t1 ) (Add(2,b,6), t1
) (Add(6,c,7), t3 ) (Query(1,a//), t1
) (Del(1,b,4), t3 )
a
a
b
2
3
4
b
a
5
6
c
7
Example
20
1
(Add(1,b,3), t1 ) (Add(1,a,4), t1 ) (Add(3,a,5),
t2 ) (Query(1,a//), t1 ) (Add(2,b,6), t1
) (Add(6,c,7), t3 ) (Query(1,a//), t1
) (Del(1,b,4), t3 )
a
a
b
2
3
4
b
a
5
6
c
7
Example
21
1
(Add(1,b,3), t1 ) (Add(1,a,4), t1 ) (Add(3,a,5),
t2 ) (Query(1,a//), t1 ) (Add(2,b,6), t1
) (Add(6,c,7), t3 ) (Query(1,a//), t1
) (Del(2,b,6), t1 )
a
a
b
2
3
4
b
a
5
6
c
7
Example
22
1
(Add(1,b,3), t1 ) (Add(1,a,4), t1 ) (Add(3,a,5),
t2 ) (Query(1,a//), t1 ) (Add(2,b,6), t1
) (Add(6,c,7), t3 ) (Query(1,a//), t1
) (Del(2,b,6), t1 )
a
a
b
2
3
4
b
a
5
6
c
7
Example
23
1
(Add(1,b,3), t1 ) (Add(1,a,4), t1 ) (Add(3,a,5),
t2 ) (Query(1,a//), t1 ) (Add(2,b,6), t1
) (Add(6,c,7), t3 ) (Query(1,a//), t1
) (Del(6,c,7), t1 )
a
a
b
2
3
4
b
a
5
6
Example
24
1
(Add(1,b,3), t1 ) (Add(1,a,4), t1 ) (Add(3,a,5),
t2 ) (Query(1,a//), t1 ) (Add(2,b,6), t1
) (Add(6,c,7), t3 ) (Query(1,a//), t1
) (Del(6,c,7), t1 ) (Query(1,a//), t2 ) 6
a
a
b
2
3
4
b
a
5
6
Example
25
Defined, correct, equivalence (1/?)
A schedule S is called defined on a document tree
DT iff the sequence of actions (Adds and Dels)
of S is defined on DT. A schedule S is called
correct if there is at least one DT on which S
is defined. Two correct schedules S1 and S2
over the same set of transactions are called
equivalent on DT if they are both defined on DT,
S1DT S2DT and the corresponding queries
give the same result.
26
Defined, correct, equivalence (2/?)
Two correct schedules over the same set of
transactions are called equivalent if they are
defined on the same set of DTs and they are
equivalent on these DTs. A schedule is called
serializable if it is equivalent with a
serial schedule.
S1 (Add(1, a, 2), t1) (Del(1, a, 2), t2) (Add(1,
a, 2), t1) S1 is correct S1 is not serializable
since t1 is not correct.
27
1
1
1
Example
a
b
2
2
DT1
DT2
DT3
S1 (Add(2, b, 3),t1) (Query(1,a/b),t2)
S2 (Query(1,a/b),t2) (Add(2, b, 3),t1)
S1DT1 /? S2DT1 S1 DT2 ? S2 DT2 S1 DT3
and S2 DT3 not defined
28
Example
S1 (Add(2, b, 3),t1) (Query(1,a/b),t2)
S2 (Query(1,a/b),t2) (Add(2, b, 3),t1)
S1 and S2 are defined on the same set of DTs and
are not (necessarily) equivalent on these DTs
S3 (Add(2, b, 3),t1) (Add(2, b, 4),t2)
S4 (Add(2, b, 4),t2) (Add(2, b, 3),t1)
S3 and S4 are defined on the same set of DTs and
are equivalent on these DTs
S5 (Add(2, b, 3),t1) (Del(2, b, 3),t2)
S6 empty
S5 and S6 are not defined on the same set of DTs
29
Example
S1 S2 (Add(1, a, 2), t1) (Add(1, b, 3),
t2) NOT EQUIVALENT
30
Example
S1 S2 (Add(1, a, 2), t1) (Add(1, b, 3),
t2) (Add(1, b, 3), t2) (Add(1, a, 2), t1)
EQUIVALENT
31
Example
S1 S2 (Add(1, a, 2), t1) (Add(1, b, 3),
t2) (Add(1, b, 3), t2) (Add(1, a, 2),
t1) (Del(4, c, 5), t1) (Del(4, c, 5), t1)
EQUIVALENT
32
Example
S1 S2 (Add(1, a, 2), t1) (Add(1, b, 3),
t2) (Add(1, b, 3), t2) (Add(1, a, 2),
t1) (Del(4, c, 5), t1) (Del(4, c, 5),
t1) (Del(4, c, 6), t2) (Del(4, c, 7), t1) NOT
EQUIVALENT
33
Example
S1 S2 (Add(1, a, 2), t1) (Add(1, b, 3),
t2) (Add(1, b, 3), t2) (Add(1, a, 2),
t1) (Del(4, c, 5), t1) (Del(4, c, 5),
t1) (Del(4, c, 6), t2) (Del(4, c, 7),
t1) (Del(4, c, 7), t1) (Del(4, c, 6), t2)
EQUIVALENT
34
Example
S1 S2 (Add(1, a, 2), t1) (Add(1, b, 3),
t2) (Add(1, b, 3), t2) (Add(1, a, 2),
t1) (Del(4, c, 5), t1) (Del(4, c, 5),
t1) (Del(4, c, 6), t2) (Del(4, c, 7),
t1) (Del(4, c, 7), t1) (Del(4, c, 6), t2) (Add(2,
c, 8), t1) NOT EQUIVALENT
35
Example
S1 S2 (Add(1, a, 2), t1) (Add(1, b, 3),
t2) (Add(1, b, 3), t2) (Add(1, a, 2),
t1) (Del(4, c, 5), t1) (Del(4, c, 5),
t1) (Del(4, c, 6), t2) (Del(4, c, 7),
t1) (Del(4, c, 7), t1) (Del(4, c, 6), t2) (Add(2,
c, 8), t1) (Query(1, b), t2) (Query(1, b), t2)
NOT EQUIVALENT
36
Example
S1 S2 (Add(1, a, 2), t1) (Add(1, b, 3),
t2) (Add(1, b, 3), t2) (Add(1, a, 2),
t1) (Del(4, c, 5), t1) (Del(4, c, 5),
t1) (Del(4, c, 6), t2) (Del(4, c, 7),
t1) (Del(4, c, 7), t1) (Del(4, c, 6), t2) (Add(2,
c, 8), t1) (Query(1, b), t2) (Query(1, b),
t2) (Add(2, c, 8), t1) EQUIVALENT
37
Example
S1 S2 (Add(1, a, 2), t1) (Add(1, b, 3),
t2) (Add(1, b, 3), t2) (Add(1, a, 2),
t1) (Del(4, c, 5), t1) (Del(4, c, 5),
t1) (Del(4, c, 6), t2) (Del(4, c, 7),
t1) (Del(4, c, 7), t1) (Del(4, c, 6), t2) (Add(2,
c, 8), t1) (Query(1, b), t2) (Query(1, b),
t2) (Add(2, c, 8), t1) (Query(1, a), t1)
(Query(1, a), t1) EQUIVALENT
38
Results (1/2)
Is it decidable whether a given transaction is
correct? Is it decidable whether a given
schedule is correct? Is it decidable whether two
given transactions are equivalent? Is it
decidable whether two given schedules are
equivalent? Is it decidable whether a given
schedule is serializable?
39
Results (2/2)
Is it decidable whether a given transaction is
correct? YES! Is it decidable whether a given
schedule is correct? YES! Is it decidable
whether two given transactions are equivalent?
YES! Is it decidable whether two given schedules
are equivalent? YES! Is it decidable whether a
given schedule is serializable? YES!
40
Correctness of queryless schedules (1/2)
  • Correctness has nothing to do with queries
  • Consider queryless schedules (QL schedules).
  • The following conditions are necessary and
    sufficient
  • for correct QL schedules
  • Between (Add(n,a,n1),t1) and (Add(n2,b,n),t2)
    there is (Del(n,a,n1),t3)
  • Between (Add(n1,a,n),t1) and (Add(n2,b,n),t2)
    there is (Del(n1,a,n),t3)
  • Between (Add(n,a,n1),t1) and (Del(n2,b,n),t2)
    there is (Del(n,a,n1),t3)
  • Between (Add(n1,a,n),t1) and (Del(n,b,n2),t2)
    there is (Add(n,b,n2),t3)
  • Between (Add(n1,a,n),t1) and (Del(n2,b,n),t2)
    there is (Del(n1,a,n),t3), (n1,a) ltgt (n2,b)
  • Between (Del(n,a,n1),t1) and (Add(n2,b,n),t2)
    there is (Del(n3,c,n),t3)
  • Between (Del(n1,a,n),t1) and (Add(n,b,n2),t2)
    there is (Add(n3,c,n),t3)
  • Between (Del(n1,a,n),t1) and (Del(n,b,n2),t2)
    there is (Add(n3,c,n),t3)
  • Between (Del(n1,a,n),t1) and (Del(n2,b,n),t2)
    there is (Add(n2,b,n),t3)

41
Correctness of queryless schedules (2/2)
  • It is decidable whether a schedule (a
    transaction) is
  • correct in O(n3) time, n being the length of the
    schedule
  • (transaction), and constant space.
  • SDT DT ? ADD(S) DEL(S)
  • if S is defined on DT
  • ADD(S) edges e whose last occurrence in S is
    Add(e)
  • DEL(S) edges e whose last occurrence in S is
    Del(e)

42
Equivalence of correct QL schedules (1/5)
Let S1 be correct and equivalent with the serial
S2. We cannot necessarily go from S1 to S2 by
swapping actions S1 S2 (Add(1,a,2),t1) (Ad
d(1,a,2),t1) (Del(1,a,2),t2) (Add(1,b,3),t1) (Ad
d(1,b,3),t2) (Del(1,b,3),t1) (Del(1,b,3),t2) (
Del(1,a,2),t2) (Add(1,b,3),t1) (Add(1,b,3),t2)
(Del(1,b,3),t1) (Del(1,b,3),t2)
43
Equivalence of correct QL schedules (2/5)
Let S1 be correct and equivalent with the serial
S2. We cannot necessarily go from S1 to S2 by
swapping actions S1 S2 (Add(1,a,2),t1) (Ad
d(1,a,2),t1) (Del(1,a,2),t2) (Add(1,b,3),t1) (Ad
d(1,b,3),t2) (Del(1,b,3),t1) (Del(1,b,3),t2) (
Del(1,a,2),t2) (Add(1,b,3),t1) (Add(1,b,3),t2)
(Del(1,b,3),t1) (Del(1,b,3),t2)
44
Equivalence of correct QL schedules (3/5)
Let S1 be correct and equivalent with the serial
S2. We cannot necessarily go from S1 to S2 by
swapping actions S1 S2 (Add(1,a,2),t1) (Ad
d(1,a,2),t1) (Del(1,a,2),t2) (Add(1,b,3),t1) (Ad
d(1,b,3),t2) (Del(1,b,3),t1) (Del(1,b,3),t2) (
Del(1,a,2),t2) (Add(1,b,3),t1) (Add(1,b,3),t2)
(Del(1,b,3),t1) (Del(1,b,3),t2) Remark that
S1 is not equivalent with the other serial
schedule S3.
45
Equivalence of correct QL schedules (4/5)
  • NI(S) the nodes that must belong to DTs on
    which S is defined
  • m first occurrence of m
    has the form Add(m,l,n), Del(m,l,n), Del(n,l,m)
  • N-I(S) the nodes that may not belong to DTs on
    which S is defd
  • m first occurrence of m
    has the form Add(n,l,m)
  • EI(S) the edges that must belong to DTs on
    which S is defined
  • e first occurrence of m
    has the form Del(e)
  • E-I(S) the edges that may not belong to DTs on
    which S is defd
  • e see paper
  • NI(S), N-I(S), EI(S) and E-I(S) are correct
  • NI(S), N-I(S), EI(S) and E-I(S) can be
    calculated in O(n2) time and
  • O(n) space

46
Equivalence of correct QL schedules (5/5)
  • S1 and S2, QL transactions or schedules over the
    same set
  • of transactions are equivalent iff
  • - NI(S1) NI(S2)
  • - N-I(S1) N-I(S2)
  • - EI(S1) EI(S2)
  • - E-I(S1) E-I(S2)
  • The equivalence of two QL transactions or
    schedules
  • over the same set of transactions can be
    decided in O(n2) time
  • and O(n) space.

47
Output Sets vs. Input Sets (1/2)
  • NO(S) the nodes that must belong to SDT
  • m last occurrence of m has
    the form Add(m,l,n), Del(m,l,n), Add(n,l,m)
  • N-O(S) the nodes that may not belong to SDT
  • m last occurrence of m has
    the form Del(n,l,m)
  • EO(S) the edges that must belong to SDT
  • e last occurrence of m has the
    form Add(e)
  • E-O(S) the edges that may not belong to SDT
  • e see paper
  • NO(S), N-O(S), EO(S) and E-O(S) are correct
  • NO(S), N-O(S), EO(S) and E-O(S) can be
    calculated in O(n2)
  • time and O(n) space

48
Output Sets vs. Input Sets (2/2)
  • If S1 and S2 are correct transactions or
    schedules
  • then S1.S2 is correct iff
  • N-O(S1) ? NI(S2) ?, E-O(S1) ? EI(S2) ?,
  • NO(S1) ? N-I(S2) ? , EO(S1) ? E-I(S2) ?
  • If S1, S2, , Sk, S1.S2Sk, are k1 correct
    schedules then
  • NI(S1Sk) ?i1..k(Ni(Si) - ?jlti N-i(Sj))
  • N-I(S1Sk) ?i1..k(N-i(Si) - ?jlti Ni(Sj))
  • EI(S1Sk) ?i1..k(Ei(Si) - ?jlti E-i(Sj))
  • E-I(S1Sk) ?i1..k(E-i(Si) - ?jlti Ei(Sj))

49
Main Results
  • Given a QL schedule S of k transactions and n
    actions. It is decidable whether S is
    serializable in time O(f(k).n3) where f(k) can
    be exponential in k and in space O(k.n).
  • Given a correct schedule S of k transactions
    and n actions. It is decidable whether S is
    serializable in time O(f(k).n6) where f(k) can
    be exponential in k and in space O(n2).

to be continued
Write a Comment
User Comments (0)
About PowerShow.com