ICS 214B: Transaction Processing and Distributed Data Management - PowerPoint PPT Presentation

1 / 67
About This Presentation
Title:

ICS 214B: Transaction Processing and Distributed Data Management

Description:

Three choices: (1) Pr = { } F1 ={ E } (2) Pr = {LOC=Sa, LOC=Sb} F2={ loc ... Query processing in distributed databases. ICS214B. Notes 09. 33. Query. Query Plan ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 68
Provided by: che7
Category:

less

Transcript and Presenter's Notes

Title: ICS 214B: Transaction Processing and Distributed Data Management


1
ICS 214B Transaction Processing and Distributed
Data Management
  • Lecture 9 Fragmentation and Distributed Query
    Processing
  • Professor Chen Li

2
Which simple predicates should we use in Pr?
  • Desired property of Pr
  • - minimality
  • - uniformity

3
Return to example
  • E(, NM, LOC, SAL,)
  • Common queries
  • Qa select Qb select
  • from E from E
  • where LOCSa where LOCSb
  • and and ...

4
Three choices
  • (1) Pr F1 E
  • (2) Pr LOCSa, LOCSb
  • F2 ? locSa E, ? locSb E
  • (3) Pr LOCSa, LOCSb, Sallt10
  • F3 ?locSa ? sallt10 E, ?locSa ? sal?10 E,
  • ?locSb ? sallt10E, ?locSb ? sal?10 E

5
In other words
Qa Select loc Sa ...
LocSa ? sal lt 10
Qb Select loc Sb ...
LocSa ? sal ? 10
F3
F1
F2
LocSb ? sal lt 10
LocSb ? sal ? 10
6
Informal definition
  • Set of predicates Pr is uniform if
  • for every Fi ? FPr, every t ? Fi has equal
    probability of access by every major application.
  • Note FPr is fragmentation defined by minterm
    predicates generated by Pr.

7
Back to example
Qa Select loc Sa ...
LocSa ? sal lt 10
tuples here have higher probability of access
Qb Select loc Sb ...
LocSa ? sal ? 10
F1
LocSb ? sal lt 10
tuples here have lower probability of access
LocSb ? sal ? 10
so F1 is not good...
8
Back to example
Qa Select loc Sa ...
LocSa ? sal lt 10
tuples here have same probability of access
Qb Select loc Sb ...
LocSa ? sal ? 10
F2
LocSb ? sal lt 10
so F2 is good...
so is F3 ...
LocSb ? sal ? 10
9
Informal definition
  • Set of predicates Pr is minimal if no
  • Pr ? Pr is uniform

10
Back to example
uniform?
  • (1) Pr N
  • (2) Pr LOCSa, LOCSb Y
  • (3) Pr LOCSa, LOCSb, Sallt10 N

Pr(2) is a subset of Pr(3), so Pr(3) is not
minimal...
11
Is Pr uniform and minimal a good thing?
  • Not necessarily! But it does simplify allocation
    problem...

12
Derived horizontal fragmentation
  • E(ENO, NAME, SAL, LOC)
  • J(ENO, DESCRIPTION,)

E ? F E1, E2 by LOC
Common query Given an employee name, list
projects (s)he works in
13
E2
E1
(at Sa)
(at Sb)
J
14
E2
E1
(at Sa)
(at Sb)
J2
J1
15
Derived horizontal fragmentation
  • R, F F1, F2, ... Fn
  • ?
  • S, D D1, D2, Dn where Di S Fi
  • Convention R is owner
  • S is member

F could be primary or derived
16
Checking completeness and disjointness of
derived fragmentation
Example Say J is
  • ? But no 33 in E1 nor in E2!

This J tuple will not be in J1 nor
J2 ?Fragmentation not complete
17
  • To get completeness Need to enforce
  • referential integrity constraint
  • join attr() of member relation
  • ?
  • join attr() of owner relation

18
Example
E2
E1
Fragmentation is not disjoint!
J
J1
J2
19
  • To get disjointness Join attribute() should be
    key of owner relation

20
Summary horizontal fragmentation
  • Type primary, derived
  • Properties completeness, disjointness
  • Predicates minimal, uniform

21
Vertical fragmentation
Example
E
E2
E1
22
  • RT ? R1T1 Ti ? T
  • RnTn
  • Just like normalization of relations

...
23
Properties RT ? RiTi
  • (1) Completeness
  • U Ti T

all i
24
  • (2) Disjointness
  • Ti ? Tj ? for all i,j i?j
  • E(,LOC,SAL)

E1(,LOC)
E2(SAL)
25
  • (3) Reconstruction Lossless join
  • Ri R

all i
? One way to achieve lossless join Repeat
key in all fragments, i.e., Key ? Ti for all i
26
Hybrid Fragmentation
R
Horizontal
R2
R1
Vertical
R22
R12
R21
R11
27
Hybrid Fragmentation -- Reconstruction
U
Horizontal
Vertical
R22
R12
R21
R11
28
Allocation
  • Example E(,NM,LOC,SAL) ?
  • F1 ?locSa E F2 ?locSb E
  • Qa select where locSa...
  • Qb select where locSb

Where do F1,F2 go?
Site b
Site a
?
29
Issues
  • Where do queries originate?
  • What is communication cost? and size of
    answers, relations,
  • What is storage capacity, cost at sites? and size
    of fragments?
  • What is processing power at sites?
  • What is query processing strategy?
  • How are joins done?
  • Where are answers collected?

30
Do we replicate fragments?
  • Cost of updating copies?
  • Writes and concurrency control?
  • ...

31
Optimization problem
  • What is best placement of fragments and/or best
    number of copies to
  • minimize query response time
  • maximize throughput
  • minimize some cost
  • ...
  • Subject to constraints?
  • Available storage
  • Available bandwidth, power,
  • Keep 90 of response time below X
  • ...
  • Often, can use common sense
  • Place fragments where they are most heavily
    accessed

32
Summary
  • Horizontal and vertical fragmentation
  • Designing good fragmentations and allocation
  • Next
  • Query processing in distributed databases

33
Query
Query Plan
34
Decomposition
  • Same as in centralized system
  • Normalization
  • Eliminating redundancy
  • Algebraic rewriting

35
Normalization
  • Convert from query language to relational algebra

36
Example
  • SELECT R.A, S.D
  • FROM R, S
  • WHERE (R.B1 and S.C2) and (R.A S.A)

37
Eliminate redundancy
  • E.g. in conditions
  • (S.A1) ? (S.Agt5) ? False
  • (S.Alt10) ? (S.Alt5) ? S.Alt5

38
E.g. Common sub-expressions
  • U U
  • S ?cond ?cond T S ?cond T
  • R R R

39
Algebraic rewriting
  • E.g. Push conditions down
  • ?cond3
  • ?cond
  • ?cond1 ?cond2
  • R S R S

40
Query
41
Localization steps
  • (1) Start with query tree
  • (2) Replace relations by fragments
  • (3) Push ? up
  • ?,? down
  • (4) Simplify eliminate unnecessary operations

42
Notation for fragment
  • R cond
  • fragment conditions its tuples
    satisfy

43
Example A
  • (1) ?E3
  • R

44
  • (2) ?E3
  • ?
  • R1 E lt 10 R2 E ? 10

45
  • (3) ?
  • ?E3 ?E3
  • R1 E lt 10 R2 E ? 10

46
  • (4) ?E3
  • R1 E lt 10

47
Rule 1
  • ?C1R c2 ? ?C1R c1 ? c2
  • R False ? Ø

A
B
48
In example A
  • ?E3R2 E?10 ? ?E3 R2 E3 ? E?10
  • ? ?E3 R2 False
  • ? Ø

49
Example B
  • (1) Acommon
  • attribute
  • R S

A
50
  • (2)
  • ? ?

R1 Alt5 R2 5 ? A ? 10 R3 Agt10
S1 Alt5 S2 A ? 5
51
  • (3) ?
  • R1Alt5S1Alt5 R1Alt5S2A?5
    R25?A?10S1Alt5
  • R25?A?10S2A?5 R3Agt10S1Alt5
    R3Agt10S2A?5

52
  • (4) ?
  • R1Alt5S1Alt5 R25?A?10S2A?5

R3Agt10S2A?5
53
Rule 2
  • R C1 S C2 ?
  • R S C1 ? C2 ? R.A S.A

54
In step 4 of Example B
  • R1 Alt5 S2 A ? 5
  • ?R1 S2 R1.A lt 5 ? S2.A ? 5 ?
    R1.A S2.A
  • ?R1 S2 False ? Ø

55
Localization with derived fragmentation
  • Example C
  • (2)
  • ? ?
  • R1Alt10R2?10 S1KR.K S2KR.K
  • ?R.Alt10 ?R.A?10

56
  • (3) ?
  • R1S1 R1S2 R2S1 R2S2

57
  • (4) ?
  • R1S1
    R2S2

58
In step 3 of Example C
  • R1Alt10 S2KR.K ? R.A?10
  • ? R1 S2 R1.Alt10 ? S2.KR.K ?
    R.A?10 ? R1.K S2.K
  • ? R1 S2False (K is key of R, R1)
  • ? Ø

59
Localization with vertical fragmentation
  • Example D
  • (1) ?A R1(K, A, B)
  • R R2(K, C, D)

60
  • (2) ?A
  • R1 R2
  • (K, A, B) (K, C, D)

61
  • (3) ?A
  • ?K,A ?K,A
  • R1 R2
  • (K, A, B) (K, C, D)

not really needed
62
  • (4) ?A
  • R1
  • (K, A, B)

63
Rule 3
  • Given vertical fragmentation of R
  • Ri ?Ai (R), Ai ? A
  • Then for any B ? A
  • ?B (R) ?B Ri B ? Ai ? Ø

64
Localization with hybrid fragmentation
  • Example E
  • R1 ?klt5 ?k,A R
  • R2 ?k?5 ?k,A R
  • R3 ?k,B R

65
  • Query ?A
  • ?k3
  • R

66
  • Reduced
  • Query ?A
  • ?k3
  • R1

67
Distributed Query Processing
  • Decomposition ?
  • Localization ?
  • Optimization ? next
Write a Comment
User Comments (0)
About PowerShow.com