CS 245 Database System Principles - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

CS 245 Database System Principles

Description:

(b) Coalesce with neighbor (sibling) (c) Re-distribute keys (d) ... Often, coalescing is not implemented. Too hard and not worth it! 30. Interesting problem: ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 40
Provided by: ZoltanG2
Category:

less

Transcript and Presenter's Notes

Title: CS 245 Database System Principles


1
CPSC-608 Database Systems
Fall 2008
Instructor Jianer Chen Office HRBB 309B Phone
845-4259 Email chen_at_cs.tamu.edu
Notes 7
2
Terms
  • Sequential file (data file)
  • Index file (key-pointer pairs)
  • Search key
  • Primary index
  • Secondary index
  • Dense index
  • Sparse index
  • Multi-level index

3
BTrees
  • Support fast search
  • Support range search
  • Support dynamic changes
  • Could be either dense or sparse

4
BTree Example n3
  • Root

100
120 150 180
30
120 130
3 5 11
180 200
100 101 110
150 156 179
30 35
5
Sample non-leaf
57 81 95
  • to keys to keys to keys to keys
  • lt 57 57? klt81 81?klt95 ?95

6
Sample leaf node
  • From non-leaf node
  • to next leaf
  • in sequence

57 81 95
To record with key 57 To record with key
81 To record with key 85
7
  • Size of nodes n1 pointers
  • n keys

FIXED
8
Dont want nodes to be too empty
  • Use at least
  • Non-leaf ?(n1)/2? pointers
  • Leaf ?(n1)/2? pointers to data

9
n3
  • Full node Min. node
  • Non-leaf
  • Leaf

120 150 180
30
3 5 11
30 35
10
Btree rules tree of order n
  • (1) All leaves at same lowest level (balanced
    tree)
  • (2) Pointers in leaves point to records except
    for sequence pointer

11
  • (3) Number of pointers/keys for Btree

Max Max Min Min ptrs keys
ptrs?data keys
Non-leaf (non-root)
n1
n
?(n1)/2?
?(n1)/2?1
Leaf (non-root)
n1
n
?(n1)/2?
?(n1)/2?
Root
n1
n
2
1
could be 1
12
Search in a Btree
  • Start from the root
  • Search in a leaf block
  • May not have to go to the data file

13
Pseudo Code for Search in a Btree
  • Search(ptr, k)
  • \\ search a record of key value k in the subtree
    rooted at ptr
  • Case 1. ptr is a leaf
  • IF (k ki) for a key ki in ptr THEN
    return(pi)
  • ELSE return(Null)
  • Case 2. ptr is not a leaf
  • find a key ki in ptr such that ki lt k lt
    k(i1)
  • return(Search(pi, k)

14
Insert into Btree
  • (a) simple case
  • space available in leaf
  • (b) leaf overflow
  • (c) non-leaf overflow
  • (d) new root

15
Pseudo Code for Insertion in a Btree
  • Insert(ptr, (k,p), (k',p'))
  • \\ p is a pointer to a data record with key value
    k, which are inserted into the subtree rooted at
  • \\ ptr p' is a pointer to a new brother of
    ptr, if created, in which k' is the smallest key
    value
  • Case 1. ptr is a leaf
  • IF there is room in ptr, THEN insert (k,p) into
    ptr return(k'0, p'Null)
  • ELSE re-arrange the content in ptr and (k,p)
    into (p0, k1, p1, ..., k(n1), p(n1))
  • leave (p0, k1, ..., k(r-1), p(r-1)) in
    ptr
  • create a new leaf q put (pr, k(r1),
    p(r1), ..., k(n1), p(n1)) in q
  • return( k' k_r, p' q )
  • Case 2. ptr is not a leaf
  • find a key ki in ptr such that ki lt k lt
    k(i1)
  • Insert(pi, (k,p), (k",p"))
  • IF (p" Null) THEN return(k'0, p'Null)
  • ELSE IF there is room in ptr, THEN insert
    (k",p") into ptr return(k'0, p'Null)
  • ELSE re-arrange the content in ptr and
    (k",p") into (p0, k1, p1, ..., k(n1), p(n1))
  • leave (p0, k1, ..., k(r-1), p(r-1)) in
    ptr
  • create a new leaf q put (pr, k(r1),
    p(r1), ..., k(n1), p(n1)) in q

16
  • (a) Insert key 32

n3
100
30
3 5 11
30 31
17
  • (a) Insert key 32

n3
100
30
3 5 11
30 31
32
18
  • (b) Insert key 7

n3
100
30
3 5 11
30 31
19
  • (b) Insert key 7

n3
100
30
7
3 5
3 5 11
30 31
7
20
  • (c) Insert key 160

n3
100
120 150 180
180 200
150 156 179
21
  • (c) Insert key 160

n3
100
120 150 180
180 200
150 156 179
160 179
22
  • (d) New root, insert 45

n3
10 20 30
1 2 3
10 12
20 25
30 32 40
23
  • (d) New root, insert 45

n3
10 20 30
1 2 3
10 12
20 25
30 32 40
24
Deletion from Btree
  • (a) Simple case (no example)
  • (b) Coalesce with neighbor (sibling)
  • (c) Re-distribute keys
  • (d) Cases (b) or (c) at non-leaf

25
  • (b) Coalesce with sibling
  • Delete 50

n4
10 40 100
10 20 30
40 50
26
  • (b) Coalesce with sibling
  • Delete 50

n4
10 40 100
10 20 30
40 50
40
27
  • (c) Redistribute keys
  • Delete 50

n4
10 40 100
10 20 30 35
40 50
28
  • (c) Redistribute keys
  • Delete 50

n4
10 40 100
35
10 20 30 35
40 50
35
29
  • (d) Non-leaf coalesce
  • Delete 37

n4
25
10 20
30 40
25 26
30 37
40 45
20 22
10 14
1 3
30
Btree deletions in practice
  • Often, coalescing is not implemented
  • Too hard and not worth it!

31
Interesting problem
  • For a Btree, how large should n be?


n is number of keys per node
32
Sample assumptions
  1. Time to read node from disk is (STn) ms
    (S, T constants, e.g. S70, T0.5)

(2) Once block in memory, use binary search (a
b log2n) ms (a, b constants, a S)
(3) Assume Btree is full, i.e., nodes to examine
is logn N, where N records
(4) Total search time f(n)
(lognN1)(STn) (a b log2n)
33
Can get f(n) time to find a record
  • f(n)
  • nopt n

34
? FIND nopt by f(n) 0
  • Answer is nopt few hundred

35
Variation on Btree B-tree (no )
  • Idea
  • Avoid duplicate keys
  • Have record pointers in non-leaf nodes

36
  • to record to record to record
  • with K1 with K2 with K3
  • to keys to keys to keys to
    keys
  • lt K1 K1ltxltK2 K2ltxltk3 gtk3

37
B-tree example n2
65 125

145 165
85 105
25 45
10 20
30 40
110 120
90 100
70 80
170 180
50 60
130 140
150 160
38
B-tree example n2
  • sequence pointers
  • not useful now!

65 125

145 165
85 105
25 45
10 20
30 40
110 120
90 100
70 80
170 180
50 60
130 140
150 160
39
Tradeoffs
  • ? B-trees have faster lookup than Btrees
  • ? in B-tree, non-leaf leaf different sizes
  • ? in B-tree, insertion and deletion more
    complicated

? Btrees preferred!
Write a Comment
User Comments (0)
About PowerShow.com