Principles of Database Management Systems 4.1: B-Trees - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Principles of Database Management Systems 4.1: B-Trees

Description:

Based on Stanford University CS 245 notes originally authored by Garcia-Molina, Ullman & Widom, and typed by Siroker – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 24
Provided by: csUkuFik
Category:

less

Transcript and Presenter's Notes

Title: Principles of Database Management Systems 4.1: B-Trees


1
Principles of Database Management Systems4.1
B-Trees
  • Pekka Kilpeläinen
  • (after Stanford CS245 slide originals by Hector
    Garcia-Molina, Jeff Ullman and Jennifer Widom)

2
  • B-Trees
  • a commonly used index structure
  • nonsequential, balanced (acces paths to
    different records of equal length)
  • adapts well to insertions deletions
  • consists of blocks holding at most n keys and n1
    pointers, and at least half of this
  • (We consider a variation actually called a B
    tree)

3
BTree Example n3
  • Root

100
120 150 180
30
3 5 11
120 130
180 200
100 101 110
150 156 179
30 35
4
Sample non-leaf
120 150 180
  • to keys to keys to keys to keys
  • lt 120 120? klt150 150?klt180 ?180

5
Sample leaf node
  • From non-leaf node
  • to next leaf
  • in sequence

120 130 unused
To record with key 120 To record with key 130
6
Dont want nodes to be too empty
  • Number of pointers in use
  • at internal nodes at least ?(n1)/2? (to child
    nodes)
  • at leaves at least ?(n1)/2? (to data
    records/blocks)

7
n3
  • Full node min. node
  • Non-leaf
  • Leaf

120 150 180
30
3 5 11
30 35
8
Btree rules
  • (1) All leaves at the same lowest
    level (balanced tree)
  • (2) Pointers in leaves point to records except
    for sequence pointer

9
  • (3) Number of pointers/keys for Btree

Max Max Min Min ptrs keys
ptrs?data keys
Non-leaf (non-root)
n1
n
?(n1)/2?
?(n1)/2?- 1
Leaf (non-root)
n1
n
?(n1)/2?
?(n1)/2?
Root
n1
n
2()
1
() 1, if only one record in the file
10
Insert into Btree
  • First lookup the proper leaf
  • (a) simple case
  • leaf not full just insert (key,
    pointer-to-record)
  • (b) leaf overflow
  • (c) non-leaf overflow
  • (d) new root

11
  • (a) Insert key 32

n3
100
30
3 5 11
30 31
12
  • (b) Insert key 7

n3
100
30
3 5 11
30 31
13
  • (c) Insert key 160

n3
100
120 150 180
180 200
150 156 179
14
  • (d) New root, insert 45

n3
Height grows at root gt balance maintained
10 20 30
1 2 3
10 12
20 25
30 32 40
15
Deletion from Btree
  • Again, first lookup the proper leaf
  • (a) Simple case no underflow Otherwise ...
  • (b) Borrow keys from an adjacent sibling (if it
    doesn't become too empty) Else ...
  • (c) Coalesce with a sibling node
  • -gt (d) Cases (a), (b) or (c) at non-leaf

16
  • (b) Borrow keys
  • Delete 50

n4
gt min of keys in a leaf ?5/2? 2
10 40 100
10 20 30 35
40 50
17
  • (c) Coalesce with a sibling
  • Delete 50

n4
20 40 100
20 30
40 50
18
gt min of keys in a non-leaf ?(n1)/2? -
13-1 2
  • (d) Non-leaf coalesce
  • Delete 37

n4
25
10 20
30 40
25 26
30 37
40 45
20 22
10 14
1 3
19
Btree deletions in practice
  • Often, coalescing is not implemented
  • Too hard and not worth it!
  • later insertions may return the node back to its
    required minimum size
  • Compromise Try redistributing keys with a
    siblingIf not possible, leave it there
  • if all accesses to the records go through the
    B-tree, can place a "tombstone" for the deleted
    record at the leaf

20
Why B-trees Are Good?
  • B-tree adapts well to insertions and deletions,
    maintaining balance
  • DBA does not need to care about reorganizing
  • split/merge operations rather rare (How often
    would nodes with 200 keys be split?)
  • -gt Access times dominated by key-lookup (i.e.,
    traversal from root to a leaf)

21
Efficiency of B-trees
  • For example, assume 4 KB blocks, 4 byte keys and
    8 byte pointers
  • How many keys and pointers fit in a node (
    index block)?
  • Max n s.t. (4n 8(n1)) B ? 4096 B ?-gt
    n340 340 keys and 341 pointers fit in a
    node-gt 171 341 pointers in a non-leaf node

22
Efficiency of B-trees (cont.)
  • Assume an average node has 255 pointers-gt a
    three-level B-tree has 2552 65025 leaves with
    total of 2553 or about 16.6 million pointers to
    records -gt if root block kept in main memory,
    each record can be accessed with 21 disk I/Os
    If all 256 internal nodes are in main memory,
    record access requires 11 disk I/Os (256 x 4 KB
    1 MB quite feasible!)

23
Outline/summary
  • B trees
  • popular index structures with graceful growth
    properties
  • support range queries like WHERE 100 lt Key
    lt 200 (-gt Exercises)
  • Next Hashing schemes
Write a Comment
User Comments (0)
About PowerShow.com