Monkey Business: In The Trees - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Monkey Business: In The Trees

Description:

Height of a BST may be O(n) because there is no requirement that the tree be balanced ... in managing RB tree from BST is that inserts and deletes have ... – PowerPoint PPT presentation

Number of Views:165
Avg rating:3.0/5.0
Slides: 19
Provided by: rober52
Category:
Tags: bst | business | monkey | trees

less

Transcript and Presenter's Notes

Title: Monkey Business: In The Trees


1
Monkey BusinessIn The Trees
  • Helpful Reading CLR Ch. 13, 14, 19
  • (MUCH more detail than this lecture will contain)

2
Why Trees? Were Not Monkeys
  • If a tree of n elements is balanced and complete,
    then we can expect its height to be O(log n) --
    this often leads to efficient operations on trees
  • Search trees are good priority queues and
    dictionaries!
  • Weve already seen binary heaps, which are
    complete trees with minimal ordering discipline
  • Not good for searches or deletes -- O(n)
  • We will study searchable trees
  • Binary search trees
  • Red-black trees (balanced binary search trees)
  • B-trees (generalized balanced trees)

3
Foundation Binary Search Trees
  • Data in the tree is attached to a key for
    ordering purposes (keys must be comparable)
  • Binary search tree property For any node, the
    key in the left child (if any) is less than or
    equal to that nodes key, and the key in the
    right child (if any) is greater than or equal to
    that nodes key

12
12
7
13
7
15
3
9
14
17
3
9
14
17
LEGAL
ILLEGAL
4
BST SEARCH(key) gt data
  • To find a particular key, start at root
  • If the root has the key, return its attached data
  • Otherwise, recursively search the subtree in
    which the key should be found
  • if the key youre looking for is less than the
    roots key, search the left subtree
  • if the key youre looking for is bigger than the
    roots key, search the right subtree
  • Expect O(log n), but really is O(n) -- WHY?
  • Lets look at inserts

5
BST INSERT(key, data)
  • Insertion always occurs at a spot with an empty
    pointer (no tree rearrangement)
  • Starting at the root, search for the leaf to
    which to append the new key
  • If the new key is less than the current nodes
    key, move left to access the lesser subtree
    otherwise move right
  • Repeat the process until the subtree you wish to
    move to is null, then attach the new key

6
The Problem with BSTs
  • Insertion always attaches a new leaf
  • Height of a BST may be O(n) because there is no
    requirement that the tree be balanced
  • Consider inserting the following keys in this
    order 1, 2, 3, 4, 5, 6, 7, 8, 9
  • In this and similar worst cases, BST degenerates
    into a linked list
  • Thus, insertion is O(n)

7
BST DELETE(key)
  • To delete a node N from a BST
  • if N has no children, just pluck it from tree
  • if N has one child, splice out N
  • if N has two children, find the successor node to
    N (i.e. the node with the next largest key after
    Ns key), swap the contents of N with the
    successor nodes contents, and then delete the
    successor node from the right subtree
  • successor node is leftmost node in right subtree
  • Deletion is O(n)

8
BST PQ operations
  • MAXIMUM returns the rightmost node in the tree --
    O(n)
  • MINIMUM returns the leftmost node in the tree --
    O(n)
  • EXTRACT-MIN / EXTRACT-MAX are simple deletions of
    nodes with at most one child -- O(n)
  • ALL operations would be O(log n) if we could
    guarantee a balanced, complete tree

9
Red-Black Trees
  • Binary search trees with four important
    additional red-black properties
  • 1. Every node is either red or black
  • 2. Every leaf is black
  • For purposes of RB trees, assume any pointer to
    NULL actually points to an empty black node
  • 3. If a node is red, it has two black children
  • 4. Every simple (non-retracing) path from a node
    to a descendant leaf contains the same number of
    black nodes

10
Why is an RB-tree balanced?
  • Let bh(x) represent the black height of node x
    (the number of black nodes any simple path from x
    to the bottom of the tree encounters, not
    including x itself)
  • Because null pointers are replaced by empty black
    nodes, any node with a NIL child limits its other
    child to be a subtree of height at most two (one
    node with two black empty children), or else
    Property 2 is violated
  • The end result is that the tree is as bushy as
    possible, and thus any subtree rooted at x of an
    RB tree contains at least (2bh(x) - 1) non-empty,
    key-bearing nodes (proof in CLR)
  • By property 3, the black height of the root of an
    RB tree is at least h/2, and so for a tree with n
    keyed nodes, it follows that n gt 2h/2 - 1, which
    states h O(lg n)

11
RB Trees are Complex
  • See CLR chapter 14 for complete pseudocode and
    explanations
  • I will briefly explain RB insert to show whats
    involved
  • The minutiae of managing RB trees is not material
    for this class (but the use of an RB tree and why
    it is balanced are !)
  • Only difference in managing RB tree from BST is
    that inserts and deletes have to preserve the
    red-black properties of the tree
  • Example insert in CLR p. 269
  • Example delete in CLR p. 276

12
RB INSERT(key, data)
  • Begins with insertion of new key done as if the
    RB tree were a normal BST
  • In insert, the inserted node (call it x) is
    colored red (and black empty children added)
  • This can ONLY cause a violation of property 3 (a
    red node might have been attached to a red
    parent)
  • Swap colors of parent of x and the grandparent of
    x (the grandparent is black by definition)
  • This could violate property 3 further up the tree
  • Move the violation up to where the immediate
    ancestor of the two red nodes has a black child
    as its other child, then perform rotations to
    remove the violation (or until you recolor root)
  • After any insert, color the root of the tree black

13
RB Rotations
y
x
RightRotate(T,y)
x
y
C
A
LeftRotate(T,x)
C
A
B
B
x and y are (red) nodes A, B, and C are
subtrees These O(1) rotations do not violate any
RB properties.
14
RB Insert Solving Violations
z
z
D
x
y
D
y
x
z
C
y
A
x
A
B
C
D
B
C
A
B
A, B, C, and D are subtrees with black roots.
15
B-Trees
  • Generalized trees (see CLR Ch. 19)
  • B-tree nodes may have many thousands of keys,
    compared to RB trees, which have one
  • If a tree has N keys in it, we have to make a big
    (N-1)-way decision on which node to visit next
  • A B-tree is a rooted tree with root R and
    branching factor t such that
  • Every node other than the root must have at least
    t-1 keys (and thus at least t children, if the
    node is not a leaf) and at most 2t-1 keys (thus
    at most 2t children)
  • If tree not empty, root must have at least one
    key
  • Every leaf must be at same depth

16
Why B-Trees?
  • We fix t as constant for any particular run
  • Height is O(logt n)
  • Branching decision at each node is O(t), thus it
    is O(1)
  • Usually use B-trees to maximize disk efficiency
    (disk seek mechanical, and thus it takes a long
    time)
  • Can store one node per disk page, and thus it
    takes at worst O(logt n) disk accesses

17
What, no implementation details?
  • B-trees are also very complex
  • Insert a key into some sorted list within a node
  • If a node gets too full (it would have 2t
    children, which is disallowed), on next insert it
    is split into two nodes and the median key is
    inserted into the parent node with pointers to
    the new node
  • Deleting a key from a node may require parents or
    siblings to contribute keys to it if the node has
    t children
  • Insertion, deletion take O(t logt n) (each level
    visited O(1) times, O(t) work done)

18
B-Trees For More Examples
  • See CLR p. 393 for a B-tree insert example
  • See CLR pp. 396-397 for a B-tree delete example
  • We will NOT cover examples of B-tree management
    on homework, exams, or quizzes -- however, you
    will be expected to judge when a B-tree or
    RB-tree is an appropriate data structure to use
    based on the time complexities of their operations
Write a Comment
User Comments (0)
About PowerShow.com