4. Search Trees - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

4. Search Trees

Description:

item function access(keytype k, bst t); do t null and k key(t) t := left(t) ... We can improve the running time of BST operations to O(log n) by balancing subtrees. ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 24
Provided by: kareny
Category:
Tags: bst | search | trees

less

Transcript and Presenter's Notes

Title: 4. Search Trees


1
4. Search Trees
  • Balanced Binary Search Trees
  • Self-Adjusting Binary Search Trees

Read Tarjan 45-70, CLR 244-277
2
Sorted Sets
  • A collection of sorted sets is an abstract data
    type representing a collection of items each
    having a key and belonging to one of several
    sets. Sets are identified by one of their items.
    Initially, each item belongs to a singleton set.
    The operations are
  • setkey(i,k) Initialize the key of item i to k. i
    is assumed to belong to a singleton set.
  • access(k,s) Return the item in set s having key
    k.
  • insert(i,s) Insert item i into s. i is assumed
    to be in a singleton set initially.
  • delete(i,s) Remove item i from set s. This
    leaves i in a singleton set.
  • join(s1,i,s2) Return set formed by combining s1,
    i and s2, where every item in s1 is assumed to
    have key less then key(i) and every item in s2 is
    assumed to have key greater than key(i). This
    operation destroys s1 and s2.
  • split(i,s) Split the sorted set s containing i
    into three sets s1 containing all items with key
    less than key(i), i and s2 containing all items
    with key larger than key(i). Return the pair
    s1,s2. This operation destroys s.
  • The keys of the items in each set must be
    distinct.

3
Sorted Sets and Binary Search Trees
  • Symmetric ordering - items inserted in trees so
    that for every node v
  • keys of nodes in vs left subtree are smaller
    than key(v)
  • keys of nodes in vs right subtree are larger
    than key(v)
  • To insert new key value, search for key and
    insert at place where search falls out of tree.

4
Implementing Sorted Sets with BSTs
  • typedef int bst, item, keytyp struct tpair bst
    t1,t2
  • class bsts
  • int n
  • struct
  • int lchild, rchild, parent keytyp
    keyfield
  • vecSPSIZ1
  • bsts(int)
  • public keytyp key(item)
  • void setkey(item,keytyp)
  • . . .
  • define left(x) vecx.lchild
  • item bstsaccess(keytyp k, bst t)
  • while (t ! Null k ! key(t))
  • if (k lt key(t)) t left(t)
  • else t right(t)
  • return t

5
  • item function access(keytype k, bst t)
  • do t ? null and k lt key(t) ? t left(t)
  • t ? null and k gt key(t) ? t right(t)
  • od
  • return t
  • end
  • procedure insert(item i, bst t)
  • item x x t
  • do key(i) lt key(x) and left(x) ? null ? x
    left(x)
  • key(i) gt key(x) and right(x) ? null ? x
    right(x)
  • od
  • if key(i) lt key(x) ? left(x) i
  • key(i) gt key(x) ? right(x) i
  • fi
  • p(i) x
  • end

if any node in tree has key k, then subtree
rooted at t does
proper insertion location for i is in subtree
with root x
6
  • procedure delete(item i, bst t)
  • item j
  • if left(i) ? null and right(i) ? null ?
  • j left(i)
  • do right(j) ? null ? j right(j) od
  • swapplaces(i,j)
  • fi
  • if left(i) null ? left(i) ? right(i) fi
  • p(left(i)) p(i)
  • if i left(p(i)) ? left(p(i)) left(i)
  • i right(p(i)) ? right(p(i)) left(i)
  • fi
  • left(i),right(i),p(i) null
  • end

find node j with next smaller key
i has lt2 children
7
  • sorted set function join(bst t1, item i, bst
    t2)
  • left(i) t1 right(i) t2
  • p(t1), p(t2) i
  • return i
  • end
  • bst, bst function split(item i, bst t)
  • bst x,y,t1,t2
  • x,y p(i),i t1,t2 left(y),right(y)
  • do y ? t and y left(x) ? x,y,t2
    p(x),x,join(t2,x,right(x))
  • y ? t and y right(x) ? x,y,t1
    p(x),x,join(left(x),x,t1)
  • od
  • left(i),right(i),p(i) null
  • p(t1), p(t2) null
  • return t1, t2
  • end

t1 (t2) includes all nodes at or below y that
belong in left (right) tree after split.
8
Analysis of Binary Search Trees
  • Access takes time proportional to the depth of
    the accessed item.
  • Insert takes time proportional to the depth of
    the item after insertion.
  • Delete takes time proportional to the depth of
    the deleted item, if it has a null child and time
    proportional to the depth of its symmetric order
    predecessor if it has no null child.
  • Join take constant time.
  • Split takes time proportional to the depth of the
    item on which the split is taking place.
  • The depth of a binary search tree on n nodes can
    be n-1 in the worst case, so most operations have
    worst-case running time ?(n).
  • We can improve the running time of BST operations
    to O(log n) by balancing subtrees.

9
Balanced Binary Trees
  • A balanced binary tree is a full binary tree each
    of whose nodes x has an integer rank, denoted
    rank(x) that satisfy the following properties.
  • if x is a node with a parent, rank(x) ?
    rank(p(x)) ? rank(x) 1
  • if x is a node with a grandparent, rank(x)
    ltrank(p(p(x)))
  • if x is an external node, rank(x) 0 if x
    also has a parent,rank(p(x)) 1
  • Also called red-black trees.
  • sufficient to store 1 bit of balance information

10
Depth of Balanced Binary Trees
  • Lemma 4.1. A node of rank k in a balanced binary
    tree has height at most 2k and at least 2k1 -1
    descendants. Therefore, a balanced binary tree
    with n internal nodes has depth at most 2
    lg(n1).
  • Proof. The proof of the first part is by
    induction on k. The basis (k0) is obvious since
    by definition of the ranks, any node of rank 0,
    must be an external node, hence its height is 0
    and it has 1 descendant. Assuming the lemma is
    true for nodes of rank k, let x be a node of rank
    k1. By the definition of ranks and the induction
    hypothesis, the grandchildren of x have height at
    most 2k, so x can have height at most 2(k1).
    Similarly, its two subtrees must contain at least
    2k1-1 nodes, so x has a total of at least
    2(2k1-1) 1 descendants. A full binary tree
    with n internal nodes contains a total of 2n1
    nodes. By the first part of the lemma, the rank
    of the root is at most lg(n1) and the height of
    the root is at most twice its rank. ?
  • By Lemma 4.1, the access time in a balanced
    binary tree is O(log n).

11
Rotation Operations
single rotation
rrotate(x)
lrotate(y)
double rotation
rrotate(y),lrotate(x)
12
Insertion in a Balanced Binary Tree
insert
promote(m)
rrotate(n)
13
Implementation of Insertion Operation
  • procedure insert(item i, bst t)
  • item x, gpx
  • left(i),right(i) NULL x t
  • do key(i) lt key(x) and left(x) ? null ? x
    left(x)
  • key(i) gt key(x) and right(x) ? null ? x
    right(x)
  • od
  • if key(i) lt key(x) ? left(x) i
  • key(i) gt key(x) ? right(x) i
  • fi
  • p(i) x x i
  • do p(x) ? null and p(p(x)) ? null and rank(x)
    rank(p(p(x))) ?
  • gpx p(p(x))
  • if rank(left(gpx)) rank(right(gpx)) ?
  • rank(gpx) rank(gpx) 1 x gpx
  • rank(left(gpx)) ? rank(right(gpx)) ?
  • if x left(left(gpx)) ? x rrotate(gpx)
  • x right(right(gpx)) ? x lrotate(gpx)
  • x left(right(gpx)) ? x
    rrotate(p(x)) x lrotate(p(x))
  • x right(left(gpx)) ? x
    lrotate(p(x)) x rrotate(p(x))

14
Self-Adjusting Binary Trees
  • By Theorem 5.1, a sequence of m dynamic tree
    operations requires O(m log n) path set
    operations. If path sets are implemented with
    balanced binary search trees, each operation
    takes O(log n) giving O(m (log n)2) time for m
    dynamic tree operations. This can be improved
    with self-adjusting binary search trees.
  • By restructuring a binary search tree after each
    operation we can get an O(log n) running time per
    operation in an amortized sense, without the need
    for an explicit balance condition.
  • The restructuring operation is the splay, which
    moves one vertex x to the root of the tree by a
    sequence of rotations this restructuring also
    moves other vertices closer to the root.
  • a descendant z of x moves at least ?depth(x)/2?
    steps closer to root
  • an ancestor z of x moves at least ?depth(z)/2? 2
    steps closer to root
  • an unrelated vertex z of x moves at least
    ?depth(y)/2? 2 steps closer to root where y is
    the nearest common ancestor of x and z (before
    the splay)

15
Illustration of Splay Steps
splaystep(x)
grandparent and x is left-leftgrandchild
grandparent andx is right-leftgrandchild
no grandparentand x is left child
rrotate(z)rrotate(y)
rrotate(y)lrotate(z)
16
Implementation of Splay
  • sorted set function splay(item x)
  • if x null ? return null fi
  • do p(x) ? null ? splaystep(x) od
  • return x
  • end
  • procedure splaystep(item x)
  • item y,z
  • if p(x) null ? return fi
  • y p(x)
  • if p(y) null and x left(y) ? rrotate(y)
    return
  • p(y) null and x right(y) ? lrotate(y)
    return
  • fi
  • z p(y)
  • if x left(left(z)) ? rrotate(z) rrotate(y)
  • x right(right(z)) ? lrotate(z)
    lrotate(y)
  • x left(right(z)) ? rrotate(y) lrotate(z)
  • x right(left(z)) ? lrotate(y) rrotate(z)
  • fi
  • end

last step of splay
each moves descendants of x up 1
17
Implementing Self-Adjusting BSTs
  • item function access(keytype k, bst t)
  • if t null ? return null fi
  • do k lt key(t) and left(t) ? null ? t
    left(t)
  • k gt key(t) and right(t) ? null ? t
    right(t)
  • od
  • t splay(t)
  • if k key(t) ? return t
  • k ? key(t) ? return null
  • fi
  • end
  • bst, bst function split(item i, bst t)
  • bst t1,t2
  • splay(i)
  • t1,t2 left(i),right(i) p(t1), p(t2)
    null
  • left(i), right(i) null
  • return t1, t2
  • end

time bounded by number of splay steps
ditto
18
  • procedure insert(item i, bst t)
  • item x x t
  • do key(i) lt key(x) and left(x) ? null ? x
    left(x)
  • key(i) gt key(x) and right(x) ? null ? x
    right(x)
  • od
  • if key(i) lt key(x) ? left(x) i
  • key(i) gt key(x) ? right(x) i
  • fi
  • p(i) x
  • splay(i)
  • end

time bounded by number of splay steps
19
  • procedure delete(item i, bst t)
  • item j
  • if left(i) ? null and right(i) ? null ?
  • j left(i)
  • do right(j) ? null ? j right(j) od
  • swapplaces(i,j)
  • fi
  • if left(i) null ? left(i) ? right(i) fi
  • p(left(i)) p(i)
  • if i left(p(i)) ? left(p(i)) left(i)
  • i right(p(i)) ? right(p(i)) left(i)
  • fi
  • splay(p(i))
  • left(i),right(i),p(i) null
  • end

time bounded by number of splay steps
20
Analysis of Self-Adjusting BSTs
  • Objective is to show that a sequence of m
    operations, on a collection of trees with a total
    of n vertices takes O(m log n) time.
  • We use a credit scheme to account for running
    time.
  • all operations but join include a splay, so we
    can account for their running time by bounding
    the time for all the splays
  • we allocate up to C lg n credits for each splay
    and each join (C to be determined)
  • time for splay is proportional to number of splay
    steps, so we can account for running time of
    splay by spending one credit for each splay
    step
  • credits not needed to pay for performing an
    operation are retained for use in later steps
  • To ensure there are enough credits on hand to pay
    for later operations, we maintain the following
    credit invariant.
  • for a vertex x, keep rank(x) credits where
    rank(x) ?lg( of descendants of x)?
  • Note that balanced trees need fewer credits than
    unbalanced trees, so splay operations release
    credits that can be used to pay for splay

21
  • Lemma 4.2. Splaying a tree with root v at a node
    u while maintaining credit invariant requires at
    most 3(rank(v)-rank(u))1 new credits.
  • Proof. The credits are divided among the
    different splay steps. A splay step at node x
    with parent y and grandparent z is allocated
    3(rank(z)-rank(x)) credits. A splay step at a
    node x with a parent y but no grandparent is
    3(rank(y)-rank(x))1. Let rank and rank? be the
    rank functions before and after the step.
  • Case 1. x has no grandparent. This is the last
    step, and the extra credit pays for it. The
    number of additional credits needed to maintain
    the invariant is
  • (rank?(x) - rank(x)) (rank?(y) - rank(y))
    rank?(y) - rank(x) ? rank(y) - rank(x)
  • which is one third of the available credits.

22
  • Case 2. x left(left(z)) or x right(right(z)).
    If rank(z) rank(x) k we get no new credits
    for this step, but rank'(z) lt k, so maintaining
    the invariant frees up at least one credit, which
    pays for the step. If rank(z) gt rank(x), the
    number of credits needed to maintain the
    invariant is
  • (rank?(x) - rank(x)) (rank?(y) - rank(y))
    (rank?(z) - rank(z)) rank?(y) rank?(z)
    - rank(x) - rank(y) ? 2(rank(z) - rank(x)) lt
    3(rank(z) - rank(x))releasing at least one extra
    credit to pay for the step.
  • Case 3. x left(right(z)) or x right(left(z)).
    If rank(z) rank(x) k we get no new credits
    for this step, but either rank?(z) lt k or
    rank'(y) lt k , so maintaining the invariant frees
    up at least one credit, which pays for the step.
    If rank(z) gt rank(x), the number of credits
    needed to maintain the invariant is
  • (rank?(x) - rank(x)) (rank?(y) - rank(y))
    (rank'(z) - rank(z)) rank?(y) rank?(z) -
    rank(x) - rank(y)
  • ? 2(rank(z) - rank(x)) lt 3(rank(z) - rank(x))
  • releasing at least one extra credit to pay for
    the step. ?

23
  • By the lemma, each splay takes at most 3?lg n?
    1 credits. The number of credits needed for an
    insert is this number plus the number of new
    credits needed to maintain the credit invariant,
    after the new item is inserted but before the
    splay is done. The only nodes whose ranks can
    increase are those on the path from the root to
    the inserted node that have exactly 2k-1
    descendants before the operation (where k?0..
    ?lg n?). There can be at most ?lg n? 1 of
    these, so the total number of credits required
    for an insert is at most 4 ?lg n? 2.
  • The join operations requires at most ?lg n?
    credits.
  • All other operations require no credits beyond
    those used by the splay
  • Theorem 4.1. The total time required for a
    sequence of m sorted set operations on n
    vertices, using self-adjusting binary search
    trees is O(m log n), where n is the number of
    insert and join operations.
Write a Comment
User Comments (0)
About PowerShow.com