Heapsort Algorithm - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Heapsort Algorithm

Description:

Each record contains a key, which is the value to be sorted. ... 2 for i = floor(n/2) to 1. 3 Max-Heapify(A,i) Invariant: ... Initialization: i=floor(n/2) ... – PowerPoint PPT presentation

Number of Views:239
Avg rating:3.0/5.0
Slides: 19
Provided by: ValuedGate2244
Learn more at: http://cs.gmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Heapsort Algorithm


1
Heapsort Algorithm
  • CS 583
  • Analysis of Algorithms

2
Outline
  • Sorting Problem
  • Heaps
  • Definition
  • Maintaining heap property
  • Building a heap
  • Heapsort Algorithm

3
Sorting Problem
  • Sorting is usually performed not on isolated
    data, but records.
  • Each record contains a key, which is the value to
    be sorted.
  • The remainder is called satellite data.
  • When a sorting algorithm permutes the keys, it
    must permute the satellite data as well.
  • If the satellite data is large for each record,
    we often permute pointers to records.
  • This level of detail is usually irrelevant in the
    study of algorithms, but is important when
    converting an algorithm to a program.

4
Sorting Problem Importance
  • Sorting is arguably the most fundamental problem
    in the study of algorithms for the following
    reasons
  • The need to sort information is often a key part
    of an application. For example, sorting the
    financial reports by security IDs.
  • Algorithms often use sorting as a key subroutine.
    For example, in order to match a security against
    benchmarks, the latter set needs to be sorted by
    some key elements.
  • There is a wide variety of sorting algorithms,
    and they use a rich set of techniques.

5
Heaps
  • Heapsort algorithm sorts in place and its running
    time is O(n log(n)).
  • It combines the better attributes of insertion
    sort and merge sort algorithms.
  • It is based on a data structure, -- heaps.
  • The (binary) heap data structure is an array
    object that can be viewed as a nearly complete
    binary tree.
  • An array A that represents a heap is an object
    with two attributes
  • lengthA, which is the number of elements, and
  • heap-sizeA, the number of elements in the heap
    stored within the array A.

6
Heaps Example
  • A 10, 8, 6, 5, 7, 3, 2
  •  
  •  
  • 10
  • 8 6
  • 7 3 2
  • The root of the tree is A1. Children of a node
    i determined as follows
  •  
  • Left(i)
  • return 2i
  • Right(i)
  • return 2i1
  •  

7
Heaps Example (cont.)
  • The above is proven by induction
  • The root's left child is 2 21.
  • Assume it is true for node n.
  • The left child of a node (n1) will follow the
    right child of node n left(n1) 2n 1 1
    2(n1) ?
  •  
  • The parent of a node i is calculated from i2p,
    or i2p1, where p is a parent node. Hence
  •  
  • Parent(i)
  • return floor(i/2)

8
Max-Heaps
  • In a max-heap, for every node i other than the
    root 
  • AParent(i) gt Ai
  • For the heapsort algorithm, we use max-heaps.
  • The height of the heap is defined to be the
    longest path from the root to a leaf, and it is
    ?(lg n) since it is a complete binary tree.
  • We will consider the following basic procedures
    on the heap 
  • Max-Heapify to maintain the max-heap property.
  • Build-Max-Heap to produce a max-heap from an
    unordered input array.
  • Heapsort to sort an array in place.

9
Maintaining the Heap Property
  • The Max-Heapify procedure takes an array A and
    its index i.
  • It is assumed that left and right subtrees are
    already max-heaps.
  • The procedure lets the value of Ai "float down"
    in the max-heap so that the subtree rooted at
    index i becomes a max-heap.

10
Max-Heapify Algorithm
  • Max-Heapify (A, i)
  • 1 l Left(i)
  • 2 r Right(i)
  • 3 if l lt heap-sizeA and Al gt Ai
  • 4 largest l
  • 5 else
  • 6 largest i
  • 7 if r lt heap-sizeA and Ar gt Alargest
  • 8 largest r
  • 9 if largest ltgt i
  • 10 ltexchange Ai with Alargestgt
  • 11 Max-Heapify(A, largest)

11
Max-Heapify Analysis
  • It takes ?(1) to find Alargest, plus the time
    to run the procedure recursively on at most 2n/3
    elements. (This is the maximum size of a child
    tree. It occurs when the last row of the tree is
    exactly half full.)
  •  
  • Assume there n nodes and x levels in the tree
    that has half of the last row. This means
  •  
  • n 1 2 ... 2(x-1) 2x/2
  • 2x 1 2x/2 n
  • 2(x-1) a gt 2a a n1 gt
  • 2(x-1) (n1)/3

12
Max-Heapify Analysis (cont.)
Max subtree size (half of all elements to level
x-1) (elements at the last level) (1 root
element) (2x 1)/2 2x/2 1 2(x-1) ½
2(x-1) 1 n/3 1/3 n/3 1/3 1.5
2n/3 2/3 1.5 2n/3   Therefore the running
time of Max-Heapify is described by the following
recurrence   T(n) lt T(2n/3) ?(1) According to
the master theorem   T(n) ?(lg n) (a1,
b3/2, f(n) ?(1))   Since T(n) is the
worst-case scenario, we have a running time of
the algorithm at O(lg n).
13
Building a Heap
  • We can use the procedure Max-Heapify in a
    bottom-up manner to convert the whole array
    A1..n into a max-heap.
  • Note that, elements Afloor(n/2)1..n are
    leaves. The last element that is not a leaf is a
    parent of the last node, -- floor(n/2).
  • The procedure Build-Max-Heap goes through all
    non-leaf nodes and runs Max-Heapify on each of
    them.

14
Build-Max-Heap Algorithm
  • Build-Max-Heap(A, n)
  • 1 heap-sizeA n
  • 2 for i floor(n/2) to 1
  • 3 Max-Heapify(A,i)
  • Invariant
  • At the start of each iteration 2-3, each node
    i1, ... , n is the root of a max-heap.
  • Proof.
  • Initialization ifloor(n/2). Each node in
    floor(n/2)1,...,n are leaves and hence are roots
    of trivial max-heaps.

15
Build-Max-Heap Correctness
  • Maintenance children of node i are numbered
    higher than i, and by the loop invariant are
    assumed to be roots of max-heaps.
  • This is the condition for Max-Heapify.
  • Moreover, the Max-Heapify preserves the property
    that i1, ... , n are roots of max-heaps.
  • Decrementing i by 1 makes the loop invariant for
    the next iteration.
  • Termination i0, hence each node 1,2,...,n is
    the root of a max-heap.

16
Build-Max-Heap Performance
  • Each call to Max-Heapify takes O(lg n) time and
    there are n such calls.
  • Therefore the running time of Build-Max-Heap is
    O(n lgn).
  • To derive a tighter bound, we observe that the
    running time of Max-Heapify depends on the node's
    height.
  • An n-element heap has height floor(lgn). There
    are at most ceil(n/2(h1)) nodes of any height
    h. Assume these nodes are at height x of the
    original tree. Then we have

17
Build-Max-Heap Performance (cont.)
12...2x...2h n 2(xh1) n1 2x
(n1)/2(h1) ceil(n/2(h1))   The time
required by Max-Heapify when called on a node of
height h is O(h). Hence   ?h0,floor(lgn)ceil(n/2
(h1)) O(h) O(n?h0,floor(lgn)h/2h)   A.8
?k0,?k/xk x/(1-x)2   ?h0,?h/2h ½ /
(1-1/2)2 2   Thus, the running time of
Build-Max-Heap can be bounded   O(n
?h0,floor(lgn)h/2h) O(n?h0,?h/2h) O(n)
18
The Heapsort Algorithm
The heapsort algorithm uses Build-Max-Heap on
A1..n. Since the maximum element of the array
is at A1, it can be put into correct position
An. Now A1..(n-1) can be made max-heap
again.   Heapsort (A,n) 1 Build-Max-Heap(A,n) 2
for i n to 2 3 ltswap A1 with Aigt 4
heap-sizeA heap-sizeA-1 5
Max-Heapify(A,1)   Step 1 takes O(n) time. Loop 2
is repeated (n-1) times with step 5 taking most
time O(lgn). Hence the running time of heapsort
is O(n) O(n lgn) O(n lgn).
Write a Comment
User Comments (0)
About PowerShow.com