When is O(n lg n) Really O(n lg n)? A Comparison of the Quicksort and Heapsort Algorithms - PowerPoint PPT Presentation

About This Presentation
Title:

When is O(n lg n) Really O(n lg n)? A Comparison of the Quicksort and Heapsort Algorithms

Description:

Remembering the tree structure of the heap, each Heapify call takes O(lg n) time. ... For very large n, we would expect a slowdown for ANY algorithm as the data no ... – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 14
Provided by: kru3
Category:

less

Transcript and Presenter's Notes

Title: When is O(n lg n) Really O(n lg n)? A Comparison of the Quicksort and Heapsort Algorithms


1
When is O(n lg n)ReallyO(n lg n)? A
Comparison of the Quicksort and Heapsort
Algorithms
  • Gerald Kruse
  • Juniata College
  • kruse_at_juniata.edu
  • Huntingdon, PA

2
Outline
  • Analyzing Sorting Algorithms
  • Quicksort
  • Heapsort
  • Experimental Results
  • Observations(this is a fun, open-ended, student
    project)

3
How Fast is my Sorting Algorithm?A nice blend
of Math and CS
  • The Sorting Problem, from Cormen et. al.
    Input A sequence of n numbers, (a1, a2, an)
  • Output A permutation (reordering) (a1, a2,
    an) of the input sequence such that a1 a2
    anNote This definition can be expanded to
    include sorting primitive data such as characters
    or strings, alpha-numeric data, and data records
    with key values.
  • Sorting algorithms are analyzed using many
    different metrics expected run-time, memory
    usage, communication bandwidth, implementation
    complexity,
  • Expected running time is given using Big-O
    notation
  • O( g(n) ) f(n) pos. constants c and n0
    s.t. 0 f(n) cg(n) n n0 .
  • While O-notation describes an asymptotic upper
    bound on a function, it is frequently used to
    describe asymptotically tight bounds.

4
Algorithm analysis also requires a model of the
implementation technology to be used
  • The most commonly used model is RAM, the
    Random-Access Machine.
  • This should NOT be confused with Random-Access
    Memory.
  • Each instruction requires an equal amount of
    processing time
  • Memory hierarchy (cache, virtual memory) is NOT
    modeled
  • The RAM model is relatively straightforward and
    usually an excellent predictor of performance on
    actual machines.

5
Quicksort
  • Good partitioning means the partitions are
    usually equally sized
  • After a partition, the element partitioned around
    will be in the correct position
  • There are n compares per level, and log(n)
    levels, resulting in an algorithm that should run
    proportionally to n lg n, taking the
    assumptions of the RAM model

6
Quicksort
  • Pathological data leads to bad or unbalanced
    partitions and the worst-case for Quicksort
  • The element partitioned around will be in sorted
    position
  • This data will be sorted in O(n2) time, since
    there are still n compares per level, but now
    there are n -1 levels.

7
Heaps
  • A heap can be seen as a complete binary tree

In practice, heaps are usually implemented as
arrays.
16
14
10
8
7
9
3
2
4
1
A
8
Heaps, continued
Heaps satisfy the heap property AParent(i) ?
Ai for all nodes i gt 1 In other words, the
value of a node is at most the value of its
parent. By the way, e-Bay uses a heap-like
data structure to track bids.
9
Heapsort
  • Heapsort(A)
  • BuildHeap(A)
  • for (i length(A) downto 2)
  • Swap(A1, Ai)
  • heap_size(A) - 1
  • Heapify(A, 1)

When the heap property is violated at just one
node (which has sub-trees which are valid heaps),
Heapify floats down the parent node to fix the
heap. Remembering the tree structure of the
heap, each Heapify call takes O(lg n) time.
Since there are n 1 calls to Heapify,
Heapsorts expected execution time is O(n lg n),
just like Quicksort.
10
Counting Comparisons
11
Timing Results
12
  • Observations
  • Implementation
  • Run on Windows and Unix based machines,
    implemented in C, C, and Java, and based on
    psuedo-code from Cormen et. al., Sedgewick, and
    Joyce et. al.
  • Heapsort does not run in O(n lg n) timeeven for
    the relatively small values of n tested
  • Quicksort does exhibit O(n lg n) behavior
  • Consider the memory access patterns For very
    large n, we would expect a slowdown for ANY
    algorithm as the data no longer fits in
    memoryFor the size n run here, the partitions
    in Quicksort consist of elements which are
    contiguous in memory, while floating down a
    Heap requires accessing elements which are not
    close in memory
  • This is a fun exploration for students, appealing
    to those with an interest in the mathematics or
    computer science

13
  • Bibliography

T. H. Cormen, C. E. Leiserson, R. L. Rivest, and
C. Stein, Introduction to Algorithms, Second
Edition, Cambridge, MA/London, England The MIT
Press/McGraw-Hill, 2003. N. Dale, C. Weems, D.
T. Joyce, Object-Oriented Data Structures Using
Java, Boston, MA Jones and Bartlett, 2002. M.
T. Goodrich and R. Tamassia, Algorithm Design
Foundation, Analysis, and Internet Examples,
Wiley New York 2001. D. E. Knuth, The Art of
Computer Programming, Volume 3 (Second Edition)
Sorting and Searching, Addison-Wesley-Longman
Redwood City, CA, 1998. C. C. McGeoch,
Analyzing algorithms by simulation Variance
reduction techniques and simulation speedups,
ACM Computing Surveys, vol. 24, no. 2, pp. 195
212, 1992. C. C. McGeoch, D. Precup, and P. R.
Cohen, How to find the Big-Oh of your data set
(and how not to), Advances in Intelligent Data
Analysis, vol. 1280 of Lecture Notes in Computer
Science, pp. 41 52, Springer-Verlag, 1997. R.
Sedgewick, Algorithms in C, Parts 1-4
Fundamentals, Data Structures, Sorting,
Searching, Third Edition, Addison-Wesley
Boston, MA, 1997
Write a Comment
User Comments (0)
About PowerShow.com