Title: When is O(n lg n) Really O(n lg n)? A Comparison of the Quicksort and Heapsort Algorithms
1When is O(n lg n)ReallyO(n lg n)? A
Comparison of the Quicksort and Heapsort
Algorithms
- Gerald Kruse
- Juniata College
- kruse_at_juniata.edu
- Huntingdon, PA
2Outline
- Analyzing Sorting Algorithms
- Quicksort
- Heapsort
- Experimental Results
- Observations(this is a fun, open-ended, student
project)
3How Fast is my Sorting Algorithm?A nice blend
of Math and CS
- The Sorting Problem, from Cormen et. al.
Input A sequence of n numbers, (a1, a2, an) - Output A permutation (reordering) (a1, a2,
an) of the input sequence such that a1 a2
anNote This definition can be expanded to
include sorting primitive data such as characters
or strings, alpha-numeric data, and data records
with key values. - Sorting algorithms are analyzed using many
different metrics expected run-time, memory
usage, communication bandwidth, implementation
complexity, - Expected running time is given using Big-O
notation - O( g(n) ) f(n) pos. constants c and n0
s.t. 0 f(n) cg(n) n n0 . - While O-notation describes an asymptotic upper
bound on a function, it is frequently used to
describe asymptotically tight bounds.
4Algorithm analysis also requires a model of the
implementation technology to be used
- The most commonly used model is RAM, the
Random-Access Machine. - This should NOT be confused with Random-Access
Memory. - Each instruction requires an equal amount of
processing time - Memory hierarchy (cache, virtual memory) is NOT
modeled - The RAM model is relatively straightforward and
usually an excellent predictor of performance on
actual machines.
5Quicksort
- Good partitioning means the partitions are
usually equally sized - After a partition, the element partitioned around
will be in the correct position - There are n compares per level, and log(n)
levels, resulting in an algorithm that should run
proportionally to n lg n, taking the
assumptions of the RAM model
6Quicksort
- Pathological data leads to bad or unbalanced
partitions and the worst-case for Quicksort - The element partitioned around will be in sorted
position - This data will be sorted in O(n2) time, since
there are still n compares per level, but now
there are n -1 levels.
7Heaps
- A heap can be seen as a complete binary tree
In practice, heaps are usually implemented as
arrays.
16
14
10
8
7
9
3
2
4
1
A
8Heaps, continued
Heaps satisfy the heap property AParent(i) ?
Ai for all nodes i gt 1 In other words, the
value of a node is at most the value of its
parent. By the way, e-Bay uses a heap-like
data structure to track bids.
9Heapsort
- Heapsort(A)
-
- BuildHeap(A)
- for (i length(A) downto 2)
-
- Swap(A1, Ai)
- heap_size(A) - 1
- Heapify(A, 1)
-
When the heap property is violated at just one
node (which has sub-trees which are valid heaps),
Heapify floats down the parent node to fix the
heap. Remembering the tree structure of the
heap, each Heapify call takes O(lg n) time.
Since there are n 1 calls to Heapify,
Heapsorts expected execution time is O(n lg n),
just like Quicksort.
10Counting Comparisons
11Timing Results
12- Implementation
- Run on Windows and Unix based machines,
implemented in C, C, and Java, and based on
psuedo-code from Cormen et. al., Sedgewick, and
Joyce et. al. - Heapsort does not run in O(n lg n) timeeven for
the relatively small values of n tested - Quicksort does exhibit O(n lg n) behavior
- Consider the memory access patterns For very
large n, we would expect a slowdown for ANY
algorithm as the data no longer fits in
memoryFor the size n run here, the partitions
in Quicksort consist of elements which are
contiguous in memory, while floating down a
Heap requires accessing elements which are not
close in memory - This is a fun exploration for students, appealing
to those with an interest in the mathematics or
computer science
13T. H. Cormen, C. E. Leiserson, R. L. Rivest, and
C. Stein, Introduction to Algorithms, Second
Edition, Cambridge, MA/London, England The MIT
Press/McGraw-Hill, 2003. N. Dale, C. Weems, D.
T. Joyce, Object-Oriented Data Structures Using
Java, Boston, MA Jones and Bartlett, 2002. M.
T. Goodrich and R. Tamassia, Algorithm Design
Foundation, Analysis, and Internet Examples,
Wiley New York 2001. D. E. Knuth, The Art of
Computer Programming, Volume 3 (Second Edition)
Sorting and Searching, Addison-Wesley-Longman
Redwood City, CA, 1998. C. C. McGeoch,
Analyzing algorithms by simulation Variance
reduction techniques and simulation speedups,
ACM Computing Surveys, vol. 24, no. 2, pp. 195
212, 1992. C. C. McGeoch, D. Precup, and P. R.
Cohen, How to find the Big-Oh of your data set
(and how not to), Advances in Intelligent Data
Analysis, vol. 1280 of Lecture Notes in Computer
Science, pp. 41 52, Springer-Verlag, 1997. R.
Sedgewick, Algorithms in C, Parts 1-4
Fundamentals, Data Structures, Sorting,
Searching, Third Edition, Addison-Wesley
Boston, MA, 1997