Sorting Part II - PowerPoint PPT Presentation

About This Presentation
Title:

Sorting Part II

Description:

this partition and sort method continues until only single element arrays exist ... Psuedo-code. mergeSort(data, first, last) { if(first last) { mid = (first ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 26
Provided by: Kristofer6
Category:

less

Transcript and Presenter's Notes

Title: Sorting Part II


1
Sorting Part II
  • CS 367 Introduction to Data Structures

2
Better Sorting
  • The problem with all previous examples is the
    O(n2) performance
  • this may be acceptable for small data sets, but
    not large ones
  • Theoretically, O(n log n) is possible
  • see proof in Section 9.2 of the book

3
Heap Sort
  • Major problem with selection sort
  • it has to search entire back end of array on
    every search for next smallest item
  • what if we could make this search faster?
  • A heap always keeps the largest element at the
    top
  • it only takes O(log n) to remove the top
  • O(log n) is much better than O(n) search time of
    selection sort

4
Heap Sort
  • Basic procedure
  • build a heap
  • swap the root with the last element
  • rebuild the heap excluding the last element
  • the last element is where it is supposed to be
  • repeat until only one item left in the heap

5
Heap Sort - Conceptually
Z
1
3
4
6
0
2
5
X
M
queue
Z
T
N
J
L
X
1
3
4
6
0
2
5
T
M
queue
X
Z
L
N
J
T
1
3
4
6
0
2
5
queue
T
X
Z
N
M
L
J
6
Heap Sort - Implementation
1
3
4
6
0
2
5
Z
X
M
T
N
J
L
swap 0 and 6, rebuild
0
1
2
3
4
5
6
0
1
2
3
4
5
6
X
T
M
L
N
J
Z
J
L
M
N
T
X
Z
swap 0 and 5, rebuild
swap 0 and 1, done
0
1
2
3
4
5
6
0
1
2
3
4
5
6
T
N
M
L
J
X
Z
L
J
M
N
T
X
Z
swap 0 and 4, rebuild
swap 0 and 2, rebuild
0
1
2
3
4
5
6
0
1
2
3
4
5
6
N
L
M
J
T
X
Z
M
L
J
N
T
X
Z
swap 0 and 3, rebuild
7
Building the Heap
  • The heap will be build within the array
  • no extra data structures will be needed
  • Basic idea
  • start at the last non-terminal node
  • restore heap for tree rooted at this node
  • simply swap this node with its largest child if
    the child is larger
  • repeat this process for all non-terminal nodes

8
Building the Heap
1
3
4
6
0
2
5
Z
X
M
T
N
J
L
compare Z with its children (no move made)
1
3
4
6
0
2
5
Z
J
M
T
N
X
L
compare J with its children (swap it with X)
1
3
4
6
0
2
5
Z
X
M
T
N
J
L
compare M with its children (swap it with Z and
then N)
1
3
4
6
0
2
5
N
X
Z
T
M
J
L
Valid Heap
9
Building the Heap
  • Code to re-build the heap
  • void moveDown(Object data, int first, int
    last)
  • int child 2 first 1
  • while(child lt last)
  • if((child lt last) ((child 1) lt last))
  • if(datachild lt datachild 1)
    child
  • if(datafirst lt datachild)
  • swap(first, child)
  • first child
  • child 2 child 1
  • else break

10
Heap Sort
  • Code to build the heap and sort it
  • void heapSort(Object data)
  • // build the heap out of the data
  • for(int idata.length / 2 i gt 0 i--)
  • moveDown(data, i, data.length 1)
  • // now sort it
  • for(int i data.length 1 i lt 0 i--)
  • swap(0, i)
  • moveDown(data, 0, i 1)

11
Heap Sort
  • Time to build the heap in worst case
  • O(n)
  • proof can be found in Section 6.9.2 of book
  • Number of swaps to perform
  • always (n 1)
  • Performance to rebuild the heap
  • O(n log n)
  • Overall performance
  • O(n) (n-1) O(n log n) O(n log n)

12
Quicksort
  • Basic procedure
  • divide the initial array into two parts
  • all of the elements in the left side must be
    smaller than all of the elements in the right
    side
  • sort the two arrays separately and put them back
    together
  • we now have a completely sorted array
  • however, before sorting the two arrays, divided
    them each into two more arrays
  • we now have a total of 4 arrays
  • smallest elements in far left and largest in far
    right
  • repeat this process until only 1 element arrays
    remain
  • put them all together and the overall array is
    sorted

13
Quicksort
1
3
4
6
0
2
5
Z
T
M
X
N
J
L
break into two parts
0
1
2
3
0
1
2
L
M
N
J
T
Z
X
break into four parts
0
0
1
0
1
0
1
Z
M
N
J
L
X
T
break into 7 parts
0
0
0
0
0
0
0
Z
J
L
M
N
T
X
14
Quicksort - Implementing
  • Steps
  • move the largest value to the highest spot
  • this prevents some array overflow problems
  • pick an upper bound for the left sub-array
  • pick the value in the center of the array
  • move this to first element so it doesnt get
    moved
  • move all elements less than this to left side
  • move all elements greater to the right side
  • bound will now be in its final position
  • repeat with the two new arrays
  • from 0 to index(bound) 1
  • from index(bound) 1 to array.length - 1

15
Quicksort - Implementing
  • void quickSort(Object data)
  • if(data.length lt 2) return
  • int max 0
  • // find the highest value and put it in top spot
  • for(int i1 iltdata.length i)
  • if(datai gt datamax) max i
  • swap(max, data.length 1)
  • // start the real algorithm
  • quickSort(data, 0, data.length 2)

16
Quicksort - Implementing
  • void quickSort(Object data, int first, int
    last)
  • int lower first 1, upper last
  • swap(first, (first last) / 2) // find the
    bound
  • Comparable bound datafirst
  • while(lower lt upper) // divides the array
    in half
  • while(datalower lt bound) lower //
    lowers that are right
  • while(dataupper gt bound) upper-- //
    uppers that are right
  • if(lower lt upper) swap(lower, upper--)
  • else lower // arrays are already split
  • swap(upper, first) // puts bound in its final
    location
  • if(first lt upper 1) quickSort(data, first,
    upper 1)
  • if(upper 1 lt last) quickSort(data, upper
    1, last)

17
Quicksort Performance
  • Worst case
  • consider selecting the smallest (or largest)
    number as the bound
  • then all of the numbers end up on one side
  • consider the sorting the following array
  • 5 3 2 1 4 6 8
  • 1 will be the first bound and end up in its
    proper location
  • however, there will still be n 1 elements to
    sort
  • this will happen on each iteration
  • the result is an O(n2) algorithm

18
Quicksort Performance
  • So whats the average case?
  • the answer is O(n log n)
  • In practice, quicksort is usually the best
    sorting algorithm
  • the closer the bound is to the median, the better
    it is
  • beware, for arrays under 30 elements, insertion
    sort is more efficient
  • can you think how quicksort and insertion sort
    could be combined?

19
Mergesort
  • One of the first ever sorting algorithms used on
    a computer
  • It works on a principle similar to quicksort
  • each array is broken into two parts and then
    sorted separately
  • this partition and sort method continues until
    only single element arrays exist
  • then all of the arrays are put back together to
    form a sorted array

20
Mergesort
  • Big difference from quicksort is that the arrays
    are always broken into equal partitions
  • or in the case of an odd sized array, as close as
    possible to even
  • There is no bound selected
  • To put the arrays back together, simply select
    the smallest element from either array and make
    it next

21
Mergesort
1
3
4
6
0
2
5
J
X
Z
R
V
M
T
break into 7 parts
0
0
0
0
0
0
0
V
Z
M
J
R
X
T
0
1
0
1
0
1
M
Z
J
R
T
X
0
1
2
3
0
1
2
R
J
Z
M
X
T
V
1
3
4
6
0
2
5
R
V
J
T
Z
M
X
22
Merging
  • The most sophisticated part of mergesort is
    recombining (or merging) two separate arrays
  • Just go through each array selecting the smallest
    remaining element from each array
  • add it to the new array

23
Merging
  • Pseudo-code
  • merge(array, first, last)
  • mid (first last) / 2
  • i1 0
  • i2 first
  • i3 mid 1
  • while( // both left and right sub-arrays contain
    elements )
  • if(arrayi2 lt arrayi3) tmpi1
    arrayi2
  • else tmpi1 arrayi3
  • // load into temp array remaining elements of
    array
  • // copy elements in temp back into array

24
Mergesort
  • Once the merge code is done, the code for
    mergesort is easy
  • Psuedo-code
  • mergeSort(data, first, last)
  • if(first lt last)
  • mid (first last) / 2
  • mergeSort(data, first, mid)
  • mergeSort(data, mid 1, last)
  • merge(data, first, last)

25
Mergesort Performance
  • Mergesort produces a lot of copying in memory
  • It also requires extra storage space for the
    temporary array
  • this can be prohibitive for very large data sets
Write a Comment
User Comments (0)
About PowerShow.com