Chapter 9 Sorting - PowerPoint PPT Presentation

1 / 71
About This Presentation
Title:

Chapter 9 Sorting

Description:

Repeat for other n-1 keys. Use current position to hold current minimum to avoid large ... The only operation used for sorting the list is swapping two keys. ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 72
Provided by: PeterM6
Learn more at: http://www.cs.gsu.edu
Category:
Tags: chapter | keys | sorting

less

Transcript and Presenter's Notes

Title: Chapter 9 Sorting


1
Chapter 9Sorting
2
Repeated Minimum
  • Search the list for the minimum element.
  • Place the minimum element in the first position.
  • Repeat for other n-1 keys.
  • Use current position to hold current minimum to
    avoid large-scale movement of keys.

3
Repeated Minimum Code
Fixed n-1 iterations
for i 1 to n-1 do for j i1 to n do
if Li gt Lj then Temp Li
Li Lj Lj Temp
endif endfor endfor
Fixed n-i iterations
4
Repeated Minimum Analysis
Doing it the dumb way
The smart way i do one comparison when in-1,
two when in-2, , n-1 when i1.
5
Bubble Sort
  • Search for adjacent pairs that are out of order.
  • Switch the out-of-order keys.
  • Repeat this n-1 times.
  • After the first iteration, the last key is
    guaranteed to be the largest.
  • If no switches are done in an iteration, we can
    stop.

6
Bubble Sort Code
Worst case n-1 iterations
for i 1 to n-1 do Switch False for j
1 to n-i do if Lj gt Lj1 then
Temp Lj Lj Lj1
Lj1 Temp Switch
True endif endfor if Not
Switch then break endfor
Fixed n-i iterations
7
Bubble Sort Analysis
Being smart right from the beginning
8
Insertion Sort I
  • The list is assumed to be broken into a sorted
    portion and an unsorted portion
  • Keys will be inserted from the unsorted portion
    into the sorted portion.

Unsorted
Sorted
9
Insertion Sort II
  • For each new key, search backward through sorted
    keys
  • Move keys until proper position is found
  • Place key in proper position

10
Insertion Sort Code
template ltclass Comparablegt void insertionSort(
vectorltComparablegt a ) for( int p 1 p
lt a.size( ) p ) Comparable tmp
a p int j for( j p j gt 0
tmp lt a j - 1 j-- ) a j a
j - 1 a j tmp
Fixed n-1 iterations
Worst case i-1 comparisons
Searching for the proper position for the new key
Move current key to right
Insert the new key to its proper position
11
Insertion Sort Analysis
  • Worst Case Keys are in reverse order
  • Do i-1 comparisons for each new key, where i runs
    from 2 to n.
  • Total Comparisons 123 n-1

Comparison
12
Insertion Sort Average I
  • Assume When a key is moved by the for loop, all
    positions are equally likely.
  • There are i positions (i is loop variable of for
    loop) (Probability of each 1/i.)
  • One comparison is needed to leave the key in its
    present position.
  • Two comparisons are needed to move key over one
    position.

13
Insertion Sort Average II
  • In general k comparisons are required to move
    the key over k-1 positions.
  • Exception Both first and second positions
    require i-1 comparisons.

Position
1
2
3
...
i
i-1
i-2
...
...
i-1
i-1
i-2
3
2
1
Comparisons necessary to place key in this
position.
14
Insertion Sort Average III
Average Comparisons to place one key
Solving
15
Insertion Sort Average IV
For All Keys
16
Optimality Analysis I
  • To discover an optimal algorithm we need to find
    an upper and lower asymptotic bound for a
    problem.
  • An algorithm gives us an upper bound. The worst
    case for sorting cannot exceed ?(n2) because we
    have Insertion Sort that runs that fast.
  • Lower bounds require mathematical arguments.

17
Optimality Analysis II
  • Making mathematical arguments usually involves
    assumptions about how the problem will be solved.
  • Invalidating the assumptions invalidates the
    lower bound.
  • Sorting an array of numbers requires at least
    ?(n) time, because it would take that much time
    to rearrange a list that was rotated one element
    out of position.

18
Rotating One Element
Assumptions Keys must be moved one at a
time All key movements take the same amount of
time The amount of time needed to move one
key is not dependent on n.
2nd
1st
n keys must be moved
3rd
2nd
4th
3rd
?(n) time
nth
n-1st
1st
nth
19
Other Assumptions
  • The only operation used for sorting the list is
    swapping two keys.
  • Only adjacent keys can be swapped.
  • This is true for Insertion Sort and Bubble Sort.

20
Inversions
  • Suppose we are given a list of elements L, of
    size n.
  • Let i, and j be chosen so 1?iltj?n.
  • If LigtLj then the pair (i,j) is an inversion.

21
Maximum Inversions
  • The total number of pairs is
  • This is the maximum number of inversions in any
    list.
  • Exchanging adjacent pairs of keys removes at most
    one inversion.

22
Swapping Adjacent Pairs
The only inversion that could be removed is the
(possible) one between the red and green keys.
The relative position of the Red and blue areas
has not changed. No inversions between the red
key and the blue area have been removed. The same
is true for the red key and the orange area. The
same analysis can be done for the green key.
23
Lower Bound Argument
  • A sorted list has no inversions.
  • A reverse-order list has the maximum number of
    inversions, ?(n2) inversions.
  • A sorting algorithm must exchange ?(n2) adjacent
    pairs to sort a list.
  • A sort algorithm that operates by exchanging
    adjacent pairs of keys must have a time bound of
    at least ?(n2).

24
Lower Bound For Average I
  • There are n! ways to rearrange a list of n
    elements.
  • Recall that a rearrangement is called a
    permutation.
  • If we reverse a rearranged list, every pair that
    used to be an inversion will no longer be an
    inversion.
  • By the same token, all non-inversions become
    inversions.

25
Lower Bound For Average II
  • There are n(n-1)/2 inversions in a permutation
    and its reverse.
  • Assuming that all n! permutations are equally
    likely, there are n(n-1)/4 inversions in a
    permutation, on the average.
  • The average performance of a swap-adjacent-pairs
    sorting algorithm will be ?(n2).

26
Shell Sort
  • With insertion sort, each time we insert an
    element, other elements get nudged one step
    closer to where they ought to be
  • What if we could move elements a much longer
    distance each time?
  • We could move each element
  • A long distance
  • A somewhat shorter distance
  • A shorter distance still
  • This approach is what makes shellsort so much
    faster than insertion sort

27
Sorting nonconsecutive subarrays
Here is an array to be sorted (numbers arent
important)
  • Consider just the red locations
  • Suppose we do an insertion sort on just these
    numbers, as if they were the only ones in the
    array?
  • Now consider just the yellow locations
  • We do an insertion sort on just these numbers
  • Now do the same for each additional group of
    numbers
  • The resultant array is sorted within groups, but
    not overall

28
Doing the 1-sort
  • In the previous slide, we compared numbers that
    were spaced every 5 locations
  • This is a 5-sort
  • Ordinary insertion sort is just like this, only
    the numbers are spaced 1 apart
  • We can think of this as a 1-sort
  • Suppose, after doing the 5-sort, we do a 1-sort?
  • In general, we would expect that each insertion
    would involve moving fewer numbers out of the way
  • The array would end up completely sorted

29
Diminishing gaps
  • For a large array, we dont want to do a 5-sort
    we want to do an N-sort, where N depends on the
    size of the array
  • N is called the gap size, or interval size
  • We may want to do several stages, reducing the
    gap size each time
  • For example, on a 1000-element array, we may want
    to do a 364-sort, then a 121-sort, then a
    40-sort, then a 13-sort, then a 4-sort, then a
    1-sort
  • Why these numbers?

30
Increment sequence
  • No one knows the optimal sequence of diminishing
    gaps
  • This sequence is attributed to Donald E. Knuth
  • Start with h 1
  • Repeatedly compute h 3h 1
  • 1, 4, 13, 40, 121, 364, 1093
  • This sequence seems to work very well
  • Another increment sequence mentioned in the
    textbook is based on the following formula
  • start with h the half of the containers size
  • hi floor (hi-1 / 2.2)
  • It turns out that just cutting the array size in
    half each time does not work out as well

31
Analysis
  • What is the real running time of shellsort?
  • Nobody knows!
  • Experiments suggest something like O(n3/2) or
    O(n7/6)
  • Analysis isnt always easy!

32
Merge Sort
  • If List has only one Element, do nothing
  • Otherwise, Split List in Half
  • Recursively Sort Both Lists
  • Merge Sorted Lists

Mergesort(A, l, r) if l lt r then q
floor((lr)/2) mergesort(A, l, q)
mergesort(A, q1, r)
merge(A, l, q, r)
33
The Merge Algorithm
Assume we are merging lists A and B into list C.
Ax 1 Bx 1 Cx 1 while Ax ? n and Bx ?
n do if AAx lt BBx then CCx
AAx Ax Ax 1 else CCx
BBx Bx Bx 1 endif Cx
Cx 1 endwhile
while Ax ? n do CCx AAx Ax Ax
1 Cx Cx 1 endwhile while Bx ? n do
CCx BBx Bx Bx 1 Cx Cx
1 endwhile
34
Merging
  • Merge.
  • Keep track of smallest element in each sorted
    half.
  • Insert smallest of two elements into auxiliary
    array.
  • Repeat until done.

smallest
smallest
auxiliary array









A
35
Merging
  • Merge.
  • Keep track of smallest element in each sorted
    half.
  • Insert smallest of two elements into auxiliary
    array.
  • Repeat until done.

auxiliary array
A









G
36
Merging
  • Merge.
  • Keep track of smallest element in each sorted
    half.
  • Insert smallest of two elements into auxiliary
    array.
  • Repeat until done.

auxiliary array
A
G








H
37
Merging
  • Merge.
  • Keep track of smallest element in each sorted
    half.
  • Insert smallest of two elements into auxiliary
    array.
  • Repeat until done.

auxiliary array
A
G
H







I
38
Merging
  • Merge.
  • Keep track of smallest element in each sorted
    half.
  • Insert smallest of two elements into auxiliary
    array.
  • Repeat until done.

auxiliary array
A
G
H
I






L
39
Merging
  • Merge.
  • Keep track of smallest element in each sorted
    half.
  • Insert smallest of two elements into auxiliary
    array.
  • Repeat until done.

auxiliary array
A
G
H
I
L





M
40
Merging
  • Merge.
  • Keep track of smallest element in each sorted
    half.
  • Insert smallest of two elements into auxiliary
    array.
  • Repeat until done.

auxiliary array
A
G
H
I
L
M




O
41
Merging
  • Merge.
  • Keep track of smallest element in each sorted
    half.
  • Insert smallest of two elements into auxiliary
    array.
  • Repeat until done.

auxiliary array
A
G
H
I
L
M
O



R
42
Merging
  • Merge.
  • Keep track of smallest element in each sorted
    half.
  • Insert smallest of two elements into auxiliary
    array.
  • Repeat until done.

first halfexhausted
auxiliary array
A
G
H
I
L
M
O
R


S
43
Merging
  • Merge.
  • Keep track of smallest element in each sorted
    half.
  • Insert smallest of two elements into auxiliary
    array.
  • Repeat until done.

first halfexhausted
auxiliary array
A
G
H
I
L
M
O
R
S

T
44
Merging
  • Merge.
  • Keep track of smallest element in each sorted
    half.
  • Insert smallest of two elements into auxiliary
    array.
  • Repeat until done.

first halfexhausted
second halfexhausted
auxiliary array
A
G
H
I
L
M
O
R
S
T
45
Merge Sort Analysis
  • Sorting requires no comparisons
  • Merging requires n-1 comparisons in the worst
    case, where n is the total size of both lists (n
    key movements are required in all cases)
  • Recurrence relation

46
Merge Sort Space
  • Merging cannot be done in place
  • In the simplest case, a separate list of size n
    is required for merging
  • It is possible to reduce the size of the extra
    space, but it will still be ?(n)

47
Quick Sort I
  • Split List into Big and Little keys
  • Put the Little keys first, Big keys second
  • Recursively sort the Big and Little keys

Quicksort( A, l, r) if l lt r then q
partition(A, l, r) quicksort(A, l,
q-1) quicksort(A, q1, r)
48
Quicksort II
  • Big is defined as bigger than the pivot point
  • Little is defined as smaller than the pivot
    point
  • The pivot point is chosen at random. In the
    following example, we pick up the middle element
    as the pivot.

49
Partitioning
2 97 17 39 12 37 10 55 80 42
46
Pick pivot 37
50
Partitioning
2 97 17 39 12 46 10 55 80 42
37
Step 1 Move pivot to end of array
51
Partitioning
2 97 17 39 12 46 10 55 80 42
37
Step 2 set i 0 and j array.length - 1
52
Partitioning
2 97 17 39 12 46 10 55 80 42
37
Step 3 move i right until value larger than the
pivot is found
53
Partitioning
2 97 17 39 12 46 10 55 80 42
37
Step 4 move j left until value less than the
pivot is found
54
Partitioning
2 10 17 39 12 46 97 55 80 42
37
Step 5 swap elements at positions i and j
55
Partitioning
2 10 17 39 12 46 97 55 80 42
37
Step 6 move i right until value larger than the
pivot is found
56
Partitioning
2 10 17 39 12 46 97 55 80 42
37
Step 7 move j left until value less than the
pivot is found
57
Partitioning
2 10 17 12 39 46 97 55 80 42
37
Step 8 swap elements at positions i and j
58
Partitioning
2 10 17 12 39 46 97 55 80 42
37
Step 9 move i left until it hits j
59
Partitioning
2 10 17 12 37 46 97 55 80 42
39
Step 10 put pivot in correct spot
60
Quicksort III
  • Pivot point may not be the exact median
  • Finding the precise median is hard
  • If we get lucky, the following recurrence
    applies (n/2 is approximate)

O(n)
61
Quicksort IV
  • If the keys are in order and the pivot happens to
    be the smallest element, Big portion will have
    n-1 keys, Small portion will be empty.
  • T(N) T(N-1) O(N) O(N2)
  • N-1 comparisons are done for first key
  • N-2 comparisons for second key, etc.
  • Result

62
A Better Lower Bound
  • The ?(n2) time bound does not apply to
    Quicksort, Mergesort.
  • A better assumption is that keys can be moved an
    arbitrary distance.
  • However, we can still assume that the number of
    key-to-key comparisons is proportional to the run
    time of the algorithm.

63
Lower Bound Assumptions
  • Algorithms sort by performing key comparisons.
  • The contents of the list is arbitrary, so tricks
    based on the value of a key wont work.
  • The only basis for making a decision in the
    algorithm is by analyzing the result of a
    comparison.

64
Lower Bound Assumptions II
  • Assume that all keys are distinct, since all sort
    algorithms must handle this case.
  • Because there are no tricks that work, the only
    information we can get from a key comparison is
  • Which key is larger

65
Lower Bound Assumptions III
  • The choice of which key is larger is the only
    point at which two runs of an algorithm can
    exhibit divergent behavior.
  • Divergent behavior includes, rearranging the keys
    in two different ways.

66
Lower Bound Analysis
  • We can analyze the behavior of a particular
    algorithm on an arbitrary list by using a tree.

12
1lt2
1gt2
13
23
2gt3
1gt3
2lt3
1lt3
1,2,.3
2,1,3
23
13
2lt3
2gt3
1gt3
1lt3
2,3,1
3,2,1
1,3,2
3,1,2
67
The Leaf Nodes II
  • Each Leaf node represents a permutation of the
    list.
  • Since there are n! initial configurations, and
    one final configuration, there must be n! ways to
    reconfigure the input.
  • There must be at least n! leaf nodes.

68
Lower Bound More Analysis
  • Since we are working on a lower bound, in any
    tree, we must find the longest path from root to
    leaf. This is the worst case.
  • The most efficient algorithm would minimize the
    length of the longest path.
  • This happens when the tree is as close as
    possible to a complete binary tree

69
Lower Bound Final
  • A Binary Tree with k leaves must have height at
    least log k.
  • The height of the tree is the length of the
    longest path from root to leaf.
  • A binary tree with n! leaves must have height at
    least log n!

70
Lower Bound Final cont.
Any comparison sort algorithm requires ?(n lg n)
comparisons in the worst case. Proof From the
preceding discussion, it suffices to determine
the height of a decision tree in which each
permutation appears as a reachable leaf. Consider
a decision tree of height h with l reachable
leaves corresponding to a comparison sort on n
elements. Because each of the n! permutations of
the input appears as some leaf, we have n! l.
Since a binary tree of height h has no more than
2h leaves, we have n! l
2h, which, by taking logarithms, implies
h log(n!)
O (nlogn)
71
Lower Bound Algebra
Write a Comment
User Comments (0)
About PowerShow.com