Sorting - PowerPoint PPT Presentation

About This Presentation
Title:

Sorting

Description:

... impossible to find a name in the telephone directory if the items are not sorted ... This happens when the data are in reverse order, then for the ith item (i-1) ... – PowerPoint PPT presentation

Number of Views:860
Avg rating:3.0/5.0
Slides: 58
Provided by: scie230
Learn more at: https://public.csusm.edu
Category:

less

Transcript and Presenter's Notes

Title: Sorting


1
Sorting
2
  • The efficiency of data handling can often be
    substantially increased if the data are sorted
  • For example, it is practically impossible to find
    a name in the telephone directory if the items
    are not sorted
  • In order to sort a set of item such as numbers or
    words, two properties must be considered
  • The number of comparisons required to arrange the
    data
  • The number of data movement

3
  • Depending on the sorting algorithm, the exact
    number of comparisons or exact number of
    movements may not always be easy to determine
  • Therefore, the number of comparisons and
    movements are approximated with big-O notations
  • Some sorting algorithm may do more movement of
    data than comparison of data
  • It is up to the programmer to decide which
    algorithm is more appropriate for specific set of
    data
  • For example, if only small keys are compared such
    as integers or characters, then comparison are
    relatively fast and inexpensive
  • But if complex and big objects should be
    compared, then comparison can be quite costly

4
  • If on the other hand, the data items moved are
    large, and the movement is relatively done more,
    then movement stands out as determining factor
    rather than comparison
  • Further, a simple method may only be 20 less
    efficient than a more elaborated algorithm
  • If sorting is used in a program once in a while
    and only for small set of data, then using more
    complicated algorithm may not be desirable
  • However, if size of data set is large, 20 can
    make significant difference and should not be
    ignored
  • Lets look at different sorting algorithms now

5
  • Insertion Sort
  • Start with first two element of the array,
    data0, and data1
  • If they are out of order then an interchange
    takes place
  • Next data2 is considered and placed into its
    proper position
  • If data2 is smaller than data0, it is placed
    before data0 by shifting down data0 and
    data1 by one position
  • Otherwise, if data2 is between data0 and
    data1, we just need to shift down data 1 and
    place data2 in the second position
  • Otherwise, data2 remain as where it is in the
    array
  • Next data3 is considered and the same process
    repeats
  • And so on

6
Algorithm and code for insertion sort
InsertionSort(data, n) for (i1, iltn,
i) move all elements dataj greater than
datai by one position place datai in its
proper position
template ltclass Tgt void InsertionSort(T data ,
int n) for (int i1 iltn, i)
T tmp datai for (int j
i jgt0 tmp lt dataj-1 j--)
dataj dataj-1 dataj tmp
7
Example of Insertion Sort
Put tmp2 in position 1
tmp 2
Moving 5 down
Put tmp3 in position 2
tmp 3
Moving 5 down
8
Since 5 is less than 8 no shifting is required
tmp 8
Put tmp1 in position 1
Moving 5 down
Moving 2 down
Moving 3 down
Moving 8 down
tmp1
9
  • Advantage of insertion sort
  • If the data are already sorted, they remain
    sorted and basically no movement is not necessary
  • Disadvantage of insertion sort
  • An item that is already in its right place may
    have to be moved temporary in one iteration and
    be moved back into its original place
  • Complexity of Insertion Sort
  • Best case This happens when the data are already
    sorted. It takes O(n) to go through the elements
  • Worst case This happens when the data are in
    reverse order, then for the ith item (i-1)
    movement is necessary
  • Total movement 1 2 .. . (n-1) n(n-1)/2
    which is O(n2)
  • The average case is approximately half of the
    worst case which is still O(n2)

10
  • Selection Sort
  • Select the minimum in the array and swap it with
    the first element
  • Then select the second minimum in the array and
    swap it with the second element
  • And so on until everything is sorted

11
Algorithm and code for selection sort
SelectionSort(data ,n) for (i0 iltn-1
i) Select the smallest element among
datai datan-1 Swap it with
datai
template ltclass Tgt void SelectionSort(T data ,
int n) int i, j, least for (i1 iltn-1,
i) for (j i1 leasti
jltn j) if dataj lt
dataleast least j
swap (dataleast, datai)
12
Example of Selection Sort
The first minimum is searched in the entire
array which is 1 Swap 1 with the first position
The second minimum is 2 Swap it with the second
position
13
The third minimum is 3 Swap 1 with the third
position
The fourth minimum is 5 Swap it with the forth
position
14
  • Complexity of Selection Sort
  • The number of comparison and/or movements is the
    same in each case (best case, average case and
    worst case)
  • The number of comparison is equal to
  • Total (n-1) (n-2) (n-3) . 1
  • n(n-1)/2
  • which is O(n2)

15
  • Bubble Sort
  • Start from the bottom and move the required
    elements up (i.e. bubble the elements up)
  • Two adjacent elements are interchanged if they
    are found to be out of order with respect to each
    other
  • First datan-1 and datan-2 are compared and
    swapped if they are not in order
  • Then datan-2 and datan-3 are swapped if they
    are not in order
  • And so on

16
Algorithm and code for bubble sort
BubbleSort(data ,n) for (i0 iltn-1 i)
for (jn-1 jgti --j)
swap elements in position j and j-1 if they are
out of order
template ltclass Tgt void BubbleSort(T data , int
n) for (int i0 iltn-1, i) for
(int j n-1 jgti --j) if dataj
lt dataj-1 swap (dataj,
dataj-1)
17
Example of Bubble Sort
Iteration 1 Start from the last element up to
the first element and bubble the smaller elements
up
Iteration 2 Start from the last element up to
second element and bubble the smaller elements up
18
Example of Bubble Sort
Iteration 3 Start from the last element up to
third element and bubble the smaller elements up
Iteration 4 Start from the last element up to
fourth element and bubble the smaller elements up
19
  • Complexity of Bubble Sort
  • The number of comparison and/or movements is the
    same in each case (best case, average case and
    worst case)
  • The number of comparison is equal to
  • Total (n-1) (n-2) (n-3) . 1
  • n(n-1)/2
  • which is O(n2)

20
  • Comparing the bubble sort with insertion and
    selection sorts we can say that
  • For the average case, bubble sort makes
    approximately twice as many comparisons and the
    same number of moves as insertion sort
  • Bubble sort also, on average, makes as many
    comparison as selection sort and n times more
    moves than selection sort
  • Between theses three types of sorts Insertion
    Sort is generally better algorithm because if
    array is already sorted running time only takes
    O(n) which is relatively faster than other
    algorithms

21
  • Shell Sort
  • Shell sort works on the idea that it is easier
    and faster to sort many short lists than it is to
    sort one large list
  • Select an increment value k (the best value for k
    is not necessarily clear)
  • Sort the sequence consisting of every kth element
    (use some simple sorting technique)
  • Decrement k and repeat above step until k1

22
Example of Shell Sort
Choose k 4 first
23
Example of Shell Sort
Now choose k 2, and then 1 by applying the
insertion sort
24
Algorithm of shell sort
ShellSort(data ,n) determine numbers ht,
ht-1, ..h1 of ways of dividing array data into
subarrays for (h ht tgt1 t--, hht )
divide data into h sub-array for
(i1 ilth i) sort sub-array
datai sort array data
  • Complexity of shell sort
  • Shell sort works well on data that is almost
    sorted O (n log2 n)
  • Deeper analysis of Shell sort is quite difficult
  • Can be shown is practice that it is O(n3/2)

25
Code for shell sort
template ltclass Tgt void ShellSort(T data , int
arrsize) int i, j, hCnt, h, k int
increments 20 // create appropriate number
of increments h for (h 1 i0 hltarrsize
i) increments i h h 3h
1 // loop on the number of different
increments h for (ii-1 igt0 i--) h
increments i // loop on the number
of sub-arrays h-sorted in ith pass for
(hCnth hCntlt2h hCnt) // insertion
sort for sub-array containing every hth element
of array data for (jhCntl
jltarrsize) T tmp dataj
k j while (k-hgt0 tmp
lt data k-h) datak
datak-h k k h
data k tmp
j j h
26
  • Heap Sort
  • Heap sort uses a heap as described in the earlier
    lectures
  • As we said before, a heap is a binary tree with
    the following two properties
  • Value of each node is not less than the values
    stored in each of its children
  • The tree is perfectly balanced and the leaves in
    the level are all in the leftmost positions

27
  • The procedure is
  • The data are transformed into a heap first
  • Doing this, the data are not necessarily sorted
    however, we know that the largest element is at
    the root
  • Thus, start with a heap tree,
  • Swap the root with the last element
  • Restore all elements except the last element into
    a heap again
  • Repeat the process for all elements until you are
    done

28
Algorithm and Code for Heap sort
HeapSort(data ,n) transform data into a
heap for (in-1 igt1 i--) swap the
root with the element in position i
restore the heap property for the tree data0
datai-1
template ltclass Tgt void HeapSort(T data , int
size) for (int i (size/2)-1 igt0
i--) MoveDown(data, i, size-1) // creates
the heap for (isize-1 igt1 --i)
Swap (data0, datai) // move the
largest item to datai MoveDown(data, 0,
i-1) // restores the heap
29
Example of Heap Sort
We first transform the data into heap
The initial tree is formed as follows
30
We turn the array into a heap first
31
(No Transcript)
32
(No Transcript)
33
Now we start to sort the elements
Swap the root with the last element
Restore the heap
34
Swap the root with the last element
Restore the heap
35
Swap the root with the last element
Restore the heap
36
Swap the root with the last element
Restore the heap
37
Swap the root with the last element
Restore the heap
38
Swap the root with the last element
Restore the heap
39
Swap the root with the last element
Restore the heap
40
Swap the root with the last element
Restore the heap
41
Place the elements into array using breadth first
traversal
42
  • Complexity of heap sort
  • The heap sort requires a lot of movement which
    can be inefficient for large objects
  • In the second phase when we start to sort the
    elements while keeping the heap, we exchange
    n-1 times the root with the element in position
    i and also restore the heap n-1 times which
    takes O(nlogn)
  • In general
  • The first phase, where we turn the array into
    heap, requires O(n) steps
  • And the second phase when we start to sort the
    elements requires
  • O(n-1) swap O(nlogn) operations to restore the
    heap
  • Total O(n) O(nlogn) O(n-1) O(nlogn)

43
  • Quick Sort
  • This is known to be the best sorting method.
  • In this scheme
  • One of the elements in the array is chosen as
    pivot
  • Then the array is divided into sub-arrays
  • The elements smaller than the pivot goes into one
    sub-array
  • The elements bigger than the pivot goes into
    another sub-array
  • The pivot goes in the middle of these two
    sub-arrays
  • Then each sub-array is partitioned the same way
    as the original array and process repeats
    recursively

44
Algorithm of quick sort
QuickSort(array ) if length (array) gt 1
choose a pivot // partition array into
array1 and array2 while there are
elements left in array include
elements either in array1 // if element lt pivot
or in array2 // if element gt
pivot QuickSort(array1)
QuickSort(array2)
  • Complexity of quick sort
  • The best case is when the arrays are always
    partitioned equally
  • For the best case, the running time is O(nlogn)
  • The running time for the average case is also
    O(nlogn)
  • The worst case happens if pivot is always either
    the smallest element in the array or largest
    number in the array.
  • In the worst case, the running time moves toward
    O(n2)

45
Code for quick sort
template ltclass Tgt void quicksort(T data , int
first, int last) int lower first 1 upper
last swap (datafirst, data(firstlast)/2))
T pivot data first while (lower lt
upper) while (datalower lt pivot)
lower while (pivot lt dataupper)
upper-- if (lower lt upper)
swap(datalower, dataupper--) else
lower swap (dataupper,
datafirst) if (first lt upper-1)
quicksort(data, first, upper-1) if (upper1 lt
last) quicksort(data, upper1, last)
templateltclass Tgt void quicksort(T data , int
n) if (nlt2) return for
(int i1, max0 iltn i) if (datamax
lt datai) max i
swap(datan-1, datamax) quicksort(data,
0, n-2)
46
Example of Quick Sort
  • By example
  • Select pivot
  • Partition

65
65
47
  • Recursively apply quicksort to both partitions
  • Result will ultimately be a sorted array

0 13 26 31 43 57 65 75 81 92
48
  • Radix Sort
  • Radix refers to the base of the number. For
    example radix for decimal numbers is 10 or for
    hex numbers is 16 or for English alphabets is 26.
  • Radix sort has been called the bin sort in the
    past
  • The name bin sort comes from mechanical devices
    that were used to sort keypunched cards
  • Cards would be directed into bins and returned to
    the deck in a new order and then redirected into
    bins again
  • For integer data, the repeated passes of a radix
    sort focus on the ones place value, then on the
    tens place value, then on the thousands place
    value, etc
  • For character based data, focus would be placed
    on the right-most character, then the second most
    right-character, etc

49
Algorithm and Code for Radix Sort Assuming the
numbers to be sorted are all decimal integers
RadixSort(array ) for (d 1 d lt the
position of the leftmost digit of longest number
i) distribute all numbers among piles 0
through 9 according to the dth digit Put
all integers on one list
void radixsort(long data , int n) int i,
j, k, mask 1 const int radix 10 //
because digits go from 0 to 9 const int digits
10 Queueltlonggt queuesradix for (i0,
factor 1, i lt digits factor factorradix,
i) for (j0 jltn j)
queues (dataj / factor ) radix .enqueue
(dataj) for (jk0 j lt radix j)
while (!queuesj.empty())
datak queuesj.dequeue()
50
  • Example of Radix Sort
  • Assume the data are
  • 459 254 472 534 649 239 432 654 477
  • Radix sort will arrange the values into 10 bins
    based upon the ones place value

0 1 2 472 432 3 4 254 534
654 5 6 7 477 8 9 459 649 239
51
  • The sublists are collected and made into one
    large bin (in order given)
  • 472 432 254 534 654 477 459 649 239
  • Then Radix sort will arrange the values into 10
    bins based upon the tens place value

0 1 2 3 432 534 239 4 649 5 254 654
459 6 7 472 477 8 9
52
  • The sublists are collected and made into one
    large bin (in order given)
  • 432 534 239 649 254 654 459 472 477
  • Radix sort will arrange the values into 10 bins
    based upon the hundreds place value (done!)

0 1 2 239 254 3 4 432 459 472
477 5 534 6 649 654 7 8 9
  • The sublists are collected and the numbers are
    sorted
  • 239 254 432 459 472 477 534 649 654

53
  • Another Example of Radix Sort
  • Assume the data are
  • 9 54 472 534 39 43 654 77
  • To make it simple, rewrite the numbers to make
    them all three digits like
  • 009 054 472 534 039 043 654 077
  • Radix sort will arrange the values into 10 bins
    based upon the ones place value

0 1 2 472 3 043 4 054 534
654 5 6 7 077 8 9 009 039
54
  • The sublists are collected and made into one
    large bin (in order given)
  • 472 043 054 534 654 077 009 039
  • Then Radix sort will arrange the values into 10
    bins based upon the tens place value

0 009 1 2 3 534 039 4 043 5 054
654 6 7 472 077 8 9
55
  • The sublists are collected and made into one
    large bin (in order given)
  • 009 534 039 043 054 654 472 077
  • Radix sort will arrange the values into 10 bins
    based upon the hundreds place value (done!)

0 009 039 043 054 077 1 2 3 4 472
5 534 6 654 7 8 9
  • The sublists are collected and the numbers are
    sorted
  • 009 039 043 054 077 472 534 654

56
  • Assume the data are
  • area book close team new place prince
  • To sort the above elements using the radix sort
    you need to have 26 buckets, one for each
    character.
  • You also need one more character to represent
    space which has the lowest value. Suppose that
    letter is question-mark ? and it is used to
    represent space
  • You can rewrite the data as follows
  • area? Book? Close Team? New?? Place Print
  • Now all letters have 5 characters and it is easy
    to compare them with each other
  • To do the sorting, you can start from the right
    most character, place the data into appropriate
    buckets and collect them. Then place them into
    bucket based on the second right most character
    and collect them again and so on.

57
  • Complexity of Radix Sort
  • The complexity is O(n)
  • However, keysize (for example, the maximum number
    of digits) is a factor, but will still be a
    linear relationship because for example for at
    most 3 digits 3n is still O(n) which is linear
  • Although theoretically O(n) is an impressive
    running time for sort, it does not include the
    queue implementation
  • Further, if radix r (the base) is a large number
    and a large amount of data has to be sorted, then
    radix sort algorithm requires r queues of at most
    size n and the number rn is O(rn) which can be
    substantially large depending of the size of r.
Write a Comment
User Comments (0)
About PowerShow.com