Chapter 07 Internal Sorting - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

Chapter 07 Internal Sorting

Description:

Radix Sort Example. 31. Radix Sort Cost. Cost: Q(nk rk) How do n, k, and r ... Thus, Radix Sort is Q(n log n) in general case. 32. Empirical Comparison (1) 33 ... – PowerPoint PPT presentation

Number of Views:159
Avg rating:3.0/5.0
Slides: 44
Provided by: clif63
Category:

less

Transcript and Presenter's Notes

Title: Chapter 07 Internal Sorting


1
Chapter 07 Internal Sorting
CS350-Data Structures
Text by Clifford Shaffer Modified by Dr. Martins
  • Chaminade University of Honolulu
  • Department of Computer Science

2
Sorting
  • Each record contains a field called the key.
  • Linear order comparison.
  • Measures of cost
  • Comparisons
  • Swaps

3
Insertion Sort (1)
4
Insertion Sort (2)
  • template ltclass Elem, class Compgt
  • void inssort(Elem A, int n)
  • for (int i1 iltn i)
  • for (int ji (jgt0)
  • (Complt(Aj, Aj-1)) j--)
  • swap(A, j, j-1)
  • Best Case
  • Worst Case
  • Average Case

5
Bubble Sort (1)
6
Bubble Sort (2)
  • template ltclass Elem, class Compgt
  • void bubsort(Elem A, int n)
  • for (int i0 iltn-1 i)
  • for (int jn-1 jgti j--)
  • if (Complt(Aj, Aj-1))
  • swap(A, j, j-1)
  • Best Case
  • Worst Case
  • Average Case

7
Selection Sort (1)
8
Selection Sort (2)
  • template ltclass Elem, class Compgt
  • void selsort(Elem A, int n)
  • for (int i0 iltn-1 i)
  • int lowindex i // Remember its index
  • for (int jn-1 jgti j--) // Find least
  • if (Complt(Aj, Alowindex))
  • lowindex j // Put it in place
  • swap(A, i, lowindex)
  • Best Case
  • Worst Case
  • Average Case

9
Pointer Swapping
10
Summary
11
Exchange Sorting
  • All of the sorts so far rely on exchanges of
    adjacent records.
  • What is the average number of exchanges required?
  • There are n! permutations
  • Consider permuation X and its reverse, X
  • Together, every pair requires n(n-1)/2 exchanges.

12
Shellsort
13
Shellsort
  • // Modified version of Insertion Sort
  • template ltclass Elem, class Compgt
  • void inssort2(Elem A, int n, int incr)
  • for (int iincr iltn iincr)
  • for (int ji
  • (jgtincr)
  • (Complt(Aj, Aj-incr)) j-incr)
  • swap(A, j, j-incr)
  • template ltclass Elem, class Compgt
  • void shellsort(Elem A, int n) // Shellsort
  • for (int in/2 igt2 i/2) // For each incr
  • for (int j0 jlti j) // Sort sublists
  • inssort2ltElem,Compgt(Aj, n-j, i)
  • inssort2ltElem,Compgt(A, n, 1)

14
Quicksort
  • template ltclass Elem, class Compgt
  • void qsort(Elem A, int i, int j)
  • if (j lt i) return // List too small
  • int pivotindex findpivot(A, i, j)
  • swap(A, pivotindex, j) // Put pivot at end
  • // k will be first position on right side
  • int k
  • partitionltElem,Compgt(A, i-1, j, Aj)
  • swap(A, k, j) // Put pivot in place
  • qsortltElem,Compgt(A, i, k-1)
  • qsortltElem,Compgt(A, k1, j)
  • template ltclass Elemgt
  • int findpivot(Elem A, int i, int j)
  • return (ij)/2

15
Quicksort Partition
  • template ltclass Elem, class Compgt
  • int partition(Elem A, int l, int r,
  • Elem pivot)
  • do // Move the bounds inward
    until they meet
  • while (Complt(Al, pivot)) // Move
    l right and
  • while ((r ! 0) Compgt(A--r, pivot))
    // r left
  • swap(A, l, r) // Swap out-of-place values
  • while (l lt r) // Stop when they cross
  • swap(A, l, r) // Reverse last swap
  • return l // Return first pos on right
  • The cost for partition is Q(n).

16
Partition Example
17
Quicksort Example
18
Cost of Quicksort
  • Best case Always partition in half.
  • Worst case Bad partition.
  • Average case
  • T(n) n 1 1/(n-1) ?(T(k) T(n-k))
  • Optimizations for Quicksort
  • Better Pivot
  • Better algorithm for small sublists
  • Eliminate recursion

n-1
k1
19
Mergesort
  • List mergesort(List inlist)
  • if (inlist.length() lt 1)return inlist
  • List l1 half of the items from inlist
  • List l2 other half of items from inlist
  • return merge(mergesort(l1),
  • mergesort(l2))

20
Mergesort Implementation
  • template ltclass Elem, class Compgt
  • void mergesort(Elem A, Elem temp,
  • int left, int right)
  • int mid (leftright)/2
  • if (left right) return
  • mergesortltElem,Compgt(A, temp, left, mid)
  • mergesortltElem,Compgt(A, temp, mid1, right)
  • for (int ileft iltright i) // Copy
  • tempi Ai
  • int i1 left int i2 mid 1
  • for (int currleft currltright curr)
  • if (i1 mid1) // Left exhausted
  • Acurr tempi2
  • else if (i2 gt right) // Right exhausted
  • Acurr tempi1
  • else if (Complt(tempi1, tempi2))
  • Acurr tempi1
  • else Acurr tempi2

21
Optimized Mergesort
  • template ltclass Elem, class Compgt
  • void mergesort(Elem A, Elem temp,
  • int left, int right)
  • if ((right-left) lt THRESHOLD)
  • inssortltElem,Compgt(Aleft,right-left1)
  • return
  • int i, j, k, mid (leftright)/2
  • if (left right) return
  • mergesortltElem,Compgt(A, temp, left, mid)
  • mergesortltElem,Compgt(A, temp, mid1, right)
  • for (imid igtleft i--) tempi Ai
  • for (j1 jltright-mid j)
  • tempright-j1 Ajmid
  • for (ileft,jright,kleft kltright k)
  • if (tempi lt tempj) Ak tempi
  • else Ak tempj--

22
Mergesort Cost
  • Mergesort cost
  • Mergsort is also good for sorting linked lists.

23
Heapsort
  • template ltclass Elem, class Compgt
  • void heapsort(Elem A, int n) // Heapsort
  • Elem mval
  • maxheapltElem,Compgt H(A, n, n)
  • for (int i0 iltn i) // Now sort
  • H.removemax(mval) // Put max at end
  • Use a max-heap, so that elements end up sorted
    within the array.
  • Cost of heapsort
  • Cost of finding K largest elements

24
Heapsort Example (1)
25
Heapsort Example (2)
26
Binsort (1)
  • A simple, efficient sort
  • for (i0 iltn i)
  • BAi Ai
  • Ways to generalize
  • Make each bin the head of a list.
  • Allow more keys than records.

27
Binsort (2)
  • template ltclass Elemgt
  • void binsort(Elem A, int n)
  • ListltElemgt BMaxKeyValue
  • Elem item
  • for (i0 iltn i) BAi.append(Ai)
  • for (i0 iltMaxKeyValue i)
  • for (Bi.setStart()
  • Bi.getValue(item) Bi.next())
  • output(item)
  • Cost

28
Radix Sort (1)
29
Radix Sort (2)
  • template ltclass Elem, class Compgt
  • void radix(Elem A, Elem B,
  • int n, int k, int r, int cnt)
  • // cnti stores of records in bini
  • int j
  • for (int i0, rtok1 iltk i, rtokr)
  • for (j0 jltr j) cntj 0
  • // Count of records for each bin
  • for(j0 jltn j) cnt(Aj/rtok)r
  • // cntj will be last slot of bin j.
  • for (j1 jltr j)
  • cntj cntj-1 cntj
  • for (jn-1 jgt0 j--)\
  • B--cnt(Aj/rtok)r Aj
  • for (j0 jltn j) Aj Bj

30
Radix Sort Example
31
Radix Sort Cost
  • Cost Q(nk rk)
  • How do n, k, and r relate?
  • If key range is small, then this can be Q(n).
  • If there are n distinct keys, then the length of
    a key must be at least log n.
  • Thus, Radix Sort is Q(n log n) in general case

32
Empirical Comparison (1)
33
Empirical Comparison (2)
34
Sorting Lower Bound
  • We would like to know a lower bound for all
    possible sorting algorithms.
  • Sorting is O(n log n) (average, worst cases)
    because we know of algorithms with this upper
    bound.
  • Sorting I/O takes ?(n) time.
  • We will now prove ?(n log n) lower bound for
    sorting.

35
Decision Trees
36
Lower Bound Proof
  • There are n! permutations.
  • A sorting algorithm can be viewed as determining
    which permutation has been input.
  • Each leaf node of the decision tree corresponds
    to one permutation.
  • A tree with n nodes has W(log n) levels, so the
    tree with n! leaves has W(log n!) W(n log n)
    levels.
  • Which node in the decision tree corresponds to
    the worst case?

37
Primary vs. Secondary Storage
  • Primary storage Main memory (RAM)
  • Secondary Storage Peripheral devices
  • Disk drives
  • Tape drives

38
Comparisons
  • RAM is usually volatile.
  • RAM is about 1/4 million times faster than disk.

39
Golden Rule of File Processing
  • Minimize the number of disk accesses!
  • 1. Arrange information so that you get what you
    want with few disk accesses.
  • 2. Arrange information to minimize future disk
    accesses.
  • An organization for data on disk is often called
    a file structure.
  • Disk-based space/time tradeoff Compress
    information to save processing time by reducing
    disk accesses.

40
Alternate Implementation
  • As with Dijkstras algorithm, the key issue is
    determining which vertex is next closest.
  • As with Dijkstras algorithm, the alternative is
    to use a priority queue.
  • Running times for the two implementations are
    identical to the corresponding Dijkstras
    algorithm implementations.

41
Kruskals MST Algorithm (1)
  • Initially, each vertex is in its own MST.
  • Merge two MSTs that have the shortest edge
    between them.
  • Use a priority queue to order the unprocessed
    edges. Grab next one at each step.
  • How to tell if an edge connects two vertices
    already in the same MST?
  • Use the UNION/FIND algorithm with parent-pointer
    representation.

42
Kruskals MST Algorithm (2)
43
Kruskals MST Algorithm (3)
  • Cost is dominated by the time to remove edges
    from the heap.
  • Can stop processing edges once all vertices are
    in the same MST
  • Total cost Q(V E log E).
Write a Comment
User Comments (0)
About PowerShow.com