CS 3343: Analysis of Algorithms - PowerPoint PPT Presentation

Loading...

PPT – CS 3343: Analysis of Algorithms PowerPoint presentation | free to view - id: 1c6fd6-M2FjZ



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

CS 3343: Analysis of Algorithms

Description:

The ith order statistic in a set of n elements is the ith ... Yes, due to Blum, Floyd, Pratt, Rivest, and Tarjan [1973]. Worst-case linear-time selection ... – PowerPoint PPT presentation

Number of Views:82
Avg rating:3.0/5.0
Slides: 26
Provided by: jianhu
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: CS 3343: Analysis of Algorithms


1
CS 3343 Analysis of Algorithms
  • Lecture 14 Order Statistics

2
Order statistics
  • The ith order statistic in a set of n elements is
    the ith smallest element
  • The minimum is thus the 1st order statistic
  • The maximum is the nth order statistic
  • The median is the n/2 order statistic
  • If n is even, there are 2 medians
  • How can we calculate order statistics?
  • What is the running time?

3
Order statistics selection problem
  • Select the ith smallest of n elements
  • Naive algorithm Sort.
  • Worst-case running time Q(n log n)
  • using merge sort or heapsort (not quicksort).
  • We will show
  • A practical randomized algorithm with Q(n)
    expected running time
  • A cool algorithm of theoretical interest only
    with Q(n) worst-case running time

4
Recall Quicksort
  • The function Partition gives us the rank of the
    pivot
  • If we are lucky, k i. done!
  • If not, at least get a smaller subarray to work
    with
  • k gt i ith smallest is on the left subarray
  • k lt i ith smallest is on the right subarray
  • Divide and conquer
  • If we are lucky, k close to n/2, or desired is
    in smaller subarray
  • If unlucky, desired is in larger subarray
    (possible size n-1)

5
Randomized divide-and-conquer algorithm
RAND-SELECT(A, p, q, i) ? i th smallest of A p .
. q if p q i gt 1 then error! r ?
RAND-PARTITION(A, p, q) k ? r p 1 ? k
rank(Ar) if i k then return A r if i lt k
then return RAND-SELECT( A, p, r 1, i ) else
return RAND-SELECT( A, r 1, q, i k )
6
Randomized Partition
  • Randomly choose an element as pivot
  • Every time need to do a partition, throw a die to
    decide which element to use as the pivot
  • Each element has 1/n probability to be selected

Rand-Partition(A, p, q) d random() //
draw a random number between 0 and 1 index
p floor((q-p1) d) // pltindexltq
swap(Ap, Aindex) Partition(A, p, q)
// now use Ap as pivot
7
Example
Select the i 6th smallest
i 6
7
10
5
8
11
3
2
13
pivot
8
Complete example select the 6th smallest element.
7
10
5
8
11
3
2
13
i 6
Note here we always used first element as pivot
to do the partition (instead of rand-partition).
9
Intuition for analysis
(All our analyses today assume that all elements
are distinct.)
Lucky
T(n) T(9n/10) Q(n) Q(n)
CASE 3
10
Running time of randomized selection
T(max(0, n1)) n if 0 n1 split, T(max(1,
n2)) n if 1 n2 split, M T(max(n1, 0))
n if n1 0 split,
T(n)
  • For upper bound, assume ith element always falls
    in larger side of partition
  • The expected running time is an average of all
    cases

Expectation
11
Substitution method
Want to show T(n) O(n). So need to prove T(n)
cn for n gt n0
Assume T(k) ck for all k lt n
if c 4
Therefore, T(n) O(n)
12
Summary of randomized selection
  • Works fast linear expected time.
  • Excellent algorithm in practice.
  • But, the worst case is very bad Q(n2).

Q. Is there an algorithm that runs in linear time
in the worst case?
13
Worst-case linear-time selection
Same as RAND-SELECT
14
Choosing the pivot
15
Choosing the pivot
  • Divide the n elements into groups of 5.

16
Choosing the pivot
  • Divide the n elements into groups of 5. Find the
    median of each 5-element group by rote.

17
Choosing the pivot
x
  • Divide the n elements into groups of 5. Find the
    median of each 5-element group by rote.
  • Recursively SELECT the median x of the ë n/5û
    group medians to be the pivot.

18
Analysis
x
At least half the group medians are x, which is
at least ë ë n/5û /2û ë n/10û group medians.
19
Analysis
x
  • At least half the group medians are x, which is
    at least ë ë n/5û /2û ë n/10û group medians.
  • Therefore, at least 3 ë n/10û elements are x.

(Assume all elements are distinct.)
20
Analysis
x
  • At least half the group medians are x, which is
    at least ë ë n/5û /2û ë n/10û group medians.
  • Therefore, at least 3 ë n/10û elements are x.
  • Similarly, at least 3 ë n/10û elements are ³ x.

21
Analysis
Need at most for worst-case runtime
  • At least 3 ë n/10û elements are x ? at most
    n-3 ë n/10û elements are ? x
  • At least 3 ë n/10û elements are ? x ? at most
    n-3 ë n/10û elements are ? x
  • The recursive call to SELECT in Step 4 is
    executed recursively on at most n-3 ë n/10û
    elements.

22
Analysis
  • Use fact that ë a/bû gt a/b-1
  • n-3 ë n/10û lt n-3(n/10-1) ? 7n/10 3
  • ? 3n/4 if n 60
  • The recursive call to SELECT in Step 4 is
    executed recursively on at most 7n/103 elements.

23
Developing the recurrence
T(n)
Q(n)
T(n/5)
Q(n)
T(7n/103)
24
Solving the recurrence
Assumption T(k) ck for all k lt n
if n 60
if c 20 and n 60
25
Conclusions
  • Since the work at each level of recursion is
    basically a constant fraction (19/20) smaller,
    the work per level is a geometric series
    dominated by the linear work at the root.
  • In practice, this algorithm runs slowly, because
    the constant in front of n is large.
  • The randomized algorithm is far more practical.

Exercise Try to divide into groups of 3 or
7. Exercise Think about an application in
sorting.
About PowerShow.com