Title: Searching
1Searching
- The truth is out there ...
2Serial Search
- Brute force algorithm examine each array item
sequentially until either - the item is found
- all items have been examined
- Algorithm is easy to code and works OK for small
data sets
3Code example for serial search
// precondition none // postcondition searches
an array of N items for target value // returns
true if target found, false if not template
ltclass itemgt bool SerialSearch (item array,
size_t N, item target) bool found
false for (size_t x0 (x lt N) (!found)
x) if (arrayx target) found
true return found
4Time analysis of serial search
- Worst case serial search is O(N) -- if item not
found, have to go through whole array before this
can be verified - Best case O(1) -- target value found at array0
- Average case O((N1)/2) -- basically still O(N),
but about 1/2 the time required for worst case
5Binary search
- Much faster than serial search
- Works only if data are sorted
- Uses divide conquer approach with recursive
calls - check value at midpoint if not target then
- if greater than target, make recursive call to
search upper half of structure - if less than target, recursively search lower
half
6Implementation of binary search
// precondition none // postcondition searches
an array of N items for target value // returns
true if target found, false if not template
ltclass itemgt void BinarySearch(item array,
size_t first, size_t size, item target, bool
found, size_t location) // parameters array is
the array to be searched, // first is the first
index to be considered, // size is the number of
items in search group // target is the value
being sought, // found is the success/failure
flag // location is the index of the entry
containing the // target value, if found
7Binary search code continued
// start of function size_t middle // index
of midpoint of current search area if (size
0) found false // base case else middle
first size / 2 if (target
arraymiddle) location middle found
true
8Binary search code continued
// target not found at current midpoint --
search appropriate half else if (target lt
arraymiddle) BinarySearch (array, first,
size/2, target, found, location) // searches
from start of array to index before midpoint
else BinarySearch (array, middle1,
(size-1)/2, target, found, location) //
searches from index after midpoint to end of
array // ends outer else // ends function
9Binary search in action
Suppose you have a 13-member array of sorted
numbers
14
47
59
71
82
151
5
23
99
108
113
130
172
0 1 2 3 4 5 6
7 8 9 10 11 12
Searching for value 113 Initial function
call first 0, size 13,
middle 6
10Binary search in action
14
47
59
71
82
151
5
23
99
108
113
130
172
0 1 2 3 4 5 6
7 8 9 10 11 12
Searching for value 113 Initial function
call first 0, size 13,
middle 6
Since 113 ! 82, make recursive
call BinarySearch (array, middle1, (size-1)/2,
target, found, location)
11Binary search in action
14
47
59
71
82
151
5
23
99
108
113
130
172
0 1 2 3 4 5 6
7 8 9 10 11 12
Searching for value 113 Recursive call(1) first
7, size 6, middle 10
12Binary search in action
14
47
59
71
82
151
5
23
99
108
113
130
172
0 1 2 3 4 5 6
7 8 9 10 11 12
Searching for value 113 Recursive call(1) first
7, size 6, middle 10
Since 113 ! 130, make recursive
call BinarySearch(array, first, size/2, target,
found, location)
13Binary search in action
14
47
59
71
82
151
5
23
99
108
113
130
172
0 1 2 3 4 5 6
7 8 9 10 11 12
Searching for value 113 Recursive call(2) first
7, size 3, middle 8
14Binary search in action
14
47
59
71
82
151
5
23
99
108
113
130
172
0 1 2 3 4 5 6
7 8 9 10 11 12
Searching for value 113 Recursive call(2) first
7, size 3, middle 8
Since 113 ! 108, make recursive
call BinarySearch(array, middle1, (size1)/2,
target, found, location)
15Binary search in action
14
47
59
71
82
151
5
23
99
108
113
130
172
0 1 2 3 4 5 6
7 8 9 10 11 12
Searching for value 113 Recursive call(3) first
9, size 1, middle 9
16Binary search in action
14
47
59
71
82
151
5
23
99
108
113
130
172
0 1 2 3 4 5 6
7 8 9 10 11 12
Searching for value 113 Recursive call(3) first
9, size 1, middle 9
Since 113 113, target is found found true,
location 9
17Binary Search Analysis
- Worst-case scenario item is not in the array
- algorithm keeps searching smaller subarrays
- eventually, array size will be 0, and the search
will stop - Analysis requires computing time needed for
operations in function as well as amount of time
for recursive calls - We will analyze the algorithms performance in
the worst case
18Step 1 count operations
- Test base case if (size0) 1 operation
- Compute midpoint
- middle first size/2 3 operations
- Test for target at midpoint
- if (target arraymiddle) 2 operations
- Test for which recursive call to make
- if (target lt arraymiddle) 2 operations
- Recursive call - requires some arithmetic and
argument passing - estimate 10 operations
19Step 2 analyze cost of recursion
- Each recursive call is preceded by 18 (or fewer)
operations - Multiply this number by the depth of recursive
calls and add the number of operations performed
in the stopping case to determine worst-case
running time (T(n)) - T(n) 18 depth of recursion 3
20Step 3 estimate depth of recursion
- Calculate upper bound approximation for depth of
recursion may slightly overestimate, but will
not underestimate actual value - Each recursive call is made on an array segment
that contains, at most, N/2 elements - Subsequent calls are always made on size/2
- Thus, depth of recursion is, at most, the number
of times N can be divided by 2 with a result gt 1
21Estimating depth of recursion
- Referring to the number of times N is divisible
by 2 with result gt 1 as H(n), or the halving
function, the time expression becomes - T(n) 18 H(n) 3
- H(n) turns out to be almost exactly equal to
log2n H(n) log2n meaning that fractional
results are rounded down to the nearest whole
number (e.g. 3.7 3) -- this notation is called
the floor function
22Worst-case time for binary search
- Substituting the floor function of the logarithm
for H(n), the time expression becomes - T(n) 18 ( log2n ) 3
- Throwing out the constants, the worst-case
running time (big O) function is O(log n)
23Significance of logarithms (again)
- Logarithmic algorithms are very fast because log
n is much smaller than n - The larger the data set, the more dramatic the
difference becomes - log28 3
- log264 6
- log21000 lt 10
- log21,000,000 lt 20
24For binary search algorithm...
- To search a 1000 element array will require no
more than 183 operations in the worst case - To search a 1,000,000 element array will require
less than 400 operations in the worst case