# Algorithmics and Complexity - PowerPoint PPT Presentation

PPT – Algorithmics and Complexity PowerPoint presentation | free to download - id: 71c8af-YzYxO The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
Title:

## Algorithmics and Complexity

Description:

### Algorithmics and Complexity In this lecture: The limits of algorithms: some problems are unsolvable. How do we measure the efficiency of an algorithm? – PowerPoint PPT presentation

Number of Views:128
Avg rating:3.0/5.0
Slides: 71
Provided by: JOHNLE163
Category:
Tags:
Transcript and Presenter's Notes

Title: Algorithmics and Complexity

1
Algorithmics and Complexity
• In this lecture
• The limits of algorithms some problems are
unsolvable.
• How do we measure the efficiency of an algorithm?
• Improvement by factor and by order of magnitude
• Some examples of complexity analysis
• Intractable problems

2
Can Computers Solve Every Problem?
• It seems that computers are powerful enough as to
enable us to solve any problem by writing the
appropriate program.
• It may (or may not) seem quite surprising to
know, that there are problems which cannot be
solved by any computer!
• Such problems were discovered and studied by the
mathematician Alan Turing, the most famous of
which is the Halting Problem (1937).

3
The Halting Problem
• Given a program P and an input x, does the
program P halt on the input x?
• One can imagine a method
• booelan doesHalt(P,x)
• that does the following
• reads the program P (which is just a text file)
• runs an algorithm which determines if the program
P halts on the input x
• returns true if P halts on the specified input x
• returns false if P does not halt on the
specified input x

4
The Halting Problem
• Define a program
• testHalt(P)
• if (doesHalt(P,P))
• loop forever
• else
• print halt
• What happens if we run testHalt, and give it as
input testHalt itself?
• testHalt(testHalt)

5
• Assume testHalt(testHalt) terminates and prints
halt
• This means doesHalt(testHalt,testHalt) returned
false, which in turn means that testHalt does not
terminate on the input testHalt! A contradiction!
• Assume testHalt(testHalt) loops forever
• This means doesHalt(testHalt,testHalt) returned
true, which in turn means that testHalt does
terminate on the input testHalt! Another
• Conclusion Our assumption, that there exists a
method doesHalt, which determines if a program
halts on a specific input, is wrong!
• In computer science terms, We say that the
Halting Problem is undecidable. (???? ?????).

6
Halting Problem - The Bright Side
• We have proved that no algorithm can solve the
halting problem.
• In contrast to the halting problem, we have
• there are problems which can be solved
algorithmically
• There may be more than one way to solve a
particular problem (sorting).

7
Algorithmic Questions
• When we are given a specific problem, there are
• Are there algorithms which solve this problem?
(????????)
• Given an algorithm which solves the problem, how
can we be convinced that the algorithm is
correct?
• How good is an algorithm which solves the
problem?
• Is it efficient in terms of processing steps
(time)?
• Is it efficient in terms of storage space
(memory)?
• How do we measure efficiency? (????????)

8
More Algorithmic Questions
• More questions we can ask
• Is there something we can say about every
algorithm which solves the problem?
• For example, every algorithm must take at least x
processing steps, etc
• When we implement the algorithm on a computer,
will the problem be solved within a reasonable
time? What is reasonable, anyway?
• Phone lookup (144) - few seconds
• Weather forecast - max. one day
• Cruise missiles - real time (a late answer is
useless)
• Physical simulations - few days? Few weeks?
Perhaps more?

9
Time Efficiency
• How do we measure time efficiency?
• Assume we have a problem P to solve, with two
algorithms A1 and A2 that solve it.
• We wish to compare A1 and A2s efficiency.
• What do you think about the following efficiency
test?

The algorithms were implemented on a computer,
and their running time was measured Algorithm A1
- 1.25 seconds Algorithm A2 - 0.34
seconds Conclusion Algorithm A2 is better!
10
Time Efficiency Questions We Must Ask
• Were the algorithms tested on the same computer?
• Is there a benchmark computer on which we test
algorithms?
• What were the inputs given to the algorithm? Were
the inputs equal? Of equal size?
• Is there a better way for measuring time
efficiency, independent of a particular computer?

11
Input Size
• The running time of an algorithm is dependent
upon the amount of work is has to perform,
which in turn is a function of the size of input
given to the algorithm
• In an array sorting algorithm - number of cells
to sort
• In an algorithm for finding a word in a text -
number of characters, or number of words
• In an algorithm that tests if a number is prime
size of number (number of bits which represent
the number, or number of digits)

12
Efficiency Measure - First Attempt
• A reasonable way to measure the time efficiency
of an algorithm could be
• find out how many steps the algorithm performs
for every input size ( as a function of the
input size).
• What could those steps be?
• Anything we find reasonable, as long as we know
those steps take approximately constant time
to run, that is, their running time is not a
function of the input size

13
Algorithmic Steps Examples
• In the bubble sort algorithm switch two adjacent
cells.
• In a generic algorithm for finding an element
in an array
• Do until stop
• Find out what is the next cell to look at (or
stop)
• Find out if the element were looking for is in
this cell
• In an algorithm for testing if a number x is
prime find out if y divides x.
• In a classic algorithm for multiplying two
numbers multiply digits / add digits.
• Note that all these steps take constant time to
perform, which is not dependent upon the size of
input.

14
• It is not dependent on a particular computer.
• If we wish to figure out what will be the running
time of the algorithm on a particular computer,
well just have to
• Estimate how long does it take to perform the
basic steps weve defined on the particular
computer
• multiply this measurement by the number of steps
weve calculated for a specific input size.

15
Example Character Search
• Problem Find out if the character c occurs in a
given text.
• Solution 1

found ? false while (more characters to read and
found false) read the next character in the
text if this character is c, found ? true. If
print(found).
16
Solution 1 Time Analysis
• Input size?
• Number of characters in text
• What is the basic step?
• Find out if end of text has been reached
• read next character in text
• Test if character is c
• What is the running time as function of input
length n?
• Depends on the particular text. But, in the worse
case, no more than n basic steps constant
(operations before and after loop).
• T(n) lt c1n d1

17
Character Search Simple Optimization
• Solution 2

found ? false add c to end of text while (found
false) read the next character in the text if
this character is c, found ? true. If (end of
print(found). Remove c from end of text.
18
Solution 2 Time Analysis
• Solution 2 analysis is more or less the same,
however the basic step is different
• read next character in text
• Test if character is c
• In the worse case, the running time of Solution 2
as a function of n is
• T(n) lt c2n d2
• This time, c2 and d2 are different. (c1 gt c2 ,
d2 gt d1).
• In solution 2, we have
• shortened the time it takes to perform the basic
step, but
• added a constant to the overall running time

19
Running Time Tables
Input Size 1 3 5 10 100 1000 30000 3000000
3n 2 5 11 17 32 302 3002 90002 9000002
2n 4 6 10 14 24 204 2004 60004 6000004
ratio 0.83 1.1 1.21 1.33 1.48 1.5 1.5 1.5
20
Improvement by Factor
• In short texts, Solution 1 is better than
Solution 2 (the improved solution), however
• As the text length grows, the constants d1 and d2
become less and less important, and the ratio
converges to 1.5.
• Such improvement is called an improvement by
factor, since the ratio between the running times
of both solutions, as n grows, converges to a
constant.

21
A Word about Best, Average and Worse cases
• Note that when we have counted the number of
steps, we have analyzed the worse case, in which
the character c is not in the text.
• Other measurement Average case.
• What is the advantage of measuring the worse
case?
• The average case is a good measurement, however,
for a specific input length n, we have no idea
what the running time will be.
• Computing the average case is quite complex.
• What information does Best Case analysis give us?

22
Improvement by Factor Is it Important?
• We shall soon see that many times we can do
better than improving the running time by a
factor
• However, improvement by factor is still
important
• If we make an effort at optimizing specific
bottleneck areas in a program, we may gain a
lot
• Special programs called profilers help us in
pinpointing the hot areas in a program.

The 80/20 rule (or 90/10 rule) A program spends
80 of its time executing 20 of its code.
23
Finding a Phone Number in a Phonebook
• Problem Find if a number x appears in a sorted
array of numbers (e.g., a phonebook).
• This problem is similar to the character search
problem.
• The algorithms we have already seen can be used
to solve this problem both algorithms are quite
similar, and are a variant of the serial search
method.
• Other possible optimizations?
• However, the assumption that the array is sorted
can be used in a clever way.

24
How Does One Find a Lion in the Desert?
25
Binary Search
• Cut out half of the search space in every step.
• The basic step in binary search
• Find out if the current cell contains the number
were looking for
• Termination condition find out if the range is
of size 1
• If 1 and 2 is false, calculate the next cell to
look for (index middle cell in current range)
• The basic step in serial search
• Find out if the current cell contains the number
were looking for
• Termination condition find out if we have
reached the end of array
• If 1 and 2 is false, calculate the next cell to
look for (index index 1)

26
Binary Search - Example
• If the array is of size 1000, in the worse case,
we will be looking at ranges of size
1000,500,250,125,63,32,16,8,4,2,1, total of 10
steps.
• Compare to serial search 1000 steps!
• With million cells, we will be looking at 20
cells in the worse case
• How many cells in the general case?

27
Binary vs. Serial - Number of Steps
Input Size 10 100 1000 10000 100000 1000000
serial 10 100 1000 10000 100000 1000000
binary 4 7 10 14 17 20
ratio 2.5 14 100 714 5883 50000
28
Improvement by Order of Magnitude
• Recall, that when we have dealt with improvement
in factor, the ratio between running times was
constant.
• This time, we can evidently see the the ratio
between the number of steps is growing as the
input size grows.
• This kind of improvement is called improvement by
order of magnitude.

29
What About the Duration of Basic Step?
• When we have dealt with improvement in factor,
the duration of a basic step was very
interesting.
• Is it of importance now?
• Or, put in other words Assume that the duration
of a single step in serial search is 1 and that
a single step in binary search takes 1000, would
there still be an improvement?

30
Binary vs. Serail - Different Duration of Steps
Input Size 10 100 1000 10000 100000 1000000
serial 10 100 1000 10000 100000 1000000
binary 400 700 1000 1400 1700 2000
ratio 0.025 0.14 1 7.14 58.8 500
31
Duration of Basic Step is Negligible
• As we can see from the table, for small input
sizes ( lt 1000), serial search is indeed better
• However, for larger input sizes, binary search
still wins.
• The reason is very simple the ratio between
duration of basic steps is constant, while the
ratio between the number of basic steps grows as
the input size grows.
• Note that in practice, the ratio between the
basic steps in binary/serial search will be much
smaller.

32
Order of Magnitude
• We have seen two basic kind of improvements in
running time of an algorithm by factor, and by
order of magnitude.
• The latter improvement is much more meaningful.
• This is why many times we want to neglect the
small differences between two running time
functions and get an impression of what is the
dominant element in the functions.

33
Linear Order
• For example, in serial search, any running time
function will be of the form f(n) an b, which
is called a linear function.
• The ratio between any two linear functions is
constant for large enough n.
• This is why we say that the running time
functions are of linear order, or that the
complexity of the algorithms is linear.
• Linear order can be symbolized by O(n). We say
that f(n) O(n). This is called the Big-O
notation.

34
Order of Magnitude
• In general, we say that two functions are of the
same order if the ratio between their values is
constant for large enough n.
• Example f(n) n2 not of linear order! It is of
• All these functions are of quadratic order
• n2 5n2 6 5n2 100n - 90 5000n2
n2/6
• Other orders of magnitude O(log n) -
logarithmic, O(nk) (k gt2) - polynomial, O(2n) -
exponential.
• Polynomial and exponential are very important
orders of magnitude, and we shall see why later.

35
Order of Magnitude - Neglecting Minor Elements
• When we compare functions of different orders of
magnitude, what is beyond the order of
magnitude is negligible.
• Example 100n and n2/100. For n gt 10000, n2/100 gt
100n.
• If we had two algorithms A1 and A2 whose running
times are 100n and n2/100, we would prefer A2 if
we knew our input size is less than 10000 (most
of the time), but prefer A1 if the opposite were
true.

36
Example Prime Test
• Problem Determine if a number n is prime.
• First attempt check if 2..n/2 are dividers of n.
• Second attempt if n is even, we only have to
check odd dividers.
• Third attempt we only have to check 2..sqrt(n),
since if n is not prime, then n pq, and one of
the numbers p or q is no greater than sqrt(n).

37
Example Frequent Two Letter Occurrences
• Problem For a given text input, find the most
frequent occurrence of an adjacent two letter
pair that appears in the text.
• First attempt
• For every pair that appears in the text, count
how many times this pair appears in the text, and
find the maximum.
• Complexity (n-1) (n-1) n2 - 2n 1 O(n2)
• Second attempt
• Use a two-dimensional 26x26 array.
• Complexity (n - 1) 22626 O(n)
complexity!

38
Other Examples Ternary Search
• Split the search space to three parts.
• Is it an improvement in order of magnitude? In
factor?

39
Other Examples Wasteful Sort
• Find x, the maximum element in the array a to be
sorted
• Create a new integer array c of size x
• Zero c
• Count number of occurrences of each element in a,
store in c
• Generate elements according to c in temporary
array
• Copy temporary array back to a
• What is the memory/time complexity?

40
Why Bother?
• Computers today are very fast, and perform
millions of operations in seconds.
• Nevertheless, improvement in order of magnitude
can reduce computation duration by seconds, hours
and even days.
• Moreover, the following fact is true for some
problems, the only known algorithms take so many
steps, that even the fastest computers today, and
that will ever exist, are unable to solve the
problem!
• Example The travelling salesperson (TSP) problem.

41
The Travelling Salesperson Problem
• The story find the shortest path which starts at
a city and traverses all cities.

6
8
11
5
13
8
6
3
7
4
11
42
Solution to TSP
• Brute Force
• For each possible path, find its length
• Choose the path with minimum length
• Number of possible paths
• At most (n-1)(n-2)1 (n-1)! (n factorial)
• Complexity of algorithm n(n-1)! O(n!)
• How long will it take to go over O(n!) paths for
growing n?
• Assume we have a computer which can compute
million paths per second

43
TSP Computing Times for Different Input Sizes
of cities 6
of paths 120
computing time 8 milliseconds
44
TSP Computing Times for Different Input Sizes
of cities 6 11
of paths 120 3,628,800
computing time 8 milliseconds 3.5 seconds
45
TSP Computing Times for Different Input Sizes
of cities 6 11 13
of paths 120 3,628,800 479,001,600
computing time 8 milliseconds 3.5 seconds 8
minutes
46
TSP Computing Times for Different Input Sizes
of cities 6 11 13 16
of paths 120 3,628,800 479,001,600 1,307,674,36
8,000
computing time 8 milliseconds 3.5 seconds 8
minutes 15 days
47
TSP Computing Times for Different Input Sizes
of cities 6 11 13 16 18
of paths 120 3,628,800 479,001,600 1,307,674,36
8,000 335,000,000,000,000
computing time 8 milliseconds 3.5 seconds 8
minutes 15 days 11 years
48
TSP Computing Times for Different Input Sizes
of cities 6 11 13 16 18 21
of paths 120 3,628,800 479,001,600 1,307,674,36
8,000 335,000,000,000,000 2,430,000,000,000,000,
000
computing time 8 milliseconds 3.5 seconds 8
minutes 15 days 11 years 77,000 years!
49
TSP - an Intractable Problem
• TSP evidently cannot be solved for reasonable
input sizes
• The complexity of TSP O(n!) gt O(2n) is
exponential.
• Any exponential running time function is
intractable.
• What is the input size we can solve with the
following conditions
• Parallel computer with of processors as the
number of atoms in the universe
• Time Number of years since the big bang

50
According to the CIA (or why exponential is bad)
• The land area on earth is about 150 million
square kilometers
• The population on Earth is about 6000 million,
thus the average population density is about 40
people / square kilometer.
• The current population growth is about 1.5 per
year.
• 1.5 may not sound like much growth, however
1.0151000 2.9 million.
• Thus by the year 3000, if the population growth
continues at 1.5 per year, the average
population density will be 120 people per square
meter.
• By the year 4000 there will be 350 million people
per square meter...

51
Effect of Improved Technology
Size of Largest Problem Instance Solvable in 1
hour
Complexity n n2 n3 n5 2n 3n
With Present Computer N1 N2 N3 N4 N5 N6
With Computer 100 Times Faster 100N1 10N2 4.46N3
2.5N4 N5 6.64 N6 4.19
With Computer 1000 Times Faster 1000N1 3.16N2 10N
3 3.98N4 N5 9.97 N6 6.29
52
TSP - A Member of a Large Family
• It may seem that TSP is just one problem
• However, there is a whole set of problems
(1000), called NP problems, from a large variety
of areas, which are very similar to TSP
• Those problems are the focus of CS research, and
yet, no efficient (polynomial) algorithm has been
found
• Although it has not been proven, it is strongly
believed that there is no efficient algorithm for
NP problems (This is the famous P NP problem)

53
The NP Complete Class
• Many of the NP problems are complete, in the
sense that if an efficient solution to them will
be found, then all problems can be solved
efficiently
• This is true since all the problems in this class
were reduced to a single problem which is known
to be NPC
• A reduction from A to B means that given an
algorithm that solves B, we can find an algorithm
that solves A.
• I dont know how to solve A, but if you show me
how to solve B, I can solve A. So now the problem
is B.
• Example Well known mathematician physicist
joke.

54
Example of a Reduction Tree
If we find a solution to any of the red
problems, then we can find a solution to SAT
(path), and all NP problems are solvable
SAT is reduced to another problem
Special Problem (SAT) If it is solvable then any
NP problem is solvable
55
Example of a Simple Reduction
• Q1 What is the minimal solution to the TSP?
• Q2 Is there a solution to TSP with length lt k?

56
Examples of NP Complete Problems
• Knapsack
• Input Set of elements U with weights, number B
• Problem Find a subset of U with max. weight s.t.
sum of weights lt B
• Minimum Set Cover
• Input Set of tasks to perform, group of people
who are able to perform subsets of the set of
• Problem find a minimal sized subgroup of the
people who can perform all the tasks.

57
More NPC Problems
• Max SAT
• Input Set of logical clauses C1 and C2 and C3
and and Ck
• Each clause is of the form Ci P or (not Q) or R
or or S
• Problem Find an assignment in which the max no.
of clauses can be satisfied
• Input Network graph
• Problem Find the minimal time for broadcast from
a node t.

58
Yet More NPC Problems
• Graph Coloring
• For a long time map makers believed that if you
planned carefully you could color any map with
maximum of four colors. Many mathematicians tried
to prove this, but only recently with the aid of
a computer was it shown to be true.
• There is also no known polynomial time algorithm
to color a graph with the minimum number of
colors.
• Minimum Bin Packing (disk storage)
• Input k files of size s1sk, disk capacity M
• Problem Find a partition of the files to disks
such that each disk will store at most M bytes,
where minimal number of disks are required

59
The Good News About NPC Problems
• Although there is no efficient algorithm known
that can solve NP problems, there are other
approaches
• Approximation Some problems have efficient
algorithms which approximate the solution, i.e.,
find a solution which is optimal within a factor.
• Randomization Some problems have efficient
algorithms, which use coins, and find a good
solution with high probability.

60
Example of Approximation
• Minimum Processor Scheduling
• Input Set of n tasks with running time t1tn,
set of processors P1,Pm
• Problem Find a schedule with minimal finish time
• This problem is known to be NPC. (mn options)
• But There is a greedy approximation!
• Greedy Algorithm
• Go over tasks serially, and at each stage assign
processor with minimal sum of jobs)

61
How Good is the Greedy Algorithm?
• Let greedy(x) be the schedule of Greedy on input
x
• Let opt(x) the optimal schedule on input x
• Theorem greedy(x)/opt(x) ? 2 - 1/m
• So the greedy is not that bad! Infact, for large
m, it is close to 2 times the optimal!
• Next few slides prove this

62
Step 1
is greedy(x)
• Let tj be the last job assigned to it.
• Observation The load on any other processor is
at least (greedy(x) - tj).
• This is true, since at the time tj was assigned
least greedy(x) - tj (k was the least loaded
processor then). Other jobs may have been added
later.

63
Step 2
• It follows that ?ti ? tj m(greedy(x) - tj)
• Why is this true? Put all jobs in sequential
rather than parallel order. Then the time it
takes is at least tj m(greedy(x) - tj) However
this cant be more than ?ti.
• Switching terms we get greedy(x) ? 1/m ?ti
(1-1/m)tj

64
Step 3
• Step 3
• Observation opt(x) ? tj (clear)
• Observation opt(x) ? 1/m ?ti
• If the last were not true, then we would get
• opt(x) lt 1/m ?ti
• and ?ti ? mopt(x) lt m 1/m ?ti ?ti
• Step 4
• From steps 2,3 we get greedy(x) ? (2 - 1/m)
opt(x)

65
Example of Randomization - SAT
• Sometimes choosing random values for the
• For each C C1 or or Ck, the probability to
get FALSE is (1/2)k. So the probability of
getting true is 1- (1/2)k.
• So in general, more than half of the Cs will get
true. (or The expectation of of Cs which get
true is at least half Cs)
• So, this is a 2-approximation.
• If each C contains at least 2 variables, we get a
4/3-approximation (opt/approximation lt 4/3).
• If we want to get a tight approximation - we can
run the algorithm many times.

66
Example The Sorted Array Sum Problem
• Input Sorted array A of n numbers, and a number
S
• Output Are there two numbers in the array whose
sum is S?
• Algorithm 1 For each pair of numbers, check if
their sum is S.
• Complexity 1 n (n-1) / 2 pairs, quadratic
complexity.
• Algorithm 2 For each Ai, binary search S-Ai.
• Complexity 2 n log n.
• Algorithm 3 left, right pointers.
• If Aleft Aright S, found.
• If Aleft Aright lt S, left
• If Aleft Aright gt S, right--
• Complexity 3 linear!

67
The Sorted Array Sum Revisited
• Input Sorted array A of n numbers, and a number
S
• Output Is there a group of numbers in the array
whose sum is S?
• Possible solution for each possible group of
numbers, find out if its sum is S.
• Complexity number of groups 2n, therefore
complexity is exponential.
• This problem is known to be NP-Complete!

68
Euler Paths and Circuits
• Given an undirected graph an Euler Path is a path
that includes every edge in E exactly once.
• An Euler Circuit is an Euler Path that starts and
ends at the same vertex.
• The circuits get their name from Leonhard Euler's
famous Konigsberg bridges problem Traverse each
one of the seven bridges once on your Sunday
stroll.
• There exists an algorithm that finds an Euler
circuit in a graph (provided there is one) in
O(E) time.

69
(No Transcript)
70
Hamiltonian Circuits
• A Hamiltonian Cycle is a cycle which traverses
each vertex in a graph exactly once without
traversing an edge twice
• This problem is NP-Complete