Algorithmics and Complexity - PowerPoint PPT Presentation

Loading...

PPT – Algorithmics and Complexity PowerPoint presentation | free to download - id: 71c8af-YzYxO



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Algorithmics and Complexity

Description:

Algorithmics and Complexity In this lecture: The limits of algorithms: some problems are unsolvable. How do we measure the efficiency of an algorithm? – PowerPoint PPT presentation

Number of Views:128
Avg rating:3.0/5.0
Slides: 71
Provided by: JOHNLE163
Learn more at: http://www1.idc.ac.il
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Algorithmics and Complexity


1
Algorithmics and Complexity
  • In this lecture
  • The limits of algorithms some problems are
    unsolvable.
  • How do we measure the efficiency of an algorithm?
  • Improvement by factor and by order of magnitude
  • Some examples of complexity analysis
  • Intractable problems

2
Can Computers Solve Every Problem?
  • It seems that computers are powerful enough as to
    enable us to solve any problem by writing the
    appropriate program.
  • It may (or may not) seem quite surprising to
    know, that there are problems which cannot be
    solved by any computer!
  • Such problems were discovered and studied by the
    mathematician Alan Turing, the most famous of
    which is the Halting Problem (1937).

3
The Halting Problem
  • Given a program P and an input x, does the
    program P halt on the input x?
  • One can imagine a method
  • booelan doesHalt(P,x)
  • that does the following
  • reads the program P (which is just a text file)
  • runs an algorithm which determines if the program
    P halts on the input x
  • returns true if P halts on the specified input x
  • returns false if P does not halt on the
    specified input x

4
The Halting Problem
  • Define a program
  • testHalt(P)
  • if (doesHalt(P,P))
  • loop forever
  • else
  • print halt
  • What happens if we run testHalt, and give it as
    input testHalt itself?
  • testHalt(testHalt)

5
The Halting Problem Paradox
  • Assume testHalt(testHalt) terminates and prints
    halt
  • This means doesHalt(testHalt,testHalt) returned
    false, which in turn means that testHalt does not
    terminate on the input testHalt! A contradiction!
  • Assume testHalt(testHalt) loops forever
  • This means doesHalt(testHalt,testHalt) returned
    true, which in turn means that testHalt does
    terminate on the input testHalt! Another
    contradiction!
  • Conclusion Our assumption, that there exists a
    method doesHalt, which determines if a program
    halts on a specific input, is wrong!
  • In computer science terms, We say that the
    Halting Problem is undecidable. (???? ?????).

6
Halting Problem - The Bright Side
  • We have proved that no algorithm can solve the
    halting problem.
  • In contrast to the halting problem, we have
    already seen that
  • there are problems which can be solved
    algorithmically
  • There may be more than one way to solve a
    particular problem (sorting).

7
Algorithmic Questions
  • When we are given a specific problem, there are
    many questions we can ask about it
  • Are there algorithms which solve this problem?
    (????????)
  • Given an algorithm which solves the problem, how
    can we be convinced that the algorithm is
    correct?
  • How good is an algorithm which solves the
    problem?
  • Is it efficient in terms of processing steps
    (time)?
  • Is it efficient in terms of storage space
    (memory)?
  • How do we measure efficiency? (????????)

8
More Algorithmic Questions
  • More questions we can ask
  • Is there something we can say about every
    algorithm which solves the problem?
  • For example, every algorithm must take at least x
    processing steps, etc
  • When we implement the algorithm on a computer,
    will the problem be solved within a reasonable
    time? What is reasonable, anyway?
  • Phone lookup (144) - few seconds
  • Weather forecast - max. one day
  • Cruise missiles - real time (a late answer is
    useless)
  • Physical simulations - few days? Few weeks?
    Perhaps more?

9
Time Efficiency
  • How do we measure time efficiency?
  • Assume we have a problem P to solve, with two
    algorithms A1 and A2 that solve it.
  • We wish to compare A1 and A2s efficiency.
  • What do you think about the following efficiency
    test?

The algorithms were implemented on a computer,
and their running time was measured Algorithm A1
- 1.25 seconds Algorithm A2 - 0.34
seconds Conclusion Algorithm A2 is better!
10
Time Efficiency Questions We Must Ask
  • Were the algorithms tested on the same computer?
  • Is there a benchmark computer on which we test
    algorithms?
  • What were the inputs given to the algorithm? Were
    the inputs equal? Of equal size?
  • Is there a better way for measuring time
    efficiency, independent of a particular computer?

11
Input Size
  • The running time of an algorithm is dependent
    upon the amount of work is has to perform,
    which in turn is a function of the size of input
    given to the algorithm
  • In an array sorting algorithm - number of cells
    to sort
  • In an algorithm for finding a word in a text -
    number of characters, or number of words
  • In an algorithm that tests if a number is prime
    size of number (number of bits which represent
    the number, or number of digits)

12
Efficiency Measure - First Attempt
  • A reasonable way to measure the time efficiency
    of an algorithm could be
  • find out how many steps the algorithm performs
    for every input size ( as a function of the
    input size).
  • What could those steps be?
  • Anything we find reasonable, as long as we know
    those steps take approximately constant time
    to run, that is, their running time is not a
    function of the input size

13
Algorithmic Steps Examples
  • In the bubble sort algorithm switch two adjacent
    cells.
  • In a generic algorithm for finding an element
    in an array
  • Do until stop
  • Find out what is the next cell to look at (or
    stop)
  • Find out if the element were looking for is in
    this cell
  • In an algorithm for testing if a number x is
    prime find out if y divides x.
  • In a classic algorithm for multiplying two
    numbers multiply digits / add digits.
  • Note that all these steps take constant time to
    perform, which is not dependent upon the size of
    input.

14
Advantages of the Suggested Measurement
  • It is not dependent on a particular computer.
  • If we wish to figure out what will be the running
    time of the algorithm on a particular computer,
    well just have to
  • Estimate how long does it take to perform the
    basic steps weve defined on the particular
    computer
  • multiply this measurement by the number of steps
    weve calculated for a specific input size.

15
Example Character Search
  • Problem Find out if the character c occurs in a
    given text.
  • Solution 1

found ? false while (more characters to read and
found false) read the next character in the
text if this character is c, found ? true. If
(end of text reached) print (not found) else
print(found).
16
Solution 1 Time Analysis
  • Input size?
  • Number of characters in text
  • What is the basic step?
  • Find out if end of text has been reached
  • read next character in text
  • Test if character is c
  • What is the running time as function of input
    length n?
  • Depends on the particular text. But, in the worse
    case, no more than n basic steps constant
    (operations before and after loop).
  • T(n) lt c1n d1

17
Character Search Simple Optimization
  • Solution 2

found ? false add c to end of text while (found
false) read the next character in the text if
this character is c, found ? true. If (end of
text reached) print (not found) else
print(found). Remove c from end of text.
18
Solution 2 Time Analysis
  • Solution 2 analysis is more or less the same,
    however the basic step is different
  • read next character in text
  • Test if character is c
  • In the worse case, the running time of Solution 2
    as a function of n is
  • T(n) lt c2n d2
  • This time, c2 and d2 are different. (c1 gt c2 ,
    d2 gt d1).
  • In solution 2, we have
  • shortened the time it takes to perform the basic
    step, but
  • added a constant to the overall running time

19
Running Time Tables
Input Size 1 3 5 10 100 1000 30000 3000000
3n 2 5 11 17 32 302 3002 90002 9000002
2n 4 6 10 14 24 204 2004 60004 6000004
ratio 0.83 1.1 1.21 1.33 1.48 1.5 1.5 1.5
20
Improvement by Factor
  • In short texts, Solution 1 is better than
    Solution 2 (the improved solution), however
  • As the text length grows, the constants d1 and d2
    become less and less important, and the ratio
    converges to 1.5.
  • Such improvement is called an improvement by
    factor, since the ratio between the running times
    of both solutions, as n grows, converges to a
    constant.

21
A Word about Best, Average and Worse cases
  • Note that when we have counted the number of
    steps, we have analyzed the worse case, in which
    the character c is not in the text.
  • Other measurement Average case.
  • What is the advantage of measuring the worse
    case?
  • The average case is a good measurement, however,
    for a specific input length n, we have no idea
    what the running time will be.
  • Computing the average case is quite complex.
  • What information does Best Case analysis give us?

22
Improvement by Factor Is it Important?
  • We shall soon see that many times we can do
    better than improving the running time by a
    factor
  • However, improvement by factor is still
    important
  • If we make an effort at optimizing specific
    bottleneck areas in a program, we may gain a
    lot
  • Special programs called profilers help us in
    pinpointing the hot areas in a program.

The 80/20 rule (or 90/10 rule) A program spends
80 of its time executing 20 of its code.
23
Finding a Phone Number in a Phonebook
  • Problem Find if a number x appears in a sorted
    array of numbers (e.g., a phonebook).
  • This problem is similar to the character search
    problem.
  • The algorithms we have already seen can be used
    to solve this problem both algorithms are quite
    similar, and are a variant of the serial search
    method.
  • Other possible optimizations?
  • However, the assumption that the array is sorted
    can be used in a clever way.

24
How Does One Find a Lion in the Desert?
25
Binary Search
  • Cut out half of the search space in every step.
  • The basic step in binary search
  • Find out if the current cell contains the number
    were looking for
  • Termination condition find out if the range is
    of size 1
  • If 1 and 2 is false, calculate the next cell to
    look for (index middle cell in current range)
  • The basic step in serial search
  • Find out if the current cell contains the number
    were looking for
  • Termination condition find out if we have
    reached the end of array
  • If 1 and 2 is false, calculate the next cell to
    look for (index index 1)

26
Binary Search - Example
  • If the array is of size 1000, in the worse case,
    we will be looking at ranges of size
    1000,500,250,125,63,32,16,8,4,2,1, total of 10
    steps.
  • Compare to serial search 1000 steps!
  • With million cells, we will be looking at 20
    cells in the worse case
  • How many cells in the general case?

27
Binary vs. Serial - Number of Steps
Input Size 10 100 1000 10000 100000 1000000
serial 10 100 1000 10000 100000 1000000
binary 4 7 10 14 17 20
ratio 2.5 14 100 714 5883 50000
28
Improvement by Order of Magnitude
  • Recall, that when we have dealt with improvement
    in factor, the ratio between running times was
    constant.
  • This time, we can evidently see the the ratio
    between the number of steps is growing as the
    input size grows.
  • This kind of improvement is called improvement by
    order of magnitude.

29
What About the Duration of Basic Step?
  • When we have dealt with improvement in factor,
    the duration of a basic step was very
    interesting.
  • Is it of importance now?
  • Or, put in other words Assume that the duration
    of a single step in serial search is 1 and that
    a single step in binary search takes 1000, would
    there still be an improvement?

30
Binary vs. Serail - Different Duration of Steps
Input Size 10 100 1000 10000 100000 1000000
serial 10 100 1000 10000 100000 1000000
binary 400 700 1000 1400 1700 2000
ratio 0.025 0.14 1 7.14 58.8 500
31
Duration of Basic Step is Negligible
  • As we can see from the table, for small input
    sizes ( lt 1000), serial search is indeed better
  • However, for larger input sizes, binary search
    still wins.
  • The reason is very simple the ratio between
    duration of basic steps is constant, while the
    ratio between the number of basic steps grows as
    the input size grows.
  • Note that in practice, the ratio between the
    basic steps in binary/serial search will be much
    smaller.

32
Order of Magnitude
  • We have seen two basic kind of improvements in
    running time of an algorithm by factor, and by
    order of magnitude.
  • The latter improvement is much more meaningful.
  • This is why many times we want to neglect the
    small differences between two running time
    functions and get an impression of what is the
    dominant element in the functions.

33
Linear Order
  • For example, in serial search, any running time
    function will be of the form f(n) an b, which
    is called a linear function.
  • The ratio between any two linear functions is
    constant for large enough n.
  • This is why we say that the running time
    functions are of linear order, or that the
    complexity of the algorithms is linear.
  • Linear order can be symbolized by O(n). We say
    that f(n) O(n). This is called the Big-O
    notation.

34
Order of Magnitude
  • In general, we say that two functions are of the
    same order if the ratio between their values is
    constant for large enough n.
  • Example f(n) n2 not of linear order! It is of
    quadratic order, or O(n2).
  • All these functions are of quadratic order
  • n2 5n2 6 5n2 100n - 90 5000n2
    n2/6
  • Other orders of magnitude O(log n) -
    logarithmic, O(nk) (k gt2) - polynomial, O(2n) -
    exponential.
  • Polynomial and exponential are very important
    orders of magnitude, and we shall see why later.

35
Order of Magnitude - Neglecting Minor Elements
  • When we compare functions of different orders of
    magnitude, what is beyond the order of
    magnitude is negligible.
  • Example 100n and n2/100. For n gt 10000, n2/100 gt
    100n.
  • If we had two algorithms A1 and A2 whose running
    times are 100n and n2/100, we would prefer A2 if
    we knew our input size is less than 10000 (most
    of the time), but prefer A1 if the opposite were
    true.

36
Example Prime Test
  • Problem Determine if a number n is prime.
  • First attempt check if 2..n/2 are dividers of n.
  • Second attempt if n is even, we only have to
    check odd dividers.
  • Third attempt we only have to check 2..sqrt(n),
    since if n is not prime, then n pq, and one of
    the numbers p or q is no greater than sqrt(n).

37
Example Frequent Two Letter Occurrences
  • Problem For a given text input, find the most
    frequent occurrence of an adjacent two letter
    pair that appears in the text.
  • First attempt
  • For every pair that appears in the text, count
    how many times this pair appears in the text, and
    find the maximum.
  • Complexity (n-1) (n-1) n2 - 2n 1 O(n2)
  • Second attempt
  • Use a two-dimensional 26x26 array.
  • Complexity (n - 1) 22626 O(n)
  • Tradeoff added storage complexity, reduced time
    complexity!

38
Other Examples Ternary Search
  • Split the search space to three parts.
  • Is it an improvement in order of magnitude? In
    factor?

39
Other Examples Wasteful Sort
  • Find x, the maximum element in the array a to be
    sorted
  • Create a new integer array c of size x
  • Zero c
  • Count number of occurrences of each element in a,
    store in c
  • Generate elements according to c in temporary
    array
  • Copy temporary array back to a
  • What is the memory/time complexity?

40
Why Bother?
  • Computers today are very fast, and perform
    millions of operations in seconds.
  • Nevertheless, improvement in order of magnitude
    can reduce computation duration by seconds, hours
    and even days.
  • Moreover, the following fact is true for some
    problems, the only known algorithms take so many
    steps, that even the fastest computers today, and
    that will ever exist, are unable to solve the
    problem!
  • Example The travelling salesperson (TSP) problem.

41
The Travelling Salesperson Problem
  • The story find the shortest path which starts at
    a city and traverses all cities.

6
8
11
5
13
8
6
3
7
4
11
42
Solution to TSP
  • Brute Force
  • For each possible path, find its length
  • Choose the path with minimum length
  • Number of possible paths
  • At most (n-1)(n-2)1 (n-1)! (n factorial)
  • Complexity of algorithm n(n-1)! O(n!)
  • How long will it take to go over O(n!) paths for
    growing n?
  • Assume we have a computer which can compute
    million paths per second

43
TSP Computing Times for Different Input Sizes
of cities 6
of paths 120
computing time 8 milliseconds
44
TSP Computing Times for Different Input Sizes
of cities 6 11
of paths 120 3,628,800
computing time 8 milliseconds 3.5 seconds
45
TSP Computing Times for Different Input Sizes
of cities 6 11 13
of paths 120 3,628,800 479,001,600
computing time 8 milliseconds 3.5 seconds 8
minutes
46
TSP Computing Times for Different Input Sizes
of cities 6 11 13 16
of paths 120 3,628,800 479,001,600 1,307,674,36
8,000
computing time 8 milliseconds 3.5 seconds 8
minutes 15 days
47
TSP Computing Times for Different Input Sizes
of cities 6 11 13 16 18
of paths 120 3,628,800 479,001,600 1,307,674,36
8,000 335,000,000,000,000
computing time 8 milliseconds 3.5 seconds 8
minutes 15 days 11 years
48
TSP Computing Times for Different Input Sizes
of cities 6 11 13 16 18 21
of paths 120 3,628,800 479,001,600 1,307,674,36
8,000 335,000,000,000,000 2,430,000,000,000,000,
000
computing time 8 milliseconds 3.5 seconds 8
minutes 15 days 11 years 77,000 years!
49
TSP - an Intractable Problem
  • TSP evidently cannot be solved for reasonable
    input sizes
  • The complexity of TSP O(n!) gt O(2n) is
    exponential.
  • Any exponential running time function is
    intractable.
  • What is the input size we can solve with the
    following conditions
  • Parallel computer with of processors as the
    number of atoms in the universe
  • Time Number of years since the big bang

50
According to the CIA (or why exponential is bad)
  • The land area on earth is about 150 million
    square kilometers
  • The population on Earth is about 6000 million,
    thus the average population density is about 40
    people / square kilometer.
  • The current population growth is about 1.5 per
    year.
  • 1.5 may not sound like much growth, however
    1.0151000 2.9 million.
  • Thus by the year 3000, if the population growth
    continues at 1.5 per year, the average
    population density will be 120 people per square
    meter.
  • By the year 4000 there will be 350 million people
    per square meter...

51
Effect of Improved Technology
Size of Largest Problem Instance Solvable in 1
hour
Complexity n n2 n3 n5 2n 3n
With Present Computer N1 N2 N3 N4 N5 N6
With Computer 100 Times Faster 100N1 10N2 4.46N3
2.5N4 N5 6.64 N6 4.19
With Computer 1000 Times Faster 1000N1 3.16N2 10N
3 3.98N4 N5 9.97 N6 6.29
52
TSP - A Member of a Large Family
  • It may seem that TSP is just one problem
  • However, there is a whole set of problems
    (1000), called NP problems, from a large variety
    of areas, which are very similar to TSP
  • Those problems are the focus of CS research, and
    yet, no efficient (polynomial) algorithm has been
    found
  • Although it has not been proven, it is strongly
    believed that there is no efficient algorithm for
    NP problems (This is the famous P NP problem)

53
The NP Complete Class
  • Many of the NP problems are complete, in the
    sense that if an efficient solution to them will
    be found, then all problems can be solved
    efficiently
  • This is true since all the problems in this class
    were reduced to a single problem which is known
    to be NPC
  • A reduction from A to B means that given an
    algorithm that solves B, we can find an algorithm
    that solves A.
  • I dont know how to solve A, but if you show me
    how to solve B, I can solve A. So now the problem
    is B.
  • Example Well known mathematician physicist
    joke.

54
Example of a Reduction Tree
If we find a solution to any of the red
problems, then we can find a solution to SAT
(path), and all NP problems are solvable
SAT is reduced to another problem
Special Problem (SAT) If it is solvable then any
NP problem is solvable
55
Example of a Simple Reduction
  • Q1 What is the minimal solution to the TSP?
  • Q2 Is there a solution to TSP with length lt k?

56
Examples of NP Complete Problems
  • Knapsack
  • Input Set of elements U with weights, number B
  • Problem Find a subset of U with max. weight s.t.
    sum of weights lt B
  • Minimum Set Cover
  • Input Set of tasks to perform, group of people
    who are able to perform subsets of the set of
    tasks.
  • Problem find a minimal sized subgroup of the
    people who can perform all the tasks.

57
More NPC Problems
  • Max SAT
  • Input Set of logical clauses C1 and C2 and C3
    and and Ck
  • Each clause is of the form Ci P or (not Q) or R
    or or S
  • Problem Find an assignment in which the max no.
    of clauses can be satisfied
  • Minimum Broadcast
  • Input Network graph
  • Problem Find the minimal time for broadcast from
    a node t.

58
Yet More NPC Problems
  • Graph Coloring
  • For a long time map makers believed that if you
    planned carefully you could color any map with
    maximum of four colors. Many mathematicians tried
    to prove this, but only recently with the aid of
    a computer was it shown to be true.
  • There is also no known polynomial time algorithm
    to color a graph with the minimum number of
    colors.
  • Minimum Bin Packing (disk storage)
  • Input k files of size s1sk, disk capacity M
  • Problem Find a partition of the files to disks
    such that each disk will store at most M bytes,
    where minimal number of disks are required

59
The Good News About NPC Problems
  • Although there is no efficient algorithm known
    that can solve NP problems, there are other
    approaches
  • Approximation Some problems have efficient
    algorithms which approximate the solution, i.e.,
    find a solution which is optimal within a factor.
  • Randomization Some problems have efficient
    algorithms, which use coins, and find a good
    solution with high probability.

60
Example of Approximation
  • Minimum Processor Scheduling
  • Input Set of n tasks with running time t1tn,
    set of processors P1,Pm
  • Problem Find a schedule with minimal finish time
  • This problem is known to be NPC. (mn options)
  • But There is a greedy approximation!
  • Greedy Algorithm
  • Go over tasks serially, and at each stage assign
    a task to the least loaded processor (i.e., the
    processor with minimal sum of jobs)

61
How Good is the Greedy Algorithm?
  • Let greedy(x) be the schedule of Greedy on input
    x
  • Let opt(x) the optimal schedule on input x
  • Theorem greedy(x)/opt(x) ? 2 - 1/m
  • So the greedy is not that bad! Infact, for large
    m, it is close to 2 times the optimal!
  • Next few slides prove this

62
Step 1
  • The load on the most loaded processor (call it k)
    is greedy(x)
  • Let tj be the last job assigned to it.
  • Observation The load on any other processor is
    at least (greedy(x) - tj).
  • This is true, since at the time tj was assigned
    to k, all the other processors had loads of at
    least greedy(x) - tj (k was the least loaded
    processor then). Other jobs may have been added
    later.

63
Step 2
  • It follows that ?ti ? tj m(greedy(x) - tj)
  • Why is this true? Put all jobs in sequential
    rather than parallel order. Then the time it
    takes is at least tj m(greedy(x) - tj) However
    this cant be more than ?ti.
  • Switching terms we get greedy(x) ? 1/m ?ti
    (1-1/m)tj

64
Step 3
  • Step 3
  • Observation opt(x) ? tj (clear)
  • Observation opt(x) ? 1/m ?ti
  • If the last were not true, then we would get
  • opt(x) lt 1/m ?ti
  • and ?ti ? mopt(x) lt m 1/m ?ti ?ti
  • This is a contradiction
  • Step 4
  • From steps 2,3 we get greedy(x) ? (2 - 1/m)
    opt(x)

65
Example of Randomization - SAT
  • Sometimes choosing random values for the
    variables cannot be that bad!
  • For each C C1 or or Ck, the probability to
    get FALSE is (1/2)k. So the probability of
    getting true is 1- (1/2)k.
  • So in general, more than half of the Cs will get
    true. (or The expectation of of Cs which get
    true is at least half Cs)
  • So, this is a 2-approximation.
  • If each C contains at least 2 variables, we get a
    4/3-approximation (opt/approximation lt 4/3).
  • If we want to get a tight approximation - we can
    run the algorithm many times.

66
Example The Sorted Array Sum Problem
  • Input Sorted array A of n numbers, and a number
    S
  • Output Are there two numbers in the array whose
    sum is S?
  • Algorithm 1 For each pair of numbers, check if
    their sum is S.
  • Complexity 1 n (n-1) / 2 pairs, quadratic
    complexity.
  • Algorithm 2 For each Ai, binary search S-Ai.
  • Complexity 2 n log n.
  • Algorithm 3 left, right pointers.
  • If Aleft Aright S, found.
  • If Aleft Aright lt S, left
  • If Aleft Aright gt S, right--
  • Complexity 3 linear!

67
The Sorted Array Sum Revisited
  • Input Sorted array A of n numbers, and a number
    S
  • Output Is there a group of numbers in the array
    whose sum is S?
  • Possible solution for each possible group of
    numbers, find out if its sum is S.
  • Complexity number of groups 2n, therefore
    complexity is exponential.
  • This problem is known to be NP-Complete!

68
Euler Paths and Circuits
  • Given an undirected graph an Euler Path is a path
    that includes every edge in E exactly once.
  • An Euler Circuit is an Euler Path that starts and
    ends at the same vertex.
  • The circuits get their name from Leonhard Euler's
    famous Konigsberg bridges problem Traverse each
    one of the seven bridges once on your Sunday
    stroll.
  • There exists an algorithm that finds an Euler
    circuit in a graph (provided there is one) in
    O(E) time.

69
(No Transcript)
70
Hamiltonian Circuits
  • A Hamiltonian Cycle is a cycle which traverses
    each vertex in a graph exactly once without
    traversing an edge twice
  • This problem is NP-Complete
About PowerShow.com