Explorations in Artificial Intelligence

- Prof. Carla P. Gomes
- gomes_at_cs.cornell.edu
- Module 6
- Intro to Complexity

The algorithm problem

Any legal input

Specification of all legal inputs

and

The algorithm

Specification of desired output as a function of

the input

The desired output

Examples of algorithmic problems

Examples of algorithmic problems

Variants of algorithmic problems Decision Problem

Problem Knapsack (decision) Input profits p0,

p1, , pn-1 weights w0, w1, ,

wn-1 capacity M target profit P Output YES

or NO to the question Does there exist an

n-tuple xo,,x n-1 ? 0,1n, such that

and

Variants of algorithmic problems Search Problem

Problem Knapsack (search) Input profits p0,

p1, , pn-1 weights w0, w1, , wn-1 capacity

M target profit P Output An n-tuple xo,,x

n-1 ? 0,1n, such that

and

Variants of algorithmic problems Optimal Value

Problem Knapsack (optimal value) Input profits

p0, p1, , pn-1 weights w0, w1, ,

wn-1 capacity M Output The maximum value of

subject to

and xo,,x n-1 ? 0,1n

Variants of algorithmic problems Optimization

Problem Knapsack (optimization) Input profits

p0, p1, , pn-1 weights w0, w1, ,

wn-1 capacity M Output An n-tuple xo,,x

n-1 ? 0,1n, such that

is maximized subject to

Instance of an algorithmic problemSize of an

instance

- An instance of an algorithmic problem is a

concrete case of such a problem with specific

input. The size of an instance is given by the

size of its input. - Examples of instances
- An instance of problem 1

Size of instance ? length of list

Size of instance L 7

Examples of instances

Size of instance ? Number of cities and roads

A particular instance

Size of instance 6 nodes 9 edges

The size of an instance is given by the size of

its input.

The size of an instance is given by the size of

its input.

Size of instance ? Number of variables and

constraints (n,m)

An instance of problem 5 max 3 x 5 y s.t

x 4 y 12 3 x 2 y 18 x,y 0

Size of instance 2 variables 3 functional

constraints

Complexity of Algorithms

Complexity of Algorithms

- The complexity of an algorithm is the number of

steps that it takes to transform the input data

into the desired output. - Each simple operation (,-,,/,,if,etc) and each

memory access corresponds to a step.() - The complexity of an algorithm is a function of

the size of the input (or size of the instance).

Well denote the complexity of algorithm A by

CA(n), where n is the size of the input.

() This model is a simplification but still

valid to give us a good idea of the complexity of

algorithms.

Example Insertion Sort

From Introduction to Algorithms Cormen et al

Different notions of complexity

Worst case complexity of an algorithm A the

maximum number of computational steps required

for the execution of Algorithm A, over all the

inputs of the same size, s. It provides an upper

bound for an algorithm. The worst that can happen

given the most difficult instance the

pessimistic view.

Best case complexity of an algorithm A -the

minimum number of computational steps required

for the execution of Algorithm A, over all the

inputs of the same size, s. The most optimistic

view of an algorithm it tells us the least work

a particular algorithm could possibly get away

with for some one input of a fixed size we have

the chance to pick the easiest input of a given

size.

- Average case complexity of an algorithm A - i.e.,

the average amount of resources the algorithm

consumes assuming some plausible frequency of

occurrence of each input. - Figuring out the average cost is much more

difficult than figuring out either the worst-cost

or best-cost ? e.g., we have to assume a given

probability distribution for the types of inputs

we get.

Different notions of complexity

We perform upper bound analysis on algorithms.

Growth Rates

- In general we only worry about growth rates

because - Our main objective is to analyze the cost

performance of algorithms. - Another obstacle to having the exact cost of

algorithms is that sometimes the algorithm are

quite complicated to analyze. - When analyzing an algorithm we are not that

interested in the exact time the algorithm takes

to run often we only want to compare two

algorithms for the same problem the thing that

makes one algorithm more desirable than another

is its growth rate relative to the other

algorithms growth rate.

Growth Rates

- Two functions of n have different growth rates if

as n goes to infinity their ratio either goes to

infinity or goes to zero. - If their ratio stays near a non-zero constant

then they are asymptotically the same function.

Big Oh Notation

- Given two functions F and G, whose domain is the

natural numbers, we say that the order of F is

lower than or equal to the order of G if - F(n) c G(n) for all n gt n0 (c and n0 are

constants) - We say F is O(G) (F is oh of G)

Example (3 n3 n2 n ) is O(n3)

In practice we just look at the fastest growing

term of the expression

Typical Growth Rates

Roughly Speaking

exponential

quadratic

Cost

linear

logarithmic

constant

Size

Good vs. Bad Algorithms

- How do computer scientists differentiate between
- good (efficient) and bad (not efficient)

algorithms?

The yardstick is that any algorithm that runs in

no more than polynomial time is an efficient

algorithm everything else is not.

(No Transcript)

Polynomial vs. exponential growth (Harel 2000)

N2

Problem Complexity

- Theory of NP-completeness or NP-hardness
- Easy vs. hard problems

Overview of complexity

- How can we show a problem is efficiently

solvable? - We can show it constructively. We provide an

algorithm and show that it solves the problem

efficiently. E.g. - Shortest path problem - Dijkstras algorithm runs

in polynomial time, O(n2). Therefore the shortest

path problem can be solved efficiently. - Linear Programming The Interior Point method

has polynomial worst-case complexity. Therefore

Linear programming can be solved efficiently.

() The simplex method has exponential worst case

complexity/ However, in practice the simplex

algorithm seems to scale as m3, where m is the

number of functional constraints.

Overview of complexity

- How can we show a problem is not efficiently

solvable? - How do you prove a negative? Much harder!!!
- This is the aim of complexity theory.

Easy (efficiently solvable) problems vsHard

Problems

- Easy Problems - we consider a problem X to be

easy or efficiently solvable, if there is a

polynomial time algorithm A for solving X. We

denote by P the class of problems solvable in

polynomial time. - Hard problems --- everything else. Any problem

for which there is no polynomial time algorithm

is an intractable problem. - .

Class P Class of Problems solvable in Polynomial

Time

Other examples of problems in the class P

include The assignment problem and

transportation problem, finding the minimum cost

flow problem and the max cost flow in in a

directed graph.

Two problems

Id like you to develop an effcient algorithm to

find the longest path between two points in a

graph.

Your Longest Path Algorithm between two nodes, u

and v

G(N,E)

u

v

Initialization MaxPath ? none MaxPathLength

? 0 For each path P starting at 1 if P is a

simple path from u to v and length(P) gt

MaxPath MaxPath ? P MaxPathLength ?

length(P) Return MaxPath MaxPathLength

Is that the best you can do? -- that seems to be

a bad algorithm!!!

I cant find an efficient algorithm. I guess Im

too dumb.

I cant find an efficient algorithm, but neither

can these famous researchers.

In 1936, Alan Turing, a British mathematician,

showed that there exists a relatively simple

universal computing device that can perform any

computational process. Computers use such a

universal model.

Turing Machine (abstraction)

Turing also showed the limits of computation

some problems cannot be computed even with the

most powerful computer and even with unlimited

amount of time e.g., Halting problem.

Brilliant mathematician, synthesizer, and

promoter of the stored program concept, whose

logical design of the Institute of Advanced

Studies (Princeton) Computer became the prototype

of todays computer () - the von Neumann

Architecture.

() sequential i.e., non-parallel computers

Invented the theory of NP-Completeness proved

that a simple problem - Satisfiability is

NP-Complete.

Given a propositional formula, is there an

assignment to its variables (a, b, and c True

or False) making the formula true?

Showed that several important problems and

applications are NP-Complete and

NP-hard, including Integer Programming.

(No Transcript)

Invented Linear Programming Formulations max

3 x 5 y s.t x 4 y 12 3 x 2 y

18 x,y 0 Invented Simplex Algorithm

Theory of NP-completeness and NP-hardnessEasy

vs. hard problems

Can satisfiability or integer programming be

solved in polynomial time?

- FACT every algorithm that has ever been

developed for satisfiability or integer

programming takes exponential time. - Hundreds of very smart researchers have tried to

come up with polynomial time algorithms for

satisfiability or integer programming, and

failed. - It is generally believed that there is no

polynomial time algorithm for satisifiability or

integer programming. - Complexity theory deals with proving that

satisfiability, integer programming, and many

other problems are computationally hard.

Decision Problems

NP-Completeness theory deals with decision

problems. What is a decision problem? A problem

for which there is a yes/no answer. Examples

?Is there a path between two nodes in a graph

shorter than k? (decision version of shortest

path problem) ?Is there a path between two nodes

in a graph longer than k? (decision version of

longest path problem) ?Most optimization

problems can be formulated as a decision problem

Class NP Class of Problems solvable in

Nondeterministic-Polynomial Time

- We say that a decision problem is solvable in

Non-deterministic polynomial time if - The solution can be verified in polynomial time.

(E.g., verifying that a path has length greater

than K) - If we imagine that we have an exponential number

of processors, we can check all possible

solutions simultaneously and therefore answer in

polynomial time

Class NP-Complete

The first problem to be shown to be NP-Complete

was Satisfiability Cook showed that all the

problems in NP could be translated (in polynomial

time) as Satisfiability problems The word

complete means that every problem in the class

NP-complete can be reduced (in polynomial time)

into another problem of the class NP-complete.

For example all the problems in the class

NP-complete can be written as Satisfiability

problems. The class of NP-Complete problems is

the class of the hardest computational problems

in the class NP every NP problem can be

transformed into an NP-complete problem (the

reverse is not true!!!)

Is P not equal to NP?1,000,000 question

- P not equal to NP?
- Is that true that not all problems is NP can be

solved in polynomial time? - Class of NP-Complete Problems
- They all admit exponential time solutions
- Nobody has ever been able to find a polynomial

time solution for any single problem in the

class - Nobody has ever been able to prove an exponential

lower bound for any single problem in the class

P not equal to NP?1,000,000 question

- Pictorial interpretation of this question

Is this the right picture? There are

problems in NP that are inherently

intractable and cannot be solved in polynomial

time.

P not equal to NP?1,000,000 question

- Pictorial interpretation of this question

Or is this the right picture? All the

problems in NP can be solved in polynomial time.

Even though at this point we dont know of

polynomial time algorithms to solve some

problems in NP, they exist

P NP

P vs. NPOne Million Dollar Prize

http//www.claymath.org/Millennium_Prize_Problems/

P_vs_NP/

Class of NP-Complete ProblemsOne Million Dollar

Prize

- Completeness
- if someone were to find a polynomial time

solution for a single problem in the class

NP-complete ? all the problems could be solved in

polynomial time!!! - if someone were to prove an exponential lower

bound for a single problem in the class

NP-complete ? all the problems in the class would

be intractable !!!

P vs. NPOne Million Dollar Prize

http//www.claymath.org/Millennium_Prize_Problems/

P_vs_NP/

Class of NP-Complete ProblemsOne Million Dollar

Prize

- Completeness
- if someone were to find a polynomial time

solution for a single problem in the class

NP-complete ? all the problems could be solved in

polynomial time!!! - if someone were to prove an exponential lower

bound for a single problem in the class

NP-complete ? all the problems in the class would

be intractable !!!

- Conjecture
- NP-Complete problems are inherently hard!
- They are intractable ?!

On Proving NP-Completeness results

- Suppose that we want to prove that the a problem

? - is NP-Complete. How do we do it?
- Find a known NP-complete problem, ?NPC.
- Show that ?NPC can be transformed (in polynomial

time) into ?. - Show that there is a solution to ?NPC if and

only if there there is a solution to ?.

Proof Hamiltonian Path is NP-complete

- A hamiltonian cycle is a cycle that passes

through each node exactly once. - A hamiltonian path is a path that includes every

node of G. - Suppose that we know that the problem of deciding

if there is a hamiltonian cycle in a graph is

NP-Complete. - We will show that the problem of deciding if

there is a hamiltonian path is also NP-Complete.

Proof Technique

- Start with any instance of the hamiltonian cycle

problem (?NPC). - We denote this instance as G (N, A).
- Transformation proofs (these are standard).

Create an instance G (N, A) for the

hamiltonian path problem from G with the

following property there is a hamiltonian path

in G if and only if there is a hamiltonian cycle

in G.

The transformed network node 1 of the original

network was split into nodes 1 and 21, and nodes

0 and 22 were connected to the split nodes.

The original network

From J.Orlin

Claim If there is a hamiltonian cycle in the

original graph then there is a hamiltonian path

in the transformed graph.

22

1

21

0

1

A Hamiltonian Cycle.

Take the two arcs in G incident to the node 1.

Connect one to node 1, and the other to node 21.

Add in arcs (0,1) and (21, 22).

Claim If there is a hamiltonian path in the

transformed graph then there is a hamiltonian

cycle in the original graph.

22

1

21

0

1

A Hamiltonian Path

Delete the two arcs (0, 1) and (21, 22). Then

take the other arcs in G incident to 1 and 21,

and make them incident to node 1 in G.

NP-CompletenessLongest Path

Problem Longest Path Input Graph G (V,E),

length l for each edge, positive integer K

V. Output Is there a simple path (i.e., a path

visiting each node at most once) with K or more

edges?

Problem HamiltonianPath Input Graph G

(V,E). Output Does it have a Hamiltonian path

(i.e., a path visiting each node exactly once)?

Reduction NP-CompletenessFrom Hamiltonian Path

into Longest Path

HamiltonianPath is NP-Complete

Problem HamiltonianPath Input Graph G

(V,E). Output Does it have a Hamiltonian path

(i.e., a path visiting each node exactly once)?

Problem Longest Path Input Graph G (V,E),

length l for each edge, positive integer K

V. Output Is there a simple path (i.e., a path

visiting each node at most once) with K or more

edges?

If there is a path of length K V (Yes)

Reduction NP-CompletenessFrom Hamiltonian Path

into Longest Path

Problem HamiltonianPath Input Graph G

(V,E). Output Does it have a Hamiltonian path

(i.e., a path visiting each node exactly once)?

Problem Longest Path Input Graph G (V,E),

length l for each edge, positive integer K

V. Output Is there a simple path (i.e., a path

visiting each node at most once) with K or more

edges?

If there is not a path of length K V (No)

Reduction NP-CompletenessFrom Hamiltonian Path

into Longest Path

Problem HamiltonianPath Input Graph G

(V,E). Output Does it have a Hamiltonian path

(i.e., a path visiting each node exactly once)?

If there is a Hamiltonian Path in G (YES).

Problem Longest Path Input Graph G (V,E),

length l for each edge, positive integer K

V. Output Is there a simple path (i.e., a path

visiting each node at most once) with K or more

edges?

Reduction NP-CompletenessFrom Hamiltonian Path

into Longest Path

Problem HamiltonianPath Input Graph G

(V,E). Output Does it have a Hamiltonian path

(i.e., a path visiting each node exactly once)?

If there is NOT a Hamiltonian Path in G (NO).

Problem Longest Path Input Graph G (V,E),

length l for each edge, positive integer K

V. Output Is there a simple path (i.e., a path

visiting each node at most once) with K or more

edges?

Proof of NP-Completeness of Sudoku

- Suppose that we know that the problem of deciding

if we can complete a Latin Square is

NP-Complete i.e., the Latin Square Completion

problem is NP-Complete. - We will show that the problem of deciding if we

can complete a partial Sudoku instance is also

NP-Complete.

Sudoku

Can we complete this matrix using numbers from 1

to 9, with repeating a number in a row, column,

or block?

9 55 3x 10 52 possible completions

0

1

0

1

3

5

6

8

4

7

Reduction To Sudoku

1

3

5

6

8

4

7

0

1

6

8

7

0

1

3

5

4

1

2

7

8

6

4

3

5

1

Latin Square Completion Problem

4

5

3

1

7

8

6

4

5

3

1

7

8

6

1

2

5

3

4

8

6

7

1

2

5

3

4

8

6

7

1

2

8

6

7

5

3

4

If there is a Latin Square Completion There

is also a way of Completing the Sudoku matrix

0

1

0

1

3

5

6

8

4

7

2

0

1

1

3

5

6

8

4

7

0

1

2

1

6

8

7

0

1

3

5

4

2

1

2

1

2

7

8

6

4

3

5

1

2

0

Latin Square Completion Problem

0

2

4

5

3

1

7

8

6

4

5

3

1

2

0

7

8

6

1

2

5

3

4

8

6

7

0

1

2

5

3

4

8

6

7

0

1

2

5

3

4

8

6

7

0

There is also a way of completing the Latin

Square Completion If there is a way of

completing the Sudoku matrix

0

1

0

1

3

5

6

8

4

7

2

1

3

5

6

8

4

7

0

1

2

6

8

7

0

1

3

5

4

2

1

2

7

8

6

4

3

5

1

2

0

Latin Square Completion Problem

0

2

4

5

3

1

7

8

6

4

5

3

1

2

0

7

8

6

1

2

5

3

4

8

6

7

0

1

2

5

3

4

8

6

7

0

1

2

5

3

4

8

6

7

0

Class NP vs. Class Co-NP

Class NP ? Class of decision problems whose

solution can be verified in polynomial time.

(E.g., satisfiability) Class Co-NP ? co-NP is the

complexity class that contains the complements of

decision problems in the complexity class NP.

Co-NP and the asymmetry of NP

Unsatisfiability (Complement of

Satisfiability)- Is it true that for all values

of a, b, and c this formula is not

satisfiable? YES instances dont have short

answers If indeed the answer to this question

is YES (i.e., we have an unsatisfiable clause)

we do not have a short proof for it.

Class Co-NP ? co-NP is the complexity class that

contains the complements of decision problems

in the complexity class NP.

Class P vs. NP ? Co-NP

If a problem belongs to P, then it belongs to

both NP and co-NP, so P ? NP ? Co-NP

P NP ? Co-NP ? (i.e., are there problems with

good characterization but with no polynomial time

algorithm?)

Class PSPACE

- What if we worry about space - i.e., memory
- requirements?
- Class PSPACE - the set of decision problems
- that can be solved using a polynomial amount
- of memory, and unlimited time.
- Clearly P ? PSPACE

Class PSPACE

- Class PSPACE - class of problems that appears
- to be harder than NP and co-NP.
- Why? space can be re-used while time cannot.
- Examples
- Consider an algorithm that counts from 0 to 2n

1. while this algorithm can be implemented with a

simple n-bit counter, it runs for an exponential

time! - We can also solve the satisfiability problem

using only a polynomial amount of space, for

example by trying all possible assignments using

also an n-bit counter.

PSPACE-Complete

- A decision problem is in PSPACE-complete if it

is in PSAPCE, and - every problem in PSPACE can be reduced to it in

polynomial time. The - problems in PSPACE-complete can be thought of as

the hardest problems - in PSPACE. These problems are widely suspected to

be outside of P and - NP, but that is not known.

Satisfiability (NP-complete)

Quantified Satisfiability (PSPACE-complete) (The

most basic PSPACE-complete problem is identical

to satisfiability, except it alternates

existential and universal quantifiers)

The PSPACE-complete problem resembles a game is

there some move I can make, such that for all

moves my opponent might make, there will then be

some move I can make to win? The question

alternates existential and universal

quantifiers. Not surprisingly, many puzzles turn

out to be NP-complete, and many games turn out

to be PSPACE-complete.

- Checkers is PSPACE-complete when generalized so

that it can be played on an - n n board.
- Generalized versions of the games Hex and Reversi

and - the solitaire games such as Rush Hour, Mahjong,

Atomix and Sokoban. - Note that the definition of PSPACE-complete is

based on - asymptotic complexity the time it takes to solve

a - problem of size n, in the limit as n grows

without bound. - That means a game like checkers (which is played

on an 8 - 8 board) could never be PSPACE-complete. That

is why - all the games were modified by playing them on an

n n - board instead.

PSPACE

Co-NP

NP

P

Some examples of NP-hard problems

- Longest path
- Traveling Salesman Problem
- Capital Budgeting Problem (knapsack problem)
- Independent Set Problem
- Fire Station Problem (set covering)
- 0-1 Integer programming
- Integer Programming
- Project management with resource constraints
- and thousands more

Okay Should we give up?

NO WAY!!!

- Here is why
- The theory of NP-completeness is only a worst

case result. Not all problem instances are as

hard as the worst case. - Real problems tend to have sub-problems that are

tractable and by exploiting the structure of such

sub-problems using efficient algorithms such Unit

Propagation, LP, Min Cost Flow, Transportation.

Assignment and shortest path methods we can solve

much larger problem instances. - In the 1970s we could only solve Binary Integer

Programming instances with ? 100 variables. By

exploiting the structure we can now solve real

world instances with over 120,000 variables and

4000 functional constraints. - In the 1990s we could only solve Satisfiability

instances with ? 50 variables and 200 clauses.

By using randomization and learning to exploit

the structure we can now solve real world

instances with over 1,000,000 variables and

5,000,000 functional clauses.

NP-Complete and NP-Hard Problems

Planning and Scheduling And Supply Chain

Management

Data Analysis Data Mining

Protein Folding And Medical Applications

Capital Budgeting And Financial Appl.

Information Retrieval

Combinatorial Auctions

Software Hardware Verification

Many more applications!!!