Loading...

PPT – In The Name Of God Algorithms Design Greedy Algorithms PowerPoint presentation | free to download - id: 691296-ZDBmM

The Adobe Flash plugin is needed to view this content

In The Name Of GodAlgorithms DesignGreedy

Algorithms

- Dr. Shahriar Bijani
- Shahed University
- Feb. 2014

Slides References

- Kleinberg and Tardos, Algorithm Design, CSE 5311,

M Kumar, Spring 2007. - Chapter 4, Computer Algorithms, Horowitz ,,

2001. - Chapter 16, Introduction to Algorithms, CLRS ,

2001.

The Greedy Principle

- The problem We are required to find a feasible

solution that either maximizes or minimizes a

given objective solution. - It is easy to determine a feasible solution but

not necessarily an optimal solution. - The greedy method solves this problem in stages,

at each stage, a decision is made considering

inputs in an order determined by the selection

procedure which may be based on an optimization

measure.

Optimization problems

- An optimization problem is one in which you want

to find, not just a solution, but the best

solution - A greedy algorithm sometimes works well for

optimization problems - A greedy algorithm works in phases. At each

phase - You take the best you can get right now, without

regard for future consequences - You hope that by choosing a local optimum at each

step, you will end up at a global optimum

Example Counting money

- Suppose you want to count out a certain amount of

money, using the fewest possible bills and coins - A greedy algorithm would do this would beAt

each step, take the largest possible bill or coin

that does not overshoot - Example To make 6.39, you can choose
- a 5 bill
- a 1 bill, to make 6
- a 25 coin, to make 6.25
- A 10 coin, to make 6.35
- four 1 coins, to make 6.39
- For US money, the greedy algorithm always gives

the optimum solution

A failure of the greedy algorithm

- In some (fictional) monetary system, Rial come

in 1 Rial, 7 Rial, and 10 Rial coins - Using a greedy algorithm to count out 15 Rials,

you would get - A 10 Rial piece
- Five 1 Rial pieces, for a total of 15 Rials
- This requires six coins
- A better solution would be to use two 7 Rial

pieces and one 1 Rial piece - This only requires three coins
- This greedy algorithm results in a solution, but

not in an optimal solution

A scheduling problem

- You have to run nine jobs, with running times of

3, 5, 6, 10, 11, 14, 15, 18, and 20 minutes - You have three processors on which you can run

these jobs - You decide to do the longest-running jobs first,

on whatever processor is available

P1 P2 P3

- Time to completion 18 11 6 35 minutes
- This solution isnt bad, but we might be able to

do better

Another approach

- What would be the result if you ran the shortest

job first? - Again, the running times are 3, 5, 6, 10, 11, 14,

15, 18, and 20 minutes

P1 P2 P3

- That wasnt such a good idea time to completion

is now 6 14 20 40 minutes - Note, however, that the greedy algorithm itself

is fast - All we had to do at each stage was pick the

minimum or maximum

An optimum solution

- Better solutions do exist

- This solution is clearly optimal (why?)
- Clearly, there are other optimal solutions (why?)
- How do we find such a solution?
- One way Try all possible assignments of jobs to

processors - Unfortunately, this approach can take exponential

time

Huffman encoding

- The Huffman encoding algorithm is a greedy

algorithm - You always pick the two smallest numbers to

combine

- Average bits/char0.222 0.123 0.242

0.064 0.272 0.094 2.42 - The Huffman algorithm finds an optimal solution

A00B100C01D1010E11F1011

22 12 24 6 27 9 A B C D E

F

Procedure Huffman_Encoding(S,f) Input S (a

string of characters) and f (an array of

frequencies). Output T (the Huffman tree for

S) 1. insert all characters into a heap H

according to their frequencies 2. while H

is not empty do 3. if H contains only one

character x then 4. x ? root (T) 5. else

6. z ? ALLOCATE_NODE() 7. x ? leftT,z ?

EXTRACT_MIN(H) 8. y ? rightT,z ?

EXTRACT_MIN(H) 9. fz ? fx fy 10. INSERT(H,z)

Complexity of Huffman algorithm Building a heap

in step 1 takes O(n) time Insertions (steps 7 and

8) and deletions (step 10) on H take O (log

n) time each Therefore Steps 2 through 10 take

O(n logn) time Thus the overall complexity of

the algorithm is O( n logn ).

Knapsack Problem

- Assume a thief is robbing a store with various

valued and weighted objects. Her bag can hold

only W pounds. - The 0-1 knapsack problem must take all or

nothing - The fractional knapsack problem can take all or

part of the item

Where Greedy Fails

50

50

50

50

50

20

20 --- 30

30

30

30

20

20

0-1 knapsack

20

10

10

10

10

60

220

160

180

100

120

240

Fractional Knapsack Problem

- Given n items of sizes w1, w2, , wn and values

p1, p2, , pn and size of knapsack of C, the

problem is to find x1, x2, , xn ?? that maximize - subject to

Solution to Fractional Knapsack Problem

- Consider yi pi / wi
- What is yi?
- What is the solution?
- fractional problem has greedy-choice property,

0-1 does not.

Critical Path Method (CPM)

- Activity On Edge (AOE) Networks
- Tasks (activities) a0, a1,
- Events v0,v1,

V6

V1

a3 1

a6 9

a9 2

a0 6

V0

V4

V8

finish

start

a1 4

a10 4

a4 1

a7 7

V7

V2

- Some definition
- Predecessor
- Successor
- Immediate predecessor
- Immediate successor

a2 5

a8 4

a5 2

V3

V5

CPM critical path

- A critical path is a path that has the longest

length. (v0, v1, v4, v7, v8)

V6

V1

a3 1

a6 9

a9 2

a0 6

V0

V4

V8

start

finish

a1 4

a10 4

a4 1

a7 7

V7

V2

a2 5

a8 4

6 1 7 4 18 (Max)

a5 2

V3

V5

The earliest time

- The earliest time of an activity, ai, can occur

is the length of the longest path from the start

vertex v0 to ais start vertex. (Ex the

earliest time of activity a7 can occur is 7.) - We denote this time as early(i) for activity

ai.? early(6) early(7) 7.

V6

V1

a3 1

a6 9

a9 2

a0 6

7/?

6/?

16/?

0/?

V0

V4

V8

finish

start

a1 4

14/?

7/?

4/?

18

a10 4

a4 1

a7 7

0/?

V7

V2

0/?

a2 5

a8 4

7/?

a5 2

V3

V5

5/?

The latest time

- The latest time, late(i), of activity, ai, is

defined to be the latest time the activity may

start without increasing the project duration. - Ex early(5) 5 late(5) 8 early(7) 7

late(7) 7

V6

V1

a3 1

a6 9

a9 2

a0 6

7/7

6/6

16/16

0/0

V0

V4

V8

finish

start

a1 4

14/14

7/7

4/5

a10 4

a4 1

a7 7

0/1

V7

V2

0/3

a2 5

a8 4

7/10

late(5) 18 4 4 - 2 8 late(7) 18 4 7

7

a5 2

V3

V5

5/8

Critical activity

- A critical activity is an activity for which

early(i) late(i). - The difference between late(i) and early(i) is a

measure of how critical an activity is.

CalculationofEarliest Times

FindingCritical path(s)

To solveAOE Problem

CalculationofLatest Times

Calculation of Earliest Times

- Let activity ai is represented by edge (u, v).
- early (i) earliest u
- late (i) latest v duration of activity ai
- We compute the times in two stagesa forward

stage and a backward stage. - The forward stage
- Step 1 earliest 0 0
- Step 2 earliest j max earliest i

duration of (i, j) i is in P(j) - P(j) is the set of immediate predecessors of j.

Calculation of Latest Times

- The backward stage
- Step 1 latestn-1 earliestn-1
- Step 2 latest j min latest i - duration

of (j, i) i is in S(j) - S(j) is the set of vertices adjacent from vertex

j. - latest8 earliest8 18
- latest6 minearliest8 - 2 16
- latest7 minearliest8 - 4 14
- latest4 minearliest6 9 earliest7

7 7 - latest1 minearliest4 - 1 6
- latest2 minearliest4 - 1 6
- latest5 minearliest7 - 4 10
- latest3 minearliest5 - 2 8
- latest0 minearliest1 6 earliest2

4 earliest3 5 0

Graph with non-critical activities deleted

V6

V1

Activity Early Late L - E Critical

a0 0 0 0 Yes

a1 0 2 2 No

a2 0 3 3 No

a3 6 6 0 Yes

a4 4 6 2 No

a5 5 8 3 No

a6 7 7 0 Yes

a7 7 7 0 Yes

a8 7 10 3 No

a9 16 16 0 Yes

a10 14 14 0 Yes

a9

a0

a3

a6

V8

V0

V4

a1

V2

V7

finish

start

a4

a7

a10

a2

V3

V5

a8

a5

V6

a6

a9

V1

a0

a3

V0

V4

V8

start

finish

V7

a7

a10

CPM for the longest path problem

- The longest path(critical path) problem can be

solved by the critical path method(CPM) - Step 1Find a topological ordering.
- Step 2 Find the critical path.

Traveling salesman

- A salesman must visit every city (starting from

city A), and wants to cover the least possible

distance - He can revisit a city (and reuse a road) if

necessary - He does this by using a greedy algorithm He goes

to the next nearest city from wherever he is

- From A he goes to B
- From B he goes to D
- This is not going to result in a shortest path!
- The best result he can get now will be ABDBCE, at

a cost of 16 - An actual least-cost path from A is ADBCE, at a

cost of 14

Analysis

- A greedy algorithm typically makes

(approximately) n choices for a problem of size n - (The first or last choice may be forced)
- Hence the expected running time isO(n

O(choice(n))), where choice(n) is making a choice

among n objects - Counting Must find largest useable coin from

among k sizes of coin (k is a constant), an

O(k)O(1) operation - Therefore, coin counting is (n)
- Huffman Must sort n values before making n

choices - Therefore, Huffman is O(n log n) O(n) O(n log

n)

Some Other greedy algorithms

- Dijkstras algorithm for finding the shortest

path in a graph (single source) - Always takes the shortest edge connecting a known

node to an unknown node - Prims algorithm for finding a minimum-cost

spanning tree - Always takes the lowest-cost edge between nodes

in the spanning tree and nodes not yet in the

spanning tree - Kruskals algorithm for finding a minimum-cost

spanning tree - Always tries the lowest-cost remaining edge

Single Source Shortest Path

- In a shortest-paths problem, we are given a

weighted, directed graph G (V,E), with weights

assigned to each edge in the graph. The weight of

the path p (v0, v1, v2, , vk) is the sum of

the weights of its edges - v0 ? v1 ? v2 . . . ? vk-1? vk
- The shortest-path from u to v is given by

d(u,v) min weight (p) if there are one or

more paths from u to v - ? otherwise

Dijkstra Single-Source Shortest Paths

Given G (V,E), find the shortest path from a

given vertex u ? V to every vertex v ? V ( u

?v). For each vertex v ? V in the weighted

directed graph, dv represents the distance from

u to v. Initially, dv 0 when u v. dv

? if (u,v) is not an edge dv

weight of edge (u,v) if (u,v) exists. Dijkstra's

Algorithm At every step of the algorithm, we

compute, dy min dy, dx w(x,y),

where x,y ? V. Dijkstra's algorithm is based on

the greedy principle because at every step we

pick the path of least weight.

(No Transcript)

Step Vertex to be marked Distance to vertex Distance to vertex Distance to vertex Distance to vertex Distance to vertex Distance to vertex Distance to vertex Distance to vertex Distance to vertex Unmarked vertices

Step Vertex to be marked u a b c d e f g h Unmarked vertices

0 u 0 1 5 ? 9 ? ? ? ? a,b,c,d,e,f,g,h

1 a 0 1 5 3 9 ? ? ? ? b,c,d,e,f,g,h

2 c 0 1 5 3 7 ? 12 ? ? b,d,e,f,g,h

3 b 0 1 5 3 7 8 12 ? ? d,e,f,g,h

4 d 0 1 5 3 7 8 12 11 ? e,f,g,h

5 e 0 1 5 3 7 8 12 11 9 f,g,h

6 h 0 1 5 3 7 8 12 11 9 g,h

7 g 0 1 5 3 7 8 12 11 9 h

8 f 0 1 5 3 7 8 12 11 9 --

Dijkstra's Single-source shortest path

- Procedure Dijkstra's Single-source shortest

path_G(V,E,u) - Input G (V,E), the weighted directed graph and

v the source vertex - Output for each vertex, v, dv is the length of

the shortest path from u to v. - mark vertex u
- du ? 0
- for each unmarked vertex v ? V do
- if edge (u,v) exists d v ? weight (u,v)
- else dv ? ?
- while there exists an unmarked vertex do
- let v be an unmarked vertex such that dv is

minimal - mark vertex v
- for all edges (v,x) such that x is unmarked do
- if dx gt dv weightv,x then
- dx ? dv weightv,x

Dijkstra's Single-source shortest path

- Complexity of Dijkstra's algorithm
- Steps 1 and 2 take ? (1) time
- Steps 3 to 5 take O(?V?) time
- The vertices are arranged in a heap in order of

their paths from u - Updating the length of a path takes O(log V)

time. - There are ?V? iterations, and at most ?E?

updates - Therefore the algorithm takes O((?E??V?) log

?V?) time.

Minimum Spanning Tree (MST)

- Spanning tree of a connected graph G a connected

acyclic subgraph of G that includes all of Gs

vertices - Minimum spanning tree of a weighted, connected

graph G a spanning tree of G of the minimum

total weight - MST nn-2 (n numbers of nodes)
- Example

Prims MST algorithm

- Start with tree T1 consisting of one (any) vertex

and grow tree one vertex at a time to produce

MST through a series of expanding subtrees T1,

T2, , Tn - On each iteration, construct Ti1 from Ti by

adding vertex not in Ti that is closest to those

already in Ti (this is a greedy step!) - Stop when all vertices are included

Example

Notes about Prims algorithm

- Proof by induction that this construction

actually yields an MST (CLRS, Ch. 23.1). - Efficiency
- O(n2) for weight matrix representation of graph

and array implementation of priority queue - O(m log n) for adjacency list representation of

graph with n vertices and m edges and min-heap

implementation of the priority queue

Kruskals_ MST

- The algorithm maintains a collection VS of

disjoint sets of vertices - Each set W in VS represents a connected set of

vertices forming a spanning tree - Initially, each vertex is in a set by itself in

VS - Edges are chosen from E in order of increasing

cost, we consider each edge (v, w) in turn v, w

? V. - If v and w are already in the same set (say W) of

VS, we discard the edge - If v and w are in distinct sets W1 and W2

(meaning v and/or w not in T) we merge W1 with

W2 and add (v, w) to T.

Kruskal's Algorithm

- Procedure MCST_G(V,E)
- (Kruskal's Algorithm)
- Input An undirected graph G(V,E) with a cost

function c on the edges - Output T the minimum cost spanning tree for G
- T ? 0
- VS ?0
- for each vertex v ? V do
- VS VS ? v
- sort the edges of E in nondecreasing order of

weight - while ?VS? gt 1 do
- choose (v,w) an edge E of lowest cost
- delete (v,w) from E
- if v and w are in different sets W1 and W2 in

VS do - W1 W1 ? W2
- VS VS - W2
- T ? T? (v,w)
- return T

Kruskals_ MST

- Consider the example graph shown earlier, The

edges in nondecreasing order - (A,D),1,(C,D),1,(C,F),2,(E,F),2,(A,F),3,

(A,B),3, - (B,E),4,(D,E),5,(B,C),6
- EdgeActionSets in VS Spanning Tree, T

A,B,C,D,E,F0(A,D) merge

A,D, B,C, E, F (A,D) (C,D) merge

A,C,D, B, E, F (A,D), (C,D) (C,F)

mergeA,C,D,F,B,E(A,D),(C,D), (C,F)

(E,F) mergeA,C,D,E,F,B(A,D),(C,D),

(C,F),(E,F)(A,F) rejectA,C,D,E,F,B(A,D),

(C,D), (C,F), (E,F)(A,B) mergeA,B,C,D,E,F(

A,D),(A,B),(C,D), (C,F),(E,F)(B,E) reject(D,E)

reject(B,C) reject

Kruskals_ MST Complexity

- Steps 1 thru 4 take time O (V)
- Step 5 sorts the edges in nondecreasing order in

O (E log E ) time - Steps 6 through 13 take O (E) time
- The total time for the algorithm is therefore

given by O (E log E) - The edges can be maintained in a heap data

structure with the property, EPARENT(i) ? Ei - This property can be used to sort data elements

in nonincreasing order. - Construct a heap of the edge weights, the edge

with lowest cost is at the root During each step

of edge removal, delete the root (minimum

element) from the heap and rearrange the heap. - The use of heap data structure reduces the time

taken because at every step we are only picking

up the minimum or root element rather than

sorting the edge weights.

Minimum-Cost Spanning Trees (Application)

- Consider a network of computers connected through

bidirectional links. Each link is associated with

a positive cost the cost of sending a message on

each link. - This network can be represented by an undirected

graph with positive costs on each edge. - In bidirectional networks we can assume that the

cost of sending a message on a link does not

depend on the direction. - Suppose we want to broadcast a message to all the

computers from an arbitrary computer. - The cost of the broadcast is the sum of the costs

of links used to forward the message.

MST (MCST) Application

Activity Selection Problem

- Problem Formulation
- Given a set of n activities, S a1, a2, ...,

an that require exclusive use of a common

resource, find the largest possible set of

nonoverlapping activities (also called mutually

compatible). - For example, scheduling the use of a classroom.
- Assume that ai needs the resource during period

si, fi), which is a half-open interval, where - si start time of the activity, and
- fi finish time of the activity.
- Note Could have many other objectives
- Schedule room for longest time.
- Maximize income rental fees.

Activity Selection Problem Example

- Assume the following set S of activities that are

sorted by their finish time, find a maximum-size

mutually compatible set.

i 1 2 3 4 5 6 7 8 9

si 1 2 4 1 5 8 9 11 13

fi 3 5 7 8 9 10 11 14 16

i 1 2 3 4 5 6 7 8 9

si 1 2 4 1 5 8 9 11 13

fi 3 5 7 8 9 10 11 14 16

Solving the Activity Selection Problem

- Define Si,j ak ?? S fi ? sk lt fk ? sj
- activities that start after ai finishes and

finish before aj starts - Activities in Si,j are compatible with
- ai and aj
- aw that finishes not later than ai and start not

earlier than aj - Add the following imaginary activities
- a0 ?? , 0) and an1 ? , ?1)
- Hence, S S0,n1 and the range of Si,j is

0 ? i,j ? n1

Solving the Activity Selection Problem

- Assume that activities are sorted by

monotonically increasing finish time - i.e., f0 ? f1 ? f2 ? ... ? fn lt fn1
- Then, Si,j ?? for i ? j.
- Proof
- Therefore, we only need to worry about Si,j where

0 ? i lt j ? n1

Solving the Activity Selection Problem

- Suppose that a solution to Si,j includes ak . We

have 2 sub-problems - Si,k (start after ai finishes, finish before ak

starts) - Sk,j (start after ak finishes, finish before aj

starts) - The Solution to Si,j is
- (solution to Si,k ) ?? ak ? (solution to

Sk,j ) - Since ak is in neither sub-problem, and the

subproblems are disjoint, - solution to S solution to Si,k 1

solution to Sk,j

Recursive Solution

- Let Ai,j optimal solution to Si,j .
- So Ai,j Ai,k ? ak ? Ak,j, assuming
- Si,j is nonempty, and
- we know ak.
- Hence,
- ci,j number of activities in a max subset of

mutually compatible activities

Finding the Greedy Algorithm

- Theorem Let Si,j ? ?, and let am be the activity

in Si,j with the earliest finish time fm min

fk ak ? Si,j . Then - am is used in some maximum-size subset of

mutually compatible activities of Si,j - Sim ?, so that choosing am leaves Sm,j as the

only nonempty subproblem.

Recursive Greedy Algorithm

Example

Iterative Greedy Algorithm

?(Sorting)?(n) ? (n log n)

Collecting coins

- A board has a certain number of coins on it
- A robot starts in the upper-left corner, and

walks to the bottom left-hand corner - The robot can only move in two directions right

and down - The robot collects coins as it goes
- You want to collect all the coins using the

minimum number of robots - Example

- Do you see a greedy algorithm for doing this?
- Does the algorithm guarantee an optimal solution?
- Can you prove it?
- Can you find a counterexample?