Loading...

PPT – Approximation Algorithms PowerPoint presentation | free to download - id: 237f17-ZDc1Z

The Adobe Flash plugin is needed to view this content

Approximation Algorithms

- Load Balancing
- k-center selection
- Pricing Method
- Vertex Cover
- Set Cover
- Bin Packing
- TSP

Approximation Algorithms

- Q. Suppose I need to solve an NP-hard problem.

What should I do? - A. Theory says you're unlikely to find a

poly-time algorithm. - Must sacrifice one of three desired features.
- Solve problem to optimality.
- Solve problem in poly-time.
- Solve arbitrary instances of the problem.
- ?-approximation algorithm.
- Guaranteed to run in poly-time.
- Guaranteed to solve arbitrary instance of the

problem - Guaranteed to find solution within ratio ? of

true optimum. - Challenge. Need to prove a solution's value is

close to optimum, without even knowing what

optimum value is!

11.1 Load Balancing

Load Balancing

- Input. m identical machines n jobs, job j has

processing time tj. - Job j must run contiguously on one machine.
- A machine can process at most one job at a time.
- Def. Let J(i) be the subset of jobs assigned to

machine i. The - load of machine i is Li ?j ? J(i) tj.
- Def. The makespan is the maximum load on any

machine L maxi Li. - Load balancing. Assign each job to a machine to

minimize makespan.

Load Balancing List Scheduling

- List-scheduling algorithm.
- Consider n jobs in some fixed order.
- Assign job j to machine whose load is smallest so

far. - Implementation. O(n log n) using a priority

queue.

List-Scheduling(m, n, t1,t2,,tn) for i 1

to m Li ? 0 J(i) ? ? for j

1 to n i argmink Lk J(i) ? J(i)

? j Li ? Li tj

load on machine i

jobs assigned to machine i

machine i has smallest load

assign job j to machine i

update load of machine i

Load Balancing List Scheduling Analysis

- Theorem. Graham, 1966 Greedy algorithm is a

2-approximation. - First worst-case analysis of an approximation

algorithm. - Need to compare resulting solution with optimal

makespan L. - Lemma 1. The optimal makespan L ? maxj tj.
- Pf. Some machine must process the most

time-consuming job. ? - Lemma 2. The optimal makespan
- Pf.
- The total processing time is ?j tj .
- One of m machines must do at least a 1/m fraction

of total work. ?

Load Balancing List Scheduling Analysis

- Theorem. Greedy algorithm is a 2-approximation.
- Pf. Consider load Li of bottleneck machine i.
- Let j be last job scheduled on machine i.
- When job j assigned to machine i, i had smallest

load. Its load before assignment is Li - tj ?

Li - tj ? Lk for all 1 ? k ? m.

blue jobs scheduled before j

machine i

j

0

L Li

Li - tj

Load Balancing List Scheduling Analysis

- Theorem. Greedy algorithm is a 2-approximation.
- Pf. Consider load Li of bottleneck machine i.
- Let j be last job scheduled on machine i.
- When job j assigned to machine i, i had smallest

load. Its load before assignment is Li - tj ?

Li - tj ? Lk for all 1 ? k ? m. - Sum inequalities over all k and divide by m
- (correct the second eqn. to j)
- Now ?

Lemma 1

Lemma 2

Load Balancing List Scheduling Analysis

- Q. Is our analysis tight?
- A. Essentially yes.
- Ex m machines, m(m-1) jobs length 1 jobs, one

job of length m

machine 2 idle

machine 3 idle

machine 4 idle

machine 5 idle

m 10

machine 6 idle

machine 7 idle

machine 8 idle

machine 9 idle

machine 10 idle

list scheduling makespan 19

Load Balancing List Scheduling Analysis

- Q. Is our analysis tight?
- A. Essentially yes.
- Ex m machines, m(m-1) jobs length 1 jobs, one

job of length m

m 10

optimal makespan 10

Load Balancing LPT Rule

- Longest processing time (LPT). Sort n jobs in

descending order of processing time, and then run

list scheduling algorithm.

LPT-List-Scheduling(m, n, t1,t2,,tn) Sort

jobs so that t1 t2 tn for i 1 to

m Li ? 0 J(i) ? ? for j

1 to n i argmink Lk J(i) ? J(i) ?

j Li ? Li tj

load on machine i

jobs assigned to machine i

machine i has smallest load

assign job j to machine i

update load of machine i

Load Balancing LPT Rule

- Observation. If at most m jobs, then

list-scheduling is optimal. - Pf. Each job put on its own machine. ?
- Lemma 3. If there are more than m jobs, L ? 2

tm1. - Pf.
- Consider first m1 jobs t1, , tm1.
- Since the ti's are in descending order, each

takes at least tm1 time. - There are m1 jobs and m machines, so by

pigeonhole principle, at least one machine gets

two jobs. ? - tj lt t(m1) lt ½ L
- Theorem. LPT rule is a 3/2 approximation

algorithm. - Pf. Same basic approach as for list scheduling.
- ?

Lemma 3 ( by observation, can assume number of

jobs gt m )

Load Balancing LPT Rule

- Q. Is our 3/2 analysis tight?
- A. No.
- Theorem. Graham, 1969 LPT rule is a

4/3-approximation. - Pf. More sophisticated analysis of same

algorithm. - Q. Is Graham's 4/3 analysis tight?
- A. Essentially yes.
- Ex m machines, n 2m1 jobs, 2 jobs of length

m1, m2, , 2m-1 and one job of length m.

11.2 Center Selection

Center Selection Problem

- Input. Set of n sites s1, , sn.
- Center selection problem. Select k centers C so

that maximum distance from a site to nearest

center is minimized.

k 4

site

Center Selection Problem

- Input. Set of n sites s1, , sn.
- Center selection problem. Select k centers C so

that maximum distance from a site to nearest

center is minimized. - Notation.
- dist(x, y) distance between x and y.
- dist(si, C) min c ? C dist(si, c) distance

from si to closest center. - r(C) maxi dist(si, C) smallest covering

radius. - Goal. Find set of centers C that minimizes r(C),

subject to C k. - Distance function properties.
- dist(x, x) 0 (identity)
- dist(x, y) dist(y, x) (symmetry)
- dist(x, y) ? dist(x, z) dist(z, y) (triangle

inequality)

Center Selection Example

- Ex each site is a point in the plane, a center

can be any point in the plane, dist(x, y)

Euclidean distance. - Remark search can be infinite!

r(C)

center

site

Greedy Algorithm A False Start

- Greedy algorithm. Put the first center at the

best possible location for a single center, and

then keep adding centers so as to reduce the

covering radius each time by as much as possible.

- Remark arbitrarily bad!

greedy center 1

center

k 2 centers

site

Center Selection Greedy Algorithm

- Greedy algorithm. Repeatedly choose the next

center to be the site farthest from any existing

center. - Observation. Upon termination all centers in C

are pairwise at least r(C) apart. - Pf. By construction of algorithm.

Greedy-Center-Selection(k, n, s1,s2,,sn) C

? repeat k times Select a site si

with maximum dist(si, C) Add si to C

return C

site farthest from any center

Center Selection Analysis of Greedy Algorithm

- Theorem. Let C be an optimal set of centers.

Then r(C) ? 2r(C). - Pf. (by contradiction) Assume r(C) lt ½ r(C).
- For each site ci in C, consider ball of radius ½

r(C) around it. - Exactly one ci in each ball let ci be the site

paired with ci. - Consider any site s and its closest center ci in

C. - dist(s, C) ? dist(s, ci) ? dist(s, ci)

dist(ci, ci) ? 2r(C). - Thus r(C) ? 2r(C). ?

?-inequality

? r(C) since ci is closest center

½ r(C)

½ r(C)

ci

½ r(C)

C

ci

sites

s

Center Selection

- Theorem. Let C be an optimal set of centers.

Then r(C) ? 2r(C). - Theorem. Greedy algorithm is a 2-approximation

for center selection problem. - Remark. Greedy algorithm always places centers

at sites, but is still within a factor of 2 of

best solution that is allowed to place centers

anywhere. - Question. Is there hope of a 3/2-approximation?

4/3?

e.g., points in the plane

Theorem. Unless P NP, there no ?-approximation

for center-selection problem for any ? lt 2.

11.4 The Pricing Method Vertex Cover

Weighted Vertex Cover

- Weighted vertex cover. Given a graph G with

vertex weights, find a vertex cover of minimum

weight.

4

2

4

2

9

2

9

2

weight 9

weight 2 2 4

Weighted Vertex Cover

- Pricing method. Each edge must be covered by

some vertex i. Edge e pays price pe ? 0 to use

vertex i. - Fairness. Edges incident to vertex i should pay

? wi in total. - Claim. For any vertex cover S and any fair

prices pe ?e pe ? w(S). - Proof. ?

4

2

9

2

sum fairness inequalities for each node in S

each edge e covered by at least one node in S

Pricing Method

- Pricing method. Set prices and find vertex cover

simultaneously.

Weighted-Vertex-Cover-Approx(G, w) foreach e

in E pe 0 while (? edge i-j such that

neither i nor j are tight) select such an

edge e increase pe without violating

fairness S ? set of all tight nodes

return S

Pricing Method

price of edge a-b

vertex weight

Figure 11.8

Pricing Method Analysis

- Theorem. Pricing method is a 2-approximation.
- Pf.
- Algorithm terminates since at least one new node

becomes tight after each iteration of while loop. - Let S set of all tight nodes upon termination

of algorithm. S is a vertex cover if some edge

i-j is uncovered, then neither i nor j is tight.

But then while loop would not terminate. - Let S be optimal vertex cover. We show w(S) ?

2w(S).

all nodes in S are tight

S ? V, prices ? 0

fairness lemma

each edge counted twice

Extra Slides

Load Balancing on 2 Machines

- Claim. Load balancing is hard even if only 2

machines. - Pf. NUMBER-PARTITIONING ? P LOAD-BALANCE.

NP-complete by Exercise 8.26

a

d

b

c

f

g

e

length of job f

Machine 1

a

d

f

machine 1

yes

Machine 2

b

c

e

g

machine 2

Time

L

0

Center Selection Hardness of Approximation

- Theorem. Unless P NP, there is no

?-approximation algorithm for metric k-center

problem for any ? lt 2. - Pf. We show how we could use a (2 - ?)

approximation algorithm for k-center to solve

DOMINATING-SET in poly-time. - Let G (V, E), k be an instance of

DOMINATING-SET. - Construct instance G' of k-center with sites V

and distances - d(u, v) 2 if (u, v) ? E
- d(u, v) 1 if (u, v) ? E
- Note that G' satisfies the triangle inequality.
- Claim G has dominating set of size k iff there

exists k centers C with r(C) 1. - Thus, if G has a dominating set of size k, a (2 -

?)-approximation algorithm on G' must find a

solution C with r(C) 1 since it cannot use

any edge of distance 2.

see Exercise 8.29

Vertex Cover Approximation

- A vertex cover is a subset of vertices such that

every edge in the graph is incident to at least

one of these vertices. - The vertex cover optimization problem is to ?nd a

vertex cover of minimum size. - For a good strategy, a heuristic is needed

Vertex Cover

- Consider an arbitrary edge (u, v) in the graph.

One of its two vertices must be in the cover, but

we do not know which one. - The idea of this heuristic is to simply put both

vertices into the vertex cover. - Then we remove all edges that are incident to u

and v (since they are now all covered), and

recurse on the remaining edges. - For every one vertex that must be in the cover,

we put two into our cover, so it is easy to see

that the cover we generate is at most twice the

size of the optimum cover.

Proof of aprroximation ratio

- Claim approx VC yields a factor-2 approximation
- Proof Consider the set C output by ApproxVC. Let

C be the optimum VC. Let A be the set of edges

selected by the line marked with () in the

?gure. Observe that the size of C is exactly

2Abecause we add two vertices for each such

edge. However note that in the optimum VC one of

these two vertices must have been added to the

VC, and thus the size of C is at least A. Thus

we have - C
- ---- A lt C
- 2
- Therefore
- C
- ---- lt 2
- C

Example

Approximate VC Algorithm Naive Approach

- ApproxVC
- C empty-set
- while (E is nonempty) do
- () let (u,v) be any edge of E
- add both u and v to C
- remove from E all edges incident to either u or

v - return C
- Can we improve on it ?
- Why not consider vertices with higher degrees

first (Greedy Strategy)

Greedy VC

- Greedy Approximation for VC GreedyVC(G(V,E))
- C empty-set
- while (E is nonempty) do
- let u be the vertex of maximum degree in G
- add u to C
- remove from E all edges incident to u
- return C
- For the example, it yields the optimum solution

Greedy VC Example

- Can we prove Greedy VC outperforms the other one

? - NO !
- It can even perform poorly than it.
- However, it should also be pointed out that the

vertex cover constructed by the greedy heuristic

is (for typical graphs) smaller than that one

computed by the 2-for-1 heuristic, so it would

probably be wise to run both algorithms and take

the better of the two.

Third Attempt Use Matching

- A matching is a subset of edges that have no

vertices in common - A matching is maximal if no more edges can be

added to it. - Maximal matchings will help us ?nd good vertex

covers, and moreover, they are easy to generate

repeatedly pick edges that are disjoint from the

ones chosen already, until this is no longer

possible. - Any vertex cover of a graph G must be at least as

large as the number of edges in any matching in

G that is, any matching provides a lower bound

on OPT. This is simply because each edge of the

matching must be covered by one of its endpoints

in any vertex cover!

Example

- Figure below shows how to convert from Maximal

Matching to Vertex Cover - a) A matching b) Completion to MaxMatch c)

Its VC

Vertex Cover from Matching

- let S be a set that contains both endpoints of

each edge in a maximal matching M. - Then S must be a vertex coverif it isnt, that

is, if it doesnt touch some edge e ? E, then M

could not possibly be maximal since we could

still add e to it. But our cover S has 2M

vertices - We know that any vertex cover must have size at

least M. - Algorithm
- Find a maximal matching M 8 E
- Return S all endpoints of edges in M

Vertex cover from Matching

- This simple procedure always returns a vertex

cover whose size is at most twice optimal! - In summary, even though we have no way of ?nding

the best vertex cover, we can easily ?nd another

structure, a maximal matching, with two key

properties - 1. Its size gives us a lower bound on the

optimal vertex cover. - 2. It can be used to build a vertex cover,

whose size can be related to that of the optimal

cover using property 1. - Alpha lt 2

Set Cover Problem Revisited

- Given a pair (X,F) where X x1,x2,...,xm is a

?nite set (a domain of elements) and F

S1,S2,...,Sn is a family of subsets of X, such

that every element of X belongs to at least one

set of F. - For C ? F. (This is a collection of sets over X.)

We say that C covers the domain if every element

of X is in some set of C - The problem is to ?nd the minimum-sized subset C

of F that covers X.

Set Cover

- Vertex Cover is a type of set cover problem. The

domain to be covered are the edges, and each

vertex covers the subset of incident edges. - Decision-problem formulation of set cover (does

there exist a set cover of size at most k?) is

NP-complete - There is a factor-2 approximation for the vertex

cover problem, but it cannot be applied to

generate a factor2 approximation for set cover.

Set Cover

- It is known that there is no constant factor

approximation to the set cover problem - There is however the greedy heuristic, which

achieves an approximation bound of ln m, where m

X, the size of the underlying domain, we will

leave the proof. - A simple greedy approach to set cover works by at

each stage selecting the set that covers the

greatest number of uncovered elements

Set Cover The Approx. Algorithm

- Greedy-Set-Cover(X, F)
- U X // U are the items to be

covered - C empty // C will be the sets in the

cover - while (U is nonempty) // there is someone left

to cover - select S in F that covers the most elements of U
- addS to C
- UU-S
- return C

Set Cover Bad Example

- The optimal set cover consists of sets S5 and S6,

each of size 16. Initially all three sets S1, S5,

and S6 have 16 elements. If ties are broken in

the worst possible way, the greedy algorithm will

first select set S1. We remove all the covered

elements. Now S2, S5 and S6 all cover 8 of the

remaining elements. Again, if we choose poorly,

S2 is chosen. The pattern repeats, choosing S3

(size 4), S4 (size 2) and finally S5 and S6 (each

of size 1).

Bin Packing

- Bin packing is another well-known NP-complete

problem, which is a variant of the knapsack

problem - Given a set of n objects, where si denotes the

size of the ith object (0 lt si lt 1. for

simplification) , put objects into bins - Size of a bin is 1 at max.
- Use fewest bins as possible
- Ex Fit object sto a truck etc.

Bin Packing Example

Bin Packing Approximation Factor

- Theorem The ?rst-?t heuristic achieves a ratio

bound of 2. - Proof Consider an instance s1,...,sn of the

bin packing problem. Let S S i si denote the sum

of all the object sizes. Let b denote the

optimal number of bins, and bff denote the number

of bins used by ?rst-?t. - b gt S since no bin can hold a total capacity

of more than 1 unit, and even if we were to fill

each bin exactly to its capacity, we would need

at least S bins

Bin Packing Analysis

- We claim that bff lt 2S.
- To see this, let ti denote the total size of the

objects that first-fit puts into bin i. - Consider bins i and i 1 filled by first-fit.

Assume that indexing is cyclical, so if i is the

last index (i bff ) then i1 1. - We claim that ti ti 1 gt 1. If not, then the

contents of bins i and i 1 could both be put

into the same bin, and hence first-fit would

never have started to fill the second bin,

preferring to keep everything in the first bin.

Thus we have - bff
- Si1 ( ti ti1 ) gt bff

Bin Packing Analysis

- But this sum adds up all the elements twice, so

it has a total value of 2S. Thus we have 2S gt

bff . - Combining this with the fact that b gt S we

have - bff lt 2S lt 2b showing bff /b lt 2 as

required - best fit attempts put the object into the bin

in which it fits most closely with the available

space (approx. ratio 17/10) - first fit decreasing, in which the objects are

first sorted - in decreasing order of size (approx. ratio

11/9)

Traveling Salesman Problem (TSP)

- In the TSP, given a complete undirected graph

with nonnegative edge weights, - Find a cycle that visits all vertices and is of

minimum - cost. (NP-Complete)
- Distances should satisfy the triangle

inequality - for all u, v, w ? c(u, w) lt c(u, v)c(v, w)
- (c(u,v) cost on edge uv or cost of shortest

path) - There is an approx. Algorithm forTSP with a

ratio of 2 (the tour that it produces cannot be

worse than twice the cost of the optimal tour)

TSP Observations

- A TSP with one edge removed is a spanning tree

(not necessarily a minimum spanning tree) - Therefore, the cost of the minimum TSP tour is at

least as large as the cost of the MST. - MST can be computed ef?ciently, using, for

example, either Kruskals or Prims algorithm - If we can ?nd some way to convert the MST into a

TSP tour while increasing its cost by at most a

constant factor, then we will have an

approximation for TSP. - We will see that if the edge weights satisfy the

triangle inequality, then this is possible.

TSP ? MST

- Given any free tree there is a tour of the tree

called a twice around tour that traverses the

edges of the tree twice, once in each direction.

The ?gure below shows an example. - MST twice round tour Short Cut

Optimal TSP

TSP

- This path is not simple because it revisits

vertices, but we can make it simple by

short-cutting, that is, we skip over previously

visited vertices - the ?nal order in which vertices are visited

using the short-cuts is exactly the same as a

preorder traversal of the MST - The triangle inequality assures us that the path

length will not increase when we take short-cuts.

Approximate Algorithm for TSP

- ApproxTSP(G(V,E))
- T minimum spanning tree for G
- r any vertex
- L list of vertices visited by a preorder walk

of T - starting with r
- return L

Approx.TSP Analysis

- Claim Approx-TSP has a ratio bound of 2.
- Proof Let H denote the tour produced by this

algorithm and let H be the optimum tour. Let T

be the minimum spanning tree. - We can remove any edge of H resulting in a

spanning tree, and since T is the minimum cost

spanning tree we have - c(T) lt

c(H). - Twice around tour of T has cost 2c(T), since

every edge in T is hit twice. By the triangle

inequality, when we short-cut an edge of T to

form H we do not increase the cost of the tour,

and so we have -

c(H) lt 2c(T). - Combining these we have
- c(H) /2 lt c(T) lt c(H) ,

therefore - c(H) / c(H) lt 2.

Graph Partitioning

- Input An undirected graph G (VE) with

nonnegative edge weights a real number a ? (0,

1/2. - Output A partition of the vertices into two

groups A and B, each of size at least a V - Goal Minimize the capacity of the cut (A,B).
- Applications from circuit layout to program

analysis to image segmentation. - Graph Partitioning is NP Hard
- Removing the restriction on the sizes of A and B

would give the MINIMUM CUT problem, which we know

to be efficiently solvable using flow techniques.

Acknowledgements

- The last few algorithms are dependent on David

Mountc 451 Course, University of Waterloo