Title: Approximation Algorithm
1Approximation Algorithm
- Instructor YE, Deshi
- yedeshi_at_zju.edu.cn
2 Dealing with Hard Problems
- What to do if
- Divide and conquer
- Dynamic programming
- Greedy
- Linear Programming/Network Flows
-
- does not give a polynomial time algorithm?
3Dealing with Hard Problems
- Solution I Ignore the problem
- Cant do it ! There are thousands of problems for
which we do not know polynomial time algorithms - For example
- Traveling Salesman Problem (TSP)
- Set Cover
4Traveling Salesman Problem
- Traveling SalesmanProblem (TSP)
- Input undirected graph with lengths on edges
- Output shortest cycle that visits each vertex
exactly once - Best known algorithm O(n 2n) time.
5 The vertex-cover problem
- A vertex cover of an undirected graph G (V, E)
is a subset V ' ? V such that if (u, v) ? E, then
u ? V ' or v ? V ' (or both). - A vertex cover for G is a set of vertices that
covers all the edges in E. - As a decision problem, we define
- VERTEX-COVER ltG, kgt graph G has a vertex
cover of size k. - Best known algorithm O(kn 1.274k)
6Dealing with Hard Problems
- Exponential time algorithms for small inputs.
E.g., (100/99)n time is not bad for n lt 1000. - Polynomial time algorithms for some (e.g.,
average-case) inputs - Polynomial time algorithms for all inputs, but
which return approximate solutions
7 Approximation Algorithms
- An algorithm A is ?-approximate, if, on any
inputof size n - The cost CA of the solution produced by the
algorithm, and - The cost COPT of the optimal solution are such
that CA ? COPT - We will see
- 2-approximation algorithm for TSP in the plane
- 2-approximation algorithm for Vertex Cover
8Comments on Approximation
- CA ? COPT makes sense only for minimization
problems - For maximization problems, replace by
- COPT ? CA
- Additive approximation CA ? COPT also
makes sense, although difficult to achieve
9 10 The vertex-cover problem
- A vertex cover of an undirected graph G (V, E)
- is a subset V' ? V such that if (u, v) ? E, then
u ? V' or v ? V' (or both). - A vertex cover for G is a set of vertices that
covers all the edges in E. - The goal is to find a vertex cover of minimum
size in a given undirected graph G.
11Naive Algorithm
- APPROX-VERTEX-COVER(G)
- 1 C ? Ø
- 2 E' ? EG
- 3 while E' ? Ø
- 4 do let (u, v) be an arbitrary edge of E'
- 5 C ? C ? u, v
- 6 remove from E' every edge incident on either u
or v - 7 return C
12Illustration of Naive algorithm
Edge bc is chosen Set C b, c
Input
Edge ef is chosen
Optimal solution b, e, d
Naive algorithm Cb,c,d,e,f,g
13Approximation 2
- Theorem. APPROX-VERTEX-COVER is a 2-approximation
algorithm. - Pf. let A denote the set of edges that were
picked by APPROX-VERTEX-COVER. - To cover the edges in A, any vertex cover, in
particular, an optimal cover C must include at
least one endpoint of each edge in A. - No two edges in A share an endpoint.
- Thus no two edges in A are covered by the same
vertex from C, and we have the lower bound - C A
- On the other hand, the algorithm picks an edge
for which neither of its endpoints is already in
C. - C 2A
- Hence, C 2A 2C.
14 Vertex cover summary
- No better constant-factor approximation is
known!! - More precisely, minimum vertex cover is known to
be approximable within (for a given V2)
(ADM85) -
- but cannot be approximated within 7/6 (Hastad
STOC97) for any sufficiently large vertex degree,
Dinur Safra (STOC02)1.36067
15Vertex cover summary
- Eran Halperin, Improved Approximation Algorithms
for the Vertex Cover Problem in Graphs and
Hypergraphs, SIAM Journal on Computing, 31/5
(2002) 1608 - 1623 . - Tomokazu Imamura , Kazuo Iwama, Approximating
vertex cover on dense graphs, Proceedings of the
sixteenth annual ACM-SIAM symposium on Discrete
algorithms 2005
16 The Traveling Salesman Problem
- Traveling SalesmanProblem (TSP)
- Input undirected graph G (V, E) with edges
cost c(u, v) associated with each edge (u, v) ? E
- Output shortest cycle that visits each vertex
exactly once - Triangle inequality if for all vertices u, v, w ?
V, - c(u, w) c(u, v) c(v, w).
u
v
w
17 2-approximation for TSP with triangle
inequality
- Compute MST T
- An edge between any pair of points
- Weight distance between endpoints
- Compute a tree-walk W of T
- Each edge visited twice
- Convert W into a cycle H using shortcuts
18Algorithm
APPROX-TSP-TOUR(G, c) 1 select a vertex r ? V
G to be a "root" vertex 2 compute a minimum
spanning tree T for G from root r using
MST-PRIM(G, c, r) 3 let L be the list of
vertices visited in a preorder tree walk of T 4
return the hamiltonian cycle H that visits the
vertices in the order L
19 Preorder Traversal
- Preorder (root-left-right)
- Visit the root first and then
- traverse the left subtree and then
- traverse the right subtree.
- Example
Order A,B,C,D,E,F,G,H,I
20Illustration
MST
Tree walk W
A full walk of the tree visits the vertices in
the order a, b, c, b, h, b, a, d, e, f, e, g, e,
d, a.
preorder walk (Final solution H)
OPT solution
212-approximation
- Theorem. APPROX-TSP-TOUR is a polynomial-time
2-approximation algorithm for the
traveling-salesman problem with the triangle
inequality. - Pf. Let COPT be the optimal cycle
- Cost(T) Cost(COPT)
- Removing an edge from H gives a spanning tree, T
is a spanning tree of minimum cost - Cost(W) 2 Cost(T)
- Each edge visited twice
- Cost(H) Cost(W)
- Triangle inequality
- Cost(H) 2 Cost(COPT )
22Load Balancing
- Input. m identical machines n jobs, job j has
processing time tj. - Job j must run contiguously on one machine.
- A machine can process at most one job at a time.
- Def. Let J(i) be the subset of jobs assigned to
machine i. The - load of machine i is Li ?j ? J(i) tj.
- Def. The makespan is the maximum load on any
machine L maxi Li. - Load balancing. Assign each job to a machine to
minimize makespan.
23Load Balancing List Scheduling
- List-scheduling algorithm.
- Consider n jobs in some fixed order.
- Assign job j to machine whose load is smallest so
far. - Implementation. O(n log n) using a priority
queue.
List-Scheduling(m, n, t1,t2,,tn) for i 1
to m Li ? 0 J(i) ? ? for j
1 to n i argmink Lk J(i) ? J(i)
? j Li ? Li tj
load on machine i
jobs assigned to machine i
machine i has smallest load
assign job j to machine i
update load of machine i
24Load Balancing List Scheduling Analysis
- Theorem. Graham, 1966 Greedy algorithm is a
(2-1/m)-approximation. - First worst-case analysis of an approximation
algorithm. - Need to compare resulting solution with optimal
makespan L. - Lemma 1. The optimal makespan L ? maxj tj.
- Pf. Some machine must process the most
time-consuming job. ? - Lemma 2. The optimal makespan
- Pf.
- The total processing time is ?j tj .
- One of m machines must do at least a 1/m fraction
of total work. ?
25Load Balancing List Scheduling Analysis
- Theorem. Greedy algorithm is a
(2-1/m)-approximation. - Pf. Consider load Li of bottleneck machine i.
- Let j be last job scheduled on machine i.
- When job j assigned to machine i, i had smallest
load. Its load before assignment is Li - tj ?
Li - tj ? Lk for all 1 ? k ? m.
blue jobs scheduled before j
machine i
j
0
L Li
Li - tj
26Load Balancing List Scheduling Analysis
- Theorem. Greedy algorithm is a (2-1/m)-
approximation. - Pf. Consider load Li of bottleneck machine i.
- Let j be last job scheduled on machine i.
- When job j assigned to machine i, i had smallest
load. Its load before assignment is Li - tj ?
Li - tj ? Lk for all 1 ? k ? m. - Sum inequalities over all k and divide by m
- Now ?
Lemma 2
Lemma 1
27Load Balancing List Scheduling Analysis
- Q. Is our analysis tight?
- A. Essentially yes. Indeed, LS algorithm has
tight bound 2- 1/m - Ex m machines, m(m-1) jobs length 1 jobs, one
job of length m
machine 2 idle
machine 3 idle
machine 4 idle
machine 5 idle
m 10
machine 6 idle
machine 7 idle
machine 8 idle
machine 9 idle
machine 10 idle
list scheduling makespan 19
28Load Balancing List Scheduling Analysis
- Q. Is our analysis tight?
- A. Essentially yes. Indeed, LS algorithm has
tight bound 2- 1/m - Ex m machines, m(m-1) jobs length 1 jobs, one
job of length m
m 10
optimal makespan 10
29Load Balancing on 2 Machines
- Claim. Load balancing is hard even if only 2
machines. - Pf. NUMBER-PARTITIONING ? P LOAD-BALANCE.
a
d
b
c
f
g
e
length of job f
Machine 1
a
d
f
machine 1
yes
Machine 2
b
c
e
g
machine 2
Time
L
0
30Load Balancing LPT Rule
- Longest processing time (LPT). Sort n jobs in
descending order of processing time, and then run
list scheduling algorithm.
LPT-List-Scheduling(m, n, t1,t2,,tn) Sort
jobs so that t1 t2 tn for i 1 to
m Li ? 0 J(i) ? ? for j
1 to n i argmink Lk J(i) ? J(i) ?
j Li ? Li tj
load on machine i
jobs assigned to machine i
machine i has smallest load
assign job j to machine i
update load of machine i
31Load Balancing LPT Rule
- Observation. If at most m jobs, then
list-scheduling is optimal. - Pf. Each job put on its own machine. ?
- Lemma 3. If there are more than m jobs, L ? 2
tm1. - Pf.
- Consider first m1 jobs t1, , tm1.
- Since the ti's are in descending order, each
takes at least tm1 time. - There are m1 jobs and m machines, so by
pigeonhole principle, at least one machine gets
two jobs. ? - Theorem. LPT rule is a 3/2 approximation
algorithm. - Pf. Same basic approach as for list scheduling.
- ?
Lemma 3( by observation, can assume number of
jobs gt m )
32Load Balancing LPT Rule
- Q. Is our 3/2 analysis tight?
- A. No.
- Theorem. Graham, 1969 LPT rule is a (4/3
1/(3m))-approximation. - Pf. More sophisticated analysis of same
algorithm. - Q. Is Graham's (4/3 1/(3m))- analysis tight?
- A. Essentially yes.
- Ex m machines, n 2m1 jobs, 2 jobs of length
m1, m2, , 2m-1 and three jobs of length m.
33LPT
- Proof. Jobs are indexed t1 t2 tn.
- If n m, already optimal (one machine processes
one job). - If ngt 2m, then tn L/3. Similar as the analysis
of LS algorithm. - Suppose total 2m h jobs, 0 h lt m
- Check that LPT is already optimal solution
1
h
h1
n
h2
n-1
Time
34 Approximation Scheme
- NP-complete problems allow polynomial-time
approximation algorithms that can achieve
increasingly smaller approximation ratios by
using more and more computation time - Tradeoff between computation time and the
quality of the approximation - For any fixed ?gt0, An approximation scheme for an
optimization problem is an (1 ?)-approximation
algorithm.
35 PTAS and FPTAS
- We say that an approximation scheme is a
polynomial-time approximation scheme (PTAS) if
for any fixed ? gt 0, the scheme runs in time
polynomial in the size n of its input instance. - Example O(n2/?).
- an approximation scheme is a fully
polynomial-time approximation scheme (FPTAS) if
it is an approximation scheme and its running
time is polynomial both in 1/? and in the size n
of the input instance - Example O((1/?)2n3).
36The Subset Sum
- Input. A pair (S, t), where S is a set x1, x2,
..., xn of positive integers and t is a positive
integer - Output. A subset S' of S
- Goal. Maximize the sum of S' but its value is not
larger than t.
37An exponential-time exact algorithm
- If L is a list of positive integers and x is
another positive integer, - then we let L x denote the list of integers
derived from L by increasing each element of L by
x. - For example, if L lt1, 2, 3, 5, 9gt, then L 2
lt3, 4, 5, 7, 11gt. - We also use this notation for sets, so that
- S x s x s ? S.
38Exact algorithm
- MERGE-LISTS(L, L') returns the sorted list that
is the merge of its two sorted input lists L and
L' with duplicate values removed. - EXACT-SUBSET-SUM(S, t)
- 1 n ? S
- 2 L0 ? lt0gt
- 3 for i ? 1 to n
- 4 do Li ? MERGE-LISTS(Li-1, Li-1 xi)
- 5 remove from Li every element that is greater
than t - 6 return the largest element in Ln
39Example
- For example, if S 1, 4, 5, then
- P1 0, 1 ,
- P2 0, 1, 4, 5 ,
- P3 0, 1, 4, 5, 6, 9, 10 .
- Given the identity
- Since the length of Li can be as much as 2i, it
is an exponential-time algorithm .
40 The Subset-sum problem FPTAS
- Trimming or rounding if two values in L are
close to each other, then for the purpose of
finding an approximate solution there is no
reason to maintain both of them explicitly. - Let d such that 0 lt d lt 1.
- L' is the result of trimming L, for every element
y that was removed from L, there is an element z
still in L' that approximates y, that is
41Example
- For example, if d 0.1 and
- L lt10, 11, 12, 15, 20, 21, 22, 23, 24, 29gt,
- then we can trim L to obtain
- L' lt10, 12, 15, 20, 23, 29gt,
- TRIM(L, d)
- 1 m ? L
- 2 L' ? lty1gt
- 3 last ? y1
- 4 for i ? 2 to m
- 5 do if yi gt last (1 d) ? yi last
because L is sorted - 6 then append yi onto the end of L'
- 7 last ? yi
- 8 return L'
42 (1 ?)-Approximation algorithm
- APPROX-SUBSET-SUM(S, t, ?)
- 1 n ? S
- 2 L0 ? lt0gt
- 3 for i ? 1 to n
- 4 do Li ? MERGE-LISTS(Li-1, Li-1 xi)
- 5 Li ? TRIM(Li, ?/2n)
- 6 remove from Li every element that is greater
than t - 7 let z be the largest value in Ln
- 8 return z
43FPTAS
- Theorem. APPROX-SUBSET-SUM is a fully
polynomial-time approximation scheme for the
subset-sum problem. - Pf. The operations of trimming Li in line 5 and
removing from Li every element that is greater
than t maintain the property that every element
of Li is also a member of Pi. Therefore, the
value z returned in line 8 is indeed the sum of
some subset of S.
44Pf. Con.
- Pf. Let y? Pn denote an optimal solution to the
subset-sum problem. we know that z y. We
need to show that y/z 1 ?. - By induction on i, it can be shown that for every
element y in Pi that is at most t, there is a z ?
Li such that - Thus, there is a z ? Ln , such that
45Pf. Con.
- And thus,
- Since there is a z ? Ln
- Hence,
46Pf. Con.
- To show FPTAS, we need to bound Li.
- After trimming, successive elements z and z' of
Li must have the relationship z'/z gt 1?/2n - Each list, therefore, contains the value 0,
possibly the value 1, and up to ?log1?/2n t?
additional values