Approximation Algorithms for Representative Points Problem of Clusters - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Approximation Algorithms for Representative Points Problem of Clusters

Description:

5. Modified Depth-First Search(i,R) 11/20/09. The Theory of ... of DFS. 7. Modified Depth-First Search(v,Ri) 8. Replace every leaf node x Ri with. Pred(x) ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 25
Provided by: sara120
Category:

less

Transcript and Presenter's Notes

Title: Approximation Algorithms for Representative Points Problem of Clusters


1
Approximation Algorithms for Representative
Points Problem of Clusters
  • Sanpawat Kantabutra
  • The Theory of Computation Group
  • Computer Science Dept.
  • Chiang Mai University

2
Outline
  • Definitions
  • Non-Existence of Absolute Performance Guarantee
  • MST-Based Approximation Algorithm
  • Greedy Approximation Algorithm
  • Performance Guarantees
  • Open Problems

3
Dissimilarity
  • Let d(x, y) denote the Euclidean distance of
    points x and y.

4
?-Clustering Problem
  • Let S be a set of n d-dimensional points and ? a
    real number.
  • Want to partition S into l clusters C1, C2,,Cl
    s. t.?x,y?Ci, ? z1,z2,,zm ?Ci, x??y, s. t.
  • d(x,z1)lt ?,d(zt,zt1)lt
    ?,d(zm,y)lt ?
  • where Ci is a maximal cluster having this
    property, ??Ci S, Ci?? Cj ?? when i ?? j

5
Example of ?-Clusters
6
?-Clustering Problem of Order K
  • Let S be a set of n d-dimensional points and ? a
    real number.
  • Want to partition S into l clusters C1, C2,,Cl,
    ? Ci ?? S, s. t. for each Ci ? Gj, ?1 ? j ? m,
    there exists a path of subsets G1,G2,,Gm and 1m
    constitutes all the Gj so that ?x?Gj, ?k-1
    distinct y ?Gj d(x,y)lt?where Gj ?Gj1 ? ? and
    k gt 1, and ?x?Ci, ?y?Cj, i ? j, d(x,y) ? ?.

7
Example of ?-Clusters of Order 3
8
Representative Points
  • Let S ?Ci of d-dimensional points. Given
    disjoint clusters C1,C2,,Cl and sets R1,R2,,Rl,
    Ri ? Ci, we then say that Ri represent Ci iff
  • ?x?Ci, ?ri?Ri, ?rj?Rj, i?j, d(ri,x)ltd(rj,x)

9
Example of Representative Points
10
No Absolute Performance Guarantee
  • Theorem 1. If P?NP, no polynomial time
    approximation algorithm A for any instance I can
    solve ?-REP with A(I)-OPT(I) ? r, for any fixed
    r and any optimum solution OPT(I), where OPT(I),
    A(I)?N, and r?N.

11
MST-Based Approximation Algorithm (I)
  • Modified Depth-First Search(v,R)
  • 1. If ((Pred(v) ? R) AND (v is not the root))
  • 2. R R ? v
  • 3. Mark v as visited
  • 4. For all nodes i adjacent to v not visited
  • 5. Modified Depth-First Search(i,R)

12
MST-Based Approximation Algorithm (II)
  • INPUT l ?-clusters (of order k) C1,C2,. . . ,Cl
  • OUTPUT A set R Ri of representative points
  • MST-Based Approximation Algorithm
  • 1. Let Ri ? for all i 1l
  • 2. For each Ci
  • 3. Compute distances of all pairs of points
    in Ci
  • 4. Find a minimum-cost spanning tree MSTi
  • from Ci where points and distances
    become
  • nodes and edges respectively

13
MST-Based Approximation Algorithm (III)
  • 5. For each MSTi
  • 6. Let v be a leaf node in MSTi and the root
    of DFS
  • 7. Modified Depth-First Search(v,Ri)
  • 8. Replace every leaf node x ? Ri with
    Pred(x)
  • 9. Output R Ri

14
Algorithms Correctness
  • Proposition 1. The MST-based approximation
  • algorithm produces a set R Ri of
    representative points of size in the worst
    case where n is the number of input points.

15
Single Cluster Representation Problem
  • Given a set S ? Ci of n points in
    d-dimensional space where Ci is a ?-cluster and a
    cluster number h, the single cluster
    representation problem is to find a
    representative set Rh ? Ch such that
  • ?y?Ch ?r?Rh ?x?S - Ch d(r,y)ltd(x,y)

16
Heuristic Representative Algorithm (I)
  • INPUT ?-clusters Ci and h where 1 ? h ? l
  • OUTPUT Rep. set Rh
  • Rh ?
  • While (Ch ? ?)
  • Pmax ?
  • For all r ? Ch
  • P ?

17
Heuristic Representative Algorithm (II)
  • P ?
  • For all y ? Ch
  • If (CheckCond(r,y,S-Ch))
  • P P ? y
  • If (PgtPmax)
  • Pmax P
  • rmax r
  • ChCh-Pmax, SS-Pmax, RhRh ? rmax
  • return Rh

18
Algorithms Correctness
  • Theorem 2. The heuristic representative
    algorithm finds a representative set Rh ? Ch for
    a single cluster representation problem according
    to Definition 5.

19
Greedy Approximation Algorithm
  • INPUT ?-clusters C1,C2,. . . ,Cl S ?Ci
  • OUTPUT A set T of representative sets Ri
  • T ?
  • For h 1 to l
  • Heuristic Representative Algorithm(Ci,h,Rh)
  • T T ? Rh
  • Return T

20
Algorithms Correctness
  • Theorem 3. This algorithm finds a set of
    representatives Ri for all ?-clusters Ci, 1 ?
    i ? l, according to the Definition 4.

21
Performance Guarantee
  • Theorem 4. Let Agreedy denote the greedy
    approximation algorithm and Rgreedy the
    performance ratio. Then, for all input instances
    I,
  • Rgreedy(I) ?
  • where Cmax is the largest ?-cluster and k is
    the order of the ?-clusters.

22
Tight Instance
23
Open Problems
  • MST-Based Approximation is in NC?
  • Is the greedy-based approximation scheme the best
    one? What if we change the analysis of the lower
    bound or the lower bound itself?
  • Does the constant relative performance guarantee
    exist for this problem?
  • Is the greedy approximation problem is
    P-complete?

24
Questions and Answers
?
Write a Comment
User Comments (0)
About PowerShow.com