On the Lower Bound of Local Optimum in KMeans Algorithm - PowerPoint PPT Presentation

1 / 32

About This Presentation

Title:

On the Lower Bound of Local Optimum in KMeans Algorithm

Description:

Lower bound is derived and used to guess the potential of the current clustering ... How to lower bound the cost of any solution in the maximal region. 5/3/09 ... – PowerPoint PPT presentation

Number of Views:153

Avg rating:3.0/5.0

Slides: 33

Provided by: dcs3

Category:

Tags: algorithm | bound | how | kmeans | local | lower | optimum | to

Transcript and Presenter's Notes

Title: On the Lower Bound of Local Optimum in KMeans Algorithm

1
On the Lower Bound of Local Optimum in K-Means
Algorithm

Zhang Zhenjie, Dai Bing Tian, and Anthony K.H.
Tung

2
Outline

Introduction
Maximal Region
Algorithms
Experiments
Conclusion and Future Work

3
Introduction

K-Means Algorithm
Pick k centers randomly
K-Means Iterations
Assign every point to the closest center
Compute the center of every cluster to replace
the old one
Stop the algorithm if the centers are stable

4
Introduction (cont.)

Cost
Sum of the squared distance from every point to
its closest center
Cost decreases after every k-means iteration
Global Optimum
Centers minimizing the cost
Local Optimum
Centers outputted by k-means with any initial
centers

5
Introduction (cont.)

Disadvantages of Local Optimum
Much worse than global optimum
Re-run the algorithm with different initial
centers
Leads to the waste of computation resource
Solution?
Find center set leading to global optimum?
Detect local optimum with large cost as early as
possible? (the target of our paper)

6
Introduction (cont.)

A simple solution for early detection

Cost
Stop when the decrease of cost is small
after one iteration
Iteration
7
Introduction (cont.)

A simple solution for early detection

Cost
A much better local optimum is missed
Iteration
8
Introduction (cont.)

Our solution

Cost
Lower bound is derived and used to guess
the potential of the current clustering
Iteration
9
Introduction (cont.)

Our solution

Cost
If the yellow curve represents the current
best solution, we can stop the computation here
Iteration
10
Outline

Introduction
Maximal Region
Algorithms
Experiments
Conclusion and Future Work

11
Solution Space

Given a d-dimensional problem space, we define
the solution space as a kd-dimensional space

c2
M1
c1
c2
c1
12
Solution Space

With the iterations, the center set jumps in the
solution space

M3
M2
c2
M1
c1
c2
c1
13
Definition of Maximal Region

Maximal Region is a region in the solution space,
covering the local optimum achieved by future
iterations
Two problems
How to find such a maximal region
How to lower bound the cost of any solution in
the maximal region

14
Maximal Region
The cost of center sets in solutions space
is represented by contour lines, lighter color
meaning smaller cost
c2
M2
Any solution between M1 and M2 must have
smaller cost than M1
M1
c1
15
Maximal Region
c2
M2
M1
Maximal Region of the local optimum, the
local optimum must locate in
c1
16
Maximal Region

A region is maximal region for center set M, if
It contains M
Any solution on the boundary of the region has
equal cost of M

17
A Special Maximal Region
c2
M2
M1
every center moves no more than Delta
c1
18
Maximal Region
M1
m1
m2
19
Costs in Maximal Region

Bounding Theorem
Any solution in must have
cost no less than C(M1)-DeltaN, where C(M1) is
the cost of M1 and N is the size of the data set

20
Outline

Introduction
Maximal Region
Algorithms
Experiments
Conclusion and Future Work

21
Algorithm

New Algorithm
Same Initial Centers Selection
New Iteration
Reassignment
Computing new centers, M
Finding the smallest R(M,Delta)
Computing the lower bound in maximal region
Check the stopping criteria or prune the current
procedure

22
Finding Smallest Delta

The value of Delta can be any float value
Divide the search range into N1 segments,
0,a(1),a(1),a(2),a(N),infinity)
Search the segments from 0,a(1) in order
On every segment, solving a quadratic equations.
If any plausible quadratic root is found, return
as the smallest Delta

23
Algorithm

Finding the smallest Delta to bound the local
optimum in the Maximal Region
Sorting and Scan Algorithm
Complexity is O(Nlog N), N is the size of the
data
Lower bounding the cost of local optimum
Simple computation
Done in O(1) time

24
Outline

Introduction
Maximal Region
Algorithms
Experiments
Conclusion and Future Work

25
Experiments

Data Set
Synthetic data sets and KDD99 data set
Original K-Means Algorithm (OKM)
Accelerated K-means Algorithm (AKM)
Run k-means clustering several times
The best result of the previous runs is used to
prune the following runs

26
Experiments

Measurement
We use the same random seeds for OKM and AKM
iterations (I/O cost) and computation time (CPU
cost)

27
Experiments (cont.)

Varying dimensionality on synthetic data sets

28
Experiments (cont.)

Varying k on synthetic data sets

29
Experiments (cont.)

Varying k on KDD99 data set

30
Conclusion and Future Work

Contribution
Lower bound of Local Optimum in K-Means
Algorithm
The concept of Maximal Region
Algorithm for finding Maximal Region
Accelerate K-Means Algorithm

31
Conclusion and Future Work

Additional Applications
Data stream clustering
Real time cluster analysis over moving objects
Improvement
Some tighter bound
Extension to general clustering algorithms

32
Q A

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

World's Best PowerPoint Templates PowerPoint PPT Presentation

World's Best PowerPoint Templates - CrystalGraphics offers more PowerPoint templates than anyone else in the world, with over 4 million to choose from. Winner of the Standing Ovation Award for “Best PowerPoint Templates” from Presentations Magazine. They'll give your presentations a professional, memorable appearance - the kind of sophisticated look that today's audiences expect. Boasting an impressive range of designs, they will support your presentations with inspiring background photos or videos that support your themes, set the right mood, enhance your credibility and inspire your audiences.

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

New Google Algorithm Updates | Softuvo Solutions PowerPoint PPT Presentation

New Google Algorithm Updates | Softuvo Solutions - Basically, a search engine algorithm is a set of rules, or a unique formula, that the search engine uses to determine the significance of a web page, and each search engine has its own set of rules. | PowerPoint PPT presentation | free to view

Greedy Algorithms PowerPoint PPT Presentation

Greedy Algorithms - For US money, the greedy algorithm always gives the optimum solution. 4 ... An optimum solution. This solution is clearly optimal (why? ... | PowerPoint PPT presentation | free to view

Analyzing and Improving Local Search: kmeans and ICP PowerPoint PPT Presentation

Analyzing and Improving Local Search: kmeans and ICP - Some 1 e approximation algorithms known: Example running times: ... Then modify X: Add 1 dimension, O(k) points, O(1) clusters ... | PowerPoint PPT presentation | free to view

Locally-biased and semi-supervised eigenvectors PowerPoint PPT Presentation

Locally-biased and semi-supervised eigenvectors - ... Solve the s-t min-cut s-t min-cut - PageRank ... connections to strongly-local spectral methods and scalable computation Push Algorithm for PageRank ... | PowerPoint PPT presentation | free to view

How Local SEO Service Can Help Website Rank Better PowerPoint PPT Presentation

How Local SEO Service Can Help Website Rank Better - Almost 89% of people are used local keywords to find a product or service and 70% are like to visit a store within 5 to 7 km. So, it’s important to optimize your local SEO. There are some reasons that show why local SEO is important. • Local Searches give high ROI • Highly targeted and timely searches • Local businesses have untapped potential • Constant increase in mobile searches • Gaining reviews is critical • Be algorithm friendly | PowerPoint PPT presentation | free to view

A NEW SAR SUPERRESOLUTION IMAGING ALGORITHM BASED ON ADAPTIVE SIDELOBE REDUCTION PowerPoint PPT Presentation

A NEW SAR SUPERRESOLUTION IMAGING ALGORITHM BASED ON ADAPTIVE SIDELOBE REDUCTION - A NEW SAR SUPERRESOLUTION IMAGING ALGORITHM BASED ON ADAPTIVE SIDELOBE REDUCTION Ping Zhang, Zhen Li,Jianmin Zhou, Quan Chen, Bangsen Tian Center for Earth ... | PowerPoint PPT presentation | free to view

Introduction to Approximation Algorithms PowerPoint PPT Presentation

Introduction to Approximation Algorithms - Introduction to Approximation Algorithms ... time we would have a polynomial time algorithm for the Hamiltonian cycle problem G has a Hamiltonian cycle ... | PowerPoint PPT presentation | free to view

Algorithm complexity PowerPoint PPT Presentation

Algorithm complexity - Algorithm complexity Problems algorithms programs Bounds are for the algorithms, rather than programs programs are just implementations of an algorithm, and almost ... | PowerPoint PPT presentation | free to view

Combinatorial Algorithms PowerPoint PPT Presentation

Combinatorial Algorithms - Introduction. Algorithms in unweighted bipartite graph (Yehong & Gordon) Maximum matching. A simple algorithm. Hopcroft-Karp algorithm. Stable marriage problem (Wang wei) | PowerPoint PPT presentation | free to view

Random Swap EM algorithm for GMM and Image Segmentation PowerPoint PPT Presentation

Random Swap EM algorithm for GMM and Image Segmentation - Random Swap EM algorithm for GMM and Image Segmentation Qinpei Zhao, Ville Hautam ki, Ismo K rkk inen, Pasi Fr nti Speech & Image Processing Unit | PowerPoint PPT presentation | free to view

Fair Cycle Detection: A New Algorithm and a Comparative Study PowerPoint PPT Presentation

Fair Cycle Detection: A New Algorithm and a Comparative Study - Fair Cycle Detection: A New Algorithm and a Comparative Study Fabio Somenzi University of Colorado at Boulder Acknowledgement This talk is the conflation of Kavita ... | PowerPoint PPT presentation | free to view

Sequential Model-based Optimization for General Algorithm Configuration PowerPoint PPT Presentation

Sequential Model-based Optimization for General Algorithm Configuration - Sequential Model-based Optimization for General Algorithm Configuration Frank Hutter, Holger Hoos, Kevin Leyton-Brown University of British Columbia | PowerPoint PPT presentation | free to view

A Parallel, High Performance Implementation of the Dot Plot Algorithm PowerPoint PPT Presentation

A Parallel, High Performance Implementation of the Dot Plot Algorithm - A Parallel, High Performance Implementation of the Dot Plot Algorithm Chris Mueller July 8, 2004 Overview Motivation Availability of large sequences Dot plot offers ... | PowerPoint PPT presentation | free to view

A Sublinear Algorithm For Weakly Approximating Edit Distance Batu, Ergun, Killian, Magen, Raskhodnikova, Rubinfeld, Sami PowerPoint PPT Presentation

A Sublinear Algorithm For Weakly Approximating Edit Distance Batu, Ergun, Killian, Magen, Raskhodnikova, Rubinfeld, Sami - A Sublinear Algorithm For Weakly Approximating Edit Distance Batu, Ergun, Killian, Magen, Raskhodnikova, Rubinfeld, Sami Presentation by Itai Dinur | PowerPoint PPT presentation | free to view

Foods to Lower Hypertension Natural Blood Pressure Control Supplements PowerPoint PPT Presentation

Foods to Lower Hypertension Natural Blood Pressure Control Supplements - This power point presentation discibes about foods to lower hypertension natural blood pressure control supplements | PowerPoint PPT presentation | free to view

The EigenTrust Algorithm for Reputation Management in P2P Networks PowerPoint PPT Presentation

The EigenTrust Algorithm for Reputation Management in P2P Networks - The EigenTrust Algorithm for Reputation Management in P2P Networks Sepandar D.Kamvar Mario T.Schlosser Hector Garcia-Molina P2P Networks and Reputation Systems ... | PowerPoint PPT presentation | free to view

Approximation Algorithms PowerPoint PPT Presentation

Approximation Algorithms - Problem: to find a Hamiltonian cycle of minimal cost. ... Problem: to find a Hamiltonian cycle of minimal cost. Polynomial Algorithm for TSP? | PowerPoint PPT presentation | free to view

8 Major Google Algorithm Updates PowerPoint PPT Presentation

8 Major Google Algorithm Updates - Over the past decade, Google algorithm updates have influenced SEO significantly. Let’s take a look at 8 major Google algorithm updates. These updates which were introduced over the decade along with their impact on SEO. | PowerPoint PPT presentation | free to view

Approach to Data Mining from Algorithm and Computation PowerPoint PPT Presentation

Approach to Data Mining from Algorithm and Computation - Approach to Data Mining from Algorithm and ... graph mining, etc. Modeling ... 2,4 1,3,4 2,3,4 1,2,3,4 frequent Apriori uses long time much memory when ... | PowerPoint PPT presentation | free to view

A general agnostic active learning algorithm PowerPoint PPT Presentation

A general agnostic active learning algorithm - A general agnostic active learning algorithm Claire Monteleoni UC San Diego Joint work with Sanjoy Dasgupta and Daniel Hsu, UCSD. Active learning Many machine ... | PowerPoint PPT presentation | free to view

MapReduce Algorithm Design Based on Jimmy Lin PowerPoint PPT Presentation

MapReduce Algorithm Design Based on Jimmy Lin - MapReduce Algorithm Design Based on Jimmy Lin s s | PowerPoint PPT presentation | free to view

Design and Analysis of Computer Algorithm Lecture 1 PowerPoint PPT Presentation

Design and Analysis of Computer Algorithm Lecture 1 - Design and Analysis of Computer Algorithm Lecture 1 Assoc. Prof.Pradondet Nilagupta Department of Computer Engineering | PowerPoint PPT presentation | free to view

Chapter 2: Fundamentals of the Analysis of Algorithm Efficiency PowerPoint PPT Presentation

Chapter 2: Fundamentals of the Analysis of Algorithm Efficiency - Title: Chapter 2: Fundamentals of the Analysis of Algorithm Efficiency Author: Anany Levitin Last modified by: E_man Created Date: 8/23/1999 5:38:43 PM | PowerPoint PPT presentation | free to view

CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling PowerPoint PPT Presentation

CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling - CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling Paper presentation in data mining class Presenter : ; Data : 2001/12/18 | PowerPoint PPT presentation | free to view

String Matching Using the Rabin-Karp Algorithm PowerPoint PPT Presentation

String Matching Using the Rabin-Karp Algorithm - String Matching Using the Rabin-Karp Algorithm Katey Cruz CSC 252: Algorithms Smith College 12.12.2000 Outline String matching problem Definition of the Rabin-Karp ... | PowerPoint PPT presentation | free to view

Implicit regularization in sublinear approximation algorithms PowerPoint PPT Presentation

Implicit regularization in sublinear approximation algorithms - ... Three simple corollaries Spectral algorithms and the PageRank problem/solution PageRank and the Laplacian Push Algorithm for PageRank Why do we care about ... | PowerPoint PPT presentation | free to view

The OMADEON Genetic TSP Algorithm O.G.T.S.P for T.S.P. Travelling Salesman Problems combines Simulat PowerPoint PPT Presentation

The OMADEON Genetic TSP Algorithm O.G.T.S.P for T.S.P. Travelling Salesman Problems combines Simulat - ... the fastest algorithms. E.g. ... The OMADEON Genetic TSP algorithm finds an optimum path through 150 points... (forget other algorithms; they're too slow! ... | PowerPoint PPT presentation | free to view