# Clustering Algorithms - PowerPoint PPT Presentation

Title:

## Clustering Algorithms

Description:

### Introduction to Hierarchical Clustering Analysis Dinh Dong Luong Introduction Data clustering concerns how to group a set of objects based on their similarity of ... – PowerPoint PPT presentation

Number of Views:459
Avg rating:3.0/5.0
Slides: 25
Provided by: Johan171
Category:
Tags:
Transcript and Presenter's Notes

Title: Clustering Algorithms

1
Introduction to Hierarchical Clustering Analysis
Dinh Dong Luong
2
Introduction
• Data clustering concerns how to group a set of
objects based on their similarity of attributes
and/or their proximity in the vector space.
• Main methods
• Partitioning K-Means
• Hierarchical BIRCH,ROCK,
• Density-based DBSCAN,
• A good clustering method will produce high
quality clusters with
• high intra-class similarity cohesive within
clusters
• low inter-class similarity distinctive between
clusters

3
Stages in clustering
4
Clustering Algorithms
• A. Distance and Similarity Measures
• B. Hierarchical Clustering
• Agglomerative
balanced iterative reducing and clustering using
hierarchies (BIRCH), clustering using
representatives (CURE), robust clustering using
• Divisive
• divisive analysis (DIANA), monothetic analysis
(MONA)

5
Distance and Similarity Measures
6
Similarity Measurements
• Pearson Correlation

Two profiles (vectors)
and
1 ? Pearson Correlation ? 1
7
Similarity Measurements
• Pearson Correlation Trend Similarity

8
Similarity Measurements
• Euclidean Distance

9
Similarity Measurements
• Euclidean Distance Absolute difference

10
Similarity Measurements
• Cosine Correlation

1 ? Cosine Correlation ? 1
11
Similarity Measurements
• Cosine Correlation Trend Mean Distance

12
Similarity Measurements
13
Similarity Measurements
Similar?
14
Taxonomy of Clustering Approaches
15
Hierarchical Clustering
• Agglomerative clustering treats each data point
as a singleton cluster, and then successively
merges clusters until all points have been merged
into a single remaining cluster. Divisive
clustering works the other way around.

16
Hierarchical Clustering
Calculate the similarity between all possible
combinations of two profiles
• Keys
• Similarity
• Clustering

Two most similar clusters are grouped together to
form a new cluster
Calculate the similarity between the new cluster
and all remaining clusters.
17
General agglomerative clustering
18
Clustering
C1
Merge which pair of clusters?
C2
C3
19
Clustering
Dissimilarity between two clusters Minimum
dissimilarity between the members of two clusters

C2
C1
20
Clustering
Dissimilarity between two clusters Maximum
dissimilarity between the members of two clusters

C2
C1
21
Clustering
Dissimilarity between two clusters Averaged
distances of all pairs of objects (one from each
cluster).

C2
C1
22
Clustering
Dissimilarity between two clusters Distance
between two cluster means.

C2
C1
23
My Idea Presentation
24
Future Work
• Step 1 Use a simple hierarchical algorithms with
moment features to run and evaluate clustering
results.
• Step 2 Find out good features for clustering on
our dataset by trying some feature variance
(Haar-like, shape quantization,).
• Step 3 Choose an optimal hierarchical clustering
algorithm