Clustering and Self-Organizing Feature Map - PowerPoint PPT Presentation

1 / 49

About This Presentation

Title:

Clustering and Self-Organizing Feature Map

Description:

Calculate the set of inter-cluster similarity ... Input : k(the number of cluster), Set V of n objects ... Assign each object to the cluster to which it is the ... – PowerPoint PPT presentation

Number of Views:363

Avg rating:3.0/5.0

Slides: 50

Provided by: aiKai

Category:

more less

Transcript and Presenter's Notes

Title: Clustering and Self-Organizing Feature Map

1
Clustering and Self-Organizing Feature Map

KAIST ????
? ??

2
Clustering
3
Introduction

Cluster
Group of the similar objects
Clustering
Special method of classification
Unsupervised learning no predefined classes

4
What is Good Clustering?

High Intra-cluster similarity
Dissimilar to the objects in other clusters
Low Inter-cluster similarity
Similar to one another within the same cluster
?Depending on the similarity measure

5
The problem of unsupervised clustering

Nearly identical to that of distribution
estimation for classes with multi-modal features

Example of 4 data sets with the same mean and
covariance
6
Similarity Measures

The distance between them
If the Euclidean distance between them is less
than some threshold distance d0,
same cluster

7
Scaling Axes Effect
8
Normalization

To achieve invariance, normalize the data
Subtracting the mean and dividing by the standard
deviation
Inappropriate if the spread is due to the
presence of subclasses

9
Similarity function

Similarity function between vectors s(x, x)
Using the angle between two vectors, normalized
inner product may be an appropriate similarity
function.

10
Tanimoto coefficient

Using binary values
The ratio of the number of shared attributes to
the number possessed by x or x

11
Distance between Sets
12
Criterion function for clustering

Sum-of-Squared-Error criterion
Also called minimum variance patition
Problem when natural grouping have very
different numbers of points
General form

13
Category of Clustering Method

Iterative optimization
Move randomly selected point to other cluster if
it improves
Hierarchical Clustering
Group objects into a tree of clusters
AGNES(Agglomerative Nesting)
DIANA(Divisible Analysis)
Partitioning Clustering
Construct a partition of object V into a set of k
clusters (k user input parameter)
K-means
K-medoids

14
Hierarchical Clustering

Sequence of partitions of N samples into C
clusters
1. N clusters
2. Merge nearest two clusters to make total N-1
cluster
3. do until the number of clusters C
Dendrogram
Agglomerative bottom up
divisive top down

15
Hierarchical Method
agglomerative(AGNES)
divisible(DIANA)
16
Hierarchical Method

Algorithm for Agglomerative
Input Set V of objects
Put each object in a cluster
Loop until the number of cluster is one
Calculate the set of inter-cluster similarity
Form merge by the fusion of the most similar
pair of current clusters

17
Hierarchical Method

Similarity Method
Single-Linkage
Complete-Linkage
Average-Linkage

18
Nearest-neighbor algorithm

When dmin is used, the algorithm is called the
neirest neighbor algorithm
If it is terminated when the distance between
nearest clusters exceeds an arbitrary threshold,
it is called single-linkage algorithm
generate a minimal spanning tree
Chaining effect defect of this distance measure
(right)

19
K-means

Use gravity center of the objects
Algorithm
Input k(the number of cluster), Set V of n
objects
Output A set of k clusters which minimizes the
sum of distance error criterion
Method
Choose k objects as the initial cluster centers
set i0
Loop
For each object v
Find the NearestCenter(i)(p), and assign p
to it
Compute mean of cluster as center
Pro quick convergence
Con sensitive to noise, outlier and initial
seed selection

20
K-means sensitive to initial point
21
K-means clustering

Choose k for V

22
K-means clustering

Assign each object to the cluster to which it is
the closest
Compute the center
of each cluster

23
K-means clustering

Reassign subjects to the cluster whose centroid
is nearest

24
Graph Theoretic Approach

Removal of inconsistent edges

25
Self-Organizing Feature Maps

Clustering method based on competitive learning
one neuron per group is fired at any one time
winner-takes-all, winning neuron
winner-takes-all by lateral inhibitory connection

26
Self-Organizing Feature Map

neurons placed at the nodes of lattice
one or two dimension
neurons become selectively tuned to input
patterns (stimuli)
by a competitive learning process
locations of neuron so tuned to be ordered
formation of topographic map of input pattern
Spatial locations of neurons in lattice ?
intrinsic statistical features contained in input
patterns
SOM is non-linear generalization of PCA

27
SOFM motivated by human brain

Brain is organized such a way that different
sensory data is represented by topologically
ordered computational maps
tactile, visual, acoustic sensory input are
mapped onto areas of cerebral cortex in
topologically ordered manner
building block of information processing
infrastructure of nervous system

28
SOFM motivated by human brain

Neurons transform input signals into a
place-coded probability distribution
sites of maximum relative activities within the
map
accessed by higher-order processors with simple
connection
each incoming information is kept in its proper
context
Neurons dealing closed related information are
close together so that they connected via short
connections

29
Kohonen Model

Captures essential features of computational maps
in Brain
capable of dimensionality reduction

30
Kohonen Model

Transform incoming signal pattern into discrete
map
of 1-D or 2-D
adaptively topologically ordered fashion
Topology-preserving transformation
class of vector coding algorithm
optimally map into fixed number of code words
input pattern is represented as a localized
region or spot of activities in the network
After initialization, three essential processes
competition
cooperation
synaptic adaptation

31
Competitive Process

Find best match of input vector with synaptic
weight
x x1, x2, , x3T
wjwj1, wj2, , wjmT, j 1, 2,3, l
Best matching, winning neuron
i(x) arg min x-wj, j 1,2,3,..,l
Determine the location where the topological
neighborhood of excited neurons is to be centered
continuous input space is mapped onto discrete
output space of neuron by competitive process

32
Cooperative Process

For a winning neuron, the neurons in its
immediate neighborhood excite more than those
farther away
topological neighborhood decay smoothly with
lateral distance
Symmetric about maximum point defined by dij 0
Monotonically decreasing to zero for dij ? 8
Neighborhood function Gaussian case
Size of neighborhood shrinks with time

33
Typical window function
34
Adaptive process

Synaptic weight vector is changed in relation
with input vector
wj(n1) wj(n) ?(n) hj,i(x)(n) (x - wj(n))
applied to all neurons inside the neighborhood of
winning neuron i
effect of moving weight wj toward input vector x
upon repeated presentation of the training data,
weight tend to follow the distribution
Learning rate ?(n) may decay with time

35
SOFM algorithm

1.initialize ws by random number
2. For input x(n), find nearest cell
i(x) argminj x(n) - wj(n)
3. update weights of neighbors
wj(n1) wj(n) ? (n) hj,i(x)(n) x(n) -
wj(n)
4. reduce neighbors and ?
5. Go to 2

36
Computer Simulation

Input sample random numbers within 2-D unit
square
100 neurons ( 10x10)
Initial weights random assignment (0.01.0)
Display
each neuron positioned at w1, w2
neighbors are connected by line
next slide
2nd example Figure 9.8 and 9.9

37
SOFM Example(1) 2-D Lattice by 2-D distribution
38
(No Transcript)
39
Topologically ordered map development (2)
40
Topologically ordered map development (3)
41
Topologically ordered map development (5)
42
Topologically ordered map development (1D array
of Neurons)
43
SOFM Example(2)Phoneme Recognition

Phonotopic maps

Recognition result for humppila

44
SOFM Example(3)

http//www-ti.informatik.uni-tuebingen.de/goepper
t/KohonenApp/KohonenApp.html
http//davis.wpi.edu/matt/courses/soms/applet.htm
l

45
Summary of SOM

Continuous input space of activation patterns
that are generated in accordance with a certain
probability distribution
Topology of the network in the form of a lattice
of neurons, which defines a discrete output space
Time-varying neighborhood function defined around
winning neuron
Learning rate decrease gradually with time, but
never go to zero

46
Vector Quantization

VQ data compression technique
input space is divided into distinct regions
reproduction vector, representative vector
code words, code book
Voronoi quantizer
nearest neighbor rule on the Euclidean metric
Learned Vector Quantization
a supervised learning technique
move Voronoi vector slightly in order to improve
classification decision quality

47
Voronoi Tesselation
48
Learned Vector Quantization