# Clustering and Self-Organizing Feature Map - PowerPoint PPT Presentation

1 / 49
Title:

## Clustering and Self-Organizing Feature Map

Description:

### Calculate the set of inter-cluster similarity ... Input : k(the number of cluster), Set V of n objects ... Assign each object to the cluster to which it is the ... – PowerPoint PPT presentation

Number of Views:352
Avg rating:3.0/5.0
Slides: 50
Provided by: aiKai
Category:
Tags:
Transcript and Presenter's Notes

Title: Clustering and Self-Organizing Feature Map

1
Clustering and Self-Organizing Feature Map
• KAIST ????
• ? ??

2
Clustering
3
Introduction
• Cluster
• Group of the similar objects
• Clustering
• Special method of classification
• Unsupervised learning no predefined classes

4
What is Good Clustering?
• High Intra-cluster similarity
• Dissimilar to the objects in other clusters
• Low Inter-cluster similarity
• Similar to one another within the same cluster
• ?Depending on the similarity measure

5
The problem of unsupervised clustering
• Nearly identical to that of distribution
estimation for classes with multi-modal features

Example of 4 data sets with the same mean and
covariance
6
Similarity Measures
• The distance between them
• If the Euclidean distance between them is less
than some threshold distance d0,
• same cluster

7
Scaling Axes Effect
8
Normalization
• To achieve invariance, normalize the data
• Subtracting the mean and dividing by the standard
deviation
• Inappropriate if the spread is due to the
presence of subclasses

9
Similarity function
• Similarity function between vectors s(x, x)
• Using the angle between two vectors, normalized
inner product may be an appropriate similarity
function.

10
Tanimoto coefficient
• Using binary values
• The ratio of the number of shared attributes to
the number possessed by x or x

11
Distance between Sets
12
Criterion function for clustering
• Sum-of-Squared-Error criterion
• Also called minimum variance patition
• Problem when natural grouping have very
different numbers of points
• General form

13
Category of Clustering Method
• Iterative optimization
• Move randomly selected point to other cluster if
it improves
• Hierarchical Clustering
• Group objects into a tree of clusters
• AGNES(Agglomerative Nesting)
• DIANA(Divisible Analysis)
• Partitioning Clustering
• Construct a partition of object V into a set of k
clusters (k user input parameter)
• K-means
• K-medoids

14
Hierarchical Clustering
• Sequence of partitions of N samples into C
clusters
• 1. N clusters
• 2. Merge nearest two clusters to make total N-1
cluster
• 3. do until the number of clusters C
• Dendrogram
• Agglomerative bottom up
• divisive top down

15
Hierarchical Method
agglomerative(AGNES)
divisible(DIANA)
16
Hierarchical Method
• Algorithm for Agglomerative
• Input Set V of objects
• Put each object in a cluster
• Loop until the number of cluster is one
• Calculate the set of inter-cluster similarity
• Form merge by the fusion of the most similar
pair of current clusters

17
Hierarchical Method
• Similarity Method

18
Nearest-neighbor algorithm
• When dmin is used, the algorithm is called the
neirest neighbor algorithm
• If it is terminated when the distance between
nearest clusters exceeds an arbitrary threshold,
• generate a minimal spanning tree
• Chaining effect defect of this distance measure
(right)

19
K-means
• Use gravity center of the objects
• Algorithm
• Input k(the number of cluster), Set V of n
objects
• Output A set of k clusters which minimizes the
sum of distance error criterion
• Method
• Choose k objects as the initial cluster centers
set i0
• Loop
• For each object v
• Find the NearestCenter(i)(p), and assign p
to it
• Compute mean of cluster as center
• Pro quick convergence
• Con sensitive to noise, outlier and initial
seed selection

20
K-means sensitive to initial point
21
K-means clustering
• Choose k for V

22
K-means clustering
• Assign each object to the cluster to which it is
the closest
• Compute the center
• of each cluster

23
K-means clustering
• Reassign subjects to the cluster whose centroid
is nearest

24
Graph Theoretic Approach
• Removal of inconsistent edges

25
Self-Organizing Feature Maps
• Clustering method based on competitive learning
• one neuron per group is fired at any one time
• winner-takes-all, winning neuron
• winner-takes-all by lateral inhibitory connection

26
Self-Organizing Feature Map
• neurons placed at the nodes of lattice
• one or two dimension
• neurons become selectively tuned to input
patterns (stimuli)
• by a competitive learning process
• locations of neuron so tuned to be ordered
• formation of topographic map of input pattern
• Spatial locations of neurons in lattice ?
intrinsic statistical features contained in input
patterns
• SOM is non-linear generalization of PCA

27
SOFM motivated by human brain
• Brain is organized such a way that different
sensory data is represented by topologically
ordered computational maps
• tactile, visual, acoustic sensory input are
mapped onto areas of cerebral cortex in
topologically ordered manner
• building block of information processing
infrastructure of nervous system

28
SOFM motivated by human brain
• Neurons transform input signals into a
place-coded probability distribution
• sites of maximum relative activities within the
map
• accessed by higher-order processors with simple
connection
• each incoming information is kept in its proper
context
• Neurons dealing closed related information are
close together so that they connected via short
connections

29
Kohonen Model
• Captures essential features of computational maps
in Brain
• capable of dimensionality reduction

30
Kohonen Model
• Transform incoming signal pattern into discrete
map
• of 1-D or 2-D
• Topology-preserving transformation
• class of vector coding algorithm
• optimally map into fixed number of code words
• input pattern is represented as a localized
region or spot of activities in the network
• After initialization, three essential processes
• competition
• cooperation

31
Competitive Process
• Find best match of input vector with synaptic
weight
• x x1, x2, , x3T
• wjwj1, wj2, , wjmT, j 1, 2,3, l
• Best matching, winning neuron
• i(x) arg min x-wj, j 1,2,3,..,l
• Determine the location where the topological
neighborhood of excited neurons is to be centered
• continuous input space is mapped onto discrete
output space of neuron by competitive process

32
Cooperative Process
• For a winning neuron, the neurons in its
immediate neighborhood excite more than those
farther away
• topological neighborhood decay smoothly with
lateral distance
• Symmetric about maximum point defined by dij 0
• Monotonically decreasing to zero for dij ? 8
• Neighborhood function Gaussian case
• Size of neighborhood shrinks with time

33
Typical window function
34
• Synaptic weight vector is changed in relation
with input vector
• wj(n1) wj(n) ?(n) hj,i(x)(n) (x - wj(n))
• applied to all neurons inside the neighborhood of
winning neuron i
• effect of moving weight wj toward input vector x
• upon repeated presentation of the training data,
weight tend to follow the distribution
• Learning rate ?(n) may decay with time

35
SOFM algorithm
• 1.initialize ws by random number
• 2. For input x(n), find nearest cell
• i(x) argminj x(n) - wj(n)
• 3. update weights of neighbors
• wj(n1) wj(n) ? (n) hj,i(x)(n) x(n) -
wj(n)
• 4. reduce neighbors and ?
• 5. Go to 2

36
Computer Simulation
• Input sample random numbers within 2-D unit
square
• 100 neurons ( 10x10)
• Initial weights random assignment (0.01.0)
• Display
• each neuron positioned at w1, w2
• neighbors are connected by line
• next slide
• 2nd example Figure 9.8 and 9.9

37
SOFM Example(1) 2-D Lattice by 2-D distribution
38
(No Transcript)
39
Topologically ordered map development (2)
40
Topologically ordered map development (3)
41
Topologically ordered map development (5)
42
Topologically ordered map development (1D array
of Neurons)
43
SOFM Example(2)Phoneme Recognition
• Phonotopic maps
• Recognition result for humppila

44
SOFM Example(3)
• http//www-ti.informatik.uni-tuebingen.de/goepper
t/KohonenApp/KohonenApp.html
• http//davis.wpi.edu/matt/courses/soms/applet.htm
l

45
Summary of SOM
• Continuous input space of activation patterns
that are generated in accordance with a certain
probability distribution
• Topology of the network in the form of a lattice
of neurons, which defines a discrete output space
• Time-varying neighborhood function defined around
winning neuron
• Learning rate decrease gradually with time, but
never go to zero

46
Vector Quantization
• VQ data compression technique
• input space is divided into distinct regions
• reproduction vector, representative vector
• code words, code book
• Voronoi quantizer
• nearest neighbor rule on the Euclidean metric
• Learned Vector Quantization
• a supervised learning technique
• move Voronoi vector slightly in order to improve
classification decision quality

47
Voronoi Tesselation
48
Learned Vector Quantization
• Suppose wc is the closest to input xi.
• Let Cw be the class of wc
• Let Cxi be the class label of xi
• If Cw Cxi, then
• wc(n1) wc(n) ?nxi - wc(n)
• otherwise
• wc(n1) wc(n) - ?nxi - wc(n)
• the other Voronoi vectors are not modified

49