Clustering and Self-Organizing Feature Map - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

Clustering and Self-Organizing Feature Map

Description:

Calculate the set of inter-cluster similarity ... Input : k(the number of cluster), Set V of n objects ... Assign each object to the cluster to which it is the ... – PowerPoint PPT presentation

Number of Views:363
Avg rating:3.0/5.0
Slides: 50
Provided by: aiKai
Category:

less

Transcript and Presenter's Notes

Title: Clustering and Self-Organizing Feature Map


1
Clustering and Self-Organizing Feature Map
  • KAIST ????
  • ? ??

2
Clustering
3
Introduction
  • Cluster
  • Group of the similar objects
  • Clustering
  • Special method of classification
  • Unsupervised learning no predefined classes

4
What is Good Clustering?
  • High Intra-cluster similarity
  • Dissimilar to the objects in other clusters
  • Low Inter-cluster similarity
  • Similar to one another within the same cluster
  • ?Depending on the similarity measure

5
The problem of unsupervised clustering
  • Nearly identical to that of distribution
    estimation for classes with multi-modal features

Example of 4 data sets with the same mean and
covariance
6
Similarity Measures
  • The distance between them
  • If the Euclidean distance between them is less
    than some threshold distance d0,
  • same cluster

7
Scaling Axes Effect
8
Normalization
  • To achieve invariance, normalize the data
  • Subtracting the mean and dividing by the standard
    deviation
  • Inappropriate if the spread is due to the
    presence of subclasses

9
Similarity function
  • Similarity function between vectors s(x, x)
  • Using the angle between two vectors, normalized
    inner product may be an appropriate similarity
    function.

10
Tanimoto coefficient
  • Using binary values
  • The ratio of the number of shared attributes to
    the number possessed by x or x

11
Distance between Sets
12
Criterion function for clustering
  • Sum-of-Squared-Error criterion
  • Also called minimum variance patition
  • Problem when natural grouping have very
    different numbers of points
  • General form

13
Category of Clustering Method
  • Iterative optimization
  • Move randomly selected point to other cluster if
    it improves
  • Hierarchical Clustering
  • Group objects into a tree of clusters
  • AGNES(Agglomerative Nesting)
  • DIANA(Divisible Analysis)
  • Partitioning Clustering
  • Construct a partition of object V into a set of k
    clusters (k user input parameter)
  • K-means
  • K-medoids

14
Hierarchical Clustering
  • Sequence of partitions of N samples into C
    clusters
  • 1. N clusters
  • 2. Merge nearest two clusters to make total N-1
    cluster
  • 3. do until the number of clusters C
  • Dendrogram
  • Agglomerative bottom up
  • divisive top down

15
Hierarchical Method
agglomerative(AGNES)
divisible(DIANA)
16
Hierarchical Method
  • Algorithm for Agglomerative
  • Input Set V of objects
  • Put each object in a cluster
  • Loop until the number of cluster is one
  • Calculate the set of inter-cluster similarity
  • Form merge by the fusion of the most similar
    pair of current clusters

17
Hierarchical Method
  • Similarity Method
  • Single-Linkage
  • Complete-Linkage
  • Average-Linkage

18
Nearest-neighbor algorithm
  • When dmin is used, the algorithm is called the
    neirest neighbor algorithm
  • If it is terminated when the distance between
    nearest clusters exceeds an arbitrary threshold,
    it is called single-linkage algorithm
  • generate a minimal spanning tree
  • Chaining effect defect of this distance measure
    (right)

19
K-means
  • Use gravity center of the objects
  • Algorithm
  • Input k(the number of cluster), Set V of n
    objects
  • Output A set of k clusters which minimizes the
    sum of distance error criterion
  • Method
  • Choose k objects as the initial cluster centers
    set i0
  • Loop
  • For each object v
  • Find the NearestCenter(i)(p), and assign p
    to it
  • Compute mean of cluster as center
  • Pro quick convergence
  • Con sensitive to noise, outlier and initial
    seed selection

20
K-means sensitive to initial point
21
K-means clustering
  • Choose k for V

22
K-means clustering
  • Assign each object to the cluster to which it is
    the closest
  • Compute the center
  • of each cluster

23
K-means clustering
  • Reassign subjects to the cluster whose centroid
    is nearest

24
Graph Theoretic Approach
  • Removal of inconsistent edges

25
Self-Organizing Feature Maps
  • Clustering method based on competitive learning
  • one neuron per group is fired at any one time
  • winner-takes-all, winning neuron
  • winner-takes-all by lateral inhibitory connection

26
Self-Organizing Feature Map
  • neurons placed at the nodes of lattice
  • one or two dimension
  • neurons become selectively tuned to input
    patterns (stimuli)
  • by a competitive learning process
  • locations of neuron so tuned to be ordered
  • formation of topographic map of input pattern
  • Spatial locations of neurons in lattice ?
    intrinsic statistical features contained in input
    patterns
  • SOM is non-linear generalization of PCA

27
SOFM motivated by human brain
  • Brain is organized such a way that different
    sensory data is represented by topologically
    ordered computational maps
  • tactile, visual, acoustic sensory input are
    mapped onto areas of cerebral cortex in
    topologically ordered manner
  • building block of information processing
    infrastructure of nervous system

28
SOFM motivated by human brain
  • Neurons transform input signals into a
    place-coded probability distribution
  • sites of maximum relative activities within the
    map
  • accessed by higher-order processors with simple
    connection
  • each incoming information is kept in its proper
    context
  • Neurons dealing closed related information are
    close together so that they connected via short
    connections

29
Kohonen Model
  • Captures essential features of computational maps
    in Brain
  • capable of dimensionality reduction

30
Kohonen Model
  • Transform incoming signal pattern into discrete
    map
  • of 1-D or 2-D
  • adaptively topologically ordered fashion
  • Topology-preserving transformation
  • class of vector coding algorithm
  • optimally map into fixed number of code words
  • input pattern is represented as a localized
    region or spot of activities in the network
  • After initialization, three essential processes
  • competition
  • cooperation
  • synaptic adaptation

31
Competitive Process
  • Find best match of input vector with synaptic
    weight
  • x x1, x2, , x3T
  • wjwj1, wj2, , wjmT, j 1, 2,3, l
  • Best matching, winning neuron
  • i(x) arg min x-wj, j 1,2,3,..,l
  • Determine the location where the topological
    neighborhood of excited neurons is to be centered
  • continuous input space is mapped onto discrete
    output space of neuron by competitive process

32
Cooperative Process
  • For a winning neuron, the neurons in its
    immediate neighborhood excite more than those
    farther away
  • topological neighborhood decay smoothly with
    lateral distance
  • Symmetric about maximum point defined by dij 0
  • Monotonically decreasing to zero for dij ? 8
  • Neighborhood function Gaussian case
  • Size of neighborhood shrinks with time

33
Typical window function
34
Adaptive process
  • Synaptic weight vector is changed in relation
    with input vector
  • wj(n1) wj(n) ?(n) hj,i(x)(n) (x - wj(n))
  • applied to all neurons inside the neighborhood of
    winning neuron i
  • effect of moving weight wj toward input vector x
  • upon repeated presentation of the training data,
    weight tend to follow the distribution
  • Learning rate ?(n) may decay with time

35
SOFM algorithm
  • 1.initialize ws by random number
  • 2. For input x(n), find nearest cell
  • i(x) argminj x(n) - wj(n)
  • 3. update weights of neighbors
  • wj(n1) wj(n) ? (n) hj,i(x)(n) x(n) -
    wj(n)
  • 4. reduce neighbors and ?
  • 5. Go to 2

36
Computer Simulation
  • Input sample random numbers within 2-D unit
    square
  • 100 neurons ( 10x10)
  • Initial weights random assignment (0.01.0)
  • Display
  • each neuron positioned at w1, w2
  • neighbors are connected by line
  • next slide
  • 2nd example Figure 9.8 and 9.9

37
SOFM Example(1) 2-D Lattice by 2-D distribution
38
(No Transcript)
39
Topologically ordered map development (2)
40
Topologically ordered map development (3)
41
Topologically ordered map development (5)
42
Topologically ordered map development (1D array
of Neurons)
43
SOFM Example(2)Phoneme Recognition
  • Phonotopic maps
  • Recognition result for humppila

44
SOFM Example(3)
  • http//www-ti.informatik.uni-tuebingen.de/goepper
    t/KohonenApp/KohonenApp.html
  • http//davis.wpi.edu/matt/courses/soms/applet.htm
    l

45
Summary of SOM
  • Continuous input space of activation patterns
    that are generated in accordance with a certain
    probability distribution
  • Topology of the network in the form of a lattice
    of neurons, which defines a discrete output space
  • Time-varying neighborhood function defined around
    winning neuron
  • Learning rate decrease gradually with time, but
    never go to zero

46
Vector Quantization
  • VQ data compression technique
  • input space is divided into distinct regions
  • reproduction vector, representative vector
  • code words, code book
  • Voronoi quantizer
  • nearest neighbor rule on the Euclidean metric
  • Learned Vector Quantization
  • a supervised learning technique
  • move Voronoi vector slightly in order to improve
    classification decision quality

47
Voronoi Tesselation
48
Learned Vector Quantization
  • Suppose wc is the closest to input xi.
  • Let Cw be the class of wc
  • Let Cxi be the class label of xi
  • If Cw Cxi, then
  • wc(n1) wc(n) ?nxi - wc(n)
  • otherwise
  • wc(n1) wc(n) - ?nxi - wc(n)
  • the other Voronoi vectors are not modified

49
Adaptive Pattern Classification
  • Combination of Feature extraction and
    classification
  • Feature extraction
  • unsupervised by SOFM
  • essential information contents of input data
  • data reduction / dimension reduction effect
  • Classification
  • supervised scheme such as MLP
Write a Comment
User Comments (0)
About PowerShow.com