Study of Topographic and Equiprobable Mapping with Clustering for Fault Classification - PowerPoint PPT Presentation

About This Presentation
Title:

Study of Topographic and Equiprobable Mapping with Clustering for Fault Classification

Description:

They are referred to as fault in a control system. ... Anil K. Jain and Richard C. Dubes, Algorithms for Clustering Data, Prentice Hall ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 47
Provided by: ashish78
Category:

less

Transcript and Presenter's Notes

Title: Study of Topographic and Equiprobable Mapping with Clustering for Fault Classification


1
Study of Topographic and Equiprobable Mapping
with Clustering for Fault Classification
  • Ashish Babbar
  • EE645 Final Project

2
Introduction
  • A typical control system consists of four basic
    elements
  • Dynamic plant
  • Controllers
  • Actuators
  • Sensors
  • Any kind of malfunction in these components can
    result in unacceptable anomaly in overall system
    performance.
  • They are referred to as fault in a control
    system. The Objective of fault detection and
    identification is to detect, isolate and identify
    these faults so that system performance can be
    recovered
  • Condition Based Maintenance (CBM) is the process
    of executing repairs when objective evidence
    indicates need for such actions or in other words
    when anomalies or faults are detected in a
    control system.

3
Motivation
  • Model based CBM can be applied when we have a
    mathematical model of the system to be monitored.
  • When CBM needs to be performed based on just the
    data available from sensors, data driven
    methodologies are utilized for this purpose.
  • SOM is widely used in data mining as a tool for
    exploration and analysis of large amounts of
    data.
  • It can be used for data reduction or vector
    quantization so that we can analyze the system
    data for anomalies by using only the data
    clusters formed from the trained map instead of
    the large initial data sets.

4
Competitive Learning
  • Assume a sequence of input samples v(t) in
    d-dimensional input space and a lattice of N
    neurons, labeled i 1,2,..,N and with the
    corresponding weight vectors
  • wi(t) wij(t) .
  • If v(t) can be simultaneously compared with each
    weight vector of the lattice then the best
    matching weight, for example wi can be
    determined and updated to match or even better
    the current input.
  • As a result of the competitive learning,
    different weights will become tuned to different
    regions in the input space.

Output layer
i
i
v
V
Input layer
5
Self Organizing Maps
  • SOM is an unsupervised neural network technique
    which finds application in
  • Density estimation e.g. clustering or
    classification purposes
  • Blind Source Separation
  • Visualization of data sets
  • It projects the input space on prototypes of
    low-dimensional regular grid that can be
    effectively utilized to visualize and explore
    properties of data.
  • The SOM consists of a regular, two dimensional
    grid of map units (neurons).
  • Each unit i is represented by a prototype vector
  • wi(t) wi1(t), , wid(t) where d is
    input vector dimension

6
Self Organizing Maps (Algorithm)
  • Given a data set the number of map units
    (neurons) is first chosen.
  • The Map units can be selected to be approximately
    equal to vN to 5vN, where N is the number of data
    samples in the given data set.
  • The SOM is trained iteratively. At each training
    step, a sample vector v is randomly chosen from
    the input data set.
  • Distances between v and all prototype vectors are
    computed. The Best Matching Unit (BMU) or the
    winner, which is denoted here by b is the map
    unit with prototype closest to v.
  • v wb
    mini v wi

7
Self Organizing Maps
  • The BMU or winner and its topological neighbors
    are moved closer to the input vector in the input
    space
  • wi(t1) wi(t) a(t)hbi(t)v-wi(t)
  • t time
  • a(t) adaptation coefficient
  • hbi(t) neighborhood kernel centered on
    winner unit
  • where rb and ri are positions of neurons b
    and i on the SOM grid. Both a(t) and s(t)
    decrease monotonically with time.

8
Clustering- The Two Level Approach
  • Group the input data into clusters where the data
    is grouped into same cluster if its similar to
    one another.
  • A widely adopted definition of clustering is a
    partitioning that minimizes the distances within
    and maximizes the distance between clusters.
  • Once the neurons are trained the next step is
    clustering of SOM. For clustering of SOM the two
    level approach is followed.
  • First a large set of neurons much larger than the
    expected number of clusters is formed using the
    SOM.
  • The Neurons in the next step are combined to form
    the actual clusters using the k-means clustering
    technique .
  • The number of clusters is K and number of Neurons
    is M.
  • KltltMltltN

9
Two level Approach
Level 1
Level 2
Data Samples
K Clusters
M Prototypes of the SOM
N Samples
K lt M lt N
10
Advantage of using the two level approach
  • The primary benefit is the reduction of
    computational cost. Even with relatively small
    number of data samples many clustering algorithms
    become intractably heavy.
  • For e.g. By using two level approach the
    reduction of computational load is about vN /15
    or about six fold for N10,000 from the direct
    clustering of data using k-means
  • Another benefit is noise reduction. The
    prototypes are local averages of data and
    therefore less sensitive to random variations
    than the original data.

11
K-means decision on number of clusters
  • K-means algorithm was used at level 2 for
    clustering of the trained SOM neurons.
  • K-means algorithm clusters the given data into k
    clusters where we define k.
  • To decide the value of k one method is to run the
    algorithm from k2 to k vN where N is the
    number of data samples.
  • K-means algorithm minimizes the error function
  • Where C is the number of clusters and ck is
    the center of cluster k.
  • The approach followed in this project was to pick
    the number of clusters as the value k which makes
    the error E 0.10E to 0.15E or 10 to 15 of E
    where E is the error when k2

12
Selection of number of clusters
13
Reference Distance Calculation
  • The clusters thus formed using training/nominal
    data sets are used to calculate the reference
    distance (dRef).
  • Knowing the cluster centers calculated from the
    k-means algorithm and the prototypes/Neurons
    which formed a particular cluster, we calculate
    the reference distance for each cluster.
  • Reference distance specific to a particular
    cluster is equal to the distance between the
    cluster center and the prototype/Neuron belonging
    to this cluster that is at the maximum distance
    from this cluster center.
  • Similarly the reference distance for each of the
    clusters formed from the nominal data set is
    calculated and serves as a base for fault
    detection.
  • To classify the given data cluster as nominal or
    faulty this underlying structure of the initial
    known nominal data set is used.

14
Fault Identification
  • The assumption made in this case is that the
    nominal data sets available which are used to
    form the underlying cluster structure spans the
    space of all information that is not faulty.
  • The same procedure is then repeated for the
    unknown data sets (no idea if nominal or in
    fault) i.e. first the N data points are reduced
    to a mapping of M neurons and then clustered
    using k-means algorithm.
  • Now taking the training data clusters as centers
    and knowing the reference distance for each
    cluster, we see if the clusters from the unknown
    data set are a member of the region spanned by
    the radius equal to the specific reference
    distance for that training cluster.
  • Any unknown data set cluster which is not a part
    of the region spanned by taking the training data
    cluster as center and radius equal to reference
    distance for that cluster is termed as faulty.

15
Block Diagram
Mapping Algorithm
Training Data
Reference Distance
Clustering
N samples
K clusters
M Neurons
Mapping Algorithm
Unknown Data
Distance Deviation
Fault Identification
Clustering
N samples
M Neurons
K clusters
KltltMltltN
16
SOM Using Nominal data
17
Clustering of nominal data
18
SOM using Unknown data
19
Clustering of unknown data
20
Fault Identification
21
SOM Performance
22
Equiprobable Maps
  • For Self Organizing feature maps the weight
    density at convergence is not a linear function
    of the input density p(v) and hence the neurons
    of the map will not be active with equal
    probabilities (i.e. the map is not
    equiprobabilistic).
  • For a discrete lattice of neurons, the weight
    density will be proportional to
  • Regardless of the type of neighborhood function
    used SOM tends to undersample the high
    probability regions and oversample the low
    probability regions

23
Avoiding Dead Units
  • SOM algorithm converges to a mapping which yields
    neurons that are never active (dead units).
  • These units do not contribute to the minimization
    of the overall (MSE) distortion of the map.
  • To produce maps in which the neurons would have
    an equal probability to be active
    (Equiprobabilistic Maps) the idea of adding
    conscience to the winning neuron was introduced.
  • Techniques of generating Equiprobabilistic Maps
    discussed are
  • Conscience Learning
  • Frequency Sensitive Competitive Learning (FSCL)

24
Conscience Learning
  • When a neural network is trained with
    unsupervised competitive learning on a set of
    input vectors that are clustered into K
    groups/clusters then a given input vector v will
    activate neuron i that has been sensitized to
    the cluster containing the input vector.
  • However if some region in the input space is
    sampled more frequently than the others, then a
    single unit begins to win all competitions for
    this region.
  • To counter this defect, one records for each
    neuron i the frequency with which it has won
    competition in the past, ci, and adds this
    quantity to the Euclidean distance between the
    weight vector wi and the current input v.


25
Conscience Learning
  • In Conscience learning two stages are
    distinguished
  • First the winning neuron is determined out of the
    N units
  • i
  • Second, the winning neuron i is not necessarily
    the one that will have its weight vectors
    updated.
  • Which neurons need to be updated depends on an
    additional term for each unit, which is related
    to the number of times the unit has won the
    competition in recent past.
  • The rule of update is that for each neuron the
    number of times it has won the competition is
    recorded and a scaled version of this quantity is
    added to the distance metric used in minimum
    Euclidean distance rule

26
Conscience Learning
  • Update rule
  • With ci the number of times neuron i has won the
    competition, and C the scaling factor
    (Conscience Factor).
  • After determining the winning neuron i its
    conscience is incremented
  • The weight of the winning neuron is updated using
  • where is the learning rate and its value
    is equal to small positive constant

27
Conscience learning using nominal data
28
Clusters shown with neurons
29
Clustering of nominal data
30
Conscience learning on new data set
31
Clustering of unknown data set
32
Clusters represented on data set
33
Fault Identification
34
Conscience learning Performance
35
Frequency Sensitive Competitive Learning
  • Another competitive learning scheme which is used
    is the Frequency Sensitive Competitive Learning
  • This learning scheme keeps a record of the total
    number of times each neuron has won the
    competition during training.
  • The distance metric in the Euclidean distance
    rule is then scaled as follows
  • After selection of winning neuron its conscience
    is then incremented and the weight vector updated
    using the UCL rule

36
FSCL using Nominal data
37
Clustering of nominal data
38
Clusters shown with neurons
39
FSCL using the unknown data set
40
Clustering using unknown data
41
Clusters of unknown data
42
Fault Identification
43
FSCL performance
44
Conclusions
  • As shown in the results the performance of SOM
    algorithm was not as good as compared to the CLT
    and FSCL approaches.
  • As SOM produces dead units even if the
    neighborhood function converges slowly so it was
    not able to train the neurons well according to
    the available data sets.
  • Due to undersampling of high probability regions,
    SOM was able to detect only two faulty clusters
    out of the four and thus its performance was not
    good.
  • Using CLT and FSCL approach all four faulty
    clusters were detected using the reference
    distance as the distance measure.
  • Thus the equiprobable maps perform much better
    than the SOM by avoiding the dead units and
    training the neurons by assigning a conscience
    with the winning neuron

45
References
  • Marc M. Van Hulle, Faithful Representations and
    Topographic maps From Distortion to Information
    based Self Organization, John Willey sons 2000
  • T. Kohonen, Self Organizing Maps, Springer 1997
  • Anil K. Jain and Richard C. Dubes, Algorithms
    for Clustering Data, Prentice Hall 1988
  • Juha Vesanto and Esa Alhoniemi, Clustering of
    the Self Organizing Map, IEEE Trans. Neural
    Networks, Vol 11, No 3, May 2000

46
Questions/Comments ?
Write a Comment
User Comments (0)
About PowerShow.com