Title: Advantages of Using Class Memberships in SelfOrganizing Map and Support Vector Machines
1Advantages of Using Class Memberships in
Self-Organizing Map and Support Vector Machines
- Sunghwan Sohn, Cihan H. Dagli
2Overview
- Basic idea
- Part I SOM using class memberships
- Self-Organizing Map (SOM)
- Fuzzy Class membership Assigning
- Experimental results
- Part II Sample Selection using class memberships
- in SVM
- Support Vector Machines (SVM)
- Probabilistic Class membership Assigning
- Possibilistic Class membership Assigning
- Experimental results
- Conclusions
3Basic Idea
- Self-Organizing Map
- Objective To provide another perspective of the
feature map representing class typicalness - Method The fuzzy class membership is accumulated
instead of crisp class frequency - Support Vector Machines
- Objective To adaptively select proper training
samples - Method The samples having the proper
probabilistic or possibilistic class membership
are selected as the training set
4Part I
- SOM using class memberships
5Self-Organizing Map
- Extensively used in data mining
- Topological mapping
- Naturally unsupervised learning, but can be used
as the classifier - In classification (General strategy)
- Each neuron is assigned a class label based on
the maximum class frequency - Each pattern is classified by a nearest neighbor
method
6Drawbacks of Class Labeling
- Each labeled pattern is considered as equal
importance regardless of its typicalness - In classification, it is difficult to judge the
test pattern's typicalness - Fuzzy theory can be used to give class
memberships to the neuron in the SOM network
7Fuzzy Memberships
8Initial Class Memberships of Samples
- Memberships for the labeled data can be assigned
by a K-nearest neighbor rule - The K-nearest neighbors are found, and then the
membership in each class is assigned by the ratio
of K and the number of neighbors with the
constraint
9Fuzzy Class Memberships to the Neuron
10The SOM Algorithm
11Experimental Design
- Two methods were performed
- Classical SOM
- SOM with fuzzy class memberships
- Data used
- Iris plant data 3 classes with 4 features
- Credit approval data 2 classes with 15 features
12Iris Plant Data
13Experiment with Iris Data
- Classical SOM
- Percent correct 97.78
3?3 lattice feature map
14Experiment with Iris Data
SOM with fuzzy memberships Percent correct
97.78
3?3 lattice feature map
Membership of Setosa (1)
Membership of Vesicolor (2)
Membership of Virginica (3)
15Credit Approval Data
16Results of 4-by-4 Feature Map
SOM with fuzzy memberships Percent correct 84.69
Classical SOM Percent correct 83.67
- SOM with fuzzy memberships produced different
class labels - in the 2nd row and 2nd column neuron
- This neuron seems to have almost equal
characteristics of both - "" and "", but it has more bias on "" based
on fuzzy class - memberships
17Part II
- Sample Selection using class Memberships in SVM
18Support Vector Machines
- SVM is to find the particular hyperplane that
maximize the margin of separation - Builds the decision hyperplane with support
vectors - Attractive potential and promising performance in
classification - Limitation of speed and size in training large
data set
19SVM Algorithm
20Motive of Sample Selection I
- SVM builds the decision function with only the
part of training samples such as support vectors - Removing any training samples that are not
relevant to support vectors might have no effect
on building the proper decision function - Can use the probabilistic class membership of
each sample to select the support-vector-like
samples
21Sample Selection Using Probabilistic Class
Membership
- Assign the class membership to the sample using
K-nearest neighbors - where ni is the number of the neighbors of xj
belonging to the ith class - Samples having the non-crisp class membership are
selected as the training set
22Experimental Design
- Data used
- Artificially made 2-class data set
- (500 training / 500 test set)
- Samples having non-crisp class membership were
selected as the training set
23Data
Original data
Selected data having non-crisp memberships
24Experiment
Classification result using class-probability
sampling (K5 used)
The results are averaged values of 5 random
trials
25Motive of Sample Selection II
- In implementing SVM, we assume that all necessary
information for classification could be
represented by support vectors - However, if the training set contains outliers,
support vectors might not be properly chosen and
degrade the classification performance for unseen
samples - Can apply the possibilistic measure to separate
outliers from typical training samples
26Sample Selection Using Possibilistic Class
Membership
- Vapnik provided a bound on the actual risk of
support vector machine
Where P(error) is the actual risk, EP(error)
is the expectation of the actual risk, ENumber
of support vectors is the expectation of the
number of support vectors.
27Sample Selection Using Possibilistic Class
Membership
- If there exist outliers in training samples, they
would be considered as support vectors and
disturb to build a proper decision function, and
consequently increase the risk of support vector
machine - Need a measure to separate noisy samples from
typical samples
28Sample Selection Using Possibilistic Class
Membership
- Assign the class membership to the sample using
vague concepts Krishnapuram and Keller, 1993
where dij is the distance of sample xj to the
prototype of the ith class
29Sample Selection Using Possibilistic Class
Membership
- The value of ?i is determined by
A typical value of m is 2
30Experimental Design
- Data used
- 1. 2-class data set
- mean(0,0) and variance1 for class 1
mean(2,0) and variance1 for class 2 - 2. Noisy data
312-Class Data
Original data
Selected data (class possibilitygt0.6)
32Experiment
Classification result using class-possibility
sampling
The results are averaged values of 5 random
trials
33Noisy Data
Noisy data
Selected data (complete class probability)
Selected data (class possibilitygt0.7)
34Experiment
Classification result using two sampling method
1. Select training samples having complete class
probability 2. Select training samples having
class possibilitygt0.7
35Conclusion
- Self-Organizing Map
- Fuzzy memberships provide another perspective to
view the SOM's output topology - The network can further distinguish each neuron
from a class cluster based on its typicalness - In credit approval data, not only we could
classify each pattern but also check the degree
of goodness of credit
36Conclusion
- Support Vector Machines
- The class membership allowed us to properly
select training samples as well as to reduce
support vectors - This method of sample selection is relatively
simple and can speed up the training of SVM with
large size of training set