Advantages of Using Class Memberships in SelfOrganizing Map and Support Vector Machines - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Advantages of Using Class Memberships in SelfOrganizing Map and Support Vector Machines

Description:

Iris plant data: 3 classes with 4 features. Credit approval data: 2 classes with 15 features ... Iris Plant Data. 13. Experiment with Iris Data. Classical SOM ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 37
Provided by: sunghw
Category:

less

Transcript and Presenter's Notes

Title: Advantages of Using Class Memberships in SelfOrganizing Map and Support Vector Machines


1
Advantages of Using Class Memberships in
Self-Organizing Map and Support Vector Machines
  • Sunghwan Sohn, Cihan H. Dagli

2
Overview
  • Basic idea
  • Part I SOM using class memberships
  • Self-Organizing Map (SOM)
  • Fuzzy Class membership Assigning
  • Experimental results
  • Part II Sample Selection using class memberships
  • in SVM
  • Support Vector Machines (SVM)
  • Probabilistic Class membership Assigning
  • Possibilistic Class membership Assigning
  • Experimental results
  • Conclusions

3
Basic Idea
  • Self-Organizing Map
  • Objective To provide another perspective of the
    feature map representing class typicalness
  • Method The fuzzy class membership is accumulated
    instead of crisp class frequency
  • Support Vector Machines
  • Objective To adaptively select proper training
    samples
  • Method The samples having the proper
    probabilistic or possibilistic class membership
    are selected as the training set

4
Part I
  • SOM using class memberships

5
Self-Organizing Map
  • Extensively used in data mining
  • Topological mapping
  • Naturally unsupervised learning, but can be used
    as the classifier
  • In classification (General strategy)
  • Each neuron is assigned a class label based on
    the maximum class frequency
  • Each pattern is classified by a nearest neighbor
    method

6
Drawbacks of Class Labeling
  • Each labeled pattern is considered as equal
    importance regardless of its typicalness
  • In classification, it is difficult to judge the
    test pattern's typicalness
  • Fuzzy theory can be used to give class
    memberships to the neuron in the SOM network

7
Fuzzy Memberships
8
Initial Class Memberships of Samples
  • Memberships for the labeled data can be assigned
    by a K-nearest neighbor rule
  • The K-nearest neighbors are found, and then the
    membership in each class is assigned by the ratio
    of K and the number of neighbors with the
    constraint

9
Fuzzy Class Memberships to the Neuron
10
The SOM Algorithm
11
Experimental Design
  • Two methods were performed
  • Classical SOM
  • SOM with fuzzy class memberships
  • Data used
  • Iris plant data 3 classes with 4 features
  • Credit approval data 2 classes with 15 features

12
Iris Plant Data
 
13
Experiment with Iris Data
  • Classical SOM
  • Percent correct 97.78

3?3 lattice feature map
14
Experiment with Iris Data
SOM with fuzzy memberships Percent correct
97.78
3?3 lattice feature map
 
 
 
 
 
 
 
Membership of Setosa (1)
Membership of Vesicolor (2)
Membership of Virginica (3)
15
Credit Approval Data
16
Results of 4-by-4 Feature Map
SOM with fuzzy memberships Percent correct 84.69
Classical SOM  Percent correct 83.67
  • SOM with fuzzy memberships produced different
    class labels
  • in the 2nd row and 2nd column neuron
  • This neuron seems to have almost equal
    characteristics of both
  • "" and "", but it has more bias on "" based
    on fuzzy class
  • memberships

17
Part II
  • Sample Selection using class Memberships in SVM

18
Support Vector Machines
  • SVM is to find the particular hyperplane that
    maximize the margin of separation
  • Builds the decision hyperplane with support
    vectors
  • Attractive potential and promising performance in
    classification
  • Limitation of speed and size in training large
    data set

19
SVM Algorithm
20
Motive of Sample Selection I
  • SVM builds the decision function with only the
    part of training samples such as support vectors
  • Removing any training samples that are not
    relevant to support vectors might have no effect
    on building the proper decision function
  • Can use the probabilistic class membership of
    each sample to select the support-vector-like
    samples

21
Sample Selection Using Probabilistic Class
Membership
  • Assign the class membership to the sample using
    K-nearest neighbors
  • where ni is the number of the neighbors of xj
    belonging to the ith class
  • Samples having the non-crisp class membership are
    selected as the training set

22
Experimental Design
  • Data used
  • Artificially made 2-class data set
  • (500 training / 500 test set)
  • Samples having non-crisp class membership were
    selected as the training set

23
Data
Original data
Selected data having non-crisp memberships
24
Experiment
Classification result using class-probability
sampling (K5 used)
The results are averaged values of 5 random
trials
25
Motive of Sample Selection II
  • In implementing SVM, we assume that all necessary
    information for classification could be
    represented by support vectors
  • However, if the training set contains outliers,
    support vectors might not be properly chosen and
    degrade the classification performance for unseen
    samples
  • Can apply the possibilistic measure to separate
    outliers from typical training samples

26
Sample Selection Using Possibilistic Class
Membership
  • Vapnik provided a bound on the actual risk of
    support vector machine

Where P(error) is the actual risk, EP(error)
is the expectation of the actual risk, ENumber
of support vectors is the expectation of the
number of support vectors.
27
Sample Selection Using Possibilistic Class
Membership
  • If there exist outliers in training samples, they
    would be considered as support vectors and
    disturb to build a proper decision function, and
    consequently increase the risk of support vector
    machine
  • Need a measure to separate noisy samples from
    typical samples

28
Sample Selection Using Possibilistic Class
Membership
  • Assign the class membership to the sample using
    vague concepts Krishnapuram and Keller, 1993

where dij is the distance of sample xj to the
prototype of the ith class
29
Sample Selection Using Possibilistic Class
Membership
  • The value of ?i is determined by

A typical value of m is 2
30
Experimental Design
  • Data used
  • 1. 2-class data set
  • mean(0,0) and variance1 for class 1
    mean(2,0) and variance1 for class 2
  • 2. Noisy data

31
2-Class Data
Original data
Selected data (class possibilitygt0.6)
32
Experiment
Classification result using class-possibility
sampling
The results are averaged values of 5 random
trials
33
Noisy Data
Noisy data
Selected data (complete class probability)
Selected data (class possibilitygt0.7)
34
Experiment
Classification result using two sampling method
1. Select training samples having complete class
probability 2. Select training samples having
class possibilitygt0.7
35
Conclusion
  • Self-Organizing Map
  • Fuzzy memberships provide another perspective to
    view the SOM's output topology
  • The network can further distinguish each neuron
    from a class cluster based on its typicalness
  • In credit approval data, not only we could
    classify each pattern but also check the degree
    of goodness of credit

36
Conclusion
  • Support Vector Machines
  • The class membership allowed us to properly
    select training samples as well as to reduce
    support vectors
  • This method of sample selection is relatively
    simple and can speed up the training of SVM with
    large size of training set
Write a Comment
User Comments (0)
About PowerShow.com