LEARNING VECTOR QUANTIZATION Presentation By : Mihajlo Grbovic - PowerPoint PPT Presentation

About This Presentation

Title:

LEARNING VECTOR QUANTIZATION Presentation By : Mihajlo Grbovic

Description:

Learning Vector Quantization (LVQ) has been introduced by Kohonen as a simple, ... prototypes and smartly increase their number during classifications? 2. or ... – PowerPoint PPT presentation

Number of Views:1973

Avg rating:3.0/5.0

Slides: 25

Provided by: Miha

Learn more at: https://cis.temple.edu

Category:

more less

Transcript and Presenter's Notes

Title: LEARNING VECTOR QUANTIZATION Presentation By : Mihajlo Grbovic

1
LEARNING VECTOR QUANTIZATIONPresentation By
Mihajlo Grbovic
2
Learning Vector Quantization
INTRODUCTION
Learning Vector Quantization (LVQ) has been
introduced by Kohonen as a simple, universal and
efficient learning classifier. LVQ represents
a family of algorithms that are widely used in
the classification of potentially
high-dimensional data. Their popularity and
success in numerous applications is closely
related to their easy implementation and their
intuitively clear approach.
3
Learning Vector Quantization
INTRODUCTION
TRAINING DATA SET
Class 1 - green Class 2 - blue Class 3 -
red Class 4 - yellow
4
Learning Vector Quantization
INTRODUCTION
LVQs TASK IS TO BUILD A MODEL USING A TRAINING
DATA SET
Each test point is labeled based on the label of
the closest prototype
LVQ PROTOTYPES
LABEL TEST POINTS BASED ON THE CLOSEST LVQ
PROTOTYPES
5
Learning Vector Quantization
INTRODUCTION
LVQ classification is based on the Euclidian
distance as a measure of how similar the given
data is to the so-called prototypes. The
prototypes are determined during the training
procedure using a labeled dataset. The idea is
to start with some initial positions of the
prototypes in the feature space, and then improve
them in such way that in the end they represent
the labeled data in a best possible way.
Attractive feature of LVQ is that it can be
easily applied to a multi-class problem
Depending on the complexity of the labeled
data, we choose the number of prototypes that are
involved in representation of each class. This
number can vary from only a single prototype per
each class (if class separations are simple) to a
large number of prototypes per each class (if
class separations are complex). Also, different
classes can involve different number of
prototypes depending on their distribution in
space.
6
Learning Vector Quantization
INTRODUCTION
During the training procedure, positions of the
prototypes are updated based on the distance from
the points in the given dataset. Basically, we
are scanning trough the dataset and for every
point determining the closest prototype. Once
the closest prototype is found it is moved
towards (away from) the point if their classes
match (differ), respectively. LVQ is an
on-line learning algorithm, its computational
effort scales linearly with the size of the
dataset. Once one scan trough the data is
finished, the prototypes should be in their
optimal positions. However, there are some
applications where multiple scans are needed.
7
Learning Vector Quantization
INTRODUCTION
There are several different LVQ algorithms that
deal with the updates of the prototypes in a
different way. Three main variants are LVQ1,
LVQ2, and LVQ3. There are also LVQ2.1, LFM,
LFMW, weighted LVQ, etc.
8
Learning Vector Quantization
LVQ 1
For each training point x(t), all of the
reference vectors (prototypes) are searched and
the reference vector closest to the point is
found, using a Euclidean distance measure. If
this reference vector (prototype) mi belongs to
the same class as the training point x(t), it is
moved closer to the point, in proportion to the
distance between the two vectors If the
closest reference vector (prototype) mi belongs
to a class other than that of the point x(t), it
is moved away, again in proportion to the
distance between the two vectors
mi(tl) mi(t) a(t) (x(t) mi(t)), where a(t)
is a monotonically decreasing function of time.
mi(tl) mi(t) - a(t) (x(t) mi(t))
Prototype (class 2)
Prototype (class 1)
Point (class 1)
9
Learning Vector Quantization
LVQ2
For a certain training point x(t), three
conditions must be met for LVQ learning to
occur 1) Closest prototype to x(t) has to be of
wrong class - mi. 2) Next closest prototype to
x(t) has to be of correct class - mj. 3) The
training point x(t) must fall inside a small
symmetric window defined around the midpoint of
mi and mj
mi(tl) mi(t) - a(t) (x(t) mi(t)) mj(tl)
mj(t) a(t) (x(t) mj(t))
UPDATE STEP
where x(t) is a training vector belonging to
class j, mi is the reference vector for the
incorrect category, mj is the reference vector
for the correct category and a(t) is a
monotonically decreasing function of time.
It can be seen that this scheme assures that the
decision line between the two vectors will
eventually attain a near-optimal position given
the probability distributions of the categories,
namely, the place where the distributions cross.
Common initial value for a(0) is 0.03 Let di
and dj be the distances from the certain training
point x(t) and corresponding prototypes. Then,
x(t) falls inside the window if
, where s is a constant factor, commonly chosen
between 0.4 and 0.8
10
Learning Vector Quantization
LFM
For each training point x(t), all of the
reference vectors (prototypes) are searched and
the reference vector closest to the point is
found, using a Euclidean distance measure. If
this reference vector (prototype) belongs to the
same class as the training point, Do NOTHING! If
the closest reference vector (prototype) belongs
to a class other than that of the training point,
it is moved away, in proportion to the distance
between the two vectors After that find the
closest prototype mj of the same class as the
training point. This prototype is then moved
closer to the training point, again, in
proportion to the distance between the two
vectors
mi(tl) mi(t) - a(t) (x(t) mi(t))
mj(tl) mj(t) a(t) (x(t) mj(t))
Prototype (class 2)
Prototype (class 4)
Point (class 4)
Prototype (class 1)
11
PROBLEMS WITH LVQ
12
Problems with LVQ
Some Issues

How to initialize positions of the prototypes?
How many prototypes per class to choose? 10, 20,
30 Depends on the situation
Some classes have more complicated distribution
in the feature space then others, so they need
more
prototypes. How to detect this?
If the data set is unbalanced 90 of the
training data is of class 1 and 10 of class 2,
how many prototypes
to assign to each class? More of them to class
1 or more of them to class 2?
As a result of noise some prototypes end up in
positions where they are increasing
classification
error instead of decreasing it. They are doing
more harm then good. Example - 2 Gaussians in 2D
1
2
If we are working on a budget (100 prototypes)
do we use them right away or we start with a
number of
prototypes and smartly increase their number
during classifications?

or
Prototype will initially be chosen here where it
will remain trapped.
2
13
Problems with LVQ
Complicated Data Sets

It can be shown that regular LVQ doesnt cope
well with complicated distributions is feature
space, even in the 2D case.
Example

After 0 LVQ iterations (based on initial
prototype positions) Accuracy 0.6808, number of
misclassified points 3192
After 30 LVQ iterations Accuracy 0.88, number of
misclassified points 1173
After 60 LVQ iterations Accuracy 0.87, number of
misclassified points 1207
Training Data Set 10.000 points, 4 classes
Initially choose 100 points of each class as
prototypes
14
Problems with LVQ
Complicated Data Sets

Why are these points misclassified after so many
iterations?
There must be learning going on. But never the
less these points remain misclassified.
No meter how much these points are moving the
prototypes of correct class towards them, they
never seem
to come.
There is a simple explanation for this. Some
other points are dragging them back so they can
remain correctly
classified. This means we dont have enough
prototypes.
So we come to the conclusion that we have to add
some more prototypes at certain places.

15
Adaptive LVQ
LVQ add / LVQ remove

We introduced a novel modification of LVQ called
Adaptive LVQ
The idea is to start with initial equal number
of prototypes per each class.
Than add prototypes to better describe more
complicated class regions and
remove prototypes that are increasing
classification error instead of decreasing it.
We add two steps at the end of every LVQ
iteration LVQremove and LVQadd

16
Adaptive LVQ
LVQ ADD

LVQadd concentrates on misclassified points of
each class while LVQ training.
Using Hierarchical clustering we find whole
clusters of such points that are
misclassified due to insufficient number of
prototypes of that class.
Then, we add prototypes at positions of cluster
centorids to improve classification
accuracy.
We can control the size of clusters we want to
take into consideration and the
number of prototypes we are allowed to add.

17
Adaptive LVQ
LVQ ADD

First we isolate training points that are
misclassified by the existing prototypes

Then we concentrate on each class separately to
find clusters of misclassified
points and determine their centroids.

interesting
not interesting
CLASS 1
CLASS 2
etc
18
Adaptive LVQ
LVQ ADD

We are not interested in small clusters of data.
We can control the sensitivity
of our algorithm (for example, consider only
clusters with 4 or more points).
After LVQadd the new prototypes will be added to
the existing ones
There is usually some budget involved. Lets say
we start with 10 prototypes all
together. We can set a limiting budget of 50
prototypes.
So if LVQadd already added 40 prototypes during
the first 30 iterations, in order
to add more it has to wait for LVQremove to
remove some of them.

19
Adaptive LVQ
LVQ REMOVE

LVQremove is introduced to deal with possible
outcomes of prototype outcasts,
trapped prototypes and prototypes that are
stuck in the position where they are
classifying more training points incorrectly
than correctly.
This can also happen to the prototypes added as
a result of LVQadd.
We are gathering statistics about each prototype
during LVQ training and
combining these statistics into a unique
prototype score.
For each prototype i

ScoreiAi-BiCi
Ai counts how many times prototype i classified
correctly (and hasnt been moved) Bi counts how
many times has prototype i been moved away as a
prototype of the wrong class Ci counts how many
times has prototype i been moved towards as a
prototype of the correct class.
20
Adaptive LVQ
LVQ REMOVE

Prototypes with negative score are increasing
classification error instead of
decreasing it and as a result they are
removed.
Based on the SCORE, prototype is a good
prototype if it has to be moved a small
number of times AND it classifies correctly a
large number of times. It is STABLE!
The purpose of LVQremove is to detect bad
prototypes and remove them

1, Outcast prototypes - Never, or almost never
selected as the closest prototypes. They have
small Ai and small Ci
They are not influencing any point.
These prototypes can be removed simply and
without the
implementation of SCORE. 2. Prototypes that are
too close to one another We merge them 3,
Trapped prototypes - Large number of times
selected as closest prototypes
but they usually
misclassify. They have large Bi and small Ai.
- They can
never escape their destiny and will always be
moved around (2D Gauss case)
21
Adaptive LVQ
IMPLEMENTATION

LVQadd and LVQremove together form Adaptive LVQ
that can be applied to any
algorithm in the LVQ family (with slight
adjustments).
For example LVQ2LVQaddLVQremoveAdaptive LVQ2
LVQremove and LVQadd are performed after each
LVQ iteration respectively.
Adaptive LVQ has many interesting applications.
We can use it to
- form multi-class BUDGET classification
algorithm
- determine which class needs more prototypes
and which less
- determine how many prototypes is enough for
good classification

22
EXPERIMENTS AND RESULTS
23
RESULTS
COMPLICATED 2D CASE

We use the same data set as before. This time we
start with 20 training points
(5 of each class) as initial prototypes and
use Adaptive LVQ to build a model.
Our limit is 100 prototypes, since we did
previous experiments with this number
of prototypes.

After 30 LVQ iterations Accuracy 0.982, number
of misclassified points 174
Training Data Set 10.000 points, 4 classes
24
RESULTS
MAJOR DATA SETS

We compared Adaptive LVQ to Regular LVQ in
classification results on 10 major data sets.
Adaptive LVQ brings 6.4 accuracy improvement on
average

DATA SET classes prototypes LVQ2 only LVQ2 LVQ add/remove
Adult 2 100 0.8017 0.8136
Letter 26 858 0.7750 0.8968
Usps 10 280 0.8854 0.9253
Shuttle 7 105 0.8974 0.9954
Ijcnn 2 242 0.7633 0.9339
Pendigits 2 84 0.9703 0.9834
Gauss 56 52 0.8842 0.9152
Ionosphere 2 14 0.906 0.9602
Iris 3 6 0.9133 0.9666
Vowel 11 82 0.4480 0.4913
AVERAGE 0.82446 0.88817

Write a Comment

User Comments (0)