On Predicting Rare Classes with SVM Ensembles in Scene Classification presentation

About This Presentation

Transcript and Presenter's Notes

Title: On Predicting Rare Classes with SVM Ensembles in Scene Classification

1
On Predicting Rare Classes with SVM Ensembles in
Scene Classification

Rong Yan, Yan Liu, Rong Jin, Alex Hauptmann

2
Outline

Motivation
SVM ensembles to solve rare class problem
SVM ensembles
Combination
Held-out training set
Experimental Results
Conclusion

3
Scene Classification

Task classify images into meaningful semantic
scenes
Indoor, Outdoor, Cityscape, Landscape, Sunrise
Application
Descriptor in Multimedia Content Description
Interface (MPEG-7)
Provide high-level features for retrieval system

4
Scene Classification

Problem Formulation Binary classification based
on low-level visual features
Image color, texture, and shape features
Support Vector Machines
Maximize margin between classes

5
Rare Class Problem

Small number of positive examples in real-world
data
Positive examples coherent subset (e.g.
Cityscape, Landscape, Sunrise, Sunset)
Negative examples less well-defined as
"everything else
Rare class problem is very common
Scene classification, fraud detection, network
intrusion, text categorization and web mining
However, most studies are based on balanced data
set
the number of positive examples is comparable to
negative examples

6
Why we study rare class problem?

Dramatically degrade classification performance
All the images are predicted as negative
Reason
Most classifiers minimize classification error

7
Solutions

Better performance measure
Precision and recall
F1 measure 2PR/(PR)
Better classification schemes
Under Sampling throw away the negative data
Over Sampling replicate the positive data
Boosting increase the weight of positive data
iteratively

8
Under Sampling

Throw away the negative data to balance the
training data distribution

9
Over Sampling

Replicate the positive data to balance the
training data distribution
The same as changing the cost function of
positive data

10
Motivation for our approach

Drawbacks for existing approaches
Under sampling lost most of potentially useful
data
Over sampling much more training time
Critical to SVMs Time complexity O(N2SV)
Varying cost function more training time
Boosting much more training time and the
performance is related to weak classifiers
Our approach SVM ensembles

11
Outline

Motivation
SVM ensembles to solve rare class problem
SVM ensembles
Combination
Held-out training set
Experimental Results
Conclusion

12
SVM ensembles

Training Process
Decompose negative examples into groups
Combine the positive examples with each group of
negative examples
Train base classifiers(SVMs) individually
Combine all base classifiers

13
Why SVM ensembles?

Advantages compared to over sampling and under
sampling
Keep all of the useful information
Less training time than over sampling, 2-3 times
speedup
Reduce the variance of classifier prediction
Overcome the limitation of SVM software
Most software produce sub-optimal results
Combination might provide better performance

14
Classifier Combination

How to combine base classifiers
Determine the aggregation functions F(x)

Classifier Combination
15
Fixed combining rules

Fixed combining rules
Majority Voting vote for each class
Sum Rule sum of posterior probabilities
Averaging out the mistake of classifiers
They are sub-optimal
Base classifiers are never equally imperfectly
trained

16
Meta-classifiers

Treat classifier outputs as features, y ( y1,
.. , yK)
Stacking another classifier on top
Multi-layer perceptrons (MLPs)
SVMs
Advantages
Learn the combination weights automatically
Less sensitive to poor performed classifiers
SVMs require less tuning effort than MLPs
Require a held-out set to train

17
Held-out training set

Purpose train top-level classifiers
Select the training examples within the local
region of test sets
Training set distribution is likely to be
different from test set distribution
Better estimation of test set distribution

18
Outline

Motivation
SVM ensembles to solve rare class problem
SVM ensembles
Combination
Held-out training set
Experimental Results
Conclusion

19
Experimental Setting

TREC02 Video Track Feature Extraction Task
Text REtrieval Conference(TREC) Video Track
http//www-nlpir.nist.gov/projects/t01v/
Feature Extraction Task
23 hours of digital video
manually labeled and sample to more than 2,000
images
Our task detect cityscape, landscape for images

20
Experimental Setting

Classification setting
SVMLight RBF kernel ?0.05
10-folder cross validation
Performance measure F1
Image features
Color feature mean and variance for HSV space
Texture feature Gabor filter with 6 angles
Total length 144

21
Results

Results for cityscape

22
Results

Results for landscape

23
Conclusion

SVM ensembles are effective to address rare class
problem
Hierarchical SVMs are preferred
High effectiveness
High efficiency
Future work
Extend our discussion to other types of ensembles
and base classifiers

Write a Comment

User Comments (0)

About PowerShow.com

On Predicting Rare Classes with SVM Ensembles in Scene Classification PowerPoint PPT Presentation