Feature Selection and Weighting using Genetic Algorithm for Off-line Character Recognition Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Feature Selection and Weighting using Genetic Algorithm for Off-line Character Recognition Systems

Description:

... wrapper approach (i.e. fitness function is the classifier's ... The time needed for GFS is bounded by (lower) linear-fit and (upper) exponential-fit curves ... – PowerPoint PPT presentation

Number of Views:607
Avg rating:3.0/5.0
Slides: 21
Provided by: usersEncs
Category:

less

Transcript and Presenter's Notes

Title: Feature Selection and Weighting using Genetic Algorithm for Off-line Character Recognition Systems


1
Feature Selection and Weighting using Genetic
Algorithm for Off-line Character Recognition
Systems
The University of British Columbia Department of
Electrical Computer Engineering
Presented by
Faten Hussein
2
Outline
  • Introduction Problem Definition
  • Motivation Objectives
  • System Overview
  • Results
  • Conclusions

3
Introduction
Off-line Character Recognition System
Text document
Scanning
  • Address readers
  • Bank Cheques readers
  • Reading data entered in forms (tax forms)
  • Detecting forged signatures

Pre-Processing
Feature Extraction
Classification
Classified text
Post-Processing
4
Introduction
For typical handwritten recognition task
  • Many variants of character (symbol) shape, size.
  • Different writers have different writing styles.
  • Same person could have different writing style.
  • Thus, unlimited number of variations for a single
    character exists.

5
Introduction
Variations in handwritten digits extracted from
zip codes
L0, E3
To overcome this diversity, a large number of
features must be added
L1, E1
L2, E0
An example of features that we used are moment
invariants, number of loops, number of end
points, centroid, area, circularity and so on.
6
Problem
Dilemma
Add more features
  • Increase problem size
  • Increase run time/memory for classification
  • To accommodate variations in symbols
  • Add-hoc process, depends on experience and trail
    and error

Character Recognition System
  • Might add redundant/irrelevant features which
    decrease the accuracy
  • Hope to increase classification accuracy

7
Feature Selection
Solution Feature Selection
Definition Select a relevant subset of features
from a larger set of features while maintaining
or enhancing accuracy
Advantages
  • Remove irrelevant and redundant features
  • Total of 40 features -gt reduced to 16
  • 7 Hu moments -gt only first three
  • Area removed -gt redundant (Circularity)
  • Maintain/enhance the classification accuracy

70 recognition rate using 40 features -gt 75
after FS using only 16 features
  • Faster classification and less memory
    requirements

8
Feature Selection/Weighting
Feature Selection (FS) Feature Weighting (FW)
Special Case General Case
Binary weights (0 for irrelevant/redundant 1 for relevant) Real-valued weights (variable weights depending on the feature relevance)
Number of feature subset combinations Number of feature subset combinations
  • The process of assigning weights (binary or real
    valued) to features needs a search algorithm to
    search for the set of weights that results in
    best classification accuracy (optimization
    problem)
  • Genetic algorithm is a good search method for
    optimization problems

9
Genetic Feature Selection/Weighting
Why use GA for FS/FW
  • Has been proven to be a powerful search method
    for FS problem
  • Does not require derivative information or any
    extra knowledge only the objective function
    (classifiers error rate) to evaluate the quality
    of the feature subset
  • Search a population of solutions in parallel, so
    they can provide a number of potential solutions
    not only one
  • GA is resistant to becoming trapped in local
    minima

10
Objectives Motivations
Build a genetic feature selection/weighting
system to be applied to character recognition
problem and investigate the following issues
  • Study the effect of varying weight values on the
    number of selected features (FS often eliminates
    more features than FW, how much ??)
  • Compare the performance of genetic feature
    selection/weighting in the presence of irrelevant
    redundant features (not studied before)
  • Compare the performance of genetic feature
    selection/weighting for regular cases (test the
    hypothesis that says that FW should have better
    or at least same results as FS ??)
  • Evaluate the performance of the better method
    (GFS or GFW) in terms of optimality and time
    complexity (study the feasibility of genetic
    search for optimality time)

11
Methodology
  • The recognition problem is to classify isolated
    handwritten digits
  • Used k-nearest-neighbor as a classifier (k1)
  • Used genetic algorithm as search method
  • Applied genetic feature selection and weighting
    in the wrapper approach (i.e. fitness function is
    the classifiers error rate)
  • Used two phases during the program run
    training/testing phase and validation phase

12
System Overview
Best feature subset (M ltN)
Pre-Processing Module
Feature Extraction Module
All Extracted features N
Feature selection/weighting Module (GA)
Input (isolated handwritten digits images)
Clean images
Assessment of feature subset
Feature subset
Evaluation Module (KNN classifier)
Training/Testing
Evaluation
Validation
13
Results (Comparison 1)
Effect of varying weight values on the number of
selected features
  • As the number of weight values increase, the
    probability of a feature having weight value0
    (POZ) decreases, so the number of eliminated
    features decreases
  • GFS eliminates more features (thus selects less
    features) than GFW because of its smaller number
    of weight values (0/1) and without compromising
    classification accuracy

14
Results (Comparison 2)
Performance of genetic feature selection/weighting
in the presence of irrelevant features
  • The performance of 1-NN classifier rapidly
    degrades by increasing the number of irrelevant
    features
  • As the number of irrelevant features increases,
    FS outperform all FW settings in both
    classification accuracy and elimination of
    features

15
Results (Comparison 3)
Performance of genetic feature selection/weighting
in the presence of redundant features
  • The classification accuracy of 1-NN does not
    suffer so much by adding redundant features, but
    they increase the problem size
  • As the number of redundant features increases, FS
    has slightly better classification accuracy than
    all FW settings, but significantly outperform FW
    in elimination of features

16
Results (Comparison 4)
Performance of genetic feature selection/weighting
for regular cases (not necessarily having
irrelevant/redundant)
  • FW has better training accuracies than FS, but FS
    is better in generalization (have better
    accuracies for unseen validation samples)
  • FW over-fits the training samples

17
Results (Evaluation 1)
Convergence of GFS to an Optimal or Near-Optimal
Set of Features
Number of features Best Exhaustive (class. rate ) Best GA (class. rate ) Average GA (5 runs)
8 74 74 74
10 75.2 75.2 75.2
12 77.2 77.2 77.04
14 79 79 78.56
16 79.2 79 78.28
18 79.4 79.4 78.92
  • GFS was able to return optimal or near-optimal
    values (reached by the exhaustive search)
  • The worst average value obtained by GFS less than
    1 away from optimal value

18
Results (Evaluation 2)
Convergence of GFS to an Optimal or Near-Optimal
Set of Features within an Acceptable Number of
Generations
The time needed for GFS is bounded by (lower)
linear-fit and (upper) exponential-fit curves
The use of GFS for highly dimensional problems
need parallel processing
19
Conclusions
  • GFS is superior to GFW in feature reduction and
    without compromising classification accuracy
  • In the presence of irrelevant features, GFS is
    better than GFW in both feature reduction and
    classification accuracy
  • In the presence of redundant features, GFS is
    also preferred over GFW due its increased ability
    to feature reduction
  • For regular databases, it is advisable to use 2
    or 3 weight values at most to avoid over-fitting
  • GFS is a reliable method to find optimal or
    near-optimal solution, but need parallel
    processing for large problem sizes

20
Questions ?
Write a Comment
User Comments (0)
About PowerShow.com