Dr' Claude C' Chibelushi - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Dr' Claude C' Chibelushi

Description:

P(C|F): a posteriori probability that observed feature F is from ... Classification: Scaled a posteriori class probability matrix. max. for feature. 11/7/09 ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 33
Provided by: claudecch
Category:

less

Transcript and Presenter's Notes

Title: Dr' Claude C' Chibelushi


1
Fac. of Comp., Eng. Tech. Staffordshire
University
Image Processing, Computer Vision, and Pattern
Recognition
Statistical Pattern RecognitionPart II
Classical Model
Dr. Claude C. Chibelushi
2
Outline
  • Introduction
  • Classical Pattern Recognition Model
  • Feature Extraction
  • Classification
  • Applications
  • Optical Character Recognition
  • Others
  • Summary

3
Introduction
  • Pattern recognition often consists of sequence of
    processes
  • often configured according to classical pattern
    recognition model

4
Classical Recognition Model
Simplified block diagram
Facial image
Example
5
Classical Recognition Model
  • Performance issues
  • All stages of recognition pipeline, and their
    connection, affect performance
  • typical performance measures recognition
    accuracy, speed, storage requirements
  • optimisation of components/connections often
    required
  • careful selection / design / implementation of
  • data capture equipment / environment
  • processing techniques

6
Feature Extraction
  • Aim to capture discriminant characteristics of
    pattern
  • Extracts pattern descriptors from raw data
  • descriptors should contain information most
    relevant to recognition task
  • descriptors may be numerical (quantitative ) or
    linguistic
  • group of numerical descriptors often known as
    feature vector

7
Feature Extraction
  • Common features for computer vision
  • Shape descriptors
  • external (e.g. boundary) internal (e.g. holes)
  • Surface descriptors
  • texture, brightness, colour, ...
  • Spatial configuration descriptors
  • arrangement of basic elements
  • Temporal configuration descriptors
  • deformation or motion of basic elements

8
Feature Extraction
  • Example
  • Pattern recognition application gender detection
  • Classes male, female

9
Feature Extraction
  • Example (ctd.)
  • Chosen features height, silhouette area

10
Feature Extraction
  • Example (ctd.) pseudo code
  • frontEnd(image)
  • // foreground-background image segmentation
    (e.g. thresholding possibly after noise removal)
  • prpImage preprocess(image)
  • // calculate height and width of image segments
  • features FeatExtr(prpImage)
  • return features

11
Feature Extraction
Graphical representation of feature
distribution Example (ctd.) data set of 5 male
and 5 female subjects
12
Feature Extraction
  • Graphical representation of feature distribution
  • Example (ctd.) Feature plot 2D feature space

13
Classification
  • Aim to identify class (category ) to which
    unknown pattern belongs
  • Wide variety of classifiers
  • Classifier selection is problem-dependent
  • use simple classifier if effective

14
Classification
  • Some classifiers
  • Minimum-distance classifier
  • classification based on distance from
    class-prototype (e.g. average) pattern
  • closest prototype determines class
  • k-nearest neighbour classifier
  • classification based on distance from class
    patterns (or clusters)
  • closest k patterns (or clusters) determine class

15
Classification
  • Some classifiers
  • Bayesian classifier
  • classification based on probability of belonging
    to class
  • most likely class
  • Artificial neural network classifier
  • classification based on neuron activations (shown
    to relate to class probability)
  • most likely class

16
Classification
Minimum-distance classifier
2D feature space
17
Classification
  • Minimum-distance classifier pseudo code
  • minDistClassifier(unknownFeatVect, prototypes)
  • minDist MAX
  • class UNKNOWN
  • // assign unknown sample to class of nearest
    prototype
  • for each protoVect in prototypes
  • dist distance(unknownFeatVect, protoVect)
  • if (dist lt minDist)
  • minDist dist
  • class class of protoVect
  • return class

18
Classification
  • k-nearest neighbour classifier

2D feature space
19
Classification
  • k-nearest neighbour classifier pseudo code
  • kNNClassifier(unknownFeatVect, dataSamples, k)
  • size k
  • initialise(minDist, minDistClasses, size)
  • class UNKNOWN
  • // find k samples nearest to unknown sample
  • for each sampleVect in dataSamples
  • dist distance(unknownFeatVect, sampleVect)
  • updateKNearestDist(dist, minDist,
    minDistClasses)
  • class majorityClass(minDistClasses)
  • return class

20
Classification
  • Some distance metrics
  • (for distance-based classifiers)
  • Measure similarity between unknown pattern and
    prototype pattern
  • based on differences between corresponding
    features in both patterns, e.g.
  • Euclidean distance sum of squares of differences
  • City-block (Manhattan or taxi-cab) distance sum
    of absolute values of differences

21
Classification
  • Decision boundary for
  • minimum-distance classifier

2D feature space
22
Classification
  • Limitations of minimum-distance classifier
  • Prone to misclassification for
  • high feature correlation
  • problems requiring non-linear decision boundary,
    e.g.
  • curved decision boundary
  • data with subclasses (i.e. clusters)
  • intricate decision boundary

23
Classification
2D feature space
Minimum-distance classifier feature correlation
24
Classification
2D feature space
Minimum-distance classifier curved decision
boundary
25
Classification
2D feature space
Minimum-distance classifier distinct subclasses
26
Classification
2D feature space
Minimum-distance classifier complex decision
boundary
27
Classification
  • Bayesian classifier
  • Bayes rule P(CF) ( P(FC) x P(C) ) / P(F)
  • P(CF) a posteriori probability that observed
    feature F is from pattern that belongs to class C
  • P(FC) conditional probability of observing
    feature F given class C
  • P(C) a priori probability that randomly-drawn
    feature is from pattern that belongs to class C
  • P(F) total probability of observing feature F
  • for mutually exclusive classes P(F) ?c (P(FC)
    x P(C))
  • P(F) is class independent, and can be seen as
    normalisation so that ? P(CF) 1
  • hence P(F) can be omitted from classification
    calculations (since decision criterion is maximum
    likelihood)

28
Classification
  • Bayesian classifier
  • Extension to multiple features
  • for simplicity, assume features are statistically
    independent, for each class (i.e. conditional
    independence)
  • naïve Bayesian classifier
  • compute class conditional probability, for each
    class, as product of class conditional
    probability for each feature
  • Bayes rule P(CF1F2...Fn) ( P(F1F2...FnC) x
    P(C) ) / P(F1F2...Fn)
  • ( P(F1C) x P(F2C) x ... x P(FnC) x P(C)
    ) / P(F1F2...Fn)

29
Classification
  • Bayesian classifier numerical example
  • Image segmentation
  • image grey-level image of scene showing road,
    pavement and grass (e.g. for autonomous unmanned
    vehicle)
  • known proportions of pixels in common scenes
    PavementGrassRoad 325
  • problem classify each pixel in image captured by
    vehicle camera as road, pavement or grass
    using Bayesian maximum likelihood classification
  • solution
  • classifier Bayesian classifier
  • feature F is pixel grey level
  • for colour image, each colour component can be a
    feature (see naïve Bayesian classifier)

30
Classification
  • Bayesian classifier numerical example (ctd.)
  • Classifier training
  • collect many typical images
  • training set that contains representative data
  • for each known class, select image areas
    containing pixels from class
  • compute grey-level histogram for class
  • normalise histogram so that sum of grey-level
    frequencies is 1
  • i.e. estimation of class conditional
    probabilities (P(FC))
  • estimate a priori probability for class (P(C))

31
Classification
  • Bayesian classifier numerical example (ctd.)

Grey level
Label
Total
0
1
2
3
Pavement
15
12
26
28
81
Grass
30
20
6
4
60
Road
4
0
20
16
40
Training Grey-level histogram matrix
32
Classification
  • Bayesian classifier numerical example (ctd.)

Grey level
Label
0
1
2
3
Pavement
15 / 81 0.19
12 / 81 0.15
26 / 81 0.32
28 / 81 0.35
Grass
30 / 60 0.5
20 / 60 0.33
6 / 60 0.1
4 / 60 0.067
Road
4 / 40 0.1
0 / 40 0
20 / 40 0.5
16 / 40 0.4
Training Class conditional probability matrix
33
Classification
  • Bayesian classifier numerical example (ctd.)
  • Training
  • PavementGrassRoad 325
  • hence, a priori class probabilities are
  • P(Pavement) 3 / (3 2 5) 0.3
  • P(Grass) 2 / (3 2 5) 0.2
  • P(Road) 5 / (3 2 5) 0.5

34
Classification
  • Bayesian classifier numerical example

Grey level
Label
0
1
2
3
Pavement
0.19 x 0.3 0.057
0.15 x 0.3 0.045
0.32 x 0.3 0.096
0.35 x 0.3 0.105
Grass
0.5 x 0.2 0.1
0.33 x 0.2 0.066
0.1 x 0.2 0.02
0.067 x 0.2 0.013
Road
0.1 x 0.5 0.05
0 x 0.5 0
0.5 x 0.5 0.25
0.4 x 0.5 0.2
Classification Scaled a posteriori class
probability matrix
35
Classification
  • Bayesian classifier numerical example

Grey level
0
1
2
3
Grass
Road
Label
Grass
Road
Classification Class label matrix
36
Classification
  • Classifier training
  • Data-driven capture of parameters representing
    statistical distribution or syntactic
    configuration of salient class characteristics
  • supervised training
  • class labels used during training
  • unsupervised training
  • class labels not used during training (e.g.
    clustering)

37
Classification
  • Classifier testing
  • Testing estimation of recognition accuracy
  • often uses real data simulation may be used
    (Monte Carlo)
  • Accuracy measure
  • error rate (often expressed as percentage)
  • e.g. correct recognition rate, insertion rate,
    false acceptance rate, false rejection rate, ...

38
Optical Character Recognition
39
Optical Character Recognition
Generic OCR system
40
Optical Character Recognition
  • Feature extraction methods
  • Spatial domain to frequency domain transform
  • Hartley, Fourier, or other transform
  • Statistics
  • mean, variance projection histograms
    orientation histograms

41
Optical Character Recognition
  • Feature extraction methods
  • Miscellaneous
  • geometric measures
  • ratio of width and height of bounding box, ...
  • description of skeletonised characters
  • graph description comprising line segments (e.g.
    strokes of Chinese characters)
  • number of L,T, or X junctions, ...

42
Optical Character Recognition
  • Feature extraction methods

Projection histograms
43
Optical Character Recognition
  • OCR examples artificial neural network (1)

44
Optical Character Recognition
  • OCR examples artificial neural network (2)

45
Optical Character Recognition
  • OCR examples
  • (see AALs book)

46
Other Recognition Applications
  • Sample
  • recognition of faces or facial expressions
  • recognition of body movement (gestures, gait)
  • recognition of handwriting (text, signature)
  • industrial inspection
  • autonomous vehicles, traffic monitoring
  • ...
  • (Exercise identify architectural components for
    these applications, and discuss factors affecting
    performance)

47
Summary
  • Classical pattern recognition model
  • pre/post-processing
  • feature extraction
  • classification
  • Feature extraction representation of
    discriminant pattern characteristics
  • Classification
  • wide variety of classifiers
  • supervised or unsupervised classifier training

48
Summary
  • Components of generic OCR system
  • image capture, image pre-processing, feature
    extraction, classification, post- processing
  • Wide variety of features for OCR, e.g.
  • frequency-domain representation
  • statistical or geometric measurements
  • skeleton descriptors
  • ...
Write a Comment
User Comments (0)
About PowerShow.com