Title: Face Recognition in the Presence of Expression Variations
1Face Recognition in the Presence of Expression
Variations
- Brian J. Gestner
- Adam C. Zelinski
- bgestner,acz_at_ece.cmu.edu
2Face Expression Database
- Database contains Expression Variations
3Face Expression Database
- 13 People (13 Classes)
- 75 Images/Person
- Variety of Expressions (angry, smiling, etc.)
- Training Set consisted of 10 images/person
- Image Information
- 64x64
- Grayscaled
- Cropped
4Our Pattern Recognition Module
- Image Pre-Processing
- Normalize Images
- Feature Extraction
- Explored 2 methods
- Principal Component Analysis
- Direct Linear Discriminant Analysis
- Feature Classification
- Neural Network
5Image Pre-Processing
- Normalized the energy of each image to equal one.
- Much of the pre-processing was already done.
- Images were already cropped, grayscaled
6Feature Extraction - PCA
- Principal Component Analysis (PCA)
- A technique to find the directions in which a
cloud of data points is stretched most. - Wrote a MATLAB script
A simple illustration of PCA. The first
principal component of a two-dimensional data set
is shown.
7Feature Extraction - DLDA
- Direct Linear Discriminant Analysis (Direct LDA,
or DLDA) - Seeks to find the most principal components that
are the most discriminating as well. - Performs PCA and LDA simultaneously
- Algorithm designed by CMU CS guys
- Projects to (c-1) (13-1) 12 Classes
8Eigenfaces DLDA-Faces
The three most principal Eigenfaces used 12
total.
The first three DLDA-Faces used 12 total.
9PCA versus Direct LDA
- PCA
- Does not necessarily discriminate data
- Uses global mean, global scatter matrix
- Direct LDA
- Reduces dimensionality and discriminates.
- Uses within class means, within class scatter
matrices - We suspected Direct LDA would do a better job
than PCA.
10Feature Classification
- Neural Network - Wrote our own in MATLAB
- 3-layers Input, Hidden, Output
- Used Duda, Hart, Stork Heuristic to get hidden
nodes - Feed-Forward Operation
- Adjust weights via Back-Propagation
- (Delta Rule)
- Constant learning rate, ? 0.0001
- Tweaked manually.
- Weights are initialized to random values.
11Our Neural Network System
- 13 NNs
- 1 per class
- 1 output node each
- NN with the max output node value is the selected
class - Complexity O(N)
- N images being classified
- 0.0031 seconds/classification
- (322.5 images/second)
12Neural Networks - Training
- Stochastic Training failed.
- Batch Training
- Error Gradient 0 Halting Criterion
Performed poorly. - Our Modification
- Halt batch training when error increases
- With proper ?, guarantees reaching at least a
local minimum if permitted iterations Inf
13PCA vs. DLDA
DLDA
- Trained many Neural Network Systems (NSSs)
- Used a variable of training images from the 10
provided. - Used the remaining training images as a
Validiation Set - Clearly, DLDA gt PCA
PCA
14Testing
- Used DLDA
- Used a variety of NSSs trained on a varying of
training images - (similar to how we compared PCA and DLDA)
- Results varied based on of training images
15Testing Results
DLDA NNS, using a variable of the 10 provided
training images per class. Notice, 100 accuracy
when training on 6-8 images per class.
16Testing Results
Confusion matrices for 3 (DLDANSS) combinations.
17Conclusions based on Results
- DLDA gt PCA for classification
- Testing performance does not consistently
increase as the training images per class
increases. - Few training images classifier cant
discriminate - Too many training images classifier cant
generalize - Moderate number of training images 100 accuracy
- Use enough training data, but not too much!
- There is no formula to find the images to use
18 Q A