Optimal Component Analysis Optimal Linear Representations of Images for Object Recognition - PowerPoint PPT Presentation

About This Presentation

Title:

Optimal Component Analysis Optimal Linear Representations of Images for Object Recognition

Description:

Designed to maximize the separation between means of each class ... Calculate the gradient matrix A(Xt) Generate d(n-d) independent realizations of wij's ... – PowerPoint PPT presentation

Number of Views:57

Avg rating:3.0/5.0

Slides: 29

Provided by: Shzu7

Learn more at: http://www.cs.fsu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Optimal Component Analysis Optimal Linear Representations of Images for Object Recognition

1
Optimal Component AnalysisOptimal Linear
Representations of Images for Object Recognition
X. Liu, A. Srivastava, and Kyle Gallivan,
Optimal linear representations of images for
object recognition, IEEE Transactions on Pattern
Recognition and Machine Intelligence, vol. 26,
no. 5, pp. 662666, 2004.
2
Outline

Motivations
Optimal Component Analysis
Performance measure
MCMC stochastic algorithm
Experimental Results
Fast Implementation through K-means
Some applications
Conclusion

3
Motivations

Linear representations are widely used in
appearance-based object recognition applications
Simple to implement and analyze
Efficient to compute
Effective for many applications

4
Standard Linear Representations

Principal Component Analysis
Designed to minimize the reconstruction error on
the training set
Obtained by calculating eigenvectors of the
co-variance matrix
Fisher Discriminant Analysis
Designed to maximize the separation between means
of each class
Obtained by solving a generalized eigen problem
Independent Component Analysis
Designed to maximize the statistical independence
among coefficients along different directions
Obtained by solving an optimization problem with
some object function such as mutual information,
negentropy, ....

5
Standard Linear Representations - continued

Standard linear representations are sub optimal
for recognition applications
Evidence in the literature 12
A toy example
Standard representations give the worst
recognition performance

6
Proposed Approach

Optimal Component Analysis (OCA)
Derive a performance function that is related to
the recognition performance
Formulate the problem of finding optimal
representations as an optimization one on the
Grassmann manifold
Use MCMC stochastic gradient algorithm for
optimization

7
Performance Measure

It must have continuous directional derivatives
It must be related to the recognition performance
It can be computed efficiently
Based on the nearest neighbor classifier
However, it can be applied to other classifiers
as it forms clusters of images from the same
class that far from clusters from other classes
See an example for support vector machines

8
Performance Measure - continued

Suppose there are C classes to be recognized
Each class has ktrain training images
It has kcross cross validation images

9
Performance Measure - continued

h is a monotonically increasing and bounded
function
We used h(x) 1/(1exp(-2bx)
Note that when b ? ?, F(U) is exactly the
recognition performance using the nearest
neighbor classifier
Some examples of F(U) along some directions

10
Performance Measure - continued

F(U) depends on the span of U but is invariant to
change of basis
In other words, F(U)F(UO) for any orthonormal
matrix O
The search space of F(U) is the set of all the
subspaces, which is known as the Grassmann
manifold
It is not a flat vector space and gradient flow
must take the underlying geometry of the manifold
into account see 3 4 5 for related work

11
Deterministic Gradient Flow - continued

Gradient at J (first d columns of n x n
identity matrix)

12
Deterministic Gradient Flow - continued

Gradient at U Compute Q such that QUJ
Deterministic gradient flow on Grassmann manifold

13
Stochastic Gradient and Updating Rules

Stochastic gradient is obtained by adding a
stochastic component
Discrete updating rules

14
MCMC Simulated Annealing Optimization Algorithm

Let X(0) be any initial condition and t0
Calculate the gradient matrix A(Xt)
Generate d(n-d) independent realizations of wijs
Compute Y (Xt1) according to the updating rules
Compute F(Y) and F(Xt) and set dFF(Y)- F(Xt)
Set Xt1 Y with probability minexp(dF/Dt),1
Set Dt1 Dt / g and set tt1
Go to step 1

15
The Toy Example

The following result on the toy example shows the
effectiveness of the algorithm
The following figure shows the recognition
performance of Xt and F(Xt)

16
ORL Face Dataset
17
Experimental Results on ORL Dataset

Here the size of image is 92 x 112, d 5
(subspace)
Comparison using gradient, stochastic gradient,
and the proposed technique with different initial
conditions

18
Results on ORL Dataset - continued

With respect to d and ktrain

d20 ktrain5
d3 ktrain5
d10 ktrain5
d5 ktrain2
d5 ktrain8
d5 ktrain1
19
Results on CMU PIE Dataset

Here we used part of the CMU PIE dataset
There are 66 subjects
Each subject has 21 pictures under different
lighting conditions

X0PCA
d10

X0ICA
d10

X0FDA
d5

20
Some Comparative Results on ORL

Comparison where performance on cross validation
images is maximized
In other words, the comparison is to show the
best performance linear representations can
achieve
PCA black dotted ICA red dash-dotted
FDA green dashed OCA blue solid

21
Some Comparative Results on ORL - continued

Comparison where the performance on the training
is optimized
In other words, it is a fair comparison
PCA black dotted ICA red dash-dotted
FDA green dashed OCA blue solid

22
Sparse Filters for Recognition

The learning algorithm can be generalized to
other manifolds using a multi-flow technique
(Amit, 1991)
Here we use a generalized version to learn linear
filters that are sparse and effective for
recognition

23
Sparse Filters for Recognition - continued

Sparseness has been realized as an important
coding principle
However, our results show sparse filters are not
effective for recognition
Proposed technique
To learn filters that are sparse and effective
for recognition

24
Results for Sparse Filters
l1 1.0 and l2 -1.0
25
Results for Sparse Filters - continued
l1 1.0 and l2 0.0
26
Results for Sparse Filters - continued
l1 0.0 and l2 1.0
27
Results for Sparse Filters - continued
l1 0.2 and l2 0.8
28
Comparison of Commonly Used Linear Representations

Write a Comment

User Comments (0)