Optimal Component Analysis Optimal Linear Representations of Images for Object Recognition - PowerPoint PPT Presentation

About This Presentation
Title:

Optimal Component Analysis Optimal Linear Representations of Images for Object Recognition

Description:

Designed to maximize the separation between means of each class ... Calculate the gradient matrix A(Xt) Generate d(n-d) independent realizations of wij's ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 29
Provided by: Shzu7
Learn more at: http://www.cs.fsu.edu
Category:

less

Transcript and Presenter's Notes

Title: Optimal Component Analysis Optimal Linear Representations of Images for Object Recognition


1
Optimal Component AnalysisOptimal Linear
Representations of Images for Object Recognition
X. Liu, A. Srivastava, and Kyle Gallivan,
Optimal linear representations of images for
object recognition, IEEE Transactions on Pattern
Recognition and Machine Intelligence, vol. 26,
no. 5, pp. 662666, 2004.
2
Outline
  • Motivations
  • Optimal Component Analysis
  • Performance measure
  • MCMC stochastic algorithm
  • Experimental Results
  • Fast Implementation through K-means
  • Some applications
  • Conclusion

3
Motivations
  • Linear representations are widely used in
    appearance-based object recognition applications
  • Simple to implement and analyze
  • Efficient to compute
  • Effective for many applications

4
Standard Linear Representations
  • Principal Component Analysis
  • Designed to minimize the reconstruction error on
    the training set
  • Obtained by calculating eigenvectors of the
    co-variance matrix
  • Fisher Discriminant Analysis
  • Designed to maximize the separation between means
    of each class
  • Obtained by solving a generalized eigen problem
  • Independent Component Analysis
  • Designed to maximize the statistical independence
    among coefficients along different directions
  • Obtained by solving an optimization problem with
    some object function such as mutual information,
    negentropy, ....

5
Standard Linear Representations - continued
  • Standard linear representations are sub optimal
    for recognition applications
  • Evidence in the literature 12
  • A toy example
  • Standard representations give the worst
    recognition performance

6
Proposed Approach
  • Optimal Component Analysis (OCA)
  • Derive a performance function that is related to
    the recognition performance
  • Formulate the problem of finding optimal
    representations as an optimization one on the
    Grassmann manifold
  • Use MCMC stochastic gradient algorithm for
    optimization

7
Performance Measure
  • It must have continuous directional derivatives
  • It must be related to the recognition performance
  • It can be computed efficiently
  • Based on the nearest neighbor classifier
  • However, it can be applied to other classifiers
    as it forms clusters of images from the same
    class that far from clusters from other classes
  • See an example for support vector machines

8
Performance Measure - continued
  • Suppose there are C classes to be recognized
  • Each class has ktrain training images
  • It has kcross cross validation images

9
Performance Measure - continued
  • h is a monotonically increasing and bounded
    function
  • We used h(x) 1/(1exp(-2bx)
  • Note that when b ? ?, F(U) is exactly the
    recognition performance using the nearest
    neighbor classifier
  • Some examples of F(U) along some directions

10
Performance Measure - continued
  • F(U) depends on the span of U but is invariant to
    change of basis
  • In other words, F(U)F(UO) for any orthonormal
    matrix O
  • The search space of F(U) is the set of all the
    subspaces, which is known as the Grassmann
    manifold
  • It is not a flat vector space and gradient flow
    must take the underlying geometry of the manifold
    into account see 3 4 5 for related work

11
Deterministic Gradient Flow - continued
  • Gradient at J (first d columns of n x n
    identity matrix)

12
Deterministic Gradient Flow - continued
  • Gradient at U Compute Q such that QUJ
  • Deterministic gradient flow on Grassmann manifold

13
Stochastic Gradient and Updating Rules
  • Stochastic gradient is obtained by adding a
    stochastic component
  • Discrete updating rules

14
MCMC Simulated Annealing Optimization Algorithm
  • Let X(0) be any initial condition and t0
  • Calculate the gradient matrix A(Xt)
  • Generate d(n-d) independent realizations of wijs
  • Compute Y (Xt1) according to the updating rules
  • Compute F(Y) and F(Xt) and set dFF(Y)- F(Xt)
  • Set Xt1 Y with probability minexp(dF/Dt),1
  • Set Dt1 Dt / g and set tt1
  • Go to step 1

15
The Toy Example
  • The following result on the toy example shows the
    effectiveness of the algorithm
  • The following figure shows the recognition
    performance of Xt and F(Xt)

16
ORL Face Dataset
17
Experimental Results on ORL Dataset
  • Here the size of image is 92 x 112, d 5
    (subspace)
  • Comparison using gradient, stochastic gradient,
    and the proposed technique with different initial
    conditions

18
Results on ORL Dataset - continued
  • With respect to d and ktrain

d20 ktrain5
d3 ktrain5
d10 ktrain5
d5 ktrain2
d5 ktrain8
d5 ktrain1
19
Results on CMU PIE Dataset
  • Here we used part of the CMU PIE dataset
  • There are 66 subjects
  • Each subject has 21 pictures under different
    lighting conditions
  • X0PCA
  • d10
  • X0ICA
  • d10
  • X0FDA
  • d5

20
Some Comparative Results on ORL
  • Comparison where performance on cross validation
    images is maximized
  • In other words, the comparison is to show the
    best performance linear representations can
    achieve
  • PCA black dotted ICA red dash-dotted
  • FDA green dashed OCA blue solid

21
Some Comparative Results on ORL - continued
  • Comparison where the performance on the training
    is optimized
  • In other words, it is a fair comparison
  • PCA black dotted ICA red dash-dotted
  • FDA green dashed OCA blue solid

22
Sparse Filters for Recognition
  • The learning algorithm can be generalized to
    other manifolds using a multi-flow technique
    (Amit, 1991)
  • Here we use a generalized version to learn linear
    filters that are sparse and effective for
    recognition

23
Sparse Filters for Recognition - continued
  • Sparseness has been realized as an important
    coding principle
  • However, our results show sparse filters are not
    effective for recognition
  • Proposed technique
  • To learn filters that are sparse and effective
    for recognition

24
Results for Sparse Filters
l1 1.0 and l2 -1.0
25
Results for Sparse Filters - continued
l1 1.0 and l2 0.0
26
Results for Sparse Filters - continued
l1 0.0 and l2 1.0
27
Results for Sparse Filters - continued
l1 0.2 and l2 0.8
28
Comparison of Commonly Used Linear Representations
Write a Comment
User Comments (0)
About PowerShow.com