Title: Transformed Component Analysis: Joint Estimation of Image Components and Transformations
1Transformed Component Analysis Joint Estimation
of Image Components and Transformations
- Brendan J. Frey
- Computer Science, University of Waterloo, Canada
- Beckman Institute ECE, Univ of Illinois at
Urbana - Nebojsa Jojic
- Beckman Institute, University of Illinois at
Urbana
2Subspace models of imagesExample Image, R 1200
f (y, R 2)
Shut eyes
Frown
3Generative density modeling
- Find a probability model that
- reflects desired structure
- randomly generates plausible images,
- represents the data by parameters
- ML estimation
- p(imageclass) used for recognition, detection,
...
4Factor analysis (generative PCA)
The density of the subspace point y is p(y)
N(y 0, I)
y
5Factor analysis (generative PCA)
p(y) N(y 0, I)
y
The density of pixel intensities z given subspace
point y is p(zy) N(z mLy, F)
z
Manifold f (y) mLy, linear
6Factor analysis (generative PCA)
p(y) N(y 0, I)
y
p(zy) N(z mLy, F)
- Parameters m, L represent the manifold
- Observing z induces a Gaussian p(yz)
- COVyz (LTF-1LI)-1
- Eyz COVyz LTF-1 z
z
7Example Hand-crafted model
SE
y
L
p(y) N(y 0, I)
Frn
p(zy) N(z mLy, F)
z
8Example Simulation
SE
y
L
p(y) N(y 0, I)
Frn
p(zy) N(z mLy, F)
z
9Example Simulation
SE
y
L
p(y) N(y 0, I)
Frn
p(zy) N(z mLy, F)
z
10Example Simulation
SE
y
L
p(y) N(y 0, I)
Frn
p(zy) N(z mLy, F)
z
11Example Simulation
SE
y
L
p(y) N(y 0, I)
Frn
p(zy) N(z mLy, F)
z
12Example Simulation
SE
y
L
p(y) N(y 0, I)
Frn
p(zy) N(z mLy, F)
z
13Example Simulation
SE
y
L
p(y) N(y 0, I)
Frn
p(zy) N(z mLy, F)
z
14Example Simulation
SE
y
L
p(y) N(y 0, I)
Frn
p(zy) N(z mLy, F)
z
15Example Simulation
SE
y
L
p(y) N(y 0, I)
Frn
p(zy) N(z mLy, F)
z
16Example Inference
SE
y
L
Frn
z
Images from data set
17Example Inference
SE
y
L
Frn
z
Images from data set
18Example Inference
SE
y
p(yz)
L
Frn
z
Images from data set
19Example Inference
SE
y
L
Frn
z
Images from data set
20Example Inference
SE
y
L
Frn
z
Images from data set
21Example Inference
SE
y
p(yz)
L
Frn
z
Images from data set
22EM algorithm for ML Learning
- Initialize m, L and F to small, random values
- E Step
- For each training case z(t), infer
- q(t)(y) p(yz(t))
- M Step
- Compute m new, F new and Lnew that maximize
- St E log p(y) p(z(t)y) ,
- where E is wrt q(t)(y)
- Each iteration increases log p(Data)
23Kind of data were interested in
Even after tracking, the features still have
unknown positions, rotations, scales, levels of
shearing, ...
24ProblemFactor analysis and PCA aresensitive to
spatial transformations, eg translation
Swap Pix 1 and 2 in 1/2 of cases
Pix 2
Pix 1
25Oneapproach
Images
Labor
Normalization
Normalized images
Pattern Analysis
26Anotherapproach
27Yet anotherapproach
Images
Extract transformation-invariant features
Transformation- invariant data
- Difficult to work with
- May hide useful features
Pattern Analysis
28Ourapproach
Images
Joint Normalization and Pattern Analysis
29What transforming an image does in the vector
space of pixel intensities
- A continuous transformation moves an image, ,
along a continuous curve - Our subspace model should assign images near this
nonlinear manifold to the same point in the
subspace
30Tractable approaches to modeling the
transformation manifold
- \ Linear approximation
- - good locally
- Discrete approximation
- - good globally
31Related work
- Generative models
- Local invariance PCA, Turk, Moghaddam, Pentland
(96) factor analysis, Hinton, Revow, Dayan,
Ghahramani (96) Frey, Colmenarez, Huang (98) - Layered motion Adelson,Black,Blake,Jepson,Wang,
Weiss - Learning discrete representations of generative
manifolds - Generative topographic maps, Bishop,Svensen,Willia
ms (98) - Discriminative models
- Local invariance tangent distance, tangent prop,
Simard, Le Cun, Denker, Victorri (92-93) - Global invariance convolutional neural networks,
Le Cun, et al (98) multiresolution tangent dist,
Vasconcelos et al (98)
32Adding transformation as a discrete latent
variable
- Say there are N pixels
- We assume we are given a set of sparse N x N
transformation generating matrices G1,,Gl ,,GL - These generate points
- from point
33Transformed Component Analysis
The density of the subspace point y is p(y)
N(y 0, I)
y
34Transformed Component Analysis
p(y) N(y 0, I)
y
The probability of latent image z given subspace
point y is p(zy) N(z mLy, F)
z
35Transformed Component Analysis
p(y) N(y 0, I)
y
p(zy) N(z mLy, F)
The probability of transf l 1,2, is P(l) rl
z
l
36Transformed Component Analysis
p(y) N(y 0, I)
y
p(zy) N(z mLy, F)
P(l) rl
z
l
The probability of observed image x is p(xz,l)
N(x Gl z , Y)
x
37Example Hand-crafted model
G1 shift left up, G2 I, G3 shift right
up
SE
y
Frn
z
l
l 1, 2, 3
r1 r2 r3 0.33
x
38Example Simulation
G1 shift left up, G2 I, G3 shift right
up
SE
y
Frn
z
l
x
39Example Simulation
G1 shift left up, G2 I, G3 shift right
up
SE
y
Frn
z
l
x
40Example Simulation
G1 shift left up, G2 I, G3 shift right
up
SE
y
Frn
z
l
x
41Example Simulation
G1 shift left up, G2 I, G3 shift right
up
SE
y
Frn
z
l1
x
42Example Simulation
G1 shift left up, G2 I, G3 shift right
up
SE
y
Frn
z
l1
43Example Simulation
G1 shift left up, G2 I, G3 shift right
up
SE
y
Frn
z
l
x
44Example Simulation
G1 shift left up, G2 I, G3 shift right
up
SE
y
Frn
z
l
x
45Example Simulation
G1 shift left up, G2 I, G3 shift right
up
SE
y
Frn
z
l
x
46Example Simulation
G1 shift left up, G2 I, G3 shift right
up
SE
y
Frn
z
l3
x
47Example Simulation
G1 shift left up, G2 I, G3 shift right
up
SE
y
Frn
z
l3
x
48Example Inference
G1 shift left up, G2 I, G3 shift right
up
SE
y
Frn
z
l
Training data
x
49Example Inference
G1 shift left up, G2 I, G3 shift right
up
SE
y
Frn
z
l
Training data
x
50Example Inference
G1 shift left up, G2 I, G3 shift right
up
SE
SE
SE
y
y
y
Frn
Frn
Frn
P(l1x) .01
P(l3x) .98
P(l2x) .01
z
z
z
l3
l2
l1
x
x
x
51EM algorithm for TCA
- Initialize m, L, F, r, Y to random values
- E Step
- For each training case x(t), infer
- q(t)(l,z,y) p(l,z,y x(t))
- M Step
- Compute mnew,Lnew,F new,rnew,Ynew to maximize
- St E log p(y) p(zy) P(l) p(x(t)z,l),
- where E is wrt q(t)(l,z,y)
- Each iteration increases log p(Data)
52A tough toy problem
- 144, 9 x 9 images
- 1 shape (pyramid)
- 3-D lighting
- cluttered
- background
- 25 possible
- locations
531st 8 principal components
- TCA
- 3 components
- 81 transformations
- - 9 horiz shifts
- - 9 vert shifts
- 10 iters of EM
- Model generates
- realistic examples
F
Y
m
L1L2 L3
54Expression modeling
- 100 16 x 24 training
- images
- variation in expression
- imperfect alignment
55PCA Mean 1st 10 principal components
56Fantasies from FA model
Fantasies from TCA model
57Modeling handwritten digits
- 200 8 x 8 images of
- each digit
- preprocessing
- normalizes vert/horiz
- translation and scale
- different writing angles
- (shearing) - see 7
58TCA - 29 shearing translation combinations
- 10 components per digit - 30
iterations of EM per digit
Transformed means
Mean of each digit
59FA Mean 10 components per digit
TCA Mean 10 components per digit
60Classification Performance
- Training 200 cases/digit, 20 components, 50 EM
iters - Testing 1000 cases, p(xclass) used for
classification - Results
- Method Error rate
- k-nearest neighbors (optimized k) 7.6
- Factor analysis 3.2
- Tranformed component analysis 2.7
- Bonus P(lx) infers the writing angle!
61Wrap-up
- MATLAB scripts www.cs.uwaterloo.ca/frey
- Other domains audio, bioinformatics,
- Other latent image models, p(z)
- clustering (mixture of Gaussians) (CVPR99)
- mixtures of factor analyzers (NIPS99)
- time series (CVPR00)
62Wrap-up
- DiscreteLinear Combination Set some components
equal to derivatives of m wrt transformations - Multiresolution approach
- Fast variational methods, belief propagation,...