Title: Variational Message Passing for Learning Object Categories
1Variational Message Passing for Learning Object
Categories
Li Fei-Fei, CaltechJohn Winn, Microsoft Research
Cambridge
2Overview
- The problem Object Categorization
- The model and the goal
- The tutorial Learning Inference
- Bayesian inference
- Variational inference
- Variational Message Passing
- The reformulation and experiments
- The graphical model and VMP
- Experiments with the Caltech-101 Object
Categories - (bonus) Extensions to the model
3Variability within a category
Intrinsic
Deformation
4Constellation model of object categories
Burl, Leung, Weber, Welling, Fergus, Fei-Fei,
Perona, et al.
5Goal
6Goal
Burl, Leung, et al. 96 98 Weber, Welling, et
al. 98 00, Fergus, et al. 03
7Goal
- Estimate uncertainties in models
- Do full Bayesian learning
- Reduce the number of training examples
8Now John will tell us...
- The problem Object Categorization
- The model and the goal
- Learning the model is difficult
- The tutorial Solution to Learning Inference
- Bayesian inference
- Variational inference
- Variational Message Passing (VMP)
- The reformulation and experiments
- The graphical model and VMP
- Experiments with the Caltech-101 Object
Categories - Extensions to the model
9Modelling the vision problem
- Nodes represent variables
- Conditional distributions at each node
- Defines a joint distribution
P(C,L,S,I)P(L) P(C) P(SC) P(IL,S)
10Bayesian inference
Object class
C
Observed variables V and hidden variables H.
Hidden
Surface colour
Lighting colour
S
L
Inference involves finding
P(H1, H2 V)
Image colour
I
Observed
11Bayesian inference vs. ML/MAP
- Consider learning a parameter ??H.
Maximum of P(V ?)
P(V ?)
?
?ML
12Bayesian inference vs. ML/MAP
- Consider learning a parameter ??H.
High probability density
P(V ?)
?
?ML
13Bayesian inference vs. ML/MAP
- Consider learning a parameter ??H.
P(V ?)
?
?ML
Samples
14Bayesian inference vs. ML/MAP
- Consider learning a parameter ??H.
P(V ?)
?
Variational approximation
?ML
15Variational Inference
(in three easy steps)
- Choose a family of variational distributions
Q(H). - Use KL divergence as a measure of distance
between P and Q. - Find Q which minimises KL(QP)
16KL Divergence
Variational Inference does this.
Minimising KL(QP)
P
17Minimising the KL divergence
maximise
fixed
minimise
where
- We choose a family of Q distributions where L(Q)
is tractable to compute.
18Minimising the KL divergence
maximise
fixed
19Factorised Approximation
No further assumptions are required!
20Applying by hand
Fei-Fei et al. 2003
21Variational Message Passing
- VMP makes it easier and quicker to apply
factorised variational inference. - VMP carries out variational inference using local
computations and message passing on the graphical
model. - Modular algorithm allows modifying, extending or
combining models.
22VMP I The Exponential Family
- Conditional distributions expressed in
exponential family form.
T
)
(
)
(
)
(
)
(
ln
X
f
g
X
X
P
?
u
?
?
sufficient statistics vector
natural parameter vector
23VMP II Conjugacy
- Parents and children are chosen to be conjugate
i.e. same functional form
X
Y
same
Z
- Examples
- Gaussian for the mean of a Gaussian
- Gamma for the precision of a Gaussian
- Dirichlet for the parameters of a discrete
distribution
24VMP III Messages
X
Y
Z
25VMP IV Update
- Optimal Q(X) has same form as P(X?) but with
updated parameter vector ?
Computed from messages from parents
- These messages can also be used to calculate the
bound on the evidence L(Q) see Winn Bishop,
2004.
26VMP Example
- Learning parameters of a Gaussian from N data
points.
µ
?
mean
precision (inverse variance)
x
data
N
27VMP Example
Message from ? to all x.
µ
?
need initial Q(?)
x
N
28VMP Example
Messages from each xn to µ.
µ
?
x
N
Update Q(µ) parameter vector
29VMP Example
Message from updated µ to all x.
µ
?
x
N
30VMP Example
Messages from each xn to ?.
µ
?
x
N
Update Q(?) parameter vector
31Initial Configuration
?
µ
32After Updating Q(µ)
?
µ
33After Updating Q(?)
?
µ
34Converged Solution
?
µ
35VMP Software VIBES
- Free download from vibes.sourceforge.net
36Now back to Fei-Fei...
- The problem Object Categorization
- The model and the goal
- Learning the model is difficult
- The tutorial Solution to Learning Inference
- Bayesian inference
- Variational inference
- Variational Message Passing
- The reformulation and experiments
- The graphical model and VMP
- Experiments with the Caltech-101 Object
Categories - Extensions to the model
37(No Transcript)
38The Generative Model
39the hypothesis (h) node
h is a mapping from interest points to parts
40the hypothesis (h) node
8
5
2
10
7
3
9
1
4
6
e.g. hj 2, 4, 8
h is a mapping from interest points to parts
41the spatial node
42the spatial parameters node
43the appearance node
PCA coefficients on fixed basis
Pt 1. (c1, c2, c3,)
Pt 2. (c1, c2, c3,)
Pt 3. (c1, c2, c3,)
44the appearance parameter node
45(No Transcript)
46Goal
?1
?2
?n
where ? µX, ?X, µA, ?A
47Goal
?1
?2
?n
where ? µX, ?X, µA, ?A
Fei-Fei et al. 03, 04
48Inference Variational Message Passing
49Inference Variational Message Passing
?A
?X
?X
?A
?X
?X
h
h
X
X
A
I
50Inference Variational Message Passing
Node ? the mean
?A
?X
?X
?A
?X
?X
h
h
X
X
A
I
51Inference Variational Message Passing
Node h each hypothesis
?A
?X
?X
?A
?X
?X
h
h
X
X
A
I
52Experiments
Training 1- 6 images (randomly drawn)
Detection test
- 50 fg/ 50 bg images
- object present/absent
Datasets foreground and background
The Caltech-101 Object Categories
www.vision.caltech.edu/feifeili/Datasets.htm
53No Manual Preprocessing
No labeling
No segmentation
No alignment
54Faces
Motorbikes
Airplanes
Spotted cats
Fergus et al. 2003
Weber, Fergus, et al.
55(No Transcript)
56(No Transcript)
57(No Transcript)
58(No Transcript)
59(No Transcript)
60(No Transcript)
61Number of training examples
62Number of training examples
63Number of training examples
64Number of training examples
65(No Transcript)
66ML vs. MAP vs. Bayes (Variational)
67Conclusions
- Bayesian inference is very useful and doesnt
have to be scary. - VMP has been successfully applied to several
vision problems. - Bayesian inference gives improved results over
ML/MAP for real systems with real data. - Experiments on a large dataset (Caltech-101
Object Categories)
68Extensions with Probabilistic PCA
c1
Pre-fixed PCA basis
Projection onto PCA basis
c2
..
c10
69Extensions with Probabilistic PCA
?X
?A
?X
?A
?
A
W
h
?
Y
X
I
?
Bayesian Probabilistic PCA
Tipping, Bishop 98 99
70(No Transcript)
71(No Transcript)
72(No Transcript)
73Probabilistic PCA
Normal PCA
74(No Transcript)
75Acknowledgments
Caltech Vision Lab, Pietro Perona, Rob Fergus,
Andrew Zisserman, Christopher Bishop, Tom Minka