Some Surprises in the Theory of Generalized Belief Propagation - PowerPoint PPT Presentation

About This Presentation
Title:

Some Surprises in the Theory of Generalized Belief Propagation

Description:

... each factor node once makes the approximate energy exact (if the beliefs are) ... We introduce generalized belief propagation algorithms whose fixed points are ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 47
Provided by: yed3
Learn more at: https://cnls.lanl.gov
Category:

less

Transcript and Presenter's Notes

Title: Some Surprises in the Theory of Generalized Belief Propagation


1
Some Surprises in the Theory of Generalized
Belief Propagation
  • Jonathan Yedidia
  • Mitsubishi Electric Research Labs (MERL)
  • Collaborators
  • Bill Freeman (MIT)
  • Yair Weiss (Hebrew University)

See Constructing Free Energy Approximations and
Generalized Belief Propagation Algorithms, MERL
TR2004-040, to be published in IEEE Trans. Info.
Theory.
2
Outline
  • Introduction to GBP
  • Quick Review of Standard BP
  • Free Energies
  • Region Graphs and Valid Approximations
  • Some Surprises
  • Maxent Normal Approximations
  • a useful heuristic

3
Factor Graphs
(Kschischang, et.al. 2001)
4
Computing Marginal Probabilities
Fundamental for
  • Decoding error-correcting codes
  • Inference in Bayesian networks
  • Computer vision
  • Statistical physics of magnets

Non-trivial because of the huge number of terms
in the sum.
5
Error-correcting Codes
(Tanner, 1981 Gallager, 1963)
Marginal Probabilities A posteriori bit
probabilities
6
Statistical Physics
Marginal Probabilities local magnetization
7
Standard Belief Propagation
beliefs messages
The belief is the BP approximation of the
marginal probability.
8
BP Message-update Rules
Using
we get
a
i
a
i

9
Variational (Gibbs) Free Energy
Kullback-Leibler Distance
Boltzmanns Law (serves to define the energy)
Variational Free Energy is minimized when
10
So weve replaced the intractable problem of
computing marginal probabilities with the even
more intractable problem of minimizing a
variational free energy over the space of
possible beliefs. But the point is that now we
can introduce interesting approximations.
11
Region-based Approximations to the Variational
Free Energy
(Kikuchi, 1951)
Exact Regions
(intractable)
12
Defining a Region
  • A region r is a set of variable nodes Vr and
    factor nodes Fr such that if a factor node a
    belongs to Fr , all variable nodes neighboring a
    must belong to Vr.

Regions
Not a Region
13
Region Definitions
Region states
Region beliefs
Region energy
Region average energy
Region entropy
Region free energy
14
Valid Approximations
Introduce a set of regions R, and a counting
number cr for each region r in R, such that cr1
for the largest regions, and for every factor
node a and variable node i,
Indicator functions
Count every node once!
15
Entropy and Energy
  • Counting each factor
    node once makes the approximate energy
    exact (if the beliefs are).
  • Counting each variable
    node once makes the approximate entropy
    reasonable (at least the entropy is correct when
    all states are equiprobable).

16
Comments
  • We could actually use different counting numbers
    for energy and entropyleads to fractional BP
    (Weigerinck Heskes 2002) or convexified free
    energy (Wainwright et.al., 2002) algorithms.
  • Of course, we also need to impose normalization
    and consistency constraints on our free energy.

17
Methods to Generate Valid Region-based
Approximations
Region Graphs
Junction graphs
Cluster Variational Method (Kikuchi)
Aji-McEliece
Junction trees
Bethe
(Bethe is example of Kikuchi for Factor graphs
with no 4-cycles Bethe is example of
Aji-McEliece for normal factor graphs.)
18
Example of a Region Graph
A,C,1,2,4,5
B,D,2,3,5,6
C,E,4,5,7,8
D,F,5,6,8,9
2,5
5,8
D,5,6
C,4,5
5 is a child of 2,5
5
19
Definition of a Region Graph
  • Labeled, directed graph of regions.
  • Arc may exist from region A to region B if B is a
    subset of A.
  • Sub-graphs formed from regions containing a
    given node are connected.
  • where is the set of
    ancestors of region r. (Mobius Function)
  • We insist that

20
Bethe Method
(after Bethe, 1935)
Two sets of regions Large regions containing
a single factor node a and all attached variable
nodes. Small regions containing a single
variable node i.
3
6
9
A1,2,4,5
D5,6
F5,6,8,9
B2,3,5,6
C4,5
E4,5,7,8
7
2
6
8
1
3
4
9
5
21
Bethe Approximation to Gibbs Free Energy
Equal to the exact Gibbs free energy when the
factor graph is a tree because in that case,
22
Minimizing the Bethe Free Energy
23
Bethe BP
Identify
to obtain BP equations
24
Cluster Variation Method
(Kikuchi, 1951)
Form a region graph with an arbitrary number of
different sized regions. Start with largest
regions. Then find intersection regions of
the largest regions, discarding any regions that
are sub-regions of other intersection regions.
Continue finding intersections of those
intersection regions, etc. All intersection
regions obey , where S(r )
is the set of super-regions of region r.
25
Region Graph Created Using CVM
A,C,1,2,4,5
B,D,2,3,5,6
C,E,4,5,7,8
D,F,5,6,8,9
2,5
5,8
D,5,6
C,4,5
5
26
Minimizing a Region Graph Free Energy
  • Minimization is possible, but it may be awkward
    because of all the constraints that must be
    satisfied.
  • We introduce generalized belief propagation
    algorithms whose fixed points are provably
    identical to the stationary points of the region
    graph free energy.

27
Generalized Belief Propagation
  • Belief in a region is the product of
  • Local information (factors in region)
  • Messages from parent regions
  • Messages into descendant regions from parents who
    are not descendants.
  • Message-update rules obtained by enforcing
    marginalization constraints.

28
Generalized Belief Propagation
2
1
3
4
5
6
7
8
9
29
Generalized Belief Propagation
30
Generalized Belief Propagation
31
Generalized Belief Propagation
2
1
3
4
5
6
7
8
9
32
Generalized Belief Propagation
Use Marginalization Constraints to Derive
Message-Update Rules
2
1
3

4
5
6
7
8
9
33
Generalized Belief Propagation
Use Marginalization Constraints to Derive
Message-Update Rules
2
1
3
2
1
3

4
4
5
6
5
6
7
8
7
9
8
9
34
Generalized Belief Propagation
Use Marginalization Constraints to Derive
Message-Update Rules
2
1
3
2
1
3

4
4
5
6
5
6
7
8
7
9
8
9
35
Generalized Belief Propagation
Use Marginalization Constraints to Derive
Message-Update Rules
2
1
3
2
1
3

4
4
5
6
5
6
7
8
7
9
8
9
36
(Mild) Surprise 1
  • Region beliefs (even those given by Bethe/BP) may
    not be realizable as the marginals of any global
    belief.

1
a
b
is a perfectly acceptable solution to BP, but
cannot arise from any
2
3
c
37
(Minor) Surprise 2
  • For some sets of beliefs (sometimes even when the
    marginal beliefs are the exactly correct ones),
    the Bethe entropy is negative!

Each large region has counting number 1, each
small region has counting number -2, so if the
beliefs are
then the entropy is negative
38
(Serious) Surprise 3
  • When there are no interactions, the minimum of
    the CVM free energy should correspond to the
    equiprobable global distribution, but sometimes
    it doesnt!

39
Example
Fully connected pairwise model, using CVM and all
triplets as the largest regions, with pairs and
singlets as the intersection regions.
40
Two simple distributions
  • Equiprobable distribution each state is equally
    probable for all beliefs.
  • e.g.
  • Distribution obtained by marginalizing the global
    distribution that only allows all-zeros or
    all-ones. Equipolarized distribution
  • e.g.

Each region gives
Each region gives
41
Surprise 3 (cont.)
  • Surprisingly, the equipolarized distribution
    can have a greater entropy than the equiprobable
    distribution for some valid CVM approximations.
  • This is a serious problem, because if the model
    gets the wrong answer without any interactions,
    it cant be expected to be correct with
    interactions.

42
The Fix Consider only Maxent-Normal
Approximations
  • A maxent-normal approximation is one which gives
    a maximum of the entropy when the beliefs
    correspond to the equiprobable distribution.
  • Bethe approximations are provably maxent-normal.
  • Some other CVM and other region-graph
    approximations are also provably maxent-normal.
  • Using quartets on square lattices
  • Empirically, these approximations give good
    results
  • Other valid CVM or region-graph approximations
    are not maxent-normal.
  • Empirically, these approximations give poor
    results

43
A Very Simple Heuristic
  • Sum all the counting numbers.
  • If the sum is greater than N, the equipolarized
    distribution will have a greater entropy than the
    equiprobable distribution. (Too MuchVery Bad)
  • If the sum is less than 0, the equipolarized
    distribution will have a negative entropy. (Too
    Little)
  • If the sum equals 1, the equipolarized
    distribution will have the correct entropy. (Just
    Right)

44
10x10 Ising Spin Glass
Random fields
Random interactions
45
(No Transcript)
46
Conclusions
  • Standard BP essentially equivalent to minimizing
    the Bethe free energy.
  • Bethe method and cluster variation method are
    special cases of the more general region graph
    method for generating valid free energy
    approximations.
  • GBP is essentially equivalent to minimizing
    region graph free energy.
  • One should be careful to use maxent-normal
    approximations.
  • Sometimes you can prove that your approximation
    is maxent-normal and simple heuristics can be
    used to prove when it isnt.
Write a Comment
User Comments (0)
About PowerShow.com