Markov Chain Monte Carlo and Simulated Annealing

About This Presentation

Title:

Description:

Number of Views:103

Avg rating:3.0/5.0

Slides: 18

Provided by: Lig50

Category:

more less

Transcript and Presenter's Notes

Title: Markov Chain Monte Carlo and Simulated Annealing

1
Markov Chain Monte Carlo and Simulated Annealing

2
Outline

3
Motivation
Optimization Problem

Genetic Algorithm
Randomly sample points
Evaluate their fitness
Apply genetic operators to the existing points
Go back to step 2
The third step is also a kind of sampling, but it
is not at random because it samples based on the
knowledge from the previous points.
GA makes use of the knowledge by using crossover
and mutation.
EDA makes use of the knowledge by estimate the
density of the distribution.
Many other ways

4
Markov Chain Monte Carlo

5
Difficulty

We dont know the normalizing constant Z
Drawing samples from P(x) is still challenging,
especially in high-dimensional space
We know the sampling methods for only a few
distribution models

6
An Example
7
Sampling Method

8
Markov Chain
9
An Example

Transition Probability Matrix
p(0)(0 1 0)
p(2)p(0)P2(0.275 0.25 0.275)
p(7)p(0)P7(0.4 0.2 0.4)
p(8)p(7)P(0.4 0.2 0.4)
p(0)(1 0 0)
p(2)(0.4375 0.1875 0.375)
p(7) (0.4 0.2 0.4)
(0.4 0.2 0.4) is a stationary distribution

10
Stationary Distribution

A Markov Chain may reach a stationary
distribution p, where the vector of
probabilities of being in any particular given
state is independent of the initial condition.
The stationary distribution satisfies,
p pP
A sufficient condition is that the detailed
balance equation holds, or the Markov chain is
reversible

11
Metropolis-Hasting

Intuitively, it samples points following a
particular Markov chain. When the chain reaches
the stationary distribution, the data are sampled
from the target distribution.

12
Algorithm

It uses a simple proposal distribution, q(xx)
as the transition probability
It generates a new point x given the current x,
and accepts it with a certain probability

13
An Example
14
Proof
15
MCMC for optimization?

To find the peak of a distribution, we use MCMC
to sample a series of points, evaluate their
densities, and then find the highest one as the
peak
This is inefficient, because many points are
sampled in the region of low fitness.
Intuitively, we exponentially enlarge the
difference between high and low fitness

16
Simulated Annealing

Further suppose we use a symmetric proposal
distribution, such as a Gaussian distribution,
then the acceptance probability becomes
This is exactly the simulated annealing
algorithm, where T is the cooling temperature