# Markov Chain Monte Carlo and Gibbs Sampling - PowerPoint PPT Presentation

PPT – Markov Chain Monte Carlo and Gibbs Sampling PowerPoint presentation | free to download - id: 56ca5b-ZTJhO

The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
Title:

## Markov Chain Monte Carlo and Gibbs Sampling

Description:

### Markov Chain Monte Carlo and ... simulated annealing reduces to greedy local optimization If T is ... Arial Default Design Microsoft Equation 3.0 Markov ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 18
Provided by: VasileiosH2
Category:
Tags:
Transcript and Presenter's Notes

Title: Markov Chain Monte Carlo and Gibbs Sampling

1
Markov Chain Monte Carlo and Gibbs Sampling
• Vasileios Hatzivassiloglou
• University of Texas at Dallas

2
Markov chains
• A subtype of random walks (not necessarily
uniform) where the entire memory of the system is
contained in the current state
• Described by a transition matrix P, where pij is
the probability of going from state i to state j
• Very useful for describing stochastic discrete
systems

3
Markov chain example
pCC
pAC
pTC
pCT
pCG
pCA
pAT
pTA
pGA
pGT
pGC
pAA
pTT
pTG
pAG
pGG
4
Example application of Markov chains
• Markov chains model dependencies across time or
position
• The model assigns a probability to each sequence
of observed data and can be used to measure how
likely an observed sequence is to follow it
• The GeneMark algorithm uses 5th-order Markov
chains (why?) to find genes (distinguishing them
from other regions in the DNA)

5
Markov Chains Stationary Distribution
• Under general assumptions (irreducibility and
aperiodicity), Markov chains have a stationary
distribution p, the limit of Pk as k goes to
infinity
• An irreducible MC has no identical states
• An aperiodic MC can reach any state from any
other state
• Such a Markov chain is called ergodic

6
Markov Chain Monte Carlo
• Used as a means to guide the selection of samples
• We want the stationary distribution of the Markov
chain to be the distribution we are sampling from
• In many cases, this is relatively easy to do

7
Gibbs sampling
• A special case of MCMC where the conditional
probabilities on specific variables can be
calculated easily, but the joint probability must
be sampled from

8
Gibbs sampling in our problem
each Si chosen randomly and uniformly from input
sequence i
• Calculate A and D(AB) for S
• Choose one member of S randomly to remove
• Choose an alternative (from the corresponding
sequence) with probability proportional to the
corresponding D(AB)
• Repeat until D(AB) converges

9
Exploring alternative strings
• When we replace a string from sequence i
• We examine in turn each of the m-n1 strings that
that sequence could offer
• For each such string, we add it temporarily to S
and calculate the new A and D(AB)
• Then we assign to each string Sij (j varies
across these strings) probability
• May pick a worse string, or the same string we
just removed

10
Gibbs sampler convergence
• Return best S seen across all iterations (may not
be the last one)
• Stop after a fixed number of iterations, or when
D(AB) does not change very much
• Solution is sensitive to the starting S, so we
typically run the algorithm several (thousand)
times from different starting points

11
Complexity of Gibbs sampler
• Construct initial S and calculate A and D(AB)
in O(kn) time
• Each iterative step takes O(n) time to remove a
string and recalculate D(AB), O(mn) time to
calculate the probabilities of the m-n1
alternatives
• Total time is O(mnd) where d is the number of
iteration steps (dmgtgtk), multiplied by the number
of random restarts

12
Why Gibbs sampling works
• Retains elements of the greedy approach
• weighing by relative entropy makes likely to move
towards locally better solutions
• Allows for locally bad moves with a small
probability, to escape local maxima

13
Variations in Gibbs sampling
• Discard substrings non-uniformly (weighed by
relative entropy, analogous to subsequent
selection of new string)
• Use simulated annealing to reduce chance of
convergence)

14
Annealing
• Annealing is a process in metallurgy for
improving metals by increasing crystal size and
reducing defects
• The process works by heating the metal and
controlled cooling which lets the atoms go
through a series of states with gradually lower
internal energy
• The goal is to have the metal settle in a
configuration with lower than the original
internal energy

15
Simulated Annealing
• Simulated annealing (SA) adopts an energy
function equal to the function we want to
minimize
• Transitions between neighboring states are
adopted with probability specified by a function
f of the energy gain ?EEnew-Eold and the
temperature T
• f(?E,T)gt0 even if ?Egt0, but as T?0, f(?E,T)?1 if
?Elt0 (or to c?E for some constant c) and
f(?E,T)?0 for ?Egt0

16
Simulated Annealing
• Original f (from the Metropolis-Hasting
algorithm)
• T controls acceptance of locally bad solutions
• The annealing schedule is a process for gradually
reducing T so that eventually only good moves are
accepted

17
Special cases
• If T is always zero,
• simulated annealing reduces to greedy local
optimization
• If T is constant but non-zero,
• simulated annealing reduces to the process we
described for Gibbs sampling (choose solutions
randomly with probability proportional to their
improvement)