Markov Chain Monte Carlo and Gibbs Sampling - PowerPoint PPT Presentation

Loading...

PPT – Markov Chain Monte Carlo and Gibbs Sampling PowerPoint presentation | free to download - id: 56ca5b-ZTJhO



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Markov Chain Monte Carlo and Gibbs Sampling

Description:

Markov Chain Monte Carlo and ... simulated annealing reduces to greedy local optimization If T is ... Arial Default Design Microsoft Equation 3.0 Markov ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 18
Provided by: VasileiosH2
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Markov Chain Monte Carlo and Gibbs Sampling


1
Markov Chain Monte Carlo and Gibbs Sampling
  • Vasileios Hatzivassiloglou
  • University of Texas at Dallas

2
Markov chains
  • A subtype of random walks (not necessarily
    uniform) where the entire memory of the system is
    contained in the current state
  • Described by a transition matrix P, where pij is
    the probability of going from state i to state j
  • Very useful for describing stochastic discrete
    systems

3
Markov chain example
pCC
pAC
pTC
pCT
pCG
pCA
pAT
pTA
pGA
pGT
pGC
pAA
pTT
pTG
pAG
pGG
4
Example application of Markov chains
  • Markov chains model dependencies across time or
    position
  • The model assigns a probability to each sequence
    of observed data and can be used to measure how
    likely an observed sequence is to follow it
  • The GeneMark algorithm uses 5th-order Markov
    chains (why?) to find genes (distinguishing them
    from other regions in the DNA)

5
Markov Chains Stationary Distribution
  • Under general assumptions (irreducibility and
    aperiodicity), Markov chains have a stationary
    distribution p, the limit of Pk as k goes to
    infinity
  • An irreducible MC has no identical states
  • An aperiodic MC can reach any state from any
    other state
  • Such a Markov chain is called ergodic

6
Markov Chain Monte Carlo
  • Used as a means to guide the selection of samples
  • We want the stationary distribution of the Markov
    chain to be the distribution we are sampling from
  • In many cases, this is relatively easy to do

7
Gibbs sampling
  • A special case of MCMC where the conditional
    probabilities on specific variables can be
    calculated easily, but the joint probability must
    be sampled from

8
Gibbs sampling in our problem
  • Start with a single candidate SS1,...,Sk where
    each Si chosen randomly and uniformly from input
    sequence i
  • Calculate A and D(AB) for S
  • Choose one member of S randomly to remove
  • Choose an alternative (from the corresponding
    sequence) with probability proportional to the
    corresponding D(AB)
  • Repeat until D(AB) converges

9
Exploring alternative strings
  • When we replace a string from sequence i
  • We examine in turn each of the m-n1 strings that
    that sequence could offer
  • For each such string, we add it temporarily to S
    and calculate the new A and D(AB)
  • Then we assign to each string Sij (j varies
    across these strings) probability
  • May pick a worse string, or the same string we
    just removed

10
Gibbs sampler convergence
  • Return best S seen across all iterations (may not
    be the last one)
  • Stop after a fixed number of iterations, or when
    D(AB) does not change very much
  • Solution is sensitive to the starting S, so we
    typically run the algorithm several (thousand)
    times from different starting points

11
Complexity of Gibbs sampler
  • Construct initial S and calculate A and D(AB)
    in O(kn) time
  • Each iterative step takes O(n) time to remove a
    string and recalculate D(AB), O(mn) time to
    calculate the probabilities of the m-n1
    alternatives
  • Total time is O(mnd) where d is the number of
    iteration steps (dmgtgtk), multiplied by the number
    of random restarts

12
Why Gibbs sampling works
  • Retains elements of the greedy approach
  • weighing by relative entropy makes likely to move
    towards locally better solutions
  • Allows for locally bad moves with a small
    probability, to escape local maxima

13
Variations in Gibbs sampling
  • Discard substrings non-uniformly (weighed by
    relative entropy, analogous to subsequent
    selection of new string)
  • Use simulated annealing to reduce chance of
    making a bad move (and gradually ensure
    convergence)

14
Annealing
  • Annealing is a process in metallurgy for
    improving metals by increasing crystal size and
    reducing defects
  • The process works by heating the metal and
    controlled cooling which lets the atoms go
    through a series of states with gradually lower
    internal energy
  • The goal is to have the metal settle in a
    configuration with lower than the original
    internal energy

15
Simulated Annealing
  • Simulated annealing (SA) adopts an energy
    function equal to the function we want to
    minimize
  • Transitions between neighboring states are
    adopted with probability specified by a function
    f of the energy gain ?EEnew-Eold and the
    temperature T
  • f(?E,T)gt0 even if ?Egt0, but as T?0, f(?E,T)?1 if
    ?Elt0 (or to c?E for some constant c) and
    f(?E,T)?0 for ?Egt0

16
Simulated Annealing
  • Original f (from the Metropolis-Hasting
    algorithm)
  • T controls acceptance of locally bad solutions
  • The annealing schedule is a process for gradually
    reducing T so that eventually only good moves are
    accepted

17
Special cases
  • If T is always zero,
  • simulated annealing reduces to greedy local
    optimization
  • If T is constant but non-zero,
  • simulated annealing reduces to the process we
    described for Gibbs sampling (choose solutions
    randomly with probability proportional to their
    improvement)
About PowerShow.com