Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models - PowerPoint PPT Presentation

Loading...

PPT – Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models PowerPoint presentation | free to download - id: 65de12-NmI0Y



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models

Description:

Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models Omiros Papaspiliopoulos and Gareth O. Roberts Presented by Yuting Qi – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 18
Provided by: eeDukeEd5
Learn more at: http://www.ee.duke.edu
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models


1
Retrospective Markov chain Monte Carlo methods
for Dirichlet process hierarchical models
  • Omiros Papaspiliopoulos and Gareth O. Roberts

Presented by Yuting Qi ECE Dept., Duke
Univ. 10/06/06
2
Overview
  • DP hierarchical models
  • Two Gibbs samplers
  • Polya urn (Escobar, west, 94, 95)
  • Blocked Gibbs sampler (Ishwaran 00)
  • Retrospective sampling
  • MCMC for DP with Retrospective sampling
  • Performance
  • Conclusions

3
DP Mixture Models (1)
  • DP Mixture models (DMMs)
  • Assume Yi is drawn from a parametric distribution
    with parameters Xi and ? .
  • All Xi have one common prior P, some Xi may take
    same value.
  • Prior distribution P DP(? , H?).
  • Property of Pólya urn scheme
  • Marginalize P,

4
DP Mixture Models (2)
  • Explicit form of DP (Sethuraman 94)
  • Relationship
  • ? is large gt Vj Beta(1, ?) is small gt
    small pj, many sticks of short length gt P
    consists of infinite number of Zj with small pj
    gt P-gtH? .
  • ? -gt0 gt Vj Beta(1, ?) is large gt few
    large sticks gt P has a large mass on a small
    subset of Zj gt most Xi share the same value.

Vj Beta(1, ?) Zj H?
5
Wests Gibbs Sampler (1)
  • Estimation of joint posterior
  • Sample from full conditional distributions
  • H?,0 , posterior by updating prior H? via the
    likelihood function,

6
Wests Gibbs Sampler (2)
  • Gibbs sampling scheme
  • Sample Xi , is equivalent to sample indicator Ki
    (Ki k, Xi takes on value Xk) for Xi , given old
    Kii1,n , Xkk1,K, generate new Ki from
    posteriors.
  • For Ki0, draw new Xi from
  • For Kigt0, generate a new set of Xk according to
    posteriors
  • Drawbacks
  • Converges slowly
  • Difficult to implement when H? and likelihood ?
    are not conjugate.

7
Blocked Gibbs Sampler (1)
  • Stick-breaking representation
  • Estimation of joint posterior
  • Update P in each Gibbs iteration
  • No Xi involved
  • Must truncate at a finite level K

8
Blocked Gibbs Sampler (2)
  • Sampling scheme
  • Sample Zj
  • For those j occupied by Xi, sample Zj from
    conditional posterior
  • For those j not occupied by any Xi, sample Zj
    from base prior H?
  • Sample K from its conditional posteriors
  • Sample p from its conditional posteriors

pk,j is posterior of pk updated by likelihood
9
Retrospective Sampling (1)
  • Retrospective sampling
  • In Blocked Gibbs sampler, given pj, sample Ki,
    and set Xi Zki , infinite sampled pairs (pj, Zj)
    is not feasible.
  • To sample Ki, first generate Ui from uniform 0,
    1, then set Kij iif
  • Retrospective sampling exchanges order of
    sampling Ui and sampling the pairs (pj, Zj) .
  • If for a given Ui , more pj needed than we
    currently have, simulate pairs (pj, Zj)
    retrospectively, until
    is satisfied.

10
Retrospective Sampling (2)
  • Algorithm

11
MCMC for DDM (1)
  • MCMC with retrospective sampling
  • Notation

12
MCMC for DDM (2)
  • Sampling Scheme
  • Sample Zj
  • Sample p from its conditional posteriors
  • Sample K from retrospective sampling.

13
MCMC for DDM (3)
  • Sampling K
  • Using Metropolis-Hasting steps.
  • When update Ki, the sampler proposes to move from
    k to k(i,j),
  • The distribution for generating the proposed j is

Mi is a constant controls the probability of
proposing j greater than maxk.
14
MCMC for DDM (4)
  • Algorithm

15
Performance
  • lepto data set (unimodal) 0.67N(0,1)0.33N(0.3,
    0.252)
  • bimod data set (bimodal) 0.5N(-1,
    0.52)0.5N(1, 0.52)

Autocorrelation time a standard way to measure
the speed of convergence
How well the algorithm explores the
high-dimensional model space
16
Performance
17
Conclusions Comments
  • Conclusions
  • Retrospective methodology is applied to DP,
    avoiding approximation.
  • Robust to large dataset.
  • Comments
  • One of the most wordy and worst-organized paper
    Ive read.
About PowerShow.com