# Introduction to Monte Carlo Markov chain MCMC methods - PowerPoint PPT Presentation

PPT – Introduction to Monte Carlo Markov chain MCMC methods PowerPoint presentation | free to download - id: 159c59-MzFlN

The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
Title:

## Introduction to Monte Carlo Markov chain MCMC methods

Description:

### Lecture 3. Introduction to Monte Carlo Markov chain (MCMC) methods. Lecture Contents ... can be assessed by the Monte Carlo standard error (MCSE) for each parameter. ... – PowerPoint PPT presentation

Number of Views:1007
Avg rating:3.0/5.0
Slides: 28
Provided by: mat135
Category:
Tags:
Transcript and Presenter's Notes

Title: Introduction to Monte Carlo Markov chain MCMC methods

1
Lecture 3
• Introduction to Monte Carlo Markov chain (MCMC)
methods

2
Lecture Contents
• How does WinBUGS fit a regression?
• Gibbs Sampling
• Convergence and burnin
• How many iterations?
• Logistic regression example

3
Linear regression
• To this model we added the following priors in
WinBUGS
• Ideally we would sample from the joint posterior
distribution

4
Linear Regression ctd.
• In this case we can sample from the joint
posterior as described in the last lecture
• However this is not the case for all models and
so we will now describe other simulation-based
methods that can be used.
• These methods come from a family of methods
called Markov chain Monte Carlo (MCMC) methods
and here we will focus on a method called Gibbs
Sampling.

5
MCMC Methods
• Goal To sample from joint posterior
distribution.
• Problem For complex models this involves
multidimensional integration
• Solution It may be possible to sample from
conditional posterior distributions,
• It can be shown that after convergence such a
sampling approach generates dependent samples
from the joint posterior distribution.

6
Gibbs Sampling
• When we can sample directly from the conditional
posterior distributions then such an algorithm is
known as Gibbs Sampling.
• This proceeds as follows for the linear
regression example
• Firstly give all unknown parameters starting
values,
• Next loop through the following steps

7
Gibbs Sampling ctd.
• Sample from

These steps are then repeated with the
generated values from this loop replacing the
starting values. The chain of values produced by
this procedure are known as a Markov chain, and
it is hoped that this chain converges to its
equilibrium distribution which is the joint
posterior distribution.
8
Calculating the conditional distributions
• In order for the algorithm to work we need to
sample from the conditional posterior
distributions.
• If these distributions have standard forms then
it is easy to draw random samples from them.
• Mathematically we write down the full posterior
and assume all parameters are constants apart
from the parameter of interest.
• We then try to match the resulting formulae to a
standard distribution.

9
Matching distributional forms
• If a parameter ? follows a Normal(µ,s2)
distribution then we can write
• Similarly if ? follows a Gamma(a,ß) distribution
then we can write

10
Step 1 ß0
11
Step 2 ß1
12
Step 3 1/s2
13
Algorithm Summary
• Repeat the following three steps
• 1. Generate ß0 from its Normal conditional
distribution.
• 2. Generate ß1 from its Normal conditional
distribution.
• 3. Generate 1/s2 from its Gamma conditional
distribution

14
Convergence and burn-in
• Two questions that immediately spring to mind
are
• We start from arbitrary starting values so when
can we safely say that our samples are from the
correct distribution?
• After this point how long should we run the chain
for and store values?

15
Checking Convergence
• This is the researchers responsibility!
• Convergence is to a target distribution (the
required posterior), not to a single value as in
ML methods.
• Once convergence has been reached, samples should
look like a random scatter about a stable mean
value.

16
Convergence
• Convergence occurs here at around 100 iterations.

17
Checking convergence 2
• One approach (in WinBUGS) is to run many long
chains with widely differing starting values.
• WinBUGS also has the Brooks-Gelman-Rubin
diagnostic which is based on the ratio of
between-within chain variances (ANOVA). This
diagnostic should converge to 1.0 on convergence.
• MLwiN has other diagnostics that we will cover on
Wednesday.

18
Demo of multiple chains in WinBUGS
• Here we transfer to the computer for a
demonstration with the regression example of
multiple chains (also mention node info)

19
Demo of multiple chains in WinBUGS
• Average 80 interval within-chains (blue) and
pooled 80 interval between chains (green)
converge to stable values
• Ratio pooledaverage interval width (red)
converge to 1.

20
Convergence in more complex models
• Convergence in linear regression is (almost)
instantaneous.
• Here is an example of slower convergence

21
How many iterations after convergence?
• After convergence, further iterations are needed
to obtain samples for posterior inference.
• More iterations more accurate posterior
estimates.
• MCMC chains are dependent samples and so the
dependence or autocorrelation in the chain will
influence how many iterations we need.
• Accuracy of the posterior estimates can be
assessed by the Monte Carlo standard error (MCSE)
for each parameter.
• Methods for calculating MCSE are given in later
lectures.

22
Inference using posterior samples from MCMC runs
• A powerful feature of MCMC and the Bayesian
approach is that all inference is based on the
joint posterior distribution.
• We can therefore address a wide range of
substantive questions by appropriate summaries of
the posterior.
• Typically report either the mean or median of the
posterior samples for each parameter of interest
as a point estimate
• 2.5 and 97.5 percentiles of the posterior
sample for each parameter give a 95 posterior
credible interval (interval within which the
parameter lies with probability 0.95)

23
Derived Quantities
• Once we have a sample from the posterior we can
answer lots of questions simply by investigating
this sample.
• Examples
• What is the probability that ?gt0?
• What is the probability that ?1gt ?2?
• What is a 95 interval for ?1/(?1 ?2)?
• See later for examples of these sorts of derived
quantities.

24
Logistic regression example
• In the practical that follows we will look at the
following dataset of rat tumours and fit a
logistic regression model to it

25
Logistic regression model
• A standard Bayesian logistic regression model for
this data can be written as follows
• WinBUGS can fit this model but can we write out
the conditional posterior distributions and use
Gibbs Sampling?

26
Conditional distribution for ß0
This distribution is not a standard distribution
and so we cannot simply simulate from a
standard random number generator. However both
WinBUGS and MLwiN can fit this model using
MCMC. We will however not see how until day 5.
27
Hints for the next practical
• In the next practical you will be creating
WinBUGS code for a logistic regression model.
• In this practical you get less help and so I
would suggest that looking at the Seeds example
in the WinBUGS examples may help. The seeds
example is more complicated than what you require
but will be helpful for showing the necessary
WinBUGS statements.