Introduction to WinBUGS presentation

About This Presentation

Transcript and Presenter's Notes

Title: Introduction to WinBUGS

1
Introduction to WinBUGS

Olivier Gimenez
University of St Andrews, Scotland

2
A brief history

1989 project began with a Unix version called
BUGS
1998 first Windows version, WinBUGS was born
Initially developed by the MRC Biostatistics Unit
in Cambridge and now joint work with Imperial
College School of Medicine at St Mary's, London.
Windows Bayesian inference Using Gibbs Sampling
Software for the Bayesian analysis of complex
statistical models using Markov chain Monte Carlo
(MCMC) methods

3
Who?
Nicky Best Imperial College Faculty of Medicine,
London (UK)
Thomas Andrew University of Helsinki, Helsinki
(Finland)
Wally Gilks MRC Biostatistics Unit Institute of
Public Health Cambridge (UK)
David Spiegelhalter MRC Biostatistics Unit
Institute of Public Health Cambridge (UK)
Freely downloadable from http//www.mrc-bsu.cam.a
c.uk/bugs/winbugs/contents.shtml
4
Key principle

You specify the prior and build up the likelihood
WinBUGS computes the posterior by running a Gibbs
sampling algorithm, based on
?(?D) / L(D?) ?(?)
WinBUGS computes some convergence diagnostics
that you have to check

5
A biological example throughout White stork
(Ciconia ciconia) in Alsace 1948-1970
Demographic components (fecundity, breeding
success, survival, etc)
Climate (Temperature, rainfall, etc)
6
WinBUGS Linear Regression
15.1 67 13.3 52 15.3 88 13.3 61 14.6 32
15.6 36 13.1 72 13.1 43 15.0 92 11.7 32
15.3 86 14.4 28 14.4 57 12.7 55 11.7 66
11.9 26 15.9 28 13.4 96 14.0 48 13.9 90
12.9 86 15.1 78 13.0 87
2.55 1.85 2.05 2.88 3.13 2.21 2.43 2.69 2.55 2.84
2.47 2.69 2.52 2.31 2.07 2.35 2.98 1.98 2.53 2.21
2.62 1.78 2.30
Y Number of chicks per pairs
T Temp. May (C) R Rainf. May (mm)
7
WinBUGS Linear Regression
1. Do temperature and rainfall affect the number
of chicks?
2. Regression model
Yi ? ?r Ri ?t Ti ?i , i1,...,23 ?i
i.i.d. N(0,?2)
Yi i.i.d. N(?i,?2), i1,...,23 ?i ? ?r Ri
?t Ti
3. Estimation of parameters ?, ?r, ?t, ?
4. Frequentist inference uses t-tests
8
Linear Regression using Frequentist approach
15.1 67 13.3 52 15.3 88 13.3 61 14.6 32
15.6 36 ... ... 13.0 87
2.55 1.85 2.05 2.88 3.13 2.21 ... 2.30
Y Number of chicks per pairs
T Temp. May (C) R Rainf. May (mm)
Y 2.451 0.031 T - 0.007 R
Estimate Std. Error t value Pr(gtt)
temperature 0.031069 0.054690 0.568
0.57629 rainfall -0.007316 0.002897
-2.525 0.02011
9
Linear Regression using Frequentist approach
15.1 67 13.3 52 15.3 88 13.3 61 14.6 32
15.6 36 ... ... 13.0 87
2.55 1.85 2.05 2.88 3.13 2.21 ... 2.30
Y Number of chicks per pairs
T Temp. May (C) R Rainf. May (mm)
Y 2.451 0.031 T - 0.007 R
Estimate Std. Error t value Pr(gtt)
temperature 0.031069 0.054690 0.568
0.57629 rainfall -0.007316 0.002897
-2.525 0.02011
? Influence of Rainfall only
10
Running WinBUGS What do you need?
11
Running WinBUGS The model
12
Running WinBUGS Data and initial values
13
Running WinBUGS Overall
14
Running WinBUGS At last!!
1- check model 2- load data 3- compile model 4-
load initial values 5- generate burn-in values 6-
parameters to be monitored 7- perform the
sampling to generate posteriors 8- check
convergence and display results
15
Running WinBUGS 1. Check model
16
Running WinBUGS 1. Check model highlight 'model'
17
Running WinBUGS 1. Check model open the Model
Specification Tool
18
Running WinBUGS 1. Check model Now click 'check
model'
19
Running WinBUGS 1. Check model Watch out for the
confirmation at the foot of the screen
20
Running WinBUGS 2. Load data Now highlight the
'list' in the data window
21
Running WinBUGS 2. Load data then click 'load
data'
22
Running WinBUGS 2. Load data watch out for the
confirmation at the foot of the screen
23
Running WinBUGS 3. Compile model Next, click
'compile'
24
Running WinBUGS 3. Compile model watch out for
the confirmation at the foot of the screen
25
Running WinBUGS 4. Load initial values highlight
the 'list' in the data window
26
Running WinBUGS 4. Load initial values click
'load inits'
27
Running WinBUGS 4. Load initial values watch out
for the confirmation at the foot of the screen
28
Running WinBUGS 5. Generate Burn-in values Open
the Model Update Tool
29
Running WinBUGS 5. Generate Burn-in values Give
the number of burn-in iterations (1000)
30
Running WinBUGS 5. Generate Burn-in values click
'update' to do the sampling
31
Running WinBUGS 6. Monitor parameters open the
Inference Samples Tool
32
Running WinBUGS 6. Monitor parameters Enter
'intercept' in the node box and click 'set'
33
Running WinBUGS 6. Monitor parameters Enter
'slope_temperature' in the node box and click
'set'
34
Running WinBUGS 6. Monitor parameters Enter
'slope_rainfall' in the node box and click 'set'
35
Running WinBUGS 7. Generate posterior values
enter the number of samples you want to take
(10000)
36
Running WinBUGS 7. Generate posterior values
click 'update' to do the sampling
37
Running WinBUGS 8. Summarize posteriors Enter
'' in the node box and click 'stats'
38
Running WinBUGS 8. Summarize posteriors mean,
median and credible intervals
39
Running WinBUGS 8. Summarize posteriors 95
Credible intervals
tell us the same story
Estimate Std. Error t value Pr(gtt)
temperature 0.031069 0.054690 0.568
0.57629 rainfall -0.007316 0.002897
-2.525 0.02011
40
Running WinBUGS 8. Summarize posteriors 95
Credible intervals
tell us the same story
Estimate Std. Error t value Pr(gtt)
temperature 0.031069 0.054690 0.568
0.57629 rainfall -0.007316 0.002897
-2.525 0.02011
41
Running WinBUGS 8. Summarize posteriors click
'history'
42
Running WinBUGS 8. Summarize posteriors click
'auto cor'
? Problem of autocorrelation
43
Coping with autocorrelation use standardized
covariates
44
Coping with autocorrelation use standardized
covariates
45
Re-running WinBUGS 1,2,...7, and 8. Summarize
posteriors click 'auto cor'
? autocorrelation OK
46
Re-running WinBUGS 1,2,...7, and 8. Summarize
posteriors click 'density'
47
Re-running WinBUGS 1,2,...7, and 8. Summarize
posteriors click 'quantiles'
48
Running WinBUGS 8. Checking for convergence using
the Brooks-Gelman-Rubin criterion

A way to identify non-convergence is to simulate
multiple sequences for over-dispersed starting
points
Intuitively, the behaviour of all of the chains
should be basically the same.
In other words, the variance within the chains
should be the same as the variance across the
chains.
In WinBUGS, stipulate the number of chains after
'load data' and before 'compile' (obviously, as
many sets of initial values as chains have to be
loaded, or generated)

49
Running WinBUGS 8. Checking for convergence using
the Brooks-Gelman-Rubin criterion
The normalized width of the central 80 interval
of the pooled runs is green The normalized
average width of the 80 intervals within the
individual runs is blue
50
Re-running WinBUGS 1,2,...7, and 8. Summarize
posteriors others...

Click 'coda' to produce lists of data suitable
for external treatment via the Coda R package
Click 'trace' to produce dynamic history
changing in real time

51
Another example logistic regression
15.1 67 13.3 52 15.3 88 13.3 61 14.6 32
15.6 36 13.1 72 13.1 43 15.0 92 11.7 32
15.3 86 14.4 28 14.4 57 12.7 55 11.7 66
11.9 26 15.9 28 13.4 96 14.0 48 13.9 90
12.9 86 15.1 78 13.0 87
151 / 173 105 / 164 73 / 103 107 / 113 113
/ 122 87 / 112 77 / 98 108 / 121 118 /
132 122 / 136 112 / 133 120 / 137 122 / 145
89 / 117 69 / 90 71 / 80 53 / 67
41 / 54 53 / 58 31 / 39 35 /
42 14 / 23 18 / 23
Y Proportion of nests with success (gt0 youngs)
T Temp. May (C) R Rainf. May (mm)
52
Performing a logistic regression with WinBUGS
53
Performing a logistic regression with WinBUGS
54
Performing a logistic regression with WinBUGS
noninformative priors
55
Performing a logistic regression with WinBUGS
data initial values
56
Performing a logistic regression with WinBUGS the
results
lower
upper

influence of rainfall, but not temperature (see
credible intervals)

57
Performing a logistic regression with WinBUGS the
results

additional parameters as a by-product of the
MCMC samples just add them in the model as
parameters to be monitored
- geometric mean
geom lt- pow(prod(p),1/N)
- odds-ratio
odds.rainfall lt- exp(slope.rainfall)
odds.temperature lt- exp(slope.temperature)

58
Performing a logistic regression with WinBUGS the
results

additional parameters as a by-product of the
MCMC samples
- geom. mean probability of success around 82
8184
- odds-ratio -16 for an increase of rainfall of
1 unit

59
Running WinBUGS from R R2WinBUGS package The
logistic regression example revisited

It may be uneasy to read complex sets of data
and initial values
It is also quite boring to specify the
parameters to be monitored in each run
It might be interesting to save the output and
read it into R for further analyses
solution 1 WinBUGS can be used in batch mode
using scripts
solution 2 R2WinBUGS allows WinBUGS to be run
from R
We can work with the results after importing
them back into R again
? create graphical displays of data and posterior
simulations or use WinBUGS in Monte Carlo
simulation studies

60
Running WinBUGS from R R2WinBUGS package The
logistic regression example revisited

To call WinBUGS from R
1. Write a WinBUGS model in a ASCII file.
2. Go into R.
3. Prepare the inputs to the 'bugs' function and
run it
4. A WinBUGS window will pop up amd R will
freeze up. The
model will now run in WinBUGS. You will see
things happening
in the Log window within WinBUGS. When
WinBugs is done, its
window will close and R will work again.
If an error message appears, re-run with 'debug
TRUE'.

61
Running WinBUGS from R R2WinBUGS package 1 -
Write the WinBUGS code in a ASCII file
This covers logistic regression model using two
explicative variables. The White storks in
Baden-Wurttemberg (Germany) data set is
provided model for( i in 1 N) A binomial
distribution as a likelihood nbsuccessi
dbin(pi,nbpairsi) The probability of
success is a function of both rainfall and
temperature logit(pi) lt- intercept
slope.temperature (temperaturei -
mean(temperature))/(sd(temperature))
slope.rainfall (rainfalli -
mean(rainfall))/(sd(rainfall)) priors
for regression parameters intercept
dnorm(0,0.001) slope.temperature
dnorm(0,0.001) slope.rainfall dnorm(0,0.001)
62
Running WinBUGS from R R2WinBUGS package 3 -
Prepare the inputs to the 'bugs' function
Nbsuccess nbpairs temperature rainfall 151 173
15.1 67 105 164 13.3 52 73 103 15.3
88 107 113 13.3 61 113 122 14.6
32 87 112 15.6 36 53 58
14.0 48 31 39 13.9 90 35 42 12.9
86 14 23 15.1 78 18 23 13.0 87
63
Running WinBUGS from R R2WinBUGS package The
logistic regression example revisited
Load R2WinBUGS package library(R2WinBUGS)
Data (R 'list' format) N 23 data
read.table("datalogistic.dat",headerT) attach(dat
a) datax list("N","nbsuccess","nbpairs","tempera
ture","rainfall") MCMC details nb.iterations
10000 nb.burnin 1000
64
Running WinBUGS from R R2WinBUGS package The
logistic regression example revisited
Initial values init1 list(intercept-1,slope.t
emperature-1,slope.rainfall-1) init2
list(intercept0,slope.temperature0,slope.rainfal
l0) init3 list(intercept1,slope.temperature1,
slope.rainfall1) inits list(init1,init2,init3)
nb.chains length(inits) Parameters to be
monitored parameters lt- c("intercept","slope.tempe
rature","slope.rainfall") MCMC simulations
res.sim lt- bugs(datax, inits, parameters,
"logisticregressionstandardized.bug", n.chains
nb.chains, n.iter nb.iterations, n.burnin
nb.burnin) Save results save(res.sim, file
"logistic.Rdata")
65
Running WinBUGS from R R2WinBUGS package The
logistic regression example revisited
Summarize results res.simsummary use
print(res.sim) alternatively mean
sd 50 97.5
Rhat Intercept 1.55122555 0.05396574
1.55100 1.6589750 1.0020641 slope.temperatur
e 0.03030854 0.06128879 0.03148
0.1510975 0.9997848 slope.rainfall
-0.15302837 0.06206554 -0.15295
-0.0243025 1.0014894 deviance
204.60259481 2.48898337 203.90000 211.2000000
1.0002280
66
Running WinBUGS from R R2WinBUGS package The
logistic regression example revisited
Numerical summaries for slope.rainfall? quantile
(slope.rainfall,c(0.025,0.975)) 2.5
97.5 -0.2693375 -0.0243025 Calculate the
odds-ratio odds.rainfall lt- exp(slope.rainfall)
quantile(odds.rainfall,c(0.025,0.975))
2.5 97.5 0.7638855 0.9759904
67
Running WinBUGS from R R2WinBUGS package The
logistic regression example revisited
Graphical summaries for slope.rainfall? plot(den
sity(slope.rainfall),xlab"",ylab"",
main"slope.rainfall a posteriori density")
68
Recent developments in WinBUGS Model selection
using RJMCMC

We consider data relating to population of White
storks breeding in Baden Württemberg (Germany).
Interest lies in the impact of climate variation
(rainfall) in their wintering area on their
population dynamics (adult survival rates).
Mark-recapture data from 1956-71 are available.
The covariates relate to the amount of rainfall
between June-September each year from 10 weather
stations located around the Sahel region.
Interest lies in identifying the given rainfall
locations that explain the adult survival rates.

69
Bayesian Model Selection

Discriminating between different models can often
be of particular interest, since they represent
competing biological hypotheses.
How do we decide which covariates to use? often
there may be a large number of possible
covariates.

70
Example (cont)

We express the survival rate as a logistic
function of the covariates
logit ?t ? ?Txt ?t
where xt denotes the set of covariate values at
time t and ?t are random effects,
?t N(0,?2).
However, which rainfalls explain the survival
rates for the adults?
Alternatively, what values of ? are non-zero?

71
Model Selection

In the classical framework, likelihood ratio
tests or information criterion (e.g. AIC) are
often used.
There is a similar Bayesian statistic the
DIC.
This is programmed within WinBUGS however its
implementation is not suitable for hierarchical
models (e.g. random effect models).
In addition, the DIC is known to give fallacious
results in even simple problems.
Within the general Bayesian framework, there is a
more natural way of dealing with the issue of
model discrimination.

72
Bayesian Approach

We treat the model itself as an unknown parameter
to be estimated.
Then, applying Bayes Theorem we obtain the
posterior distribution over both parameter and
model space
?(?m, m data) Ç L(data ?m, m) p(?m) p(m).
Here ?m denotes the parameters in model m.

73
Posterior Model Probabilities

The Bayesian approach then allows us to
quantitatively discriminate between competing
models via posterior model probabilities
?(m data) s ?(?m, m data) d?m
Ç p(m) s L(data ?m, m) p(?m) d?m
Note that we need to specify priors on both the
parameters and now also on the models themselves.
Thus we need to specify a prior probability for
each model to be considered.

74
MCMC-based estimates

We have a posterior distribution (over parameter
and model space) defined up to proportionality
?(?m, m data) Ç L(data ?m, m) p(?m m) p(m)
If we can sample from this posterior distribution
then we are able to obtain posterior estimates of
summary statistics of interest.
In particular the posterior model probabilities
can be estimated as the proportion of time that
the chain is in each model.
So, all we need to do is define how we construct
such a Markov chain!!

75
Reversible Jump MCMC

The reversible jump MCMC algorithm allows us to
construct a Markov chain with stationary
distribution equal to the posterior distribution.
It is simply an extension of the
Metropolis-Hastings algorithm that allows moves
between different dimensions.
This algorithm is needed because the number of
parameters, ?m, in model m, may differ between
models.
Note that this algorithm needs only one Markov
chain irrespective of the number of models!!

76
Markov chain

Each iteration of the Markov chain essentially
involves two steps
Within model moves updating each parameter
using standard MCMC moves (Gibbs sampler,
Metropolis-Hastings)
Between model moves updating the model using a
reversible jump type move.
Then, standard MCMC-type processes apply, such as
using an appropriate burn-in, obtaining summary
statistics of interest etc.
To illustrate the RJ algorithm, we consider a
particular example relating to variable selection
(e.g. covariate analysis).

77
WinBUGS

General RJ updates cannot currently be programmed
into WinBUGS.
Bespoke code needs to be written instead.
However, the recent add-on called jump allows
two particular RJ updates to be performed in
WinBUGS
Variable selection
Splines
See http//www.winbugs-development.org.uk/rjmcmc.h
tml
So, in particular, WinBUGS can be used for model
selection in the White storks example.

78
Example White Storks

Recall our white storks example there are 10
possible covariates, hence a total of 210
possible models (1024!).
We specify a prior probability of 0.5 that each
covariate is in the model.
Then, conditional on the covariate being present,
we specify,
? N(0,10).
Finally, for the random effect error variance,
?2 ?-1(0.001,0.001).

79
Example White Storks

WinBUGS demonstration.
RJMCMC is performed on the betas only, neither
on the ? nor on the intercept ?

80
Results
Models with largest posterior support
0001000000 0.5178058297 0.5178058297 0000000000
0.07629791141 0.5941037411 0001010000 0.0474087675
0.6415125086 0001100000 0.03814092265 0.679653431
3 0001000001 0.03085379849 0.7105072297 1001000000
0.02549001607 0.7359972458 0001000100 0.023745696
58 0.7597429424 0001001000 0.02336699564 0.7831099
38 0001000010 0.0229240303 0.8060339683 0000000100
0.02123479458 0.8272687629 0101000000 0.018099609
82 0.8453683727 0011000000 0.01540739041 0.8607757
631 1000000000 0.01186825798 0.8726440211 00000100
00 0.01103282075 0.8836768419
81
Results
Additionally the (marginal) posterior probability
that each covariate influence the survival rates
node mean sd posterior marg
prob effect1 0.01 0.04 0.06 effect2 0.00 0.0
3 0.04 effect3 0.00 0.02 0.03 effect4 0.30 0
.17 0.83 effect5 -0.01 0.04 0.07 effect6 0.0
1 0.05 0.09 effect7 -0.00 0.03 0.04 effect8
0.01 0.06 0.07 effect9 0.01 0.04 0.05 effect10
-0.01 0.04 0.05 p 0.91 0.01 sdeps 0.20 0.14
82
Results survival rates

Write a Comment

User Comments (0)

About PowerShow.com

Introduction to WinBUGS PowerPoint PPT Presentation