Recent Advances in Statistical Ecology using

Computationally Intensive Methods

- Ruth King

Overview

- Introduction to wildlife populations and

identification of questions of interest. - Motivating example.
- Issues to be addressed
- Missing data
- Model discrimination.
- Summary.
- Future research.

Wildlife Populations

- In recent years there has been increasing

interest in wildlife populations. - Often we may be interested in population changes

over time, e.g. if there is a declining

population. (Steve Buckland) - Alternatively, we may be interested in the

underlying dynamics of the system, in order to

obtain a better understanding of the population. - We shall concentrate on this latter problem, with

particular focus on identifying factors that

affect demographic rates.

Data Collection

- Data are often collected via some form of

capture-recapture study. - Observers go out into the field and record all

animals that are seen at a series of capture

events. - Animals may be recorded via simply resightings or

recaptures (of live animals) and recoveries (of

dead animals). - At each capture event all unmarked animals are

uniquely marked all observed animals are

recorded and subsequently released back into the

population.

Data

- Each animal is uniquely identifiable so our data

consist of the capture histories for each

individual observed in the study. - A typical capture history may look like
- 0 1 1 0 0 1 2
- 0/1 corresponds to the individual being

unobserved/observed at that capture time and - 2 denotes an individual is recovered dead.
- We can then explicitly write down the

corresponding likelihood as a function of

survival (?), recapture (p) and recovery (?)

rates.

Likelihood

- The likelihood is the product over all

individuals of the probability of their

corresponding capture history, conditional on

their initial capture. - For example, for an individual with history
- 0 1 1 0 0 1 2
- the contribution to the likelihood is
- ?2 p3 ?3 (1-p4) ?4 (1-p5) ?5 p6 (1-?6) ?7.
- Then, we can use this likelihood to estimate the

parameter values (either MLEs or posterior

distributions).

Covariates

- Covariates are often used to explain temporal

heterogeneity within the parameters. - Typically these are environmental factors, such

as resource availability, weather conditions or

human intervention. - Alternatively, heterogeneity in a population can

often be explained via different (individual)

covariates. - For example, sex, condition or breeding status.
- We shall consider the survival rates to be

possibly dependent on the different covariates.

Case Study Soay sheep

- We consider mark-recapture-recovery (MRR) data

relating to Soay sheep on the island of Hirta. - The sheep are free from human activity and

external competition/predation. - Thus, this population is ideal for investigating

the impact of different environmental and/or

individual factors on the sheep. - We consider annual data from 1986-2000, collected

on female sheep (1079 individuals).

This is joint work with Steve Brooks and Tim

Coulson

Covariate Information

- Individual covariates
- Coat type (1dark, 2light)
- Horn type (1polled, 2scurred, 3classical)
- Birth weight (real normalised)
- Age (in years)
- Weight (real - normalised)
- Number of lambs born to the sheep in the spring

prior to summer census (0, 1, 2) - And in the spring following the census (0, 1, 2).

- Environmental covariates
- NAO index population size March rainfall

Autumn rainfall and March temperature.

Survival Rates - ?

- Let the set of environmental covariate values at

time t be denoted by xt. - The survival rate for animal i, of age a, at time

t, is given by, - logit ?i,a,t ?a ?aT xt ?aT yi ?aT zi,t

?a,t - Here, yi denotes the set of time-independent

covariates and zi,t the time varying covariates - ?a,t N(0,?a2) denotes a random effect.
- There arise two issues here missing covariate

values and model choice.

Issue 1 Missing Data

- The capture histories and time-invariant

covariate values data are presented in the form - Note that there are also many missing values for

the time-dependent weight covariate.

Problem

- Given the set of covariate values, the

corresponding survival rate can be obtained, and

hence the likelihood calculated. - However, a complexity arises when there are

unknown (i.e. missing) covariate values, removing

the simple and explicit expression for the

likelihood.

Missing Data Classical Approaches

- Typical classical approaches to missing data

problems include - Ignoring individuals with missing covariate

values - EM algorithm (can be difficult to implement and

computationally expensive) - Imputation of missing values using some

underlying model (e.g. Gompertz curve). - Conditional approach for time-varying covariates

(Catchpole, Morgan and Tavecchia, 2006 in

submission) - We consider a Bayesian approach, where we assume

an underlying model for the missing data which

allows us to account for the corresponding

uncertainty of the missing values.

Bayesian Approach

- Suppose that we wish to make inference on the

parameters ?, and the data observed corresponds

to capture histories, c, and covariate values,

vobs. - Then, Bayes theorem states
- ?(?c, vobs) Ç L(c, vobs ?) p(?)
- The posterior distribution is very complex and so

we use Markov chain Monte Carlo (MCMC) to obtain

estimates of the posterior statistics of

interest. - However, in our case the likelihood is

analytically intractable, due to the missing

covariate values.

Auxiliary Variables

- We treat the missing covariate values (vmis) as

parameters or auxiliary variables (AVs). - We then form the joint posterior distribution

over the parameters, ?, and AVs, given the

capture histories c and observed covariate values

yobs - ?(?, vmis c, vobs) Ç L(c ?, vmis , vobs)
- f(vmis, vobs ?) p(?)
- We can now sample from the joint posterior

distribution ?(?, vmis c, vobs). - We can integrate out over the missing covariate

values, vmis, within the MCMC algorithm to obtain

a sample from ?(? c, vobs).

f(vmis, vobs ?) Categorical Data

- For categorical data (coat type and horn type),

we assume the following model. - Let y1,i denote the horn type of individual i.

Then, y1,i 2 1,2,3, and we assume that, - y1,i Multinomial(1,q),
- where q q1, q2, q3.
- Thus, we have additional parameters, q, which can

be regarded as the underlying probability of each

horn type. - The qs are updated within the MCMC algorithm, as

well as the y1,is which are unknown. - We assume the analogous model for coat type.

f(vmis, vobs ?) Continuous Data

- Let y3,i denote the birth weight of individual i.
- Then, a possible model is to assume that,
- y3,i N(?, ?2),
- where ? and ?2 are to be estimated.
- For the weight of individual i, aged a at time t,

denoted by z1,i,t we set, - z1,i,t N(z1,i,t-1 ?a ?t, ?w2),
- where the parameters ?a, ?t and ?w2 are to be

estimated. - In general, the modelling assumptions will depend

on the system under study.

Practical Implications

- Within each step of the MCMC algorithm, we not

only need to update the parameter values ?, but

also impute the missing covariate values and

random effects (if present). - This can be computationally expensive for large

amounts of missing data. - The posterior results may depend on the

underlying model for the covariates a

sensitivity analysis can be performed using

different underlying models. - (State-space modelling can also be implemented

using similar ideas see Steves talk).

Issue 2 Model Selection

- For the sheep data set we can now deal with the

issue of missing covariate values. - But.. how do we decide which covariates to use

and/or the age structure? often there may be a

large number of possible covariates and/or age

structures. - Discriminating between different models can often

be of particular interest, since they represent

competing biological hypotheses. - Model choice can also be important for future

predictions of the system.

Possible Models

- We only want to consider biologically plausible

models. - We have uncertainty on the age structure of the

survival rates, and consider models of the form - ?1a ?a1b ?j
- k is unknown a priori and also the covariate

dependence for each age group. - We consider similar age-type models for p and ?

with possible arbitrary time dependence. - E.g. ?1(N,BW), ?27(W,L),?8(N,P)/p(t)/?1,?2(t)
- The number of possible models is immense!!

Bayesian Approach

- We treat the model itself to be an unknown

parameter to be estimated. - Then, applying Bayes Theorem we obtain the

posterior distribution over both parameter and

model space - ?(?m, m data) Ç L(data ?m, m) p(?m) p(m).
- Here ?m denotes the parameters in model m.

Reversible Jump MCMC

- The reversible jump (RJ)MCMC algorithm allows us

to construct a Markov chain with stationary

distribution equal to the posterior distribution. - It is simply an extension of the

Metropolis-Hastings algorithm that allows moves

between different dimensions. - This algorithm is needed because the number of

parameters, ?m, in model m, may differ between

models. - Note that this algorithm needs only one Markov

chain, regardless of the number of models.

Posterior Model Probabilities

- Posterior model probabilities are able to

quantitatively compare different models. - The posterior probability of model m is defined

as, - ?(m data) s ?(?m,m data) d?m.
- These are simply estimated within the RJMCMC

algorithm as the proportion of the time the chain

is in the given model. - We are also able to obtain model-averaged

estimates of parameters, which takes into account

both parameter and model uncertainty.

General Comments

- The RJMCMC algorithm is the most widely used

algorithm to explore and summarise a posterior

distribution defined jointly over parameter and

model space. - The posterior model probabilities can be

sensitive to the priors specified on the

parameters (p(?)). - The acceptance probabilities for reversible jump

moves are typically lower than MH updates. - Longer simulations are generally needed to

explore the posterior distribution. - Only a single Markov chain is necessary,

irrespective of the number of possible models!!

Problem 1

- Constructing efficient RJ moves can be difficult.
- This is particularly the case when updating the

number of age groups for the survival rates. - This step involves
- Proposing a new age structure (typically

add/remove a change-point) - Proposing a covariate dependence for the new age

structure - Proposing new parameter values for this new

model. - It can be very difficult to construct the Markov

chain with (reasonably) high acceptance rates.

Example Reversibility Problem

- One obvious (and tried!) method for adding a

change-point would be as follows. - Suppose that we propose to split age group ac.
- We propose new age groups ab and b1c.
- Consider a small perturbation for each (non-zero)

regression parameter, e.g., - ?ab ?ac ? and ?b1c ?ac - ?.
- where ? N(0,??2).
- However, to satisfy the reversibility

constraints, a change-point can only be removed

when the covariate dependence structure is the

same for two consecutive age groups

Example High Posterior Mass

- An alternative proposal would be to set
- ?ab ?ac (i.e. keep all parameters same for

ab). - Propose a model (in terms of covariates) for

?b1c - Choose each possible model with equal probability

(reverse move always possible) - Choose model that differs by at most one

individual and one environmental covariate. - The problem now lies in proposing sensible

parameter values for the new model. - One approach is to use posterior estimates of the

parameters from a saturated model (full

covariate dependence for some age structure) as

the mean of the proposal distribtion.

Problem 2

- Consider the missing covariates vmis.
- Then, if the covariate is present, we have,
- ?(vmis vobs, ?, data) / L(data ?, vobs,

vmis) - f(vmis, vobs ?).
- However, if the covariate is not present in the

model, we have, - ?(vmis vobs, ?, data) / f(vmis, vobs ?).
- Thus, adding (or removing) the covariate values

may be difficult, due to (potentially) fairly

different posterior conditional distributions. - One way to avoid this is to simultaneously update

the missing covariate values in the model move.

Soay Sheep Analysis

- We now use these techniques for analysing the

(complex) Soay sheep data. - Discussion with experts identified several points

of particular interest - the age-dependence of the parameters
- identification of the covariates influencing the

survival rates - whether random effects are present.
- We focus on these issues on the results presented.

Prior Specification

- We place vague priors on the parameters present

in each model. - Priors also need to be specified on the models.
- Placing an equal prior on each model places a

high prior mass on models with a large number of

age groups, since the number of models increases

with the number of age groups. - Thus, we specify an equal prior probability on

each marginal age structure and a flat prior over

the covariate dependence given the age structure

or on time-dependence.

Results Survival Rates

- The marginal models for the age groups with

largest posterior support are - Note that with probability 1, lambs have a

distinct survival rate. - Often the models with most posterior support are

close neighbours of each other.

Age-structure Posterior probability

?1 ?27 ?8 0.943

?1 ?27 ?89 ?10 0.052

Covariate Dependence

BF 3

BF 20

BF Bayes factor

Influence of Covariates

Weight

Population size

Age 1

Age 10

Marginalisations

- Presenting only marginal results can hide some of

the more intricate details. - This is most clearly seen from another MRR data

analysis relating to the UK lapwing population. - There are two covariates time and fdays a

measure of the harshness of the winter. - Without going into too many details, we had MRR

data and survey data (estimates of total

population size). - An integrated analysis was performed, jointly

analysing both data sets.

Marginal Models

- The marginal models with most posterior support

for the demographic parameters are - (a) ?1 1st year survival (b) ?a adult

survival - Model Post. prob. Model Post. prob.
- ?1(fdays) 0.636 ?a(fdays,t) 0.522
- ?1 0.331 ?a(fdays) 0.407
- (c) ? productivity (d) l recovery rate
- Model Post. prob. Model Post. prob.
- r 0.497 l(t) 0.730
- r(t) 0.393 l(fdays,t) 0.270

This is joint work with Steve Brooks, Chiara

Mazzetta and Steve Freeman

Warning!

- The previous (marginal) posterior estimates hides

some of the intricate details. - The marginal models of the adult survival rate

and productivity rate are highly correlated. - In particular, the joint posterior probabilities

for these parameters are - Model Post. prob.
- ?a(fdays,t), ? 0.45
- ?a(fdays), ?(t) 0.36
- Thus, there is strong evidence that either adult

survival rate OR productivity rate is time

dependent.

Summary

- Bayesian techniques are widely used, and allow

complex data analyses. - Covariates can be used to explain both temporal

and individual heterogeneity. - However, missing values can add another layer of

complexity and the need to make additional

assumptions. - The RJMCMC algorithm can explore the possible

models and discriminate between competing

biological hypotheses. - The analysis of the Soay sheep has stimulated new

discussion with biologists, in terms of the

factors that affect their survival rates.

Further Research

- The development of efficient and generic RJMCMC

algorithms. - Assessing the posterior sensitivity on the model

specification for the covariates. - Investigation of the missing at random assumption

for the covariates in both classical and Bayesian

frameworks. - Prediction in the presence of time-varying

covariate information.