Uncertainty in Ecological Risk Assessment using Bayesian Model Averaging - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

Uncertainty in Ecological Risk Assessment using Bayesian Model Averaging

Description:

Stressor-response modeling with species taxa data. Canonical Correspondence Analysis ... Common taxa: Chironomidae (midge larvae), Tubificidae (worms) and Sphaeridae ... – PowerPoint PPT presentation

Number of Views:95
Avg rating:3.0/5.0
Slides: 44
Provided by: Ilya75
Category:

less

Transcript and Presenter's Notes

Title: Uncertainty in Ecological Risk Assessment using Bayesian Model Averaging


1
Uncertainty in Ecological Risk Assessment using
Bayesian Model Averaging
Eric P. Smith, Keying Ye, Ilya A. Lipkovich
Department of Statistics Virginia Polytechnic
Institute and State University May 14, 2002
2
Outline
  • Stressor-response modeling with species taxa data
  • Canonical Correspondence Analysis
  • Problems with basic regression approaches
  • Bayesian Model Averaging
  • General approach for using and combining
    important models
  • Ex Great Lakes data

3
The problem
  • Stressor response modeling
  • Response species abundance
  • Stressors chemical, habitat alteration, etc
  • Approach regression, CCA
  • Summarize model using a graphical display that
    describes the relationship between taxa
    abundances and stressor variables
  • Generally bad to use all the variables

4
Current approach - graphical
  • Example reference site analysis
  • Great Lakes sediment samples (91-93)
  • 233 sites
  • 39 benthic invertebrate familes
  • 38 chemicals, physical variables
  • Common taxa Chironomidae (midge larvae),
    Tubificidae (worms) and Sphaeridae (fingernail
    clams), Naididae (worms), Pontoporeiidae
    (shrimps) and Spongillidae (sponges).
  • Abundant taxa Tubificidae and Pontoporeiidae
    (gt20) Chironomidae and recent invaders, the
    Dreissenidae (zebra and quagga mussels).

5
Location of sites
6
Canonical Correspondence Analysis (CCA)
CCA (Ter Braak, 1986) allows one to relate the
latent gradients associated with distribution of
species over sites to certain environmental
variables measured on each site.
Y- Responses (species)
Explanatory variables (X)
7
Canonical Correspondence Analysis
  • Correspondence analysis (just analyze Y)
  • analysis of table of counts (species abundance by
    site)
  • Seek to reduce the data into gradient(s)
  • Plot the gradients, scores on gradients
  • Focus is on chi-square analysis (since we are
    dealing with counts)
  • Canonical correspondence (analyze Y and X)
  • Focus is on relationships between counts and
    descriptor variables
  • Correspondence analysis of predicted counts
    (predicted from multivariate regression)
  • Variable selection through testing

8
CCA
  • What do we want to know?
  • Is there a stressor-response relationship?
  • What stressors are involved?
  • What species are affected?
  • Which locations are affected?
  • What are the sizes of the uncertainties in the
    modeling process?

9
Correspondence Analysis Biplot
Axes interpreted as gradients Sites close
together are similar Species near center are at
expectation
10
Adding Explanatory Variables to the Plot
Ray lengthimportance Projection onto ray size
on that variable
11
Whats wrong
  • Method is graphical
  • No uncertainty
  • No test of variable importance
  • No measure of species importance
  • No measure of site score variability
  • No uncertainty due to model selection

12
Current approach inferential
  • In CCA, use permutation testing to
  • Test of overall relationship is significant
  • Select a subset of variables
  • Analogous to stepwise (forward) selection methods
  • Species, site importance from scores and location

13
Traditional Approaches to Model Selection(subset
selection)
  • Model selection criteria (typically in a form of
    penalized likelihood) R2 adj, AIC, BIC, Cp,
    Cross-validation and bootstrap)
  • Incremental model building based on maximum
    contribution to the likelihood (stepwise,
    backward deletion, forward addition, leaps and
    bounds, etc)
  • A combination of the above. Example use forward
    selection to choose the best model for each
    number of variables k, then use Mallows Cp to
    pick the best one

14
Criticism of approach
  • Assumes a fixed model (true model)
  • All inference (predicted values, etc) are based
    on the assumption that the estimated model is
    correct
  • Alternative approach use several models
  • Can take a frequentist or Bayesian approach

15
Frequentist and Bayesian Views at Model Selection
Uncertainty
Frequentist View
Bayesian View
model parameter is a random variable for which
posterior probabilities can be computed
First some parameters are estimated as zeroes (a
model is selected)
When parameters left in the model are estimated
we also have to account for uncertainty
associated with the first stage (via bootstrap)
The uncertainty associated with the model
parameter is accounted for by averaging it out
16
Bayesian Model Averaging details for the math
hungry
  • posterior model probabilities
  • integrated likelihood
  • predictive posterior distribution for a quantity
    of interest

D is the data, Mk is a model in some model space
M, K is number of models, ? is a quantity of
interest (such as a regression coefficient)
17
Elements of BMA
  • Assigning prior distributions for models and
    model parameters
  • Searching through the model space for
    data-supported models
  • Computing posterior model probabilities
  • Obtaining estimates and confidence intervals for
    the parameters and observables of interest
  • Obtaining variance components associated with
    model selection uncertainty
  • Prediction of interesting quantities (new sites)

18
Searching the Model Space for Data-Supported
Models
Need to fit all the models and evaluate the
fit Why because we want to weight by
probability of the model given the
data Need to calculate probability of model
importance
19
Problems to solve in BMA approach
  • Searching the Model Space for Data-Supported
    Models
  • Stochastic Model Search via Markov Chain Monte
    Carlo
  • Implementing BMA Markov Chain Monte Carlo Model
    Composition, MC3 (Madigan, Raftery (1994))
  • Computing Acceptance Probabilities via Bayes
    Factors and BIC Approximation (how do I decide to
    move or not to move)
  • How do I approximate the value of the model?
    BIC
  • How do I combine information to estimate
    uncertainty?
  • Need to align eigenvectors from different models

20
Skyline Metaphor of MC3 - walking through the
city to find the tallest building checking height
of each building
M2
M9
M1
M7
M3
M8
M5
M6
M4
21
Uncertainty estimation in BMA Semi-Bayesian
Approach
One can use classical procedures to obtain any
quantity of interest ? and then use model weights
to obtain the BMA estimates of their
unconditional expectation and variance
Model selection uncertainty component
Estimated via a classical procedure or bootstrap
22
Bayesian Variable Assessment
The posterior effect probabilities can be
computed from the standard BMA output
i.e. this is the proportion of models containing
this variable
One can plot the ordered values of these
probabilities (scree plot) to figure out which
variables should be selected in the final model,
if we still want to build a single model
23
Is prediction better for a single model or
combination of models?
Prediction error is based on the differences
between actual and fitted values in the
prediction data, DP, when applying parameters
estimated from the training data , DT
Relative efficiency of BMA with respect to the
best and full models is determined as follows
24
Summary of BMA Predictive Performance in a
Simulation Study
25
Interpreting BMA Output A simulated Example
  • YXABE, where ABM is a rank 2 matrix of
    coefficients for the five variables x1-x5
  • E contains iid normal errors with unit variance
  • The data were augmented with 5 more rows of
    zeroes representing additional irrelevant
    variables, x6-x10

26
Interpreting BMA Output A simulated Example
(cont.)
27
Interpreting BMA Output. A simulated Example
(cont.)
28
Representing The Variance Components
  • the outer ellipses shown with dashed lines
    represent the total variability
  • the inner ellipses represent only sampling
    variability
  • the variable scores and sampling variances on
    the right display are rescaled by multiplying
    them by the activation probabilities.

29
Great Lakes Data Biplot display of species and
variables
30
Great Lakes Data
31
Uncertainty Intervals for Species Scores on the
First Axes
Larger uncertainties tend to correspond to rare
species
rarer
0 taxa at expectation (even dist)
32
Sites, species and uncertainties
33
Biplot display showing model and sampling
uncertainty
Sampling uncertainty
Model uncertainty
34
Biplot of Species and Environmental Variables
after Scaling Variable Scores by Activation
Probabilities
Activation Probabilities
  • Illustrate important species
  • Show Depth-Temperature gradient
  • Show importance of agricultural input

35
Another application of the methodology BMA based
Outlier and Cluster Analysis
Model space can be extended to incorporate both
variables and observations (Hoeting, Raftery, and
Madigan, 1996) .
variables
observations
(1,1,0,1,0,1,0 1, 1, 0, 0, 1, 1, 0, 1, 0, 1,
0, .., 1,1,1,1,1)
  • We can use this combined model space to
    add/delete both variables and observations.
  • The goal is to obtain clusters of observations
    with individual sets of predictors
  • Outliers can be isolated in a separate cluster

36
BMA based Outlier and Cluster Analysis
  • Example of use
  • Data collected in a watershed
  • Effect of stressor is over a spatial subset of
    the watershed
  • Fitting regression model to entire set of data
    may reduce or mask the relationship
  • Find the subset of the data that supports the
    stressor response relationship
  • Great Lakes
  • Look for sets of sampling sites that are
    geographically close and have biology that is
    related to certain environmental variables

37
Summary
  • BMA is a method for not discarding variables but
    still developing good inference
  • Components model uncertainty, importance of
    species and variables
  • Further developments
  • Develop BMA Cluster Analysis for CCA
  • Investigate the properties
  • Implementing Missing Value Imputation in CCA,
    possible adding missing value modeling along with
    the cluster process

38
References
Bensmail, H., Celeux, G., Raftery, A.E., Robert,
C. (1997). Inference in Model-Based Cluster
Analysis, Statistics and Computing, 7, 1-10.
Hoeting, J.A., Raftery, A.E., Madigan, D.
(1996). A method for simultaneous variable
selection and outlier identification in linear
regression. Computational Statistics and Data
Analysis, 22, 251-270. Gabriel, K.R. (1971). The
biplot-graphic display of matrices with
application to principal component analysis.
Biometrika, 58, 453-67. George, E.I., McCulloch,
R.R. (1993). Variable selection via Gibbs
sampling. JASA, 88, 881-889. Eckart, C and
Young, G. (1936). The approximation of one matrix
by another of lower rank. Psychometrika, 1,
211-218. Madigan, D., Raftery, A.E. (1994).
Model Selection and Accounting for Model
Uncertainty in Graphical Models Using Occams
Window, JASA, 89428, 1535-1546 Madigan, D.,
York, J. (1995). Bayesian Graphical Models for
Discrete Data. Int. Statist. Review. 63, 215-32.
Raftery, A.E., Madigan, D., Hoeting, J.A.
(1997). Bayesian Model Averaging for Linear
Regression Models, JASA, 92437, 179-191. Ter
Braak, C.J.F.(1994). Biplots in Reduced-Rank
Regression. Biom. J. 368, 983-1003
39
Problem Estimating Variance Components of
Eigenvectors
  • We estimate the sampling uncertainty of the
    species and site coordinates via bootstrap, and
    model selection uncertainty via MC3
  • In either procedures, we may obtain eigenvectors
    whose signs are flipped or whose positions are
    changed
  • A general solution is Procrustes transformation
  • Procrustes transformation is an orthogonal
    transformation that makes the new coordinates as
    close as possible to some original coordinates
  • For bootstrap these original coordinates would be
    the eigenvectors obtained with the original data
  • For BMA, the original coordinates are those
    obtained with the full model

40
Searching the Model Space for Data-Supported
Models
Problem Large number of models in regression
setting In subset selection problem, when number
of explanatory variables exceeds 25-30,
evaluating this expression directly is
unfeasible. The number of models K gt 230
Solution
Stochastic Model Search via Markov Chain Monte
Carlo
41
Implementing BMA Markov Chain Monte Carlo Model
Composition, MC3 (Madigan, Raftery (1994))
Idea Define a rule for wandering through the
possible models. Walk in a way to find a best
model by starting at a random model and
comparing it with neighbor models. Stop at
the best model but record other model fits.
Occassionaly move to a worse model to avoid
getting stuck. Complex idea A Markov chain is
constructed, defining a neighborhood nbd(M) for
each model. In linear regression the
neighborhood can be the set of models with either
one variable more or one variable less than M,
plus M itself. Models are accepted with
probability depending on the fit of the model
42
Computing Acceptance Probabilities via Bayes
Factors and BIC Approximation
BF
Acceptance probability
gt1 if M is a good model
Posterior odds ratio (BF) x (Prior odds ratio)
43
Computing Bayes Factors via BIC
This must be computed but is not easy
approximate it
The Bayesian Information Criterion (BIC, Schwarz
1978 Kass and Wasserman, 1995) approximation is
the simplest
?2 is the likelihood ratio test statistic, M0 is
nested within M1 One advantage of BIC
approximation is that it eliminates the need to
specify prior distributions for model parameters
Write a Comment
User Comments (0)
About PowerShow.com