Local%20Enhancement%20%20of%20Global%20Estimation - PowerPoint PPT Presentation

About This Presentation
Title:

Local%20Enhancement%20%20of%20Global%20Estimation

Description:

progress & new directions. Two-stage sample design. Spatial modeling of EMAP data ... For some, did two manual and one automatic fit ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 48
Provided by: David2863
Category:

less

Transcript and Presenter's Notes

Title: Local%20Enhancement%20%20of%20Global%20Estimation


1
Local Enhancement of Global Estimation
Molly Leecaster, Ph.D. Kerry Ritter, Ph.D.
DAMARS and STARMAP 2nd Annual Conference Oregon
State University Corvallis, OR August 11, 2003
2
Acknowledgement
PROJECT FUNDING
  • The work reported here was developed under the
    STAR Research Assistance Agreement CR-829095
    awarded by the U.S. Environmental Protection
    Agency (EPA) to Colorado State University. This
    presentation has not been formally reviewed by
    EPA.  The views expressed here are solely those
    of the presenter and STARMAP, the Program they
    represent. EPA does not endorse any products or
    commercial services mentioned in this
    presentation.

3
Outline of Presentation
  • Introduction
  • Two-stage sample design
  • Spatial modeling of binary EMAP data
  • Indicator kriging
  • Conditional autoregressive model
  • Simulation Example
  • Future work

4
Introduction
  • EMAP developed for estimation of areal extent of
    resources
  • Sample locations are spatially separated
  • EMAP participants are interested in global
    estimation but also have local concerns
  • Spatial modeling
  • EMAP data does not provide information on the
    local spatial structure required for good spatial
    models
  • Therefore .
  • Augment EMAP design to improve spatial modeling

5
Goals
  • Present enhancement to EMAP design
  • Use of enhanced sample in spatial models of
    indicator data
  • Indicator kriging
  • Conditional autoregressive model

6
Outline of Presentation
  • Introduction
  • Two-stage sample design
  • Spatial modeling of EMAP data
  • Simulation Example
  • Future work

7
Two-stage Systematic Grid Plus Star Cluster
Sample Design
  • Two-stage because two goals
  • Systematic (EMAP) grid for global structure
  • Star cluster sample for variogram estimation
  • Enhance EMAP design with additional sample
    locations
  • Ideal for areal extent and prediction
  • Ideal for variogram estimation

8
Two-Stage Design
Pink..absence Blue..presence Black....s
ystematic Green...star clusters
1 Orange....star clusters 2
9
Stage One Systematic Component (EMAP)
  • Based on global estimation requirements
  • e.g. 30 spatially separated locations per strata

10
Stage TwoStar Cluster Component
  • Star clusters of sample sites around stage-one
    locations
  • Star clusters provide estimate of small scale
    pair-wise variance
  • Star clusters also provide many added pairs of
    samples at various distance lags
  • Star clusters provide directional information at
    small scale
  • How to specify star clusters?

11
Stage TwoStar Cluster Component
  • Location of star clusters
  • Adaptive, locate at specified observed response
  • Does this bias the variogram estimation?
  • Random stage-one locations
  • Systematic subset of stage-one locations
  • Size of star clusters
  • Diameter of star variogram range
  • Diameter of star gt variogram range
  • Number of star clusters
  • At least two, but how many more?

12
Outline of Presentation
  • Introduction
  • Two-stage sample design
  • Spatial modeling of EMAP data
  • Simulation Example
  • Future work

13
Spatial Models for Binary Data
  • Indicator kriging for geo-referenced data
  • Conditional autoregressive model for binary
    lattice data

14
Indicator Kriging
  • Binary geo-referenced data
  • Spatial correlation structure modeled from data
  • Precision of predictions depends on sample
    spacing and variogram parameters

15
Ordinary Indicator Kriging
  • Estimate local indicator mean,
    , at each location
  • Apply simple IK estimator using estimated mean

16
Conditional Autoregressive Model for Binary Data
  • Binary lattice data
  • Spatial correlation structure assumed locally
    (neighborhood) dependent Markov random field
  • Neighborhood defined as fixed pattern of
    surrounding grid points
  • Precision of predictions depends on neighborhood
    structure, grid size, and variance of response

17
Conditional Autoregressive Model for Binary Data
18
Comparison of Models
  • Ordinary Indicator Kriging
  • Advantages
  • Knowledge of spatial relationship improves
    prediction
  • Assumed spatial relationship based on data
  • Disadvantages
  • Not robust to variogram mis-specification
  • Requires strong stationarity assumption
  • Conditional autoregressive
  • Advantages
  • No need to estimate or model variogram
  • Can be used without geo-referenced data
  • Disadvantages
  • Assumed spatial relationship based on a grid size
    that could be inaccurate

19
Outline of Presentation
  • From last year to now progress new
    directions
  • Two-stage sample design
  • Spatial modeling of EMAP data
  • Simulation Example
  • Future work

20
Simulation Example
  • Used simulation so spatial structure was known
  • Simulated response from specific variogram model
    on to 50x50 hexagon grid of points
  • Specified presence/absence cutoff
  • Applied two-stage sample design (2 realizations)
  • Estimated and modeled variogram from sample data
  • For some, did two manual and one automatic fit
  • Predicted probability of presence using indicator
    kriging and conditional autoregressive model

21
Simulation Methods
  • Simulated data from Gaussian random field
    (S-Plus)
  • Spherical variogram, range 22, sill 0.4,
    nugget 0
  • Simulated value gt 2 gt presence
  • Sample Designs
  • Systematic sample (n30)
  • Systematic sample plus 2 star clusters (n54)
  • Systematic sample plus 4 star clusters (n78)
  • Models
  • Indicator kriging
  • Conditional autoregressive model

22
Data Simulation with Sample Sites
Pink..absence Blue..presence Black....s
ystematic Green...star clusters
1 Orange....star clusters 2
23
Variogram for Sample Designs
Systematic
Systematic 2 Stars
Systematic 4 Stars
Range Sill Nugget
Systematic 17 0.17 0
Sys. 2 20 0.4 0
Sys. 4 14 0.4 0
24
Systematic Sample Results
25
Systematic Sample with 2 Stars
26
Systematic Sample with 4 Stars
27
Three Fits Systematic 2 Stars
Automatic Fit
Manual Fit 1
  • Range Sill Nugget
  • 17 0.3 0
  • 0.4 0
  • 0.27 0
  • All use correct model

Manual Fit 2
28
Predictions from 3 Variogram Fits
Automatic Fit
Manual Fit 1
Manual Fit 2
29
Comparison of Prediction Errors
  • Sensitivity
  • Number of presence sites predicted to be present
  • Specificity
  • Number of absence sites predicted to be absent
  • True Positive Rate
  • Number of predicted presence sites that truly are
    present
  • True Negative Rate
  • Number of predicted absence sites that truly are
    absent

30
Comparison of Predictions (Data1F) (positive if
probability gt 0.5)(Auto, Manual 2)
Model Sample Sensitivity Specificity True Positive Rate True Negative Rate
Indicator Kriging Systematic 28 98 85 74
Systematic 2 Stars 41 (36, 27) 94 (96, 99) 77 (80, 76) 77 (90, 74)
Systematic 4 Stars 32 97 85 75
Conditional Auto. Systematic 15 96 63 70
Systematic 2 Stars 56 85 64 80
Systematic 4 Stars 54 86 65 80
31
Comparison of Predictions (Data1F) (positive if
probability gt 0.3)(Auto, Manual 2)
Model Sample Sensitivity Specificity True Positive Rate True Negative Rate
Indicator Kriging Systematic 48 91 71 78
Systematic 2 Stars 59 (56, 44) 85 (87, 93) 65 (67, 76) 81 (80 ,78)
Systematic 4 Stars 49 91 73 79
Conditional Auto. Systematic 48 80 53 76
Systematic 2 Stars 80 46 42 83
Systematic 4 Stars 80 49 43 83
32
Data Simulation with Sample Sites
Pink..absence Blue..presence Black....s
ystematic Green...star clusters
1 Orange....star clusters 2
33
Variograms for Sample Designs
Systematic
Systematic 2 Stars
Systematic 4 Stars
Range Sill Nugget
Systematic 15 0.27 0
Sys. 2 12 0.30 0.05
Sys. 4 13 0.30 0.03
34
Systematic Sample Results
35
Systematic Sample with 2 Stars
36
Systematic Sample with 4 Stars
37
Three Fits Systematic
Automatic Fit
Manual Fit 1
  • Range Sill Nugget
  • 30 .25 .21
  • 15 .27 0
  • .22 0
  • All use correct model

Manual Fit 2
38
Predictions from 3 Variogram Fits
Automatic Fit
Manual Fit 1
Manual Fit 2
39
Comparison of Predictions (Data3F) (positive if
probability gt 0.5)(Auto, Manual 2)
Model Sample Sensitivity Specificity True Positive Rate True Negative Rate
Indicator Kriging Systematic 31 (1, 15) 92 (99, 97) 65 (88, 69) 73 (68, 70)
Systematic 2 Stars 21 96 75 72
Systematic 4 Stars 24 97 81 72
Conditional Auto. Systematic 7 98 65 69
Systematic 2 Stars 17 97 71 71
Systematic 4 Stars 18 99 88 71
40
Comparison of Predictions (Data3F) (positive if
probability gt 0.3)(Auto, Manual 2)
Model Sample Sensitivity Specificity True Positive Rate True Negative Rate
Indicator Kriging Systematic 62 (72, 37) 80 (69, 89) 60 (53, 63) 81 (84, 75)
Systematic 2 Stars 43 90 68 77
Systematic 4 Stars 44 91 71 77
Conditional Auto. Systematic 68 57 41 77
Systematic 2 Stars 78 58 47 84
Systematic 4 Stars 80 56 47 85
41
Simulation Conclusions - Design
  • Two star clusters improved small-scale features
    of variogram
  • Two star clusters improved prediction accuracy
  • Four star clusters offered little improvement
    over two stars

42
Simulation Conclusions - Models
  • Variogram model affects predictions
  • Kriging tends toward overall mean probability of
    presence, i.e. it smooths
  • Kriging builds patches whose diameter is
    approximately the range of the variogram
  • Conditional autoregressive model attempts to
    connect observed presence
  • Neither model had consistently higher sensitivity
    or specificity

43
Outline of Presentation
  • From last year to now progress new
    directions
  • Two-stage sample design
  • Spatial modeling of EMAP data
  • Simulation Example
  • Future work

44
Future Work
  • Further simulation studies on two stage design
  • Effect of sample size
  • Number of star clusters necessary to improve
    variogram estimation
  • Effect of size of star clusters
  • Bias from adaptive second-stage sampling
  • Advantages of indicator kriging and conditional
    autoregressive model
  • Sensitivity of conditional autoregressive model
    to initial values, prior distributions, and grid
    size
  • Sensitivity of kriging to variogram model
    specification

45
Future Work
  • Apply two-stage sample design to real data
  • DDT data from Santa Monica Bay, CA
  • EMAP data and local monitoring data
  • Freely distribute functions for applying the
    conditional autoregressive model on a hexagon
    lattice
  • Functions in R to produce hexagon lattice input
    for WinBUGS
  • File in WinBUGS to apply model
  • Investigate optimal grid size to achieve EMAP and
    spatial modeling goals

46
Systematic (EMAP) Grid Based on Variogram Model
  • Kriging variance
  • Analog for conditional autoregressive model

47
Systematic (EMAP) Grid Based on Variogram Model
  • Prediction variance is minimized by large
    covariance between prediction location and sample
    locations
  • For kriging, grid refers to sample locations
  • For conditional autoregressive, grid refers to
    sample locations and prediction locations
  • Want -------- Sample locations close together
  • Samples too far apart gt
  • Kriging -gt correctly uses no spatial relationship
  • Conditional autoregressive -gt incorrectly uses
    assumed spatial relationship
  • Samples too close together gt waste of resources
Write a Comment
User Comments (0)
About PowerShow.com