Title: Comparative Analysis of Statistical Tools To Identify Recruitment-Environment Relationships and Forecast Recruitment Strength
1Comparative Analysis of Statistical Tools To
Identify Recruitment-Environment Relationships
and Forecast Recruitment Strength
- Bernard A. Megrey
- Yong-Woo Lee
- S. Allen Macklin
National Oceanic and Atmospheric
AdministrationNational Marine Fisheries
ServiceAlaska Fisheries Science CenterSeattle,
WA 98115 USA
2Overview
- Background and motivation
- Mechanics of testing procedures
- Results of application of 3 statistical tools to
2 data sets - Concluding remarks and observations
3Why Forecast Recruitment?
- Understand important bio-physical factors
controlling the recruitment processes - The ultimate test of a model is its ability to
predict - Project future stock dynamics
- Evaluate management scenarios
- Provide reference points for fishery management
- Assist commercial fisheries decision making
4- The data we collect as it relates to recruitment
variability and the factors that influence it
probably will not change dramatically in the near
future. - We should endeavor to treat the data differently
in a statistical sense. - R.J.H Beverton 1989
5- What are the best statistical tools for
estimating environment-recruitment relationships
and forecasting future recruitment states?
?
?
?
?
?
?
6Problems in Forecasting
- The complexity of recruitment forecasting often
seems beyond the capabilities of traditional
statistical analysis paradigms because. - Bio-physical relationships are inherently
nonlinear - Often there are limitations in theoretical
development or standard models cannot deal with
data pathologies - Inability to meet required assumptions
- Time series of data are short
- Lack of degrees of freedom
- The need to partition already short time series
into segments representing identified regimes
7Objectives
- Test and compare several statistical methods to
evaluate their ability - to identify recruitment-environment relationships
- to forecast future recruitment
- In a real world setting we can never know the
parameters and underlying relationships of actual
data - simulate data with known properties and different
levels of measurement error using Gulf of Alaska
pollock - Use methods on actual North Atlantic data
- Norwegian spring spawn herring SB and R, Kola
Line SST, and Index of NAO (Toresen and Ostvedt
2000) - Environmental effects occur in birth year (i.e.
no lags)
8Simulated Data with Known Properties
R aSexp(-bScNdTe)
R RecruitmentS Spawning Biomass N Wind
Anomaly - No relationship T Sea Surface
Temperature e Measurement Error, N(0,s2), s2
was estimated from a Ricker fit to actual data.
9Summary of Simulated Data
SB Wind SST
Relationship to Recruitment Nonlinear Ricker none Linear
Functional Relationship Exponential Nonlinear Mean Log Linear
Probability Distribution Gamma Lognormal Normal
10Herring
Simulated
11Tested Statistical Tools
- Recruitment on the absolute scale (billion fish)
- Nonlinear Regression (NLR)
- Generalized Additive Models (GAM)
- Artificial Neural Network (ANN)
FISHERIES APPLICATIONS GAM Cury et al. 1995
Swartzman et al. 1995 Meyers et al. 1995
Jacobsen and MacCall 1995 Daskalov 1999 ANN Chen
and Ware 1999
12Neural Networks
13General Additive Models
14Comparisons
- Statistical Methods
- Parametric (NLR) vs. Non-parametric (GAM, ANN)
- Conventional (NLR) vs. Innovative (GAM, ANN)
- Model Free (GAM, ANN) vs. functional
relationships specified a priori (NLR)
15Time Series Partitioning
- 2 Data Segments
- Training segment used for parameter estimation
- Forecasting segment used for forecasting
accuracy - Simulated Data (n42)
- Training segment (n37)
- Forecasting segment (n5)
- Herring Data (n89)
- Training segment (n79)
- Forecasting segment (n10)
16Simulated vs Predicted, for Error level 0
17Simulated Data Testing and Forecast
SegmentComparison using MSE
ERROR 2
ERROR 1
ERROR 0
18Simulated Data ANNRelative WeightComparison
3 variables 2 hidden neurons 10 parms
2 variables 2 hidden neurons 8 parms
19Spurious Correlations
- We did see evidence of spurious correlations when
analyzing the simulated data. - The GAM model, Err 2 R SB WIND SST
- WIND was significant in NLR model, Err3.
When dealing with data with typical levels of
variation, it is possible to conclude that
unnecessary or irrelevant variables are
significant.
Spurious correlations are the first enemy of
recruitment biologists Tyler (1992)
20Herring Data
Testing and Forecast SegmentComparison using
MSE and R2
ANN Relative WeightComparison
21GAM fit to Herring Data
22Summary
- Need to be cautious when dealing with noisy data,
because a wrong model or variable could be
identified as influential to recruitment. - We did see evidence of spurious correlations
under very controlled data situations. - It appears that ANNs forecast better than
conventional parametric methods when data are
noisy. - Non-parametric methods (GAMs and ANNs) work well
for suggesting functional relationships and
forecasting future recruitment states - desirable property because real systems are
highly non-linear and include complex
interactions among the variables.
23Summary (cont)
- There is no one best method to address the
environment-recruitment problem. - ANNs are highly flexible and show promise for
forecasting, thus using GAMs and ANNs together
with more traditional methods should enhance
analysis and forecasting. - When considering estimation in conjunction with
forecasting it is better to consider a balance
between best models. - Results underscore the need to build good
conceptual models first, then guided by
hypotheses regarding factors that control
recruitment and their time and space scales of
influence, judiciously apply a suite of
statistical models to quality data sets. - Data mining and kitchen stew correlation
exercises are not appropriate.
24Simulated Data
25Norwegian Spring Spawn Herring