Title: Lecture 4 Model Formulation and Choice of Functional Forms: Translating Your Ideas into Models
1Lecture 4Model Formulation and Choice of
Functional Forms Translating Your Ideas into
Models
2Topics
- Alternate models as multiple working hypotheses.
- Null models
- Choice of functional forms
3The triangle of statistical inference
Data
Inference
Probability Model
Scientific Model (hypothesis)
All hypotheses can be expressed as models!
4The Scientific Method
- Science is a process for learning about nature
in which competing ideas are measured against
observations - Feynman 1965
5Scientific Process
- Devise alternative hypotheses.
- Devise experiment(s) with alternative possible
outcomes. - Carry out experiments.
- Recycle procedure.
- -- Platt 1964 (Strong
inference)
But this is time consuming and not very useful
for many questions..
6The method of multiple working hypotheses
- It differs from the simple working hypothesis in
that it distributes the effort and divides the
affection. - Bring up into review every rationale
explanation of the phenomenon in hand and to
develop every tenable hypothesis relative to its
nature. - Some of the hypotheses have already been
proposed and used while others are the
investigators own creations. - An adequate explanation often involves the
coordination of several causes. - When faithfully followed for a sufficient time
it develops the habit of parallel or complex
thought. - The power of viewing phenomena analytically and
synthetically at the same time appears to be
gained . -
-
---T. C.Chamberlain, 1890. Science 15 92.
7What is the best model to use?
- This is the critical question in making valid
inferences from data. - Careful a priori consideration of alternative
models will often require a major change in
emphasis among scientists. - Model specification is more difficult than the
application of likelihood techniques.
8Formulation of Candidate Models
Translating your qualitative ideas into a
quantitative, algebraic model that can be tested
against alternative models
- Conceptually difficult.
- Subjective.
- Original and innovative.
- Models represent a scientific hypothesis.
9Where do models come from?
- Scientific literature.
- Results of manipulative experiments.
- Personal experience.
- Scientific debate.
- Natural resource management questions.
- Monitoring programs.
- Judicial hearings.
10Are models truth?
- Truth has infinite dimensions
- Sample data are finite
- Models should provide a good approximation to the
data - Larger data sets will support more complex
approximations to reality
11..empiricism, like theory, is based on a series
of simplifying assumptionsBy choosing what to
measure and what to ignore, an empiricist is
making as many assumptions as does any
theoretician. --David Tilman
Model selection is implicit in science
12Develop a set of a priori candidate models
- Include a global model that includes all
potential relevant effects. - Test of global model (R-square, goodness of fit
tests). - Develop alternative simpler models.
13Assessing alternative models
- How well does the model approximate truth
relative to its competitors? (high accuracy or
low bias). - How repeatable is the prediction of a model
relative to its competitors? (high precision or
low variance).
14Why do model selection at all?Principle of
parsimony
Variance
Bias 2
Number of parameters
Few
Many
15Principle of parsimony applied to model selection
- We typically penalize added complexity.
- A more complex model has to exceed a certain
threshold of improvement over a simpler model. - Added complexity usually makes a model more
unstable. - Complex models spread the data too thinly over
data. - Model selection is not about whether something is
true or not but about whether we have enough
information to characterize it properly.
16Reality Actual data
Example from page 33-34 of Burnham and Anderson
17A set of candidate models
18Too simple High bias (low accuracy)
UNDERFITTING!!
19Too complicated High variance (low precision)
OVERFITTING!!
20The compromise a parsimonious model
REASONABLE FIT
21Null Models
- Parametric methods advocate testing hypotheses
against a null expectation (Ho ). - Often the null is probably false simply on a
priori grounds (e.g., the parameter ? had no
effect). - In likelihood terms this usually means the null
model is the one that sets the value of parameter
? equal to 0 or 1.
22States of mind of a null hypothesis tester
Practical importance of Statistical
significance observed difference of observed
difference Not significant
Significant Not important Important
23Model Selection Methods
- Adjusted R- square.
- Likelihood Ratio Tests.
- Akaikes Information Criterion.
We will talk about these topics later
24Choice of Functional Forms
- Model formulation requires the specification of a
functional form that formalizes the relationship
between the predictive variables and the process
we are trying to understand. - The functional form should clarify the verbal
description of the mechanisms driving the process
under study. - Choosing a functional form is a skill that needs
to be developed over time.
25Choice of Functional FormsMechanism vs.
phenomenology
- Mechanistic based on some biological or
ecological model. - Phenomenological functions that fit the data
well or are simple/convenient to use.
26Choice of functional forms What matters?
- Does it represent what happens in your model?
- Does the shape of the function resemble actual
data? - Is the range of data desired delivered by this
function? - Does the function allow for ready variation of
the aspects of the question that the researcher
wants to explore? - What happens at either end (as x? 0 and x??)?
- What happens in the middle?
- Critical points (maxima, minima).
27Model Functions Vs. Probability Density Functions
Properties of pdfs
Prob(x)
x
28Some useful functions (not necessarily pdfs!)
- Exponential.
- Weibull.
- Logistic.
- Lognormal.
- Power.
- Generalized Poisson.
- Logarithmic.
29Exponential
30Exponential Decline in maximum potential growth
as a function of crowding
1
Species A
Species B
Effect on growth (Growth multiplier)
0
NCI (Neighborhood Crowding Index)
31Michaelis-Menten function
a 1.43 s 0.76
a 1.63 s 0.31
32Weibull function
The exponential is a special case of the Weibull
function (ß0)
33Weibull Example Dispersal functions
34Logistic
35Logistic Probability of mortality as a function
of storm severity
Canham et al. 2001
36Lognormal
37Lognormal Leaf litterfall as a function of
distance to the parent tree
Data from GMF, CT
38Lognormal Growth as a function of DBH
Max. Potential Growth (cm/yr)
Data from LFDP, Puerto Rico
39Power function small mammal distribution as a
function of canopy tree neighborhood
Schnurr et al. 2004.
40Parameter trade-offs More than one way to get
there.
NCI (Neighborhood Crowding Index)
41Things to keep in mind
- Scaling issues Pay attention to units, scales,
and conversions. - Multiplicative functions and parameter tradeoff.
- Computational issues
- Large exponent values
- Division by zero
- Logs of negative numbers
42Some useful references
- Catalog of curves for curve fitting. British
Columbia Ministry of Forests. - Abramowitz, M. and I. Stegun. 1965. Handbook of
Mathematical Functions. - McGill, B. 2003. Strong and weak tests of
macroecological theory. Oikos. - VanClay, J. 1995. Growth models for tropical
forests a synthesis of models and methods.
Forest Science.