Fitting models to data - PowerPoint PPT Presentation

About This Presentation
Title:

Fitting models to data

Description:

The likelihood for the two data sources combined is the product of the ... Select appropriate likelihood functions for each data source. ... – PowerPoint PPT presentation

Number of Views:288
Avg rating:3.0/5.0
Slides: 20
Provided by: PaulB136
Category:

less

Transcript and Presenter's Notes

Title: Fitting models to data


1
Fitting models to data II(The Basics of
Maximum Likelihood Estimation)
  • Fish 458, Lecture 9

2
The Principle of ML Estimation
  • We wish to select the values for the parameters
    so that the probability that the model generated
    (is responsible for) the data is a high as
    possible.
  • Taken another way if we have two candidate sets
    of parameters and the probability that one
    generated the data is ten times the other, we
    would naturally prefer the former.
  • OK, so how to we define this probability.

3
The Likelihood Function
  • What we need to compute is the likelihood
    function
  • If we have a discrete set of hypotheses / set of
    parameter vectors, then

4
A First Example
  • We observe Y6 and know that the observation
    process is based on the equation
  • Given Y6, the likelihood function is normal

5
A First Example - II
Y4
Y6
Note the parameter and not the data we are
given the data
6
Multiple Data Sources
  • If we have multiple data sources (CPUE and survey
    data for Cape Hake), we can establish a
    likelihood for each data source. The likelihood
    for the two data sources combined is the product
    of the likelihoods for each data source
  • Note We often work with the logarithm of the
    likelihood function, i.e.

7
Likelihood Estimation
  • Identify the questions.
  • Identity the data sources.
  • Select alternative models.
  • Select appropriate likelihood functions for each
    data source.
  • Find the values for the parameters that maximize
    the likelihood function (hence Maximum Likelihood
    Estimation).

8
Finding the Maximum Likelihood Estimates
The best estimate is 6, because this value of ?
leads to the maximum likelihood
9
Therefore.
We need to know which probability density
functions to use for which data types.
  • The probability distributions encountered most
    commonly are
  • Normal / multivariate normal
  • t
  • Log-normal
  • Poisson
  • Negative binomial
  • Beta
  • Binomial / multinomial

You need to know when to use each distribution
and its functional form (up to any normalizing
constants).
10
The Normal and t-distributions
  • The density functions for the normal and
    t-distributions are
  • ? is the mean
  • ? is the standard deviation ( for the t)
  • k is the degrees of freedom.
  • We use these distributions when the data are the
    sum of terms. The t-distribution allows account
    to be taken of small sample sizes (?lt30).

11
The Normal and t-distributions
12
Key Point with Normal Likelihood
Let us say we wish to fit the model
assuming normally distributed errors, i.e.
The likelihood function is therefore
Taking logarithms and multiplying by -1 gives
This is implies that if you assume
normally-distributed errors, the answers will be
identical to those from least squares.
13
Time for an Example!
  • We wish to fit the Dynamic Schaefer model to the
    bowhead census data.
  • q is assumed to be 1 here because the surveys
    provide absolute indices of abundance.
  • We have information on the trend in abundance
    from 1978-93 (increase of 3.2 per annum (SD
    0.76) based on 8 data points).
  • We have an estimate of abundance for 1993 of 7800
    (SD 564).

14
How to Deal with this Example!
  • The model
  • The likelihood function is the product of a
    normal likelihood (for the abundance estimate)
    and a t-likelihood (for the trend). Ignoring
    constants independent of the model parameters
  • We take logs, multiply by minus one and minimize
    to find the estimates for K and r.
  • Note that we can ignore any constants why?
  • The t-distribution is chosen for the slope why?

15
The Outcome
B19937710 Slope78-932.95
16
The Lognormal distribution
  • The density function
  • ? is the median (not the mean)
  • ? is the standard deviation of the logarithm
    (approximately the coefficient of variation of
    x).
  • The lognormal distribution is used extensively in
    fisheries assessments because x is always larger
    than zero this is true for most data sources
    (CPUE, survey indices, estimates of death rates,
    etc.)

17
The Multivariate Normal-I
  • The density function
  • is the vector of means.
  • is the variance-covariance matrix.
  • d is the length of the vector.
  • This isnt nearly as bad as it looks.

18
The Multivariate Normal-II
  • We use the multivariate normal when the data
    points are correlated (e.g. surveys with common
    correction factors). For example for bowheads

19
Readings
  • Hilborn and Mangel (1997) Chapter 7
  • Haddon (2001), Chapter 4
Write a Comment
User Comments (0)
About PowerShow.com