Title: Parameter estimation, maximum likelihood and least squares techniques
1Parameter estimation, maximum likelihood and
least squares techniques
third lecture
- Jorge Andre Swieca School
- Campos do Jordão, January,2003
2References
- Statistics, A guide to the Use of Statistical
Methods in the Physical Sciences, R. Barlow, J.
Wiley Sons, 1989 - Statistical Data Analysis, G. Cowan, Oxford, 1998
- Particle Data Group (PDG) Review of Particle
Physics, 2002 electronic edition. - Data Analysis, Statistical and Computational
Methods for Scientists and Engineers, S. Brandt,
Third Edition, Springer, 1999
3Likelihood
A verossimilhança () é muita vez toda a
verdade. Conclusão de Bento Machado de Assis
Quem quer que a ouvisse, aceitaria tudo por
verdade, tal era a nota sincera, a meiguice dos
termos e a verossimilhança dos pormenores Quin
cas Borba Machado de Assis
4Parameter estimation
p.d.f. f(x) sample space all possible values of
x.
Sample of size n
independent obsv.
Joint p.d.f.
To estimate prop. of p.d.f. (mean, variance,)
estimador.
Estimador consistent
(large sample or assimptotic limit)
5Parameter estimation
random variable distributed as
(sampling distribution)
Infinite number of similar experiments of size n
- sample size
- functional form of estimator
- true properties of p.d.f.
Bias
b0 independent of n ? is unbiased
Important to combine results of two or more
experiments.
6Parameter estimation
mean square error
Classical statistics no unique method for
building estimators
given an estimator one can evaluate its properties
sample mean
From supposed from
unknown pdf
Estimator for Exµ (population mean)
one possibility
7Parameter estimation
Important property weak law of large numbers
If V(x) exists, is a consistent estimator
for µ
is an unbiased estimator for the population mean µ
8Parameter estimation
Sample variance
s2 is an unbiased estimator for Vx
if µ is known
S2 is an unbiased estimator for s2.
9Maximum likelihood
Technique for estimating parameters given a
finite sample of data
Suppose the functional form of f(x?) known.
prob. xi in for all i
If parameters correct high probability for the
data.
- joint probability
- ? variables
- X parameters
likelihood function
ML estimators for ? maximize the likelihood
function
10Maximum likelihood
11 Maximum likelihood
n decay times for unstable particles t1,,tn
hypothesis distribution an exponential p.d.f.
with mean
12Maximum likelihood
50 decay times
13Maximum likelihood
?
given
unbiased estimator for when n?8
14Maximum likelihood
n measurements of x assumed to come from a
gaussian
unbiased
unbiased for large n
15Maximum likelihood
we showed that s2 is an unbiased estimator for
the variance of any p.d.f., so
is unbiased estimator for
16Maximum likelihood
Variance of ML estimators
many experiments (same n) spread of ?
analytically (exponential)
transf. invariance of ML estimators
ML estimate of
17Maximum likelihood
If the experiment repeated many times (with the
same n) the standard deviation of the estimation
0.43.
- one possible interpretation
- not the standard when the distribution is not
gaussian (68 confidence interval, - standard
deviation if the p.d.f. for the estimator is
gaussian) - in the large sample limit, ML estimates are
distributed according to a gaussian p.d.f. - two procedures lead to the same result
18Maximum likelihood
Variance MC method
cases too difficult to solve analytically MC
method
- simulate a large number of experiments
- compute the ML estimate each time
- distribution of the resulting values
S2 unbiased estimator for the variance of a p.d.f.
S from MC experiments statistical errors of the
parameter estimated from real measurement
asymptotic normality general property of ML
estimators for large samples.
19Maximum likelihood
1000 experiments 50 obs/experiment
sample standard deviation
s 0.151
20RCF bound
A way to estimate the variance of any estimators
without analytical calculations or MC.
Rao-Cramer-Frechet
Equality (minimum variance) estimator efficient
If efficient estimators exist for a problem, the
ML will find them.
ML estimators always efficient in the large
sample limit.
Ex exponential
equal to exact result efficient estimator
21RCF bound
assume efficiency and zero bias
statistical errors
22RCF bound
large data sample evaluating the second
derivative with the measured data and the ML
estimates
usual method for estimating the covariance matrix
when the likelihood function is maximized
numerically
- finite differences
- invert the matrix to get Vij
Ex MINUIT (Cern Library)
23Graphical method
single parameter ?
later
68.3 central confidence interval
24ML with two parameters
angular distribution for the scattering angles ?
(xcos?) in a particle reaction.
normalized -1 x 1
realistic measurements only in xmin x xmax
25ML with two parameters
2000 events
26ML with two parameters
500 exper.
2000 evts/exp.
Both marginal pdfs are aprox. gaussian
27Least squares
measured value y gaussian random variable
centered about the quantitys true value ?(x,?)
28Least squares
used to define the procedure even if yi are not
gaussian
measurements not independent, described by a
n-dim Gaussian p.d.f. with nown cov. matrix but
unknown mean values
LS estimators
29Least squares
linearly independent
- estimators and their variances can be found
analytically - estimators zero bias and minimum variance
minimum
30Least squares
covariance matrix for the estimators
coincides with the RCF bound for the inverse
covariance matrix if yi are gaussian distributed
31Least squares
linear in , quadratic in
to interpret this, one single ?
32Chi-squared distribution
0 z 8
n1,2,
(degrees of freedom)
n independent gaussian random variables xi with
known
is distributed as a for n dof
33Chi-squared distribution
34Chi-squared distribution
35Chi-squared distribution