Monte Carlo Maximum Likelihood Methods for Estimating Uncertainty Arising from Shared Errors in Exposures in Epidemiological Studies - PowerPoint PPT Presentation

1 / 42

About This Presentation

Title:

Monte Carlo Maximum Likelihood Methods for Estimating Uncertainty Arising from Shared Errors in Exposures in Epidemiological Studies

Description:

Monte Carlo Maximum Likelihood Methods for Estimating Uncertainty Arising from Shared Errors in Exposures in Epidemiological Studies Daniel O. Stram – PowerPoint PPT presentation

Number of Views:211

Avg rating:3.0/5.0

Slides: 43

Provided by: Dunc150

Category:

more less

Transcript and Presenter's Notes

Title: Monte Carlo Maximum Likelihood Methods for Estimating Uncertainty Arising from Shared Errors in Exposures in Epidemiological Studies

1
Monte Carlo Maximum Likelihood Methods for
Estimating Uncertainty Arising from Shared Errors
in Exposures in Epidemiological Studies

Daniel O. Stram
University of Southern California

2
Complex Dosimetry Systems a Working Definition
(my definition)

A complex dosimetry system for the study of an
environmental exposure is one in which no single
best exposure estimate is provided
Instead a distribution of possible true exposures
is developed, together with a computer program
that generates exposure replications from this
distribution
Generates doses conditional on input data
Both shared and unshared errors are incorporated
into the dose replications
The statistician/epidemiologist treats this
system as a Black Box, ie one that (s)he can
manipulate, but doesnt know (or care?) about its
inner workings

Some examples of epidemiological studies (of
radiation) that use a complex dosimetry system to
estimate doses
Utah Thyroid Disease Cohort Study
Hanford Thyroid Disease Study
Colorado plateau Uranium Miners study
In such studies limited or no direct measurements
of individual dose exist. Instead a complex dose
reconstruction (Utah, Hanford) or interpolation
system (Colorado) is used to construct individual
dose estimates or histories.

Even when all subjects in the study have
(radiation) badge measurements these may need
adjustments to reflect temporal or geographical
differences in monitoring technology
Random errors and systematic biases exist for
virtually any method
Information about the size of random and
systematic biases for each dosimeter type comes
from only a few experiments
Therefore there may considerable uncertainty in
the systematic biases for any single dosimeter
Systematic biases constitute shared error

5
Representation of uncertainty in complex
dosimetry systems

Uncertainty in the dose estimates produced by
these systems is increasingly characterized using
Monte-Carlo methods which yield many
realizations of possible dose, rather than a
single best estimate of dose for each subject.
Part of the uncertainty of these estimates may be
due to lack of knowledge of factors that
influence simultaneously some or all the
subjects doses

6
Dose estimation in the Hanford Thyroid Disease
Study

Reconstruction based on physical modeling and
some measurements of
Releases of I-131
Deposition and pasture retention of I-131
Pasture practices
Milk transfer coefficients
Individual consumption of milk
Note that errors in most of these will affect
doses for all individuals simultaneously

7
Colorado Plateau Underground Miners Study

Dose estimates created using a complex exposure
history / job history matrix
PHS exposure history matrix consisted of
interpolations of limited WLM measurements
temporally and geographically
Stram et al 1999 used a PHS developed hierarchy
of mines within localities within districts and
used a multilevel model to mimic temporal and
geographical variation in dose.
The 1999 analysis was based upon the
regression-substitution method in which E(true
doseall measurements) was computed for each
mine-year, after fitting the lognormal multilevel
model to the WLM measurements
Errors in mine-year measurements are correlated
by the interpolation system used, and many miners
work in the same mines leading to correlated
errors in the exposure history of each miner.

8
Pooled Nuclear Workers

Multi-facility, multi-year study
Each worker had badge measurements but the
technologies changed through time and across
facility.
The systematic errors in each badge type are
shared by all subjects working at the time the
badge was in use
For many but not all types of personal monitor
some limited work (using phantoms, etc.) has been
done to assess the relationship between true
exposure and the badge measurement
One important issue is whether the low dose-rate
exposures of the N workers produce risks that are
in line with those seen for the A-bomb
upper confidence intervals that take account of
shared dosimetry error needed

9
Monte-Carlo Dosimetry

Adopts a Bayesian framework
Is Bayesian about sampling error in the
experimental work (with badges), interpreted as
giving posterior distributions
Prior distributions for uncertain parameters (for
N workers, the likely size of biases for
unmeasured badges) using expert opinion
For each replication the uncertain parameters are
sampled from their distribution and combined with
samples of other random factors (e.g. local
meteorology for the Hanford or Utah studies) and
with all relevant individual data for each
subject (location, milk consumption, age, etc)
Each set of random quantities is combined with
individual data to form dose estimates for each
individual

Let us assume that the dose replications really
may be regarded as samples from the distribution
of true dose given all the individual data
For retrospective dose-reconstruction systems
this assumption may be a very large leap of faith
For other studies using badge calibration
(Workers) or measurement interpolation this may
be considerably more solidly founded.
Consider the sampling characteristics of
frequentist inference concerning risk estimation.
We want to know the influence of uncertainty on
The power to detect an effect (of exposure on
risk of disease) of a certain size
Confidence limits on estimated risk parameters

11
An idealized dosimetry system

Assume each replication of dose is a sample from
the joint distribution
f(X1, X2,.., XN W1, W2,,WN)
of true dose given the input data Wi recorded
for all subjects. Because many realizations,
from f(XW) are available we can calculate
Zi E(Xi W)
as the average over a very large number of
realizations, Xri where Xr f(XW) r1 ?

12
How should an epidemiologist deal with the
uncertainty in the random replications of dose?

We are interested in estimating parameters in the
dose-response function for disease Di given true
dose Xi , specifically the relationship
E(Di Xi)
parameterized by ? (dose response slope)

13
Simplifications of the disease model

Assume a linear relation between D and X
E(Di Xi) a b Xi (1)
Linear models are of interest for at least two
reasons
They may be important for radio-biological and
radio protection reasons even for binary disease
outcomes (where logistic regression models are
the standard)
For small b it may be impossible to distinguish
between linear and smooth nonlinear (e.g.
logistic) dose response shapes
A study with good power to detect a dose-response
relationship may have very poor power to fully
define the shape of the response

14
Berkson error models

If the errors in the Z_is defined above are
independent from one another then fitting model
(1) is done by replacement of true X_i with Z_i.
This is a Berkson error model in the sense that
the truth is distributed around the measured
value. Regression-substitution yields unbiased
estimates.
The classical error model has the measurement
distributed around the truth. This produces risk
estimates that are biased towards the null.

15
Impact of independent measurement error

For either Berkson or Classical error models the
most important effect of random error is loss of
power to detect nonzero risk estimates
If R2 is the squared correlation between true
exposure X and measured exposure Z then it will
take 1/R2 subjects to detect the same risk using
Z as using true X.

16
Shared versus unshared dosimetry error

A key distinction between the effects of shared
versus unshared dosimetry error is their effect
on the validity of sample variance estimates used
to characterize the variability of the estimates
Independent Berkson Errors The usual estimate
of the std error of the slope estimate remains
valid despite the loss of power
Independent Classical Errors Again the usual
estimate of the standard error of the slope
estimates generally remains valid despite
The loss of power
The attenuation in the dose response parameter
estimate

17
Dosimetry simplifications

Adopt a generalization of the Berkson error model
for the joint distribution of true Xi around its
conditional mean Zi which incorporates both
shared and unshared errors
?SM is shared multiplicative error with mean
1 ?M,i is unshared multiplicative error with
mean 1 ?SA is shared additive error with mean
0 ?A,i is unshared additive error with mean 0

Under this shared and unshared multiplicative and
additive (SUMA) error model we have E(XW) Z
(the usual Berkson property) over the
distribution of all four ?
What happens when we fit
E(Di Zi) a b Zi
If there are no measurement errors Var(?) 0, we
will have (for small values of b )
(1)

Effects of shared and unshared errors on
estimation
We are interested in three questions regarding
each error component in the SUMA model
What is its effect on study power?
What is its effect on the validity of expression
(1) for the variance of the estimate of b?
How valid are the estimates of study power when
they are based on expression (1)?

Shared Additive error has little effect on
either the estimation of b or on the variability
of the estimate
Unshared Additive or Multiplicative errors
reduces the correlation, R, between Xi and Zi,
thereby reducing study power, the reduction in
study efficiency due to unshared measurement
error is roughly proportional to R2
however the validity of expression (1) for the
variance of the estimator remains appropriate.
Further the estimate of study power using (1)
remains appropriate

21
Effect of multiplicative shared error

Averaging over the distribution of random ?SM we
retain the Berkson property that
But with

Notice that if b 0 that the naïve estimate of
the variance of
ignoring the shared error is equal to
the true variance of this parameter
If b gt 0, the naïve estimate of the variance is
biased downward by

We conclude
Ignoring shared error does not affect the
validity of the test of the null hypothesis that
b0, because expression (2) expression (1)
when b0
More generally non-differential ME weakens the
power, but doesnt invalidate the validity, of a
test of association between disease and exposure
Ignoring shared error will overstate the power to
detect a bgt0, because (1) lt (2) in this case

Ignoring shared error will result in confidence
limits that are too narrow
However it is the upper confidence limit that is
most affected.
If the lower confidence limit ignoring shared
error does not include zero, correcting for
shared error will not cause it to include zero
(because of conclusion 1)

25
How to incorporate shared ME directly into an
analysis

Multiple imputation
Full Parametric Bootstrap
Likelihood analysis with MCML

26
Multiple Imputation

It is tempting to try to quantify the uncertainty
in by regressing Di on each set of Xr and
using the quantiles of the resulting as
confidence limits for b
This ignores the sampling variability of D
Moreover the distribution of the slope estimates
can be badly biased towards the null value.
Essentially there is a reintroduction of
classical error into the problem
True multiple imputation requires sampling Xr
from the distribution of X given both the input
data W and the outcomes Di (not just W) to
remove these biases

27
Full Parametric Bootstrap

A simulation experiment in which is used as the
true value of the risk parameter and both doses
and outcomes Di are simulated from a complete
model

28
Monte-Carlo maximum likelihood

We can compute likelihood ratio tests as follows
For null a0 and b0 generate n samples of Xr from
the distribution of X given W and D
For any test values a and b compute the log
likelihood ratio as
If we use b0 0 then we dont have to condition
on D (so that we can use the dosimetry system
directly)

29
Once we compute the likelihood what do we do with
it?

We have a funny mishmash we are being
Bayesian about the doses
Frequentist about the dose-response parameter
Moreover we cant really expect standard
Frequentist asymptotic likelihood theory to hold
Suppose the number of subjects ? 8 then the
distribution of will be dominated by the
distribution of the shared multiplicative errors
in the dosimetry system the distribution of which
is arbitrary.
Is it still reasonable to use chi-square
approximations to the distribution of changes in
log likelihood?

30
Other problems

If shared multiplicative error is large then as
b-b0 gets large the summands in (5)
become extremely variable
Convergence of the average is incredibly slow
Round-off error dominates the performance of the
algorithm

31
Application to the ORNL N-workers dataStayner et
al in review

Estimate a single risk parameter using (time
dependent) total dose in a partial likelihood
analysis
Write a computer program that simulates the bias
factors for the badges used in those facilities
and re-links the risk sets

32
Three analyses

1. Compute the MCML Likelihood
For each replication of doses compute the partial
likelihoods over a 1-dimensional grid of risk
parameters
Average the partial likelihoods over the
replications
Pretend that the asymptotics still hold and
compute a confidence interval

Compute FPB estimates of
Compare these to the MCML confidence intervals
For each set of D computed in 2 compute a
separate MCML confidence interval (more
simulations from the dose distribution)
Count the number of times that the standard
frequentist confidence interval contains the true
value of the risk parameter

34
(No Transcript)
35
FPB simulations
36
Some observations

The MCML widens the confidence interval on the
high side more than the low side
The 90 percent asymptotic lower CI for the MCML
does not include 0.
This is good because (1) the uncorrected CI did
not include 0 and (2) we claim that correcting
for measurement error shouldnt affect the
significance of a test of no association.
Note that the two curves (corrected and MCML log
likelihoods) are very close to parallel at b0
This implies that a score test of beta0 wil be
(nearly) identical using the corrected and
uncorrected likelihoods using any significance
criterion
This observed result follows from Tosteson and
Tsiatis 1988 on score tests for EIV problems

The FPB on the other hand puts significantly
more than 5 percent of the estimates lt 0 (68 of
1,000) and significantly fewer of the estimates
(33 of 1,000) above the MCML UCI.
This may actually be a promising observation for
the validity of the MCML confidence intervals
They tend to be skewed to the right (not
symmetric around the MLE) so more (than 3.3
percent) of the upper confidence limits and fewer
(than 6.8 percent) of the lower confidence
intervals should fail to contain the true value
Simulations are now in progress

38
Validity of MCML CI

Consider limiting case when n-gt8 the slope
estimate will be determined by the distribution
of shared multiplicative errors
Worst case would be the SUMA model
Suppose that ?SM is distributed as log normal
with arithmetic mean 1 (log mean -1/2 ?2)
Then b / ?SM is also distributed as log normal
with mean parameter log(b) -1/2 ?2 and log
variance ?2
Consider twice the change in log likelihood from
true b to MLE
This will be 1/?2log( )-(log(b)-1/2 ?2)2
which is exactly ?2
Consider next a normal distribution for the
shared multiplicative error
Would make sense if the ?SM was itself a sum of
many components

For this model twice the change in log likelihood
is of form
-2 log(?SM) 1/(?2)(?SM-1)2 c
Where c 2 log(1/21/2?(14 ?2))
- 1/2?(14 ?2)-1/22 / ?2
What is the distribution of this random variable?
How close is it to a Chi Square w 1 df?

40
(No Transcript)
41
Conclusions

The MCML has promise but it is complicated
But other methods (multiple imputation, etc, have
complications of their own)
Score tests of b0 based on the average
likelihood agree with analyses that ignore
measurement errors
Our application of the MCML method for partial
likelihoods ignores the dilution effects
described by Prentice (Biometrika 1982) but
these are expected to be very small in most
settings
In shared error settings the asymptotics are not
correct for ordinary frequentist calculations,
but it seems to be hard to come up with
situations where they fail drastically