Monte Carlo Maximum Likelihood Methods for Estimating Uncertainty Arising from Shared Errors in Exposures in Epidemiological Studies - PowerPoint PPT Presentation


PPT – Monte Carlo Maximum Likelihood Methods for Estimating Uncertainty Arising from Shared Errors in Exposures in Epidemiological Studies PowerPoint presentation | free to download - id: 705f8d-Y2EzN


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

Monte Carlo Maximum Likelihood Methods for Estimating Uncertainty Arising from Shared Errors in Exposures in Epidemiological Studies


Monte Carlo Maximum Likelihood Methods for Estimating Uncertainty Arising from Shared Errors in Exposures in Epidemiological Studies Daniel O. Stram – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 43
Provided by: Dunc150
Learn more at:


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Monte Carlo Maximum Likelihood Methods for Estimating Uncertainty Arising from Shared Errors in Exposures in Epidemiological Studies

Monte Carlo Maximum Likelihood Methods for
Estimating Uncertainty Arising from Shared Errors
in Exposures in Epidemiological Studies
  • Daniel O. Stram
  • University of Southern California

Complex Dosimetry Systems a Working Definition
(my definition)
  • A complex dosimetry system for the study of an
    environmental exposure is one in which no single
    best exposure estimate is provided
  • Instead a distribution of possible true exposures
    is developed, together with a computer program
    that generates exposure replications from this
  • Generates doses conditional on input data
  • Both shared and unshared errors are incorporated
    into the dose replications
  • The statistician/epidemiologist treats this
    system as a Black Box, ie one that (s)he can
    manipulate, but doesnt know (or care?) about its
    inner workings

  • Some examples of epidemiological studies (of
    radiation) that use a complex dosimetry system to
    estimate doses
  • Utah Thyroid Disease Cohort Study
  • Hanford Thyroid Disease Study
  • Colorado plateau Uranium Miners study
  • In such studies limited or no direct measurements
    of individual dose exist. Instead a complex dose
    reconstruction (Utah, Hanford) or interpolation
    system (Colorado) is used to construct individual
    dose estimates or histories.

  • Even when all subjects in the study have
    (radiation) badge measurements these may need
    adjustments to reflect temporal or geographical
    differences in monitoring technology
  • Random errors and systematic biases exist for
    virtually any method
  • Information about the size of random and
    systematic biases for each dosimeter type comes
    from only a few experiments
  • Therefore there may considerable uncertainty in
    the systematic biases for any single dosimeter
  • Systematic biases constitute shared error

Representation of uncertainty in complex
dosimetry systems
  • Uncertainty in the dose estimates produced by
    these systems is increasingly characterized using
    Monte-Carlo methods which yield many
    realizations of possible dose, rather than a
    single best estimate of dose for each subject.
  • Part of the uncertainty of these estimates may be
    due to lack of knowledge of factors that
    influence simultaneously some or all the
    subjects doses

Dose estimation in the Hanford Thyroid Disease
  • Reconstruction based on physical modeling and
    some measurements of
  • Releases of I-131
  • Deposition and pasture retention of I-131
  • Pasture practices
  • Milk transfer coefficients
  • Individual consumption of milk
  • Note that errors in most of these will affect
    doses for all individuals simultaneously

Colorado Plateau Underground Miners Study
  • Dose estimates created using a complex exposure
    history / job history matrix
  • PHS exposure history matrix consisted of
    interpolations of limited WLM measurements
    temporally and geographically
  • Stram et al 1999 used a PHS developed hierarchy
    of mines within localities within districts and
    used a multilevel model to mimic temporal and
    geographical variation in dose.
  • The 1999 analysis was based upon the
    regression-substitution method in which E(true
    doseall measurements) was computed for each
    mine-year, after fitting the lognormal multilevel
    model to the WLM measurements
  • Errors in mine-year measurements are correlated
    by the interpolation system used, and many miners
    work in the same mines leading to correlated
    errors in the exposure history of each miner.

Pooled Nuclear Workers
  • Multi-facility, multi-year study
  • Each worker had badge measurements but the
    technologies changed through time and across
  • The systematic errors in each badge type are
    shared by all subjects working at the time the
    badge was in use
  • For many but not all types of personal monitor
    some limited work (using phantoms, etc.) has been
    done to assess the relationship between true
    exposure and the badge measurement
  • One important issue is whether the low dose-rate
    exposures of the N workers produce risks that are
    in line with those seen for the A-bomb
  • upper confidence intervals that take account of
    shared dosimetry error needed

Monte-Carlo Dosimetry
  • Adopts a Bayesian framework
  • Is Bayesian about sampling error in the
    experimental work (with badges), interpreted as
    giving posterior distributions
  • Prior distributions for uncertain parameters (for
    N workers, the likely size of biases for
    unmeasured badges) using expert opinion
  • For each replication the uncertain parameters are
    sampled from their distribution and combined with
    samples of other random factors (e.g. local
    meteorology for the Hanford or Utah studies) and
    with all relevant individual data for each
    subject (location, milk consumption, age, etc)
  • Each set of random quantities is combined with
    individual data to form dose estimates for each

  • Let us assume that the dose replications really
    may be regarded as samples from the distribution
    of true dose given all the individual data
  • For retrospective dose-reconstruction systems
    this assumption may be a very large leap of faith
  • For other studies using badge calibration
    (Workers) or measurement interpolation this may
    be considerably more solidly founded.
  • Consider the sampling characteristics of
    frequentist inference concerning risk estimation.
    We want to know the influence of uncertainty on
  • The power to detect an effect (of exposure on
    risk of disease) of a certain size
  • Confidence limits on estimated risk parameters

An idealized dosimetry system
  • Assume each replication of dose is a sample from
    the joint distribution
  • f(X1, X2,.., XN W1, W2,,WN)
  • of true dose given the input data Wi recorded
    for all subjects. Because many realizations,
    from f(XW) are available we can calculate
  • Zi E(Xi W)
  • as the average over a very large number of
    realizations, Xri where Xr f(XW) r1 ?

How should an epidemiologist deal with the
uncertainty in the random replications of dose?
  • We are interested in estimating parameters in the
    dose-response function for disease Di given true
    dose Xi , specifically the relationship
  • E(Di Xi)
  • parameterized by ? (dose response slope)

Simplifications of the disease model
  • Assume a linear relation between D and X
  • E(Di Xi) a b Xi (1)
  • Linear models are of interest for at least two
  • They may be important for radio-biological and
    radio protection reasons even for binary disease
    outcomes (where logistic regression models are
    the standard)
  • For small b it may be impossible to distinguish
    between linear and smooth nonlinear (e.g.
    logistic) dose response shapes
  • A study with good power to detect a dose-response
    relationship may have very poor power to fully
    define the shape of the response

Berkson error models
  • If the errors in the Z_is defined above are
    independent from one another then fitting model
    (1) is done by replacement of true X_i with Z_i.
  • This is a Berkson error model in the sense that
    the truth is distributed around the measured
    value. Regression-substitution yields unbiased
  • The classical error model has the measurement
    distributed around the truth. This produces risk
    estimates that are biased towards the null.

Impact of independent measurement error
  • For either Berkson or Classical error models the
    most important effect of random error is loss of
    power to detect nonzero risk estimates
  • If R2 is the squared correlation between true
    exposure X and measured exposure Z then it will
    take 1/R2 subjects to detect the same risk using
    Z as using true X.

Shared versus unshared dosimetry error
  • A key distinction between the effects of shared
    versus unshared dosimetry error is their effect
    on the validity of sample variance estimates used
    to characterize the variability of the estimates
  • Independent Berkson Errors The usual estimate
    of the std error of the slope estimate remains
    valid despite the loss of power
  • Independent Classical Errors Again the usual
    estimate of the standard error of the slope
    estimates generally remains valid despite
  • The loss of power
  • The attenuation in the dose response parameter

Dosimetry simplifications
  • Adopt a generalization of the Berkson error model
    for the joint distribution of true Xi around its
    conditional mean Zi which incorporates both
    shared and unshared errors
  • ?SM is shared multiplicative error with mean
    1 ?M,i is unshared multiplicative error with
    mean 1 ?SA is shared additive error with mean
    0 ?A,i is unshared additive error with mean 0

  • Under this shared and unshared multiplicative and
    additive (SUMA) error model we have E(XW) Z
    (the usual Berkson property) over the
    distribution of all four ?
  • What happens when we fit
  • E(Di Zi) a b Zi
  • If there are no measurement errors Var(?) 0, we
    will have (for small values of b )
  • (1)

  • Effects of shared and unshared errors on
  • We are interested in three questions regarding
    each error component in the SUMA model
  • What is its effect on study power?
  • What is its effect on the validity of expression
    (1) for the variance of the estimate of b?
  • How valid are the estimates of study power when
    they are based on expression (1)?

  • Shared Additive error has little effect on
    either the estimation of b or on the variability
    of the estimate
  • Unshared Additive or Multiplicative errors
    reduces the correlation, R, between Xi and Zi,
    thereby reducing study power, the reduction in
    study efficiency due to unshared measurement
    error is roughly proportional to R2
  • however the validity of expression (1) for the
    variance of the estimator remains appropriate.
    Further the estimate of study power using (1)
    remains appropriate

Effect of multiplicative shared error
  • Averaging over the distribution of random ?SM we
    retain the Berkson property that
  • But with

  • Notice that if b 0 that the naïve estimate of
    the variance of
  • ignoring the shared error is equal to
  • the true variance of this parameter
  • If b gt 0, the naïve estimate of the variance is
    biased downward by

  • We conclude
  • Ignoring shared error does not affect the
    validity of the test of the null hypothesis that
    b0, because expression (2) expression (1)
    when b0
  • More generally non-differential ME weakens the
    power, but doesnt invalidate the validity, of a
    test of association between disease and exposure
  • Ignoring shared error will overstate the power to
    detect a bgt0, because (1) lt (2) in this case

  • Ignoring shared error will result in confidence
    limits that are too narrow
  • However it is the upper confidence limit that is
    most affected.
  • If the lower confidence limit ignoring shared
    error does not include zero, correcting for
    shared error will not cause it to include zero
    (because of conclusion 1)

How to incorporate shared ME directly into an
  • Multiple imputation
  • Full Parametric Bootstrap
  • Likelihood analysis with MCML

Multiple Imputation
  • It is tempting to try to quantify the uncertainty
    in by regressing Di on each set of Xr and
    using the quantiles of the resulting as
    confidence limits for b
  • This ignores the sampling variability of D
  • Moreover the distribution of the slope estimates
    can be badly biased towards the null value.
    Essentially there is a reintroduction of
    classical error into the problem
  • True multiple imputation requires sampling Xr
    from the distribution of X given both the input
    data W and the outcomes Di (not just W) to
    remove these biases

Full Parametric Bootstrap
  • A simulation experiment in which is used as the
    true value of the risk parameter and both doses
    and outcomes Di are simulated from a complete

Monte-Carlo maximum likelihood
  • We can compute likelihood ratio tests as follows
  • For null a0 and b0 generate n samples of Xr from
    the distribution of X given W and D
  • For any test values a and b compute the log
    likelihood ratio as
  • If we use b0 0 then we dont have to condition
    on D (so that we can use the dosimetry system

Once we compute the likelihood what do we do with
  • We have a funny mishmash we are being
  • Bayesian about the doses
  • Frequentist about the dose-response parameter
  • Moreover we cant really expect standard
    Frequentist asymptotic likelihood theory to hold
  • Suppose the number of subjects ? 8 then the
    distribution of will be dominated by the
    distribution of the shared multiplicative errors
    in the dosimetry system the distribution of which
    is arbitrary.
  • Is it still reasonable to use chi-square
    approximations to the distribution of changes in
    log likelihood?

Other problems
  • If shared multiplicative error is large then as
    b-b0 gets large the summands in (5)
  • become extremely variable
  • Convergence of the average is incredibly slow
  • Round-off error dominates the performance of the

Application to the ORNL N-workers dataStayner et
al in review
  • Estimate a single risk parameter using (time
    dependent) total dose in a partial likelihood
  • Write a computer program that simulates the bias
    factors for the badges used in those facilities
    and re-links the risk sets

Three analyses
  • 1. Compute the MCML Likelihood
  • For each replication of doses compute the partial
    likelihoods over a 1-dimensional grid of risk
  • Average the partial likelihoods over the
  • Pretend that the asymptotics still hold and
    compute a confidence interval

  • Compute FPB estimates of
  • Compare these to the MCML confidence intervals
  • For each set of D computed in 2 compute a
    separate MCML confidence interval (more
    simulations from the dose distribution)
  • Count the number of times that the standard
    frequentist confidence interval contains the true
    value of the risk parameter

(No Transcript)
FPB simulations
Some observations
  • The MCML widens the confidence interval on the
    high side more than the low side
  • The 90 percent asymptotic lower CI for the MCML
    does not include 0.
  • This is good because (1) the uncorrected CI did
    not include 0 and (2) we claim that correcting
    for measurement error shouldnt affect the
    significance of a test of no association.
  • Note that the two curves (corrected and MCML log
    likelihoods) are very close to parallel at b0
  • This implies that a score test of beta0 wil be
    (nearly) identical using the corrected and
    uncorrected likelihoods using any significance
  • This observed result follows from Tosteson and
    Tsiatis 1988 on score tests for EIV problems

  • The FPB on the other hand puts significantly
    more than 5 percent of the estimates lt 0 (68 of
    1,000) and significantly fewer of the estimates
    (33 of 1,000) above the MCML UCI.
  • This may actually be a promising observation for
    the validity of the MCML confidence intervals
  • They tend to be skewed to the right (not
    symmetric around the MLE) so more (than 3.3
    percent) of the upper confidence limits and fewer
    (than 6.8 percent) of the lower confidence
    intervals should fail to contain the true value
  • Simulations are now in progress

Validity of MCML CI
  • Consider limiting case when n-gt8 the slope
    estimate will be determined by the distribution
    of shared multiplicative errors
  • Worst case would be the SUMA model
  • Suppose that ?SM is distributed as log normal
    with arithmetic mean 1 (log mean -1/2 ?2)
  • Then b / ?SM is also distributed as log normal
    with mean parameter log(b) -1/2 ?2 and log
    variance ?2
  • Consider twice the change in log likelihood from
    true b to MLE
  • This will be 1/?2log( )-(log(b)-1/2 ?2)2
    which is exactly ?2
  • Consider next a normal distribution for the
    shared multiplicative error
  • Would make sense if the ?SM was itself a sum of
    many components

  • For this model twice the change in log likelihood
    is of form
  • -2 log(?SM) 1/(?2)(?SM-1)2 c
  • Where c 2 log(1/21/2?(14 ?2))
  • - 1/2?(14 ?2)-1/22 / ?2
  • What is the distribution of this random variable?
  • How close is it to a Chi Square w 1 df?

(No Transcript)
  • The MCML has promise but it is complicated
  • But other methods (multiple imputation, etc, have
    complications of their own)
  • Score tests of b0 based on the average
    likelihood agree with analyses that ignore
    measurement errors
  • Our application of the MCML method for partial
    likelihoods ignores the dilution effects
    described by Prentice (Biometrika 1982) but
    these are expected to be very small in most
  • In shared error settings the asymptotics are not
    correct for ordinary frequentist calculations,
    but it seems to be hard to come up with
    situations where they fail drastically

  • Leslie Stayner, Stephen Gilbert UIC/NIOSH
  • Elisabeth Cardis, Martine Vrijheid, Isabelle
    Deltour, IARC
  • Geoffrey Howe, Columbia
  • Terri Kang, USC