Title: Comparing two strategies for primary analysis of longitudinal trials with missing data
1Comparing two strategies for primary analysis of
longitudinal trials with missing data
- Peter Lane
- Research Statistics Unit
2Acknowledgements
- Missing data working group (2001 )
- Fiona Holland (Stats Prog, Harlow)
- Byron Jones (RSU Harlow)
- Mike Kenward (LSHTM)
- MNLM vs LOCF working group (2004 )
- Paul McSorley (Psychiatry area leader, RTP)
- Suzanne Edwards Wen-Jene Ko (SP, RTP)
- Kath Davy, Claire Blackburn, Andrea Machin (SP,
Harlow)
3Contents
- Outline of the problem
- Methods of analysis
- Six clinical trials in GSK
- Simulation study
- parameters estimated from trials
- range of drop-out mechanisms
- comparison of two methods of analysis
- Conclusions
4Outline of the problem
- Missing values in longitudinal trials are a big
issue - First aim should be to reduce proportion
- Ethics dictate that it cant be avoided
- Information lost cant be conjured up
- There is no magic method to fix it
- Magnitude of problem varies across areas
- 8-week depression trial 25-50 may drop out by
final visit - 12-week asthma trial maybe only 5-10
- Most serious when efficacy evaluated at end
5Methods of analysis
- Ignore drop-out
- CC (complete-case analysis)
- Single imputation of missing values
- LOCF (last observation carried forward)
- Generate small samples from estimated
distributions - MI (multiple imputation)
- Fit model for response at all time-points
- GEE (generalized estimating equations)
- MNLM (multivariate normal linear model also
referred to as MMRM, or mixed-model repeated
measures) - Model drop-out as well as response
- SM (selection models)
- PMM (pattern-mixture models)
6Properties of methods
- MCAR drop-out independent of response
- CC is valid, though it ignores information
- LOCF is valid if there are no trends with time
- MAR drop-out depends only on observations
- CC, LOCF, GEE invalid
- MI, MNLM, weighted GEE valid
- MNAR drop-out depends also on unobserved
- CC, LOCF, GEE, MI, MNLM invalid
- SM, PMM valid if (uncheckable) assumptions true
7Usage of methods
- In the past, LOCF has been used widely
- seen as conservative not necessarily true
- gives envelope together with CC not necessarily
true - conditional inference not often interpretable
- MI was developed to improve imputation
- concern with repeatability assumptions
- MNLM is being increasingly used
- software available, but lack of understanding
- SM, PMM recommended for sensitivity analysis
- looks at some types of MNAR, requiring assumptions
8Compare LOCF and MNLM
- Simulation study, based on experience from trials
- Six trials from a range of psychiatry areas
- Pattern of treatment means over time
- Covariance matrix between repeated obs
- Drop-out rates
- Set up a range of drop-out mechanisms
- Generate many datasets and analyse both ways
- Look at bias of treatment diff. at final
time-point - Look at power to detect diff.
9Trial 2 Pick two comparisons Trials 3, 4, 6 Pick
one comparison Gives seven two-arm scenarios
10Covariance matrix from Trial 4
- Week Correlation SD
- 1 4.6
- 2 .68 6.3
- 3 .57 .72 7.2
- 4 .52 .64 .83 7.3
- 5 .43 .53 .70 .82 7.2
- 6 .39 .50 .64 .75 .85 7.4
- 7 .33 .43 .60 .71 .78 .89 7.6
- 8 .32 .44 .59 .67 .74 .84 .88 7.7
- 1 2 3 4 5 6 7
- Used estimates from each trial in simulation
11 drop-out rates from Trials 2 6
- Week 1 2 3 4 5 6 Total
- Treat 1 17 11 15 5 11 58
- Treat 2 10 13 14 10 1 49
- Treat 3 6 15 8 8 3 40
- Week 1 2 3 4 6 8 Total
- Treat 1 3 9 5 6 7 30
- Treat 2 7 7 5 7 9 36
- Treat 3 6 3 2 3 9 22
- Used average rate over times and treatments from
each trial
12Drop-out mechanisms
- MCAR generate drop-out at random
- MAR classify responses at Time k by size, and
simulate drop-out at Time k1 with varying
probabilities for each class - MNAR as for MAR, but simulate drop-out at Time
k, so actual response that influences drop-out is
not observed - Divide all responses at any visit into 9
quantiles, and investigate 3 probability patterns
(next slide) for drop-out
13Drop-out probabilities
Drop-out probability increases as response
increases These patterns give an average 4
drop-out rate per visit
14Trial 1, simulation results
- Large treatment difference 19
- average obs. SD 19
- patients per arm 93
- Example of simulation results
- MCAR drop-out
- 1000 simulations
- power_mnlm 99.90
- power_cc 99.90
- power_locf 99.90
- bias_mnlm 0.32
- bias_cc 0.29
- bias_locf 12.17
15Trial 1, summary
- Bias uniformly greater for LOCF
- average 18 vs 4 for MNLM
- all negative bias except one for LOCF (MAR
extreme) - e.g. MNAR linear 13 bias for LOCF, i.e. treat
diff 15 rather than 19 2 bias for MNLM - e.g. MNAR extreme 24 for LOCF, 18 for MNLM
- Power nearly all 100
16Trial 2, first comparison
- Medium treatment difference 13
- average obs. SD 19 patients per arm 75
- Bias greater for LOCF than MNLM except one (MNAR
extreme) with 27 for LOCF, 28 for MNLM - average 23 for LOCF, 7 for MNLM
- all negative bias except one for LOCF (39 for
MAR extreme) - Power uniformly higher for LOCF average 92 vs
67 for MNLM
17Trial 3
- Medium treatment difference 3
- average obs. SD 8.7 patients per arm 116
- Similar results to Trial 2 with first comparison,
except - smaller power difference 76 for LOCF, 60 for
MNLM
18Trial 4
- Small treatment difference 2
- average obs. SD 6.9 patients per arm 142
- Bias uniformly greater for LOCF (but small in
magnitude as treatment difference is small) - average 44 vs 4 for LOCF
- all negative bias except three for MNLM (2, 0, 0
for MCAR, MAR light and MAR medium) - Power uniformly lower for LOCF
- average 21 vs 36 for MNLM
19Trial 5
- Small treatment difference 2
- average obs SD 8.9 patients per arm 121
- Similar results to Trial 4, except
- smaller bias difference 12 for LOCF, 4 for
MNLM - little power difference 26 for LOCF, 22 for
MNLM
20Trial 6
- Almost no treatment difference 1
- average obs. SD 10.3 patients per arm 115
- Bias uniformly greater for LOCF
- average 28 vs 9 for MNLM
- negative bias except five for MNLM (12, 9, 5,
2, 4 for MCAR, MAR and MNAR light) - Power virtually the same
- average 7 for LOCF vs 9 for MNLM
21Trial 2, second comparison
- Almost no treatment difference 1
- average obs. SD 19 patients per arm 75
- Similar results to Trial 6, except
- little bias difference 23 for both
22Conclusions
- 1. MNLM is nearly always superior in terms of
reduced bias - LOCF is biased even for MCAR with these patterns
- MNLM has virtually no bias for MCAR and MAR
- MNLM has less bias than LOCF for moderate MNAR
- extreme MNAR gives problems for both
- 2. Bias is usually negative
- underestimates the effect of a drug
- is this contributing to the attrition rate of
late-phase drugs?
23Conclusions (continued)
- 3. LOCF sometimes has more power than MNLM,
sometimes less - reduced treatment effect can be more than
counteracted by artificially increased
sample-size - against statistical and ethical principles to
augment data with invented values - 4. MNLM gives very similar results to CC
- MNLM adjusts CC for non-MCAR effects
- LOCF adjusts CC in unacceptable ways
- other methods must be used to investigate non-MAR
effects neither LOCF nor MNLM can address these
problems
24Actions within GSK
- Continue to propose MNLM for primary analysis of
longitudinal trials - Prepare clear guides for statisticians, reviewers
and clinicians about MNLM - Continue to investigate methods for sensitivity
analysis to handle MNAR drop-out
25Selected references
- Mallinckrodt et al. (2003). Assessing and
interpreting treatment effects in longitudinal
clinical trials with missing data. Biological
Psychiatry 53, 754760. - Gueorguieva Krystal (2004) Move Over ANOVA.
Archives of General Psychiatry 61, 310317. - Mallinckrodt et al. (2004). Choice of the primary
analysis in longitudinal clinical trials.
Pharmaceutical Statistics 3, 161169. - Molenberghs et al. (2004). Analyzing incomplete
longitudinal clinical trial data (with
discussion). Biostatistics 5, 445464. - Cook, Zeng Yi (2004). Marginal analysis of
incomplete longitudinal binary data a cautionary
note on LOCF imputation. Biometrics 60, 820-828.