Imputation Methods for Missing Quality of Life Data in the Adjuvant Breast Cancer Trials IBCSG Trial - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Imputation Methods for Missing Quality of Life Data in the Adjuvant Breast Cancer Trials IBCSG Trial

Description:

Imputation Methods for Missing Quality of Life Data in the Adjuvant Breast ... The multiple imputation methods showed hazard ratios which were similar for each ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 26
Provided by: mari131
Category:

less

Transcript and Presenter's Notes

Title: Imputation Methods for Missing Quality of Life Data in the Adjuvant Breast Cancer Trials IBCSG Trial


1
Imputation Methods for Missing Quality of Life
Data in the Adjuvant Breast Cancer Trials IBCSG
Trial VI and VII
  • Marion Procter

2
Aim
  • Analysing a clinical trial looking at the effect
    of quality of life (QOL) on disease free survival
    (DFS)
  • QOL is measured repeatedly but there are many
    missing values
  • Is the conclusion about the effect of QOL on DFS
    sensitive to the missing values?

3
Outline
  • IBCSG Trial VI and VII
  • Imputing Coping Scores
  • Time-dependent Cox models
  • Missing data pattern
  • Common imputation methods
  • Results from time-dependent Cox models
  • Comparing imputation methods
  • Estimated difference between imputed coping score
    and missing coping score
  • Simulated dataset with relationship between
    disease-free survival and coping score
  • Conclusions

4
IBCSG Trial VI/VII
  • Between July 1986 and April 1993, 1475 pre- and
    perimenopausal patients were randomized to Trial
    VI
  • Between March 1990 and April 1993, 1212
    postmenopausal patients were randomized to Trial
    VII
  • Patients followed up every 3 months for 2 years

5
Quality of Life Assessment
  • To assess coping / adjustment to disease, a
    simple self-assessment scale was used
  • The patients were asked to rate the amount of
    adjustment needed to cope with her illness on a
    scale from 0 to100 by marking on a linear
    analogue line
  • High numbers reflect worse quality of life

6
Quality of Life Objectives
  • The objective of assessing quality of life was
  • to evaluate the hypotheses the level of early
    coping / well-being of the patient can be used as
    a prognostic factor of outcome
  • to investigate if the level of early coping /
    well-being of the patient changes during the
    study (not discussed here)

7
Missing Data Pattern
  • Baseline 6 months 12 months 24 months
  • Observed 2231 1870 1812 1501
  • Missing 456 751 662 666
  • Post-recurrence 0 57 173 340
  • Lost to follow-up 0 1 3 5
  • Dead 0 8 37 175
  • Total 2687 2687 2687 2687

8
Imputing Coping Scores
  • The aim of the analysis is to investigate the
    relationship between quality of life and
    disease-free survival (DFS)
  • Considered coping score from quality of life
    assessments up to 2 years (24 months) after
    randomization and before recurrence
  • Only 585 out of 2687 patients have all 9 quality
    of life assessments up to 2 years before
    recurrence
  • Imputation of the quality of life scores is
    therefore considered

9
Time Dependent Cox Models
  • Coping score as covariate in a time-dependent Cox
    model for DFS
  • In a time-dependent Cox model for DFS, the length
    of time a patient spends in each time period,
    whether DFS survival event occurred in the time
    period is calculated
  • The coping score changes with time and the coping
    score during each period is a time dependent
    covariate

10
Common Imputation Methods
  • Last Observation Carried Forward (LOCF)
  • Median imputation
  • Bootstrapping
  • Linear regression
  • Predicted mean matching
  • Pattern mixture models
  • Nearest neighbour imputation

11
Bootstrapping
  • Replace a missing coping score an observed coping
    score selected at random from the observed coping
    scores of patients in the subgroup
  • Subgroups defined by baseline coping score and
    then previous observed or imputed coping score
  • Bootstrap procedure run 150 times

12
Linear Regression with Concurrent Variables
  • Concurrent variables considered in a linear
    regression model for coping score included
  • UICC Performance Status Menstrual status
  • Severity of adverse events
  • nausea and vomiting
  • diarrhoea
  • stomatitis / mucous membrane

13
Hazard Ratios for Square Root of Coping Score
(S_Pacis)
  • The hazard ratios for the square root of the
    coping score (S_Pacis) from a time-dependent Cox
    model stratified by trial are presented
  • The hazard ratios for S_Pacis from a
    time-dependent Cox model stratified by trial with
    re-introduction of chemotherapy as an explanatory
    variable are similar and are not presented

14
Results from Time-Dependent Cox Model
(Stratified by Trial)
  • Hazard Ratio 95 CI for HR
  • LOCF 0.993 (0.972, 1.015)
  • Median imputation
  • Median of patients observed scores 1.003 (0.983,
    1.025)
  • Median of time period 0.994 (0.970, 1.018)
  • Median of treatment arm by time
    period 0.991 (0.968, 1.015)
  • Linear regression
  • Using previous coping scores 0.993 (0.970,
    1.018)
  • Using concurrent variables 1.008 (0.986, 1.030)

15
Results Time-Dependent Cox Model (Stratified by
Trial)Mean of 150 Simulations
  • Hazard Ratio 95 CI for HR
  • Bootstrapping, subgroups defined by
  • Baseline coping score 0.992 (0.970, 1.015)
  • Previous coping score 0.994 (0.972, 1.016)
  • Predicted mean matching 0.995 (0.972, 1.017)
  • Pattern mixture models 0.993 (0.971, 1.015)
  • Nearest neighbour imputation 0.996 (0.974,
    1.018)

16
Comparing Imputation Methods
  • Patients with a complete history of observed
    coping scores were identified and some values
    were removed to imitate the missing data pattern
    in the full data
  • 150 simulated datasets with artificially removed
    coping scores were generated
  • The differences between the imputed coping score
    and the real coping score artificially removed
    were calculated

17
Comparing Imputation Methods
  • From the differences between the imputed coping
    score and the real coping score artificially
    removed, the mean and standard deviation of the
    difference between the imputed coping score and
    the missing coping score was estimated for the
    imputation method

18
Estimated Difference BetweenImputed Coping Score
and Missing Coping Score
  • Mean Standard Deviation
  • Difference of Difference
  • LOCF -0.72 20.64
  • Median imputation
  • Median of patients observed scores 2.10 18.02
  • Median of time period 11.24 25.18
  • Median of treatment arm by time
    period 10.21 25.01
  • Linear regression
  • Using previous coping scores 5.36 18.37
  • Using concurrent variables 9.40 25.62

19
Estimated Difference BetweenImputed Coping Score
and Missing Coping ScoreMean of 150 Simulations,
First Artificial Dataset
  • Mean Standard Deviation
  • Difference of Difference
  • Bootstrapping, subgroups defined by
  • Baseline coping score 3.17 30.58
  • Previous coping score 2.54 27.41
  • Predicted mean matching 2.07 25.36
  • Pattern mixture models -0.76 20.88
  • Nearest neighbour imputation 2.94 30.25

20
Simulated Dataset with RelationshipBetween DFS
and Coping Score
  • A simulated dataset was created where good
    quality of life is associated with high DFS
  • The simulated coping scores for the observed
    coping scores were selected at random from a
    range of possible values
  • For each coping score assessment expected, a
    simulated coping score was generated

21
Simulated Dataset with RelationshipBetween DFS
and Coping Score
  • The same missing data pattern was used as the
    original IBCSG data
  • There is no relationship between time period and
    coping score
  • As expected, the square root of the coping score
    (S_Pacis) was a significant parameter for all
    common imputation methods

22
Results from Time-Dependent Cox Model(Stratified
by Trial)
  • Hazard Ratio 95 CI for HR
  • Full simulated data 3.377 (3.216, 3.546)
  • LOCF 3.380 (3.205, 3.564)
  • Median imputation
  • Median of patients observed scores 3.398 (3.232,
    3.573)
  • Median of time period 2.388 (2.295, 2.485)
  • Median of treatment arm by time
    period 2.395 (2.302, 2.493)
  • Linear regression
  • Using previous coping scores 3.462 (3.281,
    3.654)
  • Using concurrent variables 3.048 (2.903, 3.200)

23
Results Time-Dependent Cox Model(Stratified by
Trial)Mean of 150 Iterations
  • Hazard Ratio 95 CI for HR
  • Bootstrapping, subgroups defined by
  • Baseline coping score 3.125 (2.970, 3.288)
  • Previous coping score 2.722 (2.602, 2.847)
  • Predicted mean matching 3.068 (2.920, 3.225)
  • Pattern mixture models 3.207 (3.048, 3.374)
  • Nearest neighbour imputation 3.207 (3.046,
    3.376)

24
ConclusionsComparing imputed and missing scores
  • In the IBCSG data, there is large within patient
    and between patient variability
  • For all common methods the standard deviation of
    the difference between the imputed coping score
    and the missing coping score was high, indicating
    a lack of precision in predicting the missing
    coping score
  • The estimated standard deviations of the
    difference between the imputed coping score and
    missing coping score were similar for the
    simple imputation methods as the multiple
    imputation methods

25
ConclusionsHazard Ratios in Cox Model
  • No evidence of a relationship between quality of
    life and disease-free survival in the IBCSG data
  • The multiple imputation methods showed hazard
    ratios which were similar for each application
  • When imputing the explanatory variable in a time
    dependent Cox regression there is no effect of
    imputation method on the hazard ratio
  • In the simulated data set with a strong
    relationship between coping score and DFS but the
    same missing data pattern as in IBCSG data there
    is evidence that some imputation methods are
    biased (underestimate) in the estimate of the
    hazard ratio
  • Among common methods, LOCF worked well in
    imputing both the IBCSG data and the simulated
    data set with a strong relationship between
    coping score and DFS
Write a Comment
User Comments (0)
About PowerShow.com