Title: Evaluating the ROC performance of markers for future events
1Evaluating the ROC performance of markers for
future events
- Yuying Jin, Yingye Zheng, Margaret Pepe
- Department of Biostatistics, University of
Washington - Fred Hutchinson Cancer Research Center
- Aug 2nd, 2007
2Application I
- Acute Kidney Injury
- Event Patients undergoing major cardiac surgery
are at high risk of kidney damage. An AKI event
occurs when serum creatinine level increases by
25 and sustained for 24 hours. - Outcome time from surgery to AKI event.
- Marker Two new markers measured in urine at
baseline and at various intervals after surgery
are under investigation. - Goal Determine the numbers of AKI patients be
diagnosed in advance with new markers, by how
long and at what cost in terms of false
diagnoses. - Issues Study involves competing risk due to
non-AKI deaths, no censoring.
3Application II
- Seattle Heart Failure Study
- Event More than 5 million people in United State
have heart failure. The event of this study is
defined as death caused by heart failure. - Outcome time from entry to death.
- Marker Linear combinations of predictors derived
from a cohort (SHF score). - Goal Evaluate the performance of the SHF score
to discriminate people who die in the first 2
years or not. - Issues This study includes censoring, but no
competing risk events.
4Overview
- Definition of ROC for Event Time Outcomes
- Estimation Approaches
- Retrospective Methods
- Model marker distribution conditioning on
outcome. - Prospective Methods
- Model event time conditioning on marker.
- Data Analysis
- Seattle Heart Failure
- Kidney Biomarker Study
5(TPF, FPF) for Binary Marker
- Binary marker Y measured at baseline.
- TPFt sensitivity (t) P(Y positive event at t)
- FPF 1-specificity P(Y positive controls)
- Natural controls exist for some studies, e.g. AKI
study. - If there are no natural controls, controls can be
determined by outcome time gt t. - t is a large landmark time point to define
controls. - FPF 1-specificity P(Y positive Tgt t)
- Focus on incident True Positive Fraction and
static False Positive Fraction defined in
Heagerty and Zheng (2005). -
6ROC curve for Continuous Marker
- Continuous marker Y measured at baseline.
- With a specific threshold rule Ygtc,
- TPFt(c) sensitivity (c,t) P(Ygtc event
at t), - FPF(c) 1-specificity(c) P(Ygtc
controls). - TPFt is a decreasing function of t.
- ROCt is the plot of TPFt(c) versus FPF(c) for
7Retrospective Estimation Methods
- Leisenring et al. (1997) proposed simple binary
regression for binary marker and no censoring.
- Etzioni et al. (1999) extended the binary
regression approach to continuous marker, again
in the absence of censoring. - Cai et al. (2006) offer a comprehensive
approach, encompassing previous methods and
extending them to censored failure time data.
8Leisenring and Etzionis Methods
- For binary marker without censoring, Leisenring
et. al. estimate FPF using controls and binary
regression to estimate TPF(t)
from cases, where g-1 is link function and
, is a set of polynomial functions. - For continuous marker, in the absence of
censoring, Etzioni et al. use ROC-GLM to model
,
- where g-1 is link function, g(h(f)) is the
baseline ROC curve at t0 and fFPF. - ROC-GLM is a general regression method of
modeling ROC curves directly. (CaiPepe (2002),
JASA Pepe (1997), Biometrika)
9Cais Method
- Extended to censored data.
- Censored subject
- Control when censored time Xgt t
- Weighted average of cases and controls
when Xlt t. - Weights are determined by estimating the
distribution of T using standard failure time
methods. - For continuous biomarker with censoring, Cai et
al. replace each marker with a series of binary
records, corresponding to a series of thresholds,
c1,,cp and adopt ROC-GLM model.
10Prospective Estimation Methods
- Prospective model combines risk regression
techniques with observed predictor distributions
to calculate TPF and FPF. - Heagerty and Zheng (2005) and Song and Zhou (in
press) employ a Cox model for a baseline marker. - Censoring is naturally incorporated.
11Heagerty and Zhengs Method
- Cox model for a baseline marker Y
- For binary marker and denote the risk set at t by
R(t), -
- is a consistent estimate of TPF(t), follows
from Xu and OQuigley (2000). - is the empirical estimate.
- With continuous biomarkers, is the empirical
dist. of Y in controls, -
- where c , generalizes
.
12Song and Zhous Method for binary marker
- Use Bayes theorem to write TPF(t) and FPF, and
estimate TPF(t) and FPF using estimates from Cox
model. - For binary marker,
- TPF(t)
-
- FPF
- When
13Song and Zhous Method for continuous maker
- With continuous marker, integrals over the
distribution of Y, F, substitute
estimated from Cox Model and the empirical
distribution of Y. - ROC Curve estimator is
14Heagerty and Zhengs versus Song and Zhous
- Advantages of Song and Zhous method,
- More efficient, uses maximum partial likelihood
estimators. - Allows censoring to depend on Y.
- Advantage of Heagerty and Zhengs method,
- Allows estimation under non-proportional hazards.
- Current Song and Zhous method is valid only when
proportional hazards assumption satisfied. But it
can be extended to non-proportional hazards
situation with further work.
15Comparisons of Retrospective and Prospective
methods
- All are semiparametric methods.
- Censoring depending on Y is only accommodated by
SZ. - Prospective Methods can only accommodate study
designs that allow the hazard function and
population distribution of predictor can be
calculated, e.g. a simple case control design
cannot be accommodated by Prospective Methods. - Retrospective methods can include disease
specific covariates, e.g. severity of kidney
injury.
16Data Analysis I
- Seattle Heart Failure
- A random sample of 1000 observations from
Val-heft trial - Controls are subjects alive at 2 years
- There are 165 deaths, 375 censoring observations
before 2 years and 460 subjects remaining alive
at 2 years. - Crude ROC curve
- Cases are the subjects who have events in the
interval (t-?,t?). - Controls are the subjects whose outcome times are
aftert. - Censored observations are excluded.
17Seattle Heart Failure Study
Figure 1 SHF scores measured at enrollment in
cases. A box-plot of the SHF score distribution
in known controls.
18Seattle Heart Failure Study cont.
Figure 2 ROC curves using 4 methods
(a)categorizing T and comparing with known
controls only (b) Cais retrospective method
(c) Heagerty and Zheng with proportional hazards
model and (d) Song and Zhou with a proportional
hazards model.
19Seattle Heart Failure Study cont.
Table 1 Comparison of estimated ROC curves at
f0.2 and f0.8. 95 confidence intervals in
parentheses are based on the same 200
bootstrapped samples.
- Conclusion
- Crude ROC curves have the largest variance
- Prospective methods have narrower confidence
intervals than Cais method. - Among prospective methods, Song and Zhous method
is more efficient than Heagerty and Zhengs.
20Data Analysis II
- Kidney Biomarker Study
- Simulated data that approximates the study design
- There are 1800 subjects in the study. 1440
patients who has no kidney injury are treated as
controls. There are 136 severe AKI events, 206
mild AKI event, and 18 patients died for non
kidney related causes. - True ROC curve
- This is a simulated study. With extremely large
data size, we can approximate the true ROC curve.
21Kidney Biomarkers Study
Figure 3 Baseline AKI (Acute Kidney Injury)
biomarker distributions. Lowess curves for
biomarkers in severe and mild AKI subgroups are
shown.
22Kidney Biomarkers Study cont.
Figure 4a True and Crude ROC curves for the
baseline AKI biomarker at T 1 and 2 days after
surgery. Solid line is T1 and dashed line is T2.
23Kidney Biomarkers Study cont.
Figure 4b Estimated ROC curves using Cais and
Song and Zhous methods for the baseline AKI
biomarker at T 1 and 2 days after surgery.
24Kidney Biomarkers Study cont.
- For baseline AKI marker, ROC curve estimated from
Cais method follows the true ROC well. - Song and Zhous estimates are not close to the
true ROC curves comparing to Cais method. It is
possible due to the violation of proportional
hazards assumption.
25Conclusion
- Summarize the current methods of ROC analysis for
censored failure time data. In our opinion a
retrospective analysis is more natural and direct
. - Focus of paper is only ROC curves. Various
methods exist for estimating AUC (area under ROC
curve). - Slides available at http//www.fhcrc.org/science/l
abs/pepe/dabs/
26Discussion
- ROC curve for event time outcome can be extended
to longitudinal markers. - Other Definitions of True Positive Fraction and
False Positive Fraction are not covered.
(Heagerty and Zheng (2005), Biometrics) - Questions?
27References
- Cai T and Pepe MS (2002) Semi-parametric ROC
analysis to evaluate biomarkers for disease.
Journal of the American Statistical Association
9710991107. - Etzioni R, Pepe M, Longton G, Hu C, Goodman G
(1999) Incorporating the time dimension in
receiver operating characteristic curves a case
study of prostate cancer. Medical Decision Making
19242251. - Heagerty PJ, Zheng Y (2005) Survival model
predictive accuracy and ROC curves. Biometrics
6192105. - Heagerty PJ, Lumley T, Pepe MS (2000)
Time-dependent ROC curves for censored survival
data and a diagnostic marker. Biometrics 56
337344.
28References
- Pepe MS (2003) The Statistical Evaluation of
Medical Tests for Classification and
Prediction.Oxford University Press. New York. - Song X and Zhou XH (in press) A semiparametric
approach for the covariate specific ROC curve
with survival outcome. Statistica Sinca. - Xu R, OQuigley J (2000) Proportional hazards
estimate of the conditional survival function.
Journal of the Royal Statistical Society Series B
62 667-680. - Zheng Y, Heagerty PJ (2004) Semiparametric
estimation of time-dependent ROC curves for
longitudinal marker data. Biostatistics 5
615-632.
29ROC curve for longitudinal marker
- Longitudinal marker Y is binary and measured at
time s. - Reset clock of measure time s to zero for Y(s)
clustered observations. - TPF (s, t) Prob (Y(s)1Tst)
- Sensitivity of marker depends on event time and
measurement time. - FPF (s) Prob (Y(s)1Tgtst)
- Time dependent ROC
- ROCt,s(f)Prob(Y(s)gtc(s)Tst) where
c(s)F-1(1-f) - F denotes the cdf for Y(s) in the control group.
- ROCt,s(f) is TPF(t,s) corresponding to an
FPF(s)f.
30Kidney Biomarkers Study (longitudinal)
Figure 5 Biomarker distributions in cases as a
function of the time lag between marker
measurement and event time, t T - s, and in
controls.
31Kidney Biomarkers Study (longitudinal) cont.
Figure 6a True and Crude ROC curves for the
longitudinally measured AKI biomarker measured at
1 and 2 days prior to clinical diagnosis of AKI
with serum creatinine.
32Kidney Biomarkers Study (longitudinal) cont.
Figure 6b Estimated ROC curves using Cais and
Song and Zhous methods for the longitudinally
measured AKI biomarker measured at 1 and 2 days
prior to clinical diagnosis of AKI with serum
creatinine.
33Kidney Biomarkers Study (longitudinal) cont.
- For longitudinal AKI marker, ROC curve by Cais
method is close to the crude ROC curve. - Song and Zhous method appears to underestimate
the ROC curve, especially at smaller FPFs.
Presumably the proportional hazards assumptions
fails.