Visual displays for the comparison of two survival functions - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Visual displays for the comparison of two survival functions

Description:

... Survival plots of time-to-event outcomes in clinical trials: good practice and pitfalls. ... the purpose of the graph is to summarise the data in the study ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 28
Provided by: reynol1
Category:

less

Transcript and Presenter's Notes

Title: Visual displays for the comparison of two survival functions


1
Visual displays for the comparison of two
survival functions
  • John Reynolds
  • Centre for Biostatistics Clinical Trials
  • Peter MacCallum Cancer Centre
  • East Melbourne

2
How to display statistical uncertainty in
survival plots?
  • Plots should include some measure of statistical
    uncertainty, otherwise any visual signs of
    treatment differences might look more convincing
    than they really are. Either SEs or CIs should
    be displayed at regular time points, or an
    overall estimate of treatment difference (eg,
    relative risk) with its 95 CI should be given.
  • whether SEs or 95 CIs should be plotted is
    open to debate
  • Pocock, S.J., Clayton, T.C. and Altman, D.G.
    (2002) Survival plots of time-to-event outcomes
    in clinical trials good practice and pitfalls.
    The Lancet 359 1686-1689.

3
Example Gastric cancer
  • Chemotherapy alone versus chemotherapy plus
    radiation
  • 45 patients in each arm
  • Primary Endpoint Overall Survival time (from
    registration) to death from any cause
  • Therneau, T. and Grambsch, P. (2000) Modeling
    survival data Extending the Cox model.
    Springer-Verlag, New York. Chapter 6.

4
Gastric cancer - overall survival
5
Gastric cancer treatment difference?
  • The log-rank (Mantel-Haenszel) test for the
    total curve comparison has a p-value of 0.251
  • The log-rank test is not so powerful when the
    hazards arent proportional (powerful when
    survival is exponential)
  • The PetoPeto modification to the Gehan-Wilcoxon
    test has a p-value of 0.030
  • The G-W test is more powerful when survival is
    logistic and gives more weight to earlier
    survival experience
  • All this is well-known see, for example,
    Friedman, L.M., Furberg, C.D. and DeMets, D.L.
    (1998) Fundamentals of clinical trials (3rd Ed).
    Springer-Verlag, New York.

6
Gastric cancer treatment difference?
  • The survival curves look to be different in the
    first 2 years
  • In this case, reporting the p-value from the
    log-rank test along with a graph of the two
    curves is probably not a good summary of the data

7
Question
  • Should we help viewers of such graphs make
    unplanned comparisons of aspects of the curves
    such as
  • Is there a significant difference in one-year
    survival? (A vertical comparison)
  • Is there a significant difference in median
    survival time? (A horizontal comparison)

8
Answer
  • I think the answer is yes if the purpose of the
    graph is to summarise the data in the study
    rather than to lend support to an outcome of a
    hypothesis test specified in a protocol
  • Exploration and summary of data versus
    confirmation and summary of a planned hypothesis
    test

9
General Approach
  • Uncertainty envelopes around the curves
  • Overlap indicates no significant difference of a
    pointwise test
  • Underlap indicates a significant difference of
    a pointwise test
  • ? SE, too anti-conservative (? ? 0.16)
  • ? 95 CI, too conservative (? ? 0.006)
  • ? LSD/2, just-right? (? ? 0.05)

10
General Approach (contd)
  • Similar to the LSIs of Andrews, H.P., Snee, R.D.
    and Sarner, M.H. (1980) Graphical display of
    means. American Statistician 34 195-199, and,
    Hannah, M. and Quigley, P. (1996) Presentation of
    ordinal regression analysis on the ordinal scale.
    Biometrics 52 771-775.
  • Except we dont have to worry about the
    approximation of k(k-1)/2 square roots of sums by
    sums of k square roots
  • Only need to worry about unequal SEs at each
    point
  • We plot estimate ? 1.96?delta where delta is
    related to the standard errors (SE) of the
    estimates as follows

11
Derivation of deltas
12
General Approach (contd)
  • The pointwise comparison of estimates via the
    overlap (and underlap) of these uncertainty
    intervals will behave like pairwise t-tests (or
    z-tests)

13
Vertical Comparisons
  • Comparing surviving at a given time
  • Use Kaplan-Meier approach to estimate the
    cumulative hazard
  • H(t) log(S(t))
  • at each event time for each group
  • Compute the SE of this estimate in the usual way
    (see for example Chapter 2 of Collett, D. (2003)
    Modelling survival data in medical research (2nd
    Ed). Chapman Hall/CRC Press, Boca Raton)
  • Compute the uncertainty interval
  • estimate ? 1.96?delta
  • then back-transform (ie. exponentiate) and plot

14
Vertical Comparisons (contd)
  • We work on the scale of the cumulative hazard,
    H-log(S) for ease and other reasons (Link, C.
    (1984) Confidence intervals for the survival
    function using Coxs proportional-hazard model
    with covariates. Biometrics 40 601-610)
  • Easy to write a function to do this in
  • S-PLUS2000
  • a plug for our conference sponsor

15
Gastric cancer vertical comparisons
16
Gastric cancer vertical comparisons
  • Overlap of cross-hatched regions indicates no
    significant differences at the associated time
    points
  • Daylight between the curves (ie. underlap of
    cross-hatched regions) indicates significant
    differences at those time points (pointwise
    comparisons on the scale of the cumulative
    hazard)
  • The difference in the early survival experience
    of the two arms (from about day 144 to day 381)
    is readily apparent in the graph
  • Data to ink ratio uncomfortably low - see Tufte,
    E.R.(1983) Visual display of quantitative
    information. Graphics press, Cheshire.

17
Horizontal Comparisons
  • Comparing percentiles of each group (eg. median
    of group 1 with median of group 2)
  • Using the K-M estimated survivor function, the
    estimated pth percentile is the smallest
    observed event time t(p) for which
  • S(t(p)) lt 1 (p/100)
  • The SE of the estimated pth percentile can be
    found from the usual delta method formula (see
    Collett op.cit.)

18
Horizontal Comparisons (contd)
  • SE of the estimated pth percentile
  • where the SE of the survival function estimate
    uses Greenwoods formula and where the estimate
    of the density function (a ratio of differences)
    can be very unstable!

19
Gastric cancer horizontal comparisons
20
Gastric cancer horizontal comparisons
  • Overlap of cross-hatched regions indicates no
    significant differences between treatment arms at
    those survival proportions or percentages
  • Evidently the times associated with the 55th
    through to the 85th percentiles of survival are
    significantly different between the two treatment
    arms (as judged by pointwise tests)
  • We have had to limit our comparisons to the 25th
    through to the 95th percentiles

21
Horizontal Comparisons - Issues
  • Which percentile test to use what are the
    operating characteristics of various tests?
  • Weve used a crude asymptotic z-test on the scale
    of the survival probability
  • Where and how to automatically restrict
    comparisons estimation of the density function
    of the survival distribution (required for the
    variance estimate of the percentile) is the
    problem

22
Another example Monoclonal gammopathy of
undetermined significance (MGUS)
  • 241 patients diagnosed at the Mayo Clinic with an
    apparently benign monoclonal gammopathy before
    January 1971 were followed forward to 1992.
  • 140 males, 101 females
  • Example 8.4.1 in Therneau Grambsch op. cit.
  • We investigate the gender difference

23
MGUS Overall survival
24
MGUS Vertical Comparisons
25
MGUS Horizontal Comparisons
26
MGUS Smoothed (lowess, f0.2) Horizontal
Comparisons?
27
Summary and Conclusions
  • Could be a useful exploratory tool?
  • But dangerous in some hands. How much daylight
    shining between the curtains, which shroud the
    curves, should cause us to take action?
  • Characterising the data by emphasizing the
    results of a collection of pointwise tests,
    rather than the actual data (!) (cf. Tufte op.
    cit.)
  • Re the horizontal comparison, a more stable
    estimation procedure for SEs of percentiles is
    required
  • Identifying and fitting models from a suitable
    parametric family neatly avoids this whole issue
    the curves are everywhere different, except
    at points of intersection, when one or more
    parameters are significantly different
Write a Comment
User Comments (0)
About PowerShow.com