A Brief Introduction to Survival Analysis - PowerPoint PPT Presentation


PPT – A Brief Introduction to Survival Analysis PowerPoint presentation | free to view - id: 1ac284-ZDc1Z


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

A Brief Introduction to Survival Analysis


other than death - e.g., - time to relapse (pediatric ALL studies) ... Non-Emergency. Total. Dead. Alive. Total. 24. 9. 33. 289. 100. 389. 313. 109. 422. Chi ... – PowerPoint PPT presentation

Number of Views:125
Avg rating:3.0/5.0
Slides: 29
Provided by: UAMS


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: A Brief Introduction to Survival Analysis

A Brief Introduction to Survival Analysis
Page C. Moore, Ph.D. Department of
Biostatistics GCRC Clinical Research
Course October 12, 2006
Outline and Course Objectives
  • What are survival data?
  • Why do we need special methods?
  • Assumptions about censoring
  • Estimating survival curves
  • Comparing survival curves
  • Incorporating covariates and prognostic factors

Methods are called survival analysis for
historical reasons, but are useful for analyzing
time to events other than death - e.g., -
time to relapse (pediatric ALL studies) - time
to pregnancy (infertility studies) - time to
developmental milestones (infant studies related
to size at birth) - time to divorce (marital
studies) - time to drop-out (high school
retention studies)
Why are standard methods of estimation (i.e.
sample mean/median) and analysis (t-tests,
chi-square, linear regression) inadequate for
these situations?
Censoring occurs when a subject is observed for
some period of time without the event of interest
(death, relapse, bone marrow engraftment, etc.)
  • Censoring may result from
  • Loss to follow-up
  • Follow-up ends before event occurs
  • Competing risks (e.g. bone marrow transplant
    patient dies of opportunistic infection before
    engraftment ALL patient dies in automobile
    accident before relapsing)

When the prolonged observation of an individual
is not necessary to assess occurrence of the
event Example Surgical mortality Statistical
Analysis 2x2 contingency chi-square analysis
may be used to assess differences in survival
between groups of subjects.
Chi-Square 0.04 Degrees of Freedom (2-1)(2-1)
1 p 0.084
Assumption -
  • The censoring process is independent of the event
    (failure) process.
  • Violations can be subtle,
  • e.g., patients might drop out of a study because
    advanced disease makes them feel they are too
    weak or ill to continue

  • How can we account for partial information
    provided by censored observations?
  • Time measured (approximately) continuous (e.g.,
    days or weeks)
  • Kaplan-Meier plots (a.k.a. - actuarial curves,
  • product limit curves, survival curves)
  • Event times are grouped into larger time
    intervals (e.g., years of decades)
  • use special but similar methods

Basis - Survival Rates Probability of surviving
2 days is probability of surviving day 2 given
survival of day 1, multiplied by the probability
of surviving day 1. Probability of surviving 3
days is probability of surviving day 3 given
survival of day 2, multiplied by the probability
of surviving day 2 (see above). . . . Etc.
Survival Rates - In Notation P(surviving t
days) P(surviving day t survived day
t-1)P(surviving day t-1 survived day
t-2)P(surviving day t-2survived day t-3) .
. . P(surviving day 3survived day
2)P(surviving day 2survived day 1)P(surviving
day 1)
Example Remission time of acute leukemia
  • Purpose evaluate drugs ability to maintain
  • Patients randomly assigned
  • Study terminated after 1 year
  • Different follow up times due to sequential
  • 6-MP
  • 6,6,6,7,10,22,23,6,9,10,11,17,19,20,25,32
  • Placebo
  • 1,1,2,2,3,4,4,5,5,8,8,8,8,11,11,12,12,15,17,22,23

Example Remission time of acute leukemia
  • Statistic of Interest t-year survival rate
  • number of individuals relapse-free longer
    than t weeks
  • total number of individuals in data set
  • Without censoring - Placebo group
  • 10-wk remission duration rate 8/21 X 100
  • What can we do about censoring?
  • Kaplan Meier (Product limit) method for
    estimating survival rates

How can I Calculate a Survival Rate??
Column 1
Column 2
Column 3
Column 4
Column 5
Ranks 1 to n
Ranks for uncensored observations only
Time t survival rate multiply values in Col. 4
up to and including t
(No Transcript)
(No Transcript)
Comments and Observations
  • The Kaplan Meier curve is a step function (i.e.,
    it does not change on days when no events occur).
  • Step sizes are not all equal they depend on
    changes in denominator.
  • Even with heavy censoring, the Kaplan-Meier curve
    is an unbiased estimate of the true (population)
    survival curve. Censoring affects the precision
    but not the accuracy (bias).
  • Censoring must be independent of occurrence of
    endpoint for estimate to be unbiased.

Comments and Observations
  • If there is no censoring, the Kaplan-Meier
    estimate is the same as the simple observed
    proportion surviving.
  • For example, if there are 100 observations and no
    censoring the curve will have the value 0.99
    between the first and second failure times
    (assuming only one individual failed at time 1).
    If there are two failures at time 2 (3 failures
    now), the curve will be 0.97 between times 2 and
  • Dont over interpret plateaus!

(No Transcript)
(No Transcript)
Additional Comments
  • There are estimators of variance of the
    Kaplan-Meier estimate at any time point t. These
    can be used to calculate a confidence interval
    for the proportion surviving at time t (i.e., a
    five-year survival rate for breast cancer
  • There are statistical tests for comparing the
    survival durations in two or more groups. The
    most frequently used are the log-rank test and
    the Gehan test. Both have test statistics that
    are compared to critical values of the chi-square

Covariates and Prognostic Factors
  • Regression models for survival data allow us

- Evaluate more than one risk factor at a
time - Evaluate relative treatment effects while
controlling for potential confounding
factors investigate interactive effects among
  • They are not a panacea for flawed study designs!!
  • The model most often used is the proportional
    hazards model developed by Cox in 1972
  • - often referred to simply as the Cox model.

Covariates and Prognostic Factors
  • The hazard (instantaneous failure rate) function
    is more conventional mathematically than the
    survivor function.
  • - There is a one-one relationship between
    the functions, so identifying factors which
    affect the hazard identifies those factors which
    affect survival.
  • A strong advantage of the proportional hazards
    model is that we do not need to make assumptions
    about the form of the failure time distribution
    for a given set of covariate values.
  • - However, it is assumed that the covariate
    value has the same proportional effect on
    increasing or decreasing an individuals hazard
    relative to the baseline, regardless of time.

Covariates and Prognostic Factors
  • We estimate the regression parameters ß using a
    principle called maximum likelihood.
  • We use what we know about the asymptotic
    (infinite sample size) behavior of these
    estimates to make inferences about our finite
  • - Clearly, the smaller our sample size, the more
    questionable are our approximations.
  • In practice, we usually dont do too badly.

Rule of Thumb Sample Size
  • Need 10 times as many observed events as
    factors in the model. (e.g., 3 factors 30
    observed events, 10 events for each factor)
  • The distribution across categories is important
    as well as the total sample size.
  • For example, if failure to thrive (FTT) is a
    factor you wish to control for but you only have
    two patients out of your sample of 100 who have
    FTT, the estimated effect of FTT will be

Why Study Prognostic Factors?
1. To learn about natural history of disease 2.
To adjust for imbalances in comparing
treatments 3. To aid in designing future
studies 4. To look for treatment-covariate
interaction 5. To predict outcome for
individual patients 6. To intervene in the
course of disease 7. To explain variation and
detect interaction --Byar (in Buyse, et
al, 1988)
  • How Do We Identify Prognostic Factors?
  • A. Initial screening
  • Developing multivariate models

Developing Multivariate Models
  • We draw conclusions about the importance of the
    factor in question by making inferences about the
    magnitude and sign (/-) of the regression
    coefficient associated with that factor.
  • Because of inter-relationships among the
    prognostic factors, the values of the
    corresponding regression coefficients (and hence,
    their statistical significance) will depend on
    what other factors are in the model.
  • The purpose of the modeling determines the
    modeling strategy.

Additional Resources
  • Text
  • Kleinbaum, D.G. and Klein, M., Survival Analysis
    A Self Learning Text, Springer, New York 2005.
  • Klein, J.P. and Moeschberger, M.L., Survival
    Analysis, Springer, New York 2005.
  • Computer Software
  • SAS (http//www.sas.com/)
  • S-plus (http//www.splus.com/)
  • NCSS (http//www.ncss.com/download.html)
About PowerShow.com