Loading...

PPT – A Brief Introduction to Survival Analysis PowerPoint presentation | free to view - id: 1ac284-ZDc1Z

The Adobe Flash plugin is needed to view this content

A Brief Introduction to Survival Analysis

Page C. Moore, Ph.D. Department of

Biostatistics GCRC Clinical Research

Course October 12, 2006

Outline and Course Objectives

- What are survival data?
- Why do we need special methods?
- Assumptions about censoring
- Estimating survival curves
- Comparing survival curves
- Incorporating covariates and prognostic factors

Methods are called survival analysis for

historical reasons, but are useful for analyzing

time to events other than death - e.g., -

time to relapse (pediatric ALL studies) - time

to pregnancy (infertility studies) - time to

developmental milestones (infant studies related

to size at birth) - time to divorce (marital

studies) - time to drop-out (high school

retention studies)

Why are standard methods of estimation (i.e.

sample mean/median) and analysis (t-tests,

chi-square, linear regression) inadequate for

these situations?

Censoring

Censoring

Censoring occurs when a subject is observed for

some period of time without the event of interest

(death, relapse, bone marrow engraftment, etc.)

occurring.

- Censoring may result from
- Loss to follow-up
- Follow-up ends before event occurs
- Competing risks (e.g. bone marrow transplant

patient dies of opportunistic infection before

engraftment ALL patient dies in automobile

accident before relapsing)

When the prolonged observation of an individual

is not necessary to assess occurrence of the

event Example Surgical mortality Statistical

Analysis 2x2 contingency chi-square analysis

may be used to assess differences in survival

between groups of subjects.

Chi-Square 0.04 Degrees of Freedom (2-1)(2-1)

1 p 0.084

Assumption -

- The censoring process is independent of the event

(failure) process. - Violations can be subtle,
- e.g., patients might drop out of a study because

advanced disease makes them feel they are too

weak or ill to continue

- How can we account for partial information

provided by censored observations? - Time measured (approximately) continuous (e.g.,

days or weeks) - Kaplan-Meier plots (a.k.a. - actuarial curves,
- product limit curves, survival curves)
- Event times are grouped into larger time

intervals (e.g., years of decades) - use special but similar methods

Basis - Survival Rates Probability of surviving

2 days is probability of surviving day 2 given

survival of day 1, multiplied by the probability

of surviving day 1. Probability of surviving 3

days is probability of surviving day 3 given

survival of day 2, multiplied by the probability

of surviving day 2 (see above). . . . Etc.

Survival Rates - In Notation P(surviving t

days) P(surviving day t survived day

t-1)P(surviving day t-1 survived day

t-2)P(surviving day t-2survived day t-3) .

. . P(surviving day 3survived day

2)P(surviving day 2survived day 1)P(surviving

day 1)

Example Remission time of acute leukemia

- Purpose evaluate drugs ability to maintain

remissions - Patients randomly assigned
- Study terminated after 1 year
- Different follow up times due to sequential

enrollment - 6-MP
- 6,6,6,7,10,22,23,6,9,10,11,17,19,20,25,32

,32,34,35 - Placebo
- 1,1,2,2,3,4,4,5,5,8,8,8,8,11,11,12,12,15,17,22,23

Example Remission time of acute leukemia

- Statistic of Interest t-year survival rate

(weeks) - number of individuals relapse-free longer

than t weeks - total number of individuals in data set
- Without censoring - Placebo group
- 10-wk remission duration rate 8/21 X 100

38.1 - What can we do about censoring?
- Kaplan Meier (Product limit) method for

estimating survival rates

How can I Calculate a Survival Rate??

Column 1

Column 2

Column 3

Column 4

Column 5

Ranks 1 to n

Ranks for uncensored observations only

Time t survival rate multiply values in Col. 4

up to and including t

(No Transcript)

(No Transcript)

Comments and Observations

- The Kaplan Meier curve is a step function (i.e.,

it does not change on days when no events occur).

- Step sizes are not all equal they depend on

changes in denominator. - Even with heavy censoring, the Kaplan-Meier curve

is an unbiased estimate of the true (population)

survival curve. Censoring affects the precision

but not the accuracy (bias). - Censoring must be independent of occurrence of

endpoint for estimate to be unbiased.

Comments and Observations

- If there is no censoring, the Kaplan-Meier

estimate is the same as the simple observed

proportion surviving. - For example, if there are 100 observations and no

censoring the curve will have the value 0.99

between the first and second failure times

(assuming only one individual failed at time 1).

If there are two failures at time 2 (3 failures

now), the curve will be 0.97 between times 2 and

3. - Dont over interpret plateaus!

(No Transcript)

(No Transcript)

Additional Comments

- There are estimators of variance of the

Kaplan-Meier estimate at any time point t. These

can be used to calculate a confidence interval

for the proportion surviving at time t (i.e., a

five-year survival rate for breast cancer

patients). - There are statistical tests for comparing the

survival durations in two or more groups. The

most frequently used are the log-rank test and

the Gehan test. Both have test statistics that

are compared to critical values of the chi-square

distribution.

Covariates and Prognostic Factors

- Regression models for survival data allow us

to

- Evaluate more than one risk factor at a

time - Evaluate relative treatment effects while

controlling for potential confounding

factors investigate interactive effects among

factors

- They are not a panacea for flawed study designs!!
- The model most often used is the proportional

hazards model developed by Cox in 1972 - - often referred to simply as the Cox model.

Covariates and Prognostic Factors

- The hazard (instantaneous failure rate) function

is more conventional mathematically than the

survivor function. - - There is a one-one relationship between

the functions, so identifying factors which

affect the hazard identifies those factors which

affect survival. - A strong advantage of the proportional hazards

model is that we do not need to make assumptions

about the form of the failure time distribution

for a given set of covariate values. - - However, it is assumed that the covariate

value has the same proportional effect on

increasing or decreasing an individuals hazard

relative to the baseline, regardless of time.

Covariates and Prognostic Factors

- We estimate the regression parameters ß using a

principle called maximum likelihood. - We use what we know about the asymptotic

(infinite sample size) behavior of these

estimates to make inferences about our finite

samples. - - Clearly, the smaller our sample size, the more

questionable are our approximations. - In practice, we usually dont do too badly.

Rule of Thumb Sample Size

- Need 10 times as many observed events as

factors in the model. (e.g., 3 factors 30

observed events, 10 events for each factor) - The distribution across categories is important

as well as the total sample size. - For example, if failure to thrive (FTT) is a

factor you wish to control for but you only have

two patients out of your sample of 100 who have

FTT, the estimated effect of FTT will be

unreliable.

Why Study Prognostic Factors?

1. To learn about natural history of disease 2.

To adjust for imbalances in comparing

treatments 3. To aid in designing future

studies 4. To look for treatment-covariate

interaction 5. To predict outcome for

individual patients 6. To intervene in the

course of disease 7. To explain variation and

detect interaction --Byar (in Buyse, et

al, 1988)

- How Do We Identify Prognostic Factors?
- A. Initial screening
- Developing multivariate models

Developing Multivariate Models

- We draw conclusions about the importance of the

factor in question by making inferences about the

magnitude and sign (/-) of the regression

coefficient associated with that factor. - Because of inter-relationships among the

prognostic factors, the values of the

corresponding regression coefficients (and hence,

their statistical significance) will depend on

what other factors are in the model. - The purpose of the modeling determines the

modeling strategy.

Additional Resources

- Text
- Kleinbaum, D.G. and Klein, M., Survival Analysis

A Self Learning Text, Springer, New York 2005. - Klein, J.P. and Moeschberger, M.L., Survival

Analysis, Springer, New York 2005. - Computer Software
- SAS (http//www.sas.com/)
- S-plus (http//www.splus.com/)
- NCSS (http//www.ncss.com/download.html)