Introduction to Biostatistics II - PowerPoint PPT Presentation

1 / 28

About This Presentation

Title:

Introduction to Biostatistics II

Description:

Consider the probability that an individual younger than 40 years of age in the ... This individual must have survived up to the six-month point and then expired a ... – PowerPoint PPT presentation

Number of Views:162

Avg rating:3.0/5.0

Slides: 29

Provided by: Constantin86

Category:

more less

Transcript and Presenter's Notes

Title: Introduction to Biostatistics II

1
Introduction to Biostatistics II

Survival analysis

2
Survival analysis

The outcome is survival.
In general this is, the outcome is time to an
event
e.g., response
failure
death
pregnancy
infection

3
Survival curves for three population groups
4
USA life table 1979-1981
5
Survival curve for the US population, 1979-1981
6
Hemophiliac data example
7
Estimates of the survival curve

Consider the probability that an individual
younger than 40 years of age in the previous data
set will die at time t6 months after initiation
of observation.
This individual must have survived up to the
six-month point and then expired a short time
after that, at time point t?, where ? symbolizes
a very small unit of time. How will the
probability of a death at six months be
calculated?
Recall the definition of the conditional
probability of an event B given an event A
which results in the multiplicative law of
probability

8
Estimates of the survival curve (contd)

Considering two events ASurvive up to time t
and B Failed at time t? (i.e., shortly after
t).
The event S(t?)Survived up to time t? is
i.e., the probability of surviving up to time t?
is equal to the probability of surviving up to t
times the probability of failing at t? given
survival up to t.

9
The life table method

The life-table method of estimation of the
survival curve works as follows
Splits the time scale into J time intervals of
the type tj-1-tj for j1,?,J
The number of people dying in each interval is dj
The number of people alive at the beginning of
the interval (number at risk) is rj

10
Derivation of the life table method

To derive the life table estimate of the survival
distribution we need to estimate the following
quantities
Conditional probability of dying at interval j
given survival up to j
P(BjAj)?qt dj/rj
Thus, probability of survival up to j
The life-table estimate of the survival
distribution is constructed as follows

11
Life table of under-40 hemophiliac data
This subfile contains 12 observations
Life Table Survival Variable survival
Number Number Number Number
Cumul Intrvl Entrng Wdrawn Exposd of
Propn Propn Propn Proba- Start this
During to Termnl Termi- Sur- Surv
bility Hazard Time Intrvl Intrvl Risk
Events nating viving at End Densty
Rate ------ ------ ------ ------ ------
------ ------ ------ ------ ------ .0
12.0 .0 12.0 2.0 .1667 .8333
.8333 .0333 .0364 5.0 10.0 .0
10.0 3.0 .3000 .7000 .5833 .0500
.0706 10.0 7.0 .0 7.0 1.0
.1429 .8571 .5000 .0167 .0308 15.0
6.0 .0 6.0 3.0 .5000 .5000
.2500 .0500 .1333 20.0 3.0 .0
3.0 .0 .0000 1.0000 .2500 .0000
.0000 25.0 3.0 .0 3.0 1.0
.3333 .6667 .1667 .0167 .0800 30.0
2.0 .0 2.0 2.0 1.0000 .0000
.0000 These calculations
for the last interval are meaningless. The
median survival time for these data is 15.00
12
Life table estimate of the survival distribution
13
The Kaplan-Meier method

The K-M method differs from the life-table method
in that it separates the time spectrum according
to failure times (instead of fixed-width
intervals).
The first interval is (0 2) (2 is the time of
the first failure) when 1/12 individuals failed
(died) so 11/12 survived. The survival estimate
at t2 is, S(2)11/120.9167.
The second interval is (2 3) (the second
failure happens at t3) when 1/11 individuals
fails. The survival estimate at t3 is
S(3)(11/12)(10/11)(10/12)0.8333, since to
survive up to t3 you must survive up to t2 and
(given that you survived up to t2) then survive
beyond t3.
And so on

14
The product-limit Method

Nothing happens except at the time of failure.

Survival Analysis for survival Time
Status Cumulative Standard
Cumulative Number
Survival Error Events
Remaining 2 Selected .9167
.0798 1 11 3
Selected .8333 .1076
2 10 6 Selected
3 9
6 Selected .6667 .1361
4 8 7 Selected
.5833 .1423 5
7 10 Selected .5000
.1443 6 6 15
Selected
7 5 15 Selected
.3333 .1361 8 4
16 Selected .2500 .1250
9 3 27 Selected
.1667 .1076 10
2 30 Selected .0833
.0798 11 1 32
Selected .0000 .0000
12 0 Number of Cases 12
Censored 0 ( .00) Events 12
Survival Time Standard Error 95
Confidence Interval Mean 14
3 ( 8, 20 ) Median
10 5 ( 1,
19 ) Percentiles
25.00 50.00 75.00
Value 16.00 10.00
6.00 Standard Error 9.00 4.62
2.45
15
Kaplan-Meier estimate of the survival distribution

This plot is the Kaplan-Meier estimate of the
hemophiliac-patient survival distribution
corresponding to the previous output.

16
Censoring

When failure has not been observed, then the only
information from the data is that the failure
time is no less than the time of the last
available observation (e.g., clinical visit).
This is easily incorporated into the estimation
procedure.
For example, consider the following data where
subjects 2 and 6 completed observation without
failure at months 3 and 10 (censor0 means
censoring)

17
Life table method in the presence of censoring

To carry out the life-table estimate of the
survival distribution, when data include censored
observations, we include the number of censored
observations in interval j.
cj is the number of censored observations in
interval j
Since we do not know when exactly the censoring
occurred we have the following options for
calculating the number of individuals surviving
up to j
at the beginning of the interval (so the number
at risk at the beginning of interval j is
r'jrj-cj)
at the end of the interval (so the number at risk
is r'jrj)
at the middle of the interval (assuming that
censoring happens uniformly through the interval,
so r'jrj-cj/2).
The latter case is called the actuarial estimator
of survival.

18
Derivation of the life-table method

To calculate the life table method for the period
between 5 and 10 (interval j1) months in our
example we proceed as follows
There is one failure and one censored observation
in the first interval (i.e., between 0 and 5
months). Assuming that the censoring happened at
the midpoint of the interval (actuarial survival)
the (effective) number at risk is
r'1(r1-c1/2)11.5.
Thus, ?q11/11.50.0870, so
S(1)?q10.9130
For the second interval (j2, time between 5 and
10 months) we have that three failures occurred
with no censoring thus after removing the first
failure and censored observation r'2r210 and
?q23/100.3000, so

19
Analysis via the life-table method

Life Table
Survival Variable survival
Number Number Number Number
Cumul
Intrvl Entrng Wdrawn Exposd of Propn
Propn Propn Proba-
Start this During to Termnl Termi-
Sur- Surv bility Hazard
Time Intrvl Intrvl Risk Events nating
viving at End Densty Rate
------ ------ ------ ------ ------ ------
------ ------ ------ ------
.0 12.0 1.0 11.5 1.0 .0870
.9130 .9130 .0174 .0182
5.0 10.0 .0 10.0 3.0 .3000
.7000 .6391 .0548 .0706
10.0 7.0 1.0 6.5 .0 .0000
1.0000 .6391 .0000 .0000
15.0 6.0 .0 6.0 3.0 .5000
.5000 .3196 .0639 .1333
20.0 3.0 .0 3.0 .0 .0000
1.0000 .3196 .0000 .0000
25.0 3.0 .0 3.0 1.0 .3333
.6667 .2130 .0213 .0800
30.0 2.0 .0 2.0 2.0 1.0000
.0000 .0000
These calculations for the last interval
are meaningless.
The median survival time for these data is
17.18

20
Life table estimate of the survival distribution
in the presence of censoring
21
K-M estimate in the presence of censoring

Consider how censoring is handled in the K-M
procedure
In the first interval (time 0-2) one out of 12
individuals fails at 2 months so that the
estimate of survival at t2 is
No one fails at t3 months (second interval).
At t6 months two total subjects have failed out
of the remaining ten (since one subject was
censored at 3 months and is no longer part of the
at-risk sample at six months), so (1q6 0.2000)
is the probability of failure at t6 months. The
estimate of the survival distribution is
S(6) S(2)(1-1q6) 0.9167(1-0.2000)0.7333
So, censored observations are present up to the
interval where they are censored and disappear
after that.

22
Kaplan-meier estimate with censored observations

Survival Analysis for survival
Time Status Cumulative Standard
Cumulative Number
Survival Error
Events Remaining
2 1.00 .9167 .0798
1 11
3 .00
1 10
6 1.00
2 9
6 1.00 .7333 .1324
3 8
7 1.00 .6417 .1441
4 7
10 .00
4 6
15 1.00
5 5
15 1.00 .4278 .1565
6 4
16 1.00 .3208 .1495
7 3
27 1.00 .2139 .1325
8 2
30 1.00 .1069 .1005
9 1
32 1.00 .0000 .0000
10 0

23
The K-M plot with censoring

The K-M estimate of the survival distribution in
the presence of censoring is as shown in the
figure.

24
Testing

Consider the survival curves of hemophiliacs
contracting AIDS above 40 years of age and before
40 years of age.

25
Survival distribution of gt40 year-olds

Survival Analysis for survival
Factor age gt40
Time Status Cumulative Standard
Cumulative Number
Survival Error
Events Remaining
1 1.00
1 8
1 1.00
2 7
1 1.00
3 6
1 1.00 .5556 .1656
4 5
2 1.00 .4444 .1656
5 4
3 1.00
6 3
3 1.00 .2222 .1386
7 2
9 1.00 .1111 .1048
8 1
22 1.00 .0000 .0000
9 0
Number of Cases 9 Censored 0 (
.00) Events 9

26
Comparing two survival distributions
27
The log-rank test

The log-rank test evaluates the null hypothesis
H0 Slt40(t) Sgt40(t) versus the alternative
H0 Slt40(t) ? Sgt40(t)
the test is based on the statistic
where, for each failure time j and group i1,2,
, where dj is the number of deaths, Y(t)
is the number at risk (alive) at time t and Y1(t)
and Y2(t) are the total number at risk in group 1
and 2 respectively and
and are the
total numbers of expected and observed deaths.

28
The log-rank test with SPSS

The SPSS output for the log-rank test is as
follows
since p0.006lt0.05, there is a statistically
significant difference in survival between the
two groups.
Since the log-rank test is two-sided, we must
check the median survival time to see the
direction of the difference (here it is the
younger lt40 year-old patients).

Test Statistics for Equality of Survival
Distributions for age Statistic
df Significance Log Rank
7.61 1 .0058

Write a Comment

User Comments (0)