Title: Nonparametric maximum likelihood estimation (MLE) for bivariate censored data
1Nonparametric maximum likelihood estimation (MLE)
for bivariate censored data
- Marloes H. Maathuis
- advisors
- Piet Groeneboom and Jon A. Wellner
2Motivation
- Estimate the distribution function of the
- incubation period of HIV/AIDS
- Nonparametrically
- Based on censored data
- Time of HIV infection is interval censored
- Time of onset of AIDS is interval censored
- or right censored
3Approach
- Use MLE to estimate the bivariate distribution
- Integrate over diagonal strips P(Y-X z)
Y (AIDS)
z
X (HIV)
4Main focus of the project
- MLE for bivariate censored data
- Computational aspects
- (In)consistency and methods to repair the
inconsistency
5Main focus of the project
- MLE for bivariate censored data
- Computational aspects
- (In)consistency and methods to repair the
inconsistency
6Y (AIDS)
1996
Interval of onset of AIDS
1992
1980
1980
1983
1986
X (HIV)
Interval of HIV infection
7Y (AIDS)
1996
Interval of onset of AIDS
1992
1980
1980
1983
1986
X (HIV)
Interval of HIV infection
8(No Transcript)
9(No Transcript)
10(No Transcript)
11(No Transcript)
12(No Transcript)
13(No Transcript)
14(No Transcript)
153/5
0
0
The ais are not always uniquely determined
mixture non uniqueness
16Computation of the MLE
- Reduction step
- determine the maximal intersections
- Optimization step
- determine the amounts of mass assigned to the
maximal intersections
17Computation of the MLE
- Reduction step
- determine the maximal intersections
- Optimization step
- determine the amounts of mass assigned to the
maximal intersections
18Existing reduction algorithms
- Betensky and Finkelstein (1999, Stat. in
Medicine) - Gentleman and Vandal (2001, JCGS)
- Song (2001, Ph.D. thesis)
- Bogaerts and Lesaffre (2003, Tech. report)
- The first three algorithms are very slow,
- the last algorithm is of complexity O(n3).
19New algorithms
- Tree algorithm
- Height map algorithm
- based on the idea of a height map of the
observation rectangles - very simple
- very fast O(n2)
20Height map algorithm O(n2)
1
1
1
1
0
0
0
0
1
1
2
2
1
0
0
0
0
2
1
3
3
2
1
1
1
0
2
1
3
3
2
1
2
1
0
2
1
2
2
1
0
1
0
0
2
0
1
1
0
0
1
0
0
1
0
1
2
1
1
2
1
1
1
0
0
1
1
1
2
1
1
0
0
0
0
0
0
1
0
0
0
21(No Transcript)
22Main focus of the project
- MLE of bivariate censored data
- Computational aspects
- (In)consistency and methods to repair the
inconsistency
23Time of HIV infection is interval censored case 2
AIDS
HIV
24Time of HIV infection is interval censored case 2
AIDS
HIV
25Time of HIV infection is interval censored case 2
AIDS
HIV
26Time of onset of AIDS is right censored
AIDS
HIV
27Time of onset of AIDS is right censored
AIDS
t min(c,y)
HIV
28Time of onset of AIDS is right censored
AIDS
t min(c,y)
HIV
29AIDS
u1
u2
HIV
30AIDS
u1
u2
HIV
31AIDS
u1
u2
HIV
32AIDS
u1
u2
HIV
33Inconsistency of the naive MLE
34Inconsistency of the naive MLE
35Inconsistency of the naive MLE
36Inconsistency of the naive MLE
37Methods to repair inconsistency
- Transform the lines into strips
- MLE on a sieve of piecewise constant densities
- Kullback-Leibler approach
38X time of HIV infection Y time of onset of
AIDS Z Y-X incubation period
- cannot be estimated consistently
39X time of HIV infection Y time of onset of
AIDS Z Y-X incubation period
- An example of a parameter we can estimate
consis- tently is
40Conclusions (1)
- Our algorithms for the parameter reduction step
are significantly faster than other existing
algorithms. - We proved that in general the naive MLE is an
inconsistent estimator for our AIDS model.
41Conclusions (2)
- We explored several methods to repair the
inconsistency of the naive MLE. - cannot be estimated consistently
without additional assumptions. An alternative
parameter that we can estimate consistently
is .
42Acknowledgements
- Piet Groeneboom
- Jon Wellner