Comparison of Recalibration Techniques for Logistic Regression in Interventional Cardiology - PowerPoint PPT Presentation

1 / 35

About This Presentation

Title:

Comparison of Recalibration Techniques for Logistic Regression in Interventional Cardiology

Description:

From www.powerpointbackgrounds.com Messages do not appear on the full product. – PowerPoint PPT presentation

Number of Views:71

Avg rating:3.0/5.0

Slides: 36

Provided by: Maboroshi

Category:

more less

Transcript and Presenter's Notes

Title: Comparison of Recalibration Techniques for Logistic Regression in Interventional Cardiology

1
Comparison of Recalibration Techniques for
Logistic Regression in Interventional Cardiology

Michael E. Matheny, MD
HST 951 Final Presentation

2
Background

Risk Models are evaluated for accuracy in two
categories
Discrimination ability of a model to separate
data with respect to values of an outcome
variable
Measured by the Area Under the Receiving
Operating Characteristic Curve (ROC or AUC)
Calibration ability of a model to accurately
predict risk for individuals or small subgroups
of the population
Multiple measurements Hosmer-Lemeshow
Goodness-of-Fit, Brier Score, Calibration Slope,
etc.

3
Background

Risk Modeling techniques are generally able to
perform well in terms of discrimination and
calibration on local development and test data
Performance Depends on
Data collection quality (data noise level)
Identification of relevant risk factors for an
outcome
Time delay to realization of an outcome

4
Background

A model is most useful when it can be
successfully applied to all patients in that
domain
External validation of these models in multiple
medical domains with various risk modeling
techniques have produced consistent results
Discrimination is preserved
Calibration tends to fail

5
Background

Multiple Reasons for Calibration Failure
Problems related to location/medical center
Different patient demographics / case-mix
Different outcome event rates
Possibly different data element definitions
Problems related to time
Changes in the standard of medical care

6
Background

Various recalibration methods have been applied
to adapt a risk model to local conditions
Outcome Scaling
Adjusting the model result by the outcome event
ratio between the new and original models
Model Refitting
Applying a new model to the result of the
original model
Including the result of the original model as a
covariate in the new model
Remodeling
Fitting a new model using the same covariates

7
Background

These techniques have been variably successful in
improving calibration for local populations
Relative performance of these techniques has not
been well-described in the literature
Application of these techniques over multiple
consecutive time periods of data for a population
has not been reported

8
Background

Logistic Regression is the most common risk
modeling technique used in medicine
Interventional Cardiology
High Data Quality (National Data Element
Standard)
Many large published risk models
Risk factors for outcome are well-known
Access

9
Purpose

The purpose of this study was to evaluate
well-known recalibration methods for Logistic
Regression over multiple periods of time to
compare the relative performance of each method
in the domain of Interventional Cardiology.

10
MethodsSource Data

Brigham Womens Hospital
720 Bed Academic Teaching Hospital
Interventional Cardiology Suites
Electronic Data Collection
Compliant with National Data Element Standard
State mandated data collection for every case

11
MethodsSource Data

All PCI cases performed from January 01, 2002 to
December 31, 2004 were included
The outcome of interest was post-procedural
in-hospital mortality
A separate data set was created each year of cases

12
MethodsSource Data
Year Cases Mortality ()
2002 1947 15 (0.8)
2003 1841 33 (1.8)
2004 1767 33 (1.9)
13
MethodsData Collection

The most well-known LR risk models were utilized
for the evaluation (event rate)
American College of Cardiology (ACC)
707/50123 (1.4)
Northern New England (NNE)
165/15331 (1.1)
Cleveland Clinic (CCL)
169/12985 (1.3)
University of Michigan (MIC)
169/10796 (1.6)

14
MethodsData Collection

All statistical evaluations were performed by SAS
9.1 (Cary, NC)
Discrimination was measured by the Area Under the
Receiving Operator Characteristic curve

15
MethodsStudy Data

Three calibration evaluations
Hosmer-Lemeshow Goodness-of-Fit
Brier Score / Spiegelhalter Z Score
Calibration Plot (Intercept/Slope)
Graphical Only
Based on risk deciles in HL GOF algorithm
For each recalibration, the prior year was used
to recalibrate (2002-gt2003, 2003-gt2004)

16
MethodsPost-Score Scaling (PSY)

At the case level, model results are scaled by
the following equation

P(PSY) can exceed 1 for some values of
ObservedEventRate gt ModelEventRate and these
values are truncated to 1

17
MethodsLR Intercept Scaling (IntY)

In the general LR equation

B0 is the intercept of the equation
This variable represents the outcome probability
in the absence of all other risk factors
(baseline risk)

18
MethodsLR Intercept Scaling (IntY)

The proportion of risk contributed by the
intercept (baseline) can be calculated for a data
set by

19
MethodsLR Intercept Scaling (IntY)

The proportion of risk (RiskInt()) is multiplied
by the observed event rate, and converted back to
a Beta Coefficient from a probability

If ObsEventRate(New) gt ObsEventRate(Old) then the
probability can exceed 1, and is truncated to 1.

20
MethodsRecalibration Methods

LR Model Refitting (SigY)
In this method, the output probability of the
original LR equation is used to model a new LR
equation with that output as the only covariate

21
ResultsROC with 95 Confidence Intervals
22
ResultsNo Recalibration
Model Obs Exp HLChi2 Spieg Z
2003
ACC 33 414 634 -11.4
NNE 33 39.0 24.3 0.08
MIC 33 27.2 6.6 1.51
CCL 33 56.3 14.0 -3.49
2004
ACC 33 418 641 -11.8
NNE 33 36.6 51.0 0.41
MIC 33 23.3 22.9 1.99
CCL 33 60.3 21.2 -3.78
23
ResultsPost-Scale (PSY) Recalibration
Model Obs Exp HLChi2 Spieg Z
2003
ACC 33 226 210 -13.6
NNE 33 27.9 32.8 1.43
MIC 33 13.4 40.4 5.63
CCL 33 33.3 5.8 -0.74
2004
ACC 33 524 1233 -4.91
NNE 33 58.9 44.7 -1.14
MIC 33 26.7 18.0 1.26
CCL 33 82.9 41.0 -4.79
24
Results2003 PSY vs None
25
Results2004 PSY vs None
26
ResultsLR Intercept Scaling (IntY) Recalibration
Model Obs Exp HLChi2 Spieg Z
2003
ACC 33 45.1 10.0 -2.20
NNE 33 26.0 43.6 2.52
MIC 33 22.1 12.7 2.78
CCL 33 24.8 10.5 1.25
2004
ACC 33 34.1 14.6 -0.90
NNE 33 28.9 69.8 1.82
MIC 33 26.5 17.6 1.22
CCL 33 33.5 14.2 -0.50
27
Results2003 IntY vs None
28
Results2004 IntY vs None
29
ResultsLR Refitting (SigY) Recalibration
Model Obs Exp HLChi2 Spieg Z
2003
ACC 33 24.0 12.7 1.16
NNE 33 18.6 32.9 4.14
MIC 33 20.1 24.0 4.56
CCL 33 25.5 15.2 2.18
2004
ACC 33 32.0 35.7 -0.47
NNE 33 31.2 21.7 1.00
MIC 33 31.0 23.6 0.27
CCL 33 31.6 13.2 0.84
30
Results2003 SigY vs None
31
Results2004 SigY vs None
32
Conclusions

All 4 Models failed to maintain calibration on
the data without recalibration
Two utilized measures of calibration (HL Brier)
commonly disagreed
If a model was considered to be recalibrated only
if both methods showed calibration, then
Best was Intercept adjustment (IntY) 3 / 8
2nd was LR Refitting (SigY) 2 / 8

33
Limitations

Low Event Rate makes attaining statistical
significance for results more difficult
Variation between 2002 and 2003/2004 Event Rates
make recalibration less likely in 2003 compared
to 2004.

34
Future Directions