Propensity Score Analyses: A good looking cousin of an RCT - PowerPoint PPT Presentation

1 / 43

About This Presentation

Title:

Propensity Score Analyses: A good looking cousin of an RCT

Description:

JPMorgan Chase & Co. ... – PowerPoint PPT presentation

Number of Views:265

Avg rating:3.0/5.0

Slides: 44

Provided by: kcasugOrg

Category:

more less

Transcript and Presenter's Notes

Title: Propensity Score Analyses: A good looking cousin of an RCT

1
Propensity Score Analyses A good looking cousin
of an RCT KCASUG Q1 March 4, 2010

Kevin Kennedy, MS
Saint Lukes Hospital, Kansas City, MO
John House, MS
Saint Lukess Hospital, Kansas City, MO
Phil Jones, MS
Saint Lukes Hospital, Kansas City, MO

2
Motivation

Estimating Treatment effect is important!
Is Drug A advantageous to Placebo?
Do same sex classes increase academic
performance?
Do Titanium golf clubs increase distance of
drives?
Designing ways to answer these questions should
be
Ethical
Practical
Cost Effective

3
The Gold Standard

Randomized Control Trials
Randomization of subjects to treatment groups
(essentially coin flip determines group)
On average all subject characteristics will be
balanced between groups

Treatment (n100) Control (n100) P-value
Age 573.2 573.1 .78
Male 57 58 .65
History Diabetes 22 22 .99
History Heart Failure 8 9 .75
4
Benefits of a RCT

A pure link between Treatment and Outcome
Random allocation of subjects removes the
possibility of a third factor being associated
with treatment and outcome
Can blind subjects and researchers to treatment
allocation

5
Potential Caveats with an RCT

Ethical Issues
Not assigning subjects to a treatment generally
thought to improve outcomes is often thought
unethical
Practical Issues
Problems with recruitment of subjects
Consenting to alternatives, and substantial
drop out
Cost and Time Issues
Enrolling subjects, training staff, designing
trial, treatment
May be too controlled
Specific subject criteria and treatment use
Population may not represent the real world
experience

Spaar A, Frey M, Turk A, Karrer W, Puhan MA.
Recruitment barriers in a randomized controlled
trial from the physicians' perspective a postal
survey. BMC Med Res Methodol. 2009 Mar 2914
6
Sowhat now?

Observational data is popular
Treatment is not given due to randomization, only
observed
UnfortunatelySubject characteristics will likely
not be balanced

Treatment (n100) Control (n100) P-value
Age 573.2 625 .031
Male 57 42 .047
History Diabetes 22 30 lt.001
History Heart Failure 8 15 ..035
7
Sowhat now?

Need to account for the differences between
treatment and control
Common in modeling to adjust away differences
between groups
However, sample size constraints restrict the
of variables to adjust for
Solution Propensity Scores

8
Propensity Score Outline

Introduction
How to use the score
Matching
Stratifying
Accessing Balance
Standardized Difference
Propensity Scores Using SAS
Concluding remarks
Other uses
Issues with publications

9
Introduction

Definition
Propensity score (PS) the conditional
probability of being treated given the
individuals covariates
Notation
Estimating Propensity Score can be done with the
common logistic regression model predicting
treatment on selected covariates needing balanced
Will be used to balance characteristics between
groups

10
Introduction
Treatment (n100) Control (n100) P-value
Age 573.2 625 .031
Male 57 42 .047
History Diabetes 22 30 lt.001
History Heart Failure 8 15 ..035
Here we would develop a PS for being in the
treatment group conditioned on age, gender,
diabetes history, and heart failure
11
Introduction-why important?

Important For a specific value of the PS the
difference between treatment and control is an
unbiased estimate of the average treatment effect
at that PS (Rosenbaum Rubin, 1983 Theorem 4)
Quasi-Randomized experiment
Take 2 subjects (one from treatment and other
control) with the same PS then you could
imagine these 2 subjects were randomly
assigned to each group. (since they are equally
likely to be treated.

Rosenbaum PR, Rubin DB. The central role of the
propensity score in observational studies for
causal effects. Biometrika. 1983704155.
12
Introduction

What Covariates should go into the PS model?
Dont use covariates that define the group
Dont use insulin to predict diabetes group
Since goal is balancing groups, one can be more
liberal with the of covariates in model1
Austin2 (2006) showed that more parsimonious
models resulted in greater precision

DAgostino JR. Propensity Scores in
Cardiovascular Research. Circ.
20071152340-2343.
Austin P. A comparison of the ability of
different propensity score models to balance
measured variables
between treated and untreated subjects a
Monte Carlo study. Statist. Med. 2007 734-753

13
Introduction

Its not just a side analysis anymore

14
Ways to use the PS

Common strategies include
Matching
Match treatment and controls on PS
Stratification
Keep all subjects but analyze in Strata (usually
quintiles of PS)
Regression adjustment

15
Matching

Most common use of PS analyses.
Since the PS is a single scalar quantity Matching
is comparatively easier (as opposed to matching
on age, gender, history, etc)
Matching 1 Control to 1 Treatment makes for an
easily understood analyses
Common to match on the Logit of the PS since it
is approximately normal

16
Matching

Nearest Neighbor matching (w/o replacement)
Randomly Order Treated and Control Subjects
Take the first treated subject and find the
Control with the closest Propensity Score.
Remove both from list
Move to the second Treated subject and find
control with closest PScontinue until you run
out of treated patients
This will create a 11 match of treated and
control patients
Note methods exist for 1many matches also

17
Matching

Problem The Nearest neighbor may not be that
Near
May want to enforce a caliper width for
acceptable matches
E.g. if there is no control within the caliper
of a case then no match occurs and case will be
removed
Common in Literature to use
.2stddevL(x) as the caliper
For a matching macro see
mayoresearch.mayo.edu/biostat/upload/gmatch.sas

18
Matching Ideal Scenario
Treatment (n543) Control (n1598) P-value
Age 573.2 625 .031
Male 57 42 .047
History Diabetes 22 30 lt.001
History Heart Failure 8 15 ..035

Before Match
After Match

Treatment (n500) Control (n500) P-value
Age 573.2 57.33 .45
Male 57 57 .88
History Diabetes 22 23 .48
History Heart Failure 8 7 .77
19
Stratification

Matching will inevitably result in a smaller
dataset
Stratifying analyses on PS will keep all data.
Create the PS
Cut the PS into equal groups (Quartile,
Quintiles)
(Rosenbaum Rubin, 1983) claim quintile strata
will remove 90 of bias
Conduct the analyses within these strata

20
Example

Comparison of Angiography (vs not) in elderly
patients with Chronic Kidney Disease (CKD)
Propensity score for receiving an Angio
Based on Demographics, History, and Hospital
Characteristics

Propensity Quintile Group of patients 1-year Mortality OR (95CI)
1(0-.06) Angio No Angio 46 1307 56.5 56.2 1.02 (.56-1.84)
2(.06-.16) Angio No Angio 133 1221 36.8 50.7 .57 (.39-.82)
3 (.16-.30) Angio No Angio 303 1051 34.7 44.7 .66 (.50-.86)
4 (.30-.54) Angio No Angio 557 797 30.7 38.3 .72 (.57-.90)
5 (.54-1) Angio No Angio 967 387 18.9 34.1 .45 (.35-.59)
Overall Angio No Angio 2014 4780 26.7 47.4 .62 (.54-.70)
Chertow GM, Normand SL, McNeil BJ. "Renalism"
inappropriately low rates of coronary angiography
in elderly individuals with renal insufficiency.
J Am Soc Nephrol. 2004 Sep15(9)2462-8
21
Covariate Adjustment

This use would be the least recommended.
Do a model for PS, and then use that PS in a
model as an adjustment when evaluating
association between treatment and outcome
Advantage over normal covariate adjustment
Simpler final model
Can have many more covariates in the PS model

22
Assessing Balance

Remember the main purpose of a PS is to balance
characteristics between treated and controlsso
how do we show success?
P-values
Function of Sample Size
May be misleading for Stratification or 1many
match
Standardized Differences
Not a function of Sample Size
Can be used for Stratification and 1many matches

23
Standardized Differences

Formula Continuous Variables

Formula Dichotomous Variables

For Stratified analyses compute d in each strata
and take average

24
Standardized Differences

Sample Calculations for a 11 match
Before Match
After Match

Treatment (n543) Control (n1598) P-value
Age 573.2 625 .031
Treatment (n500) Control (n500) P-value
Age 573.2 57.33 .45
25
Standardized Differences

What value constitutes balance?
Peter Austin Commonly states values less than 10
constitute balance between groups
The closer to 0 then more balanced

26
Propensity Analysis (Matching) Using SAS

Simulated Data
Data specifics
N5000 (1000 Group1, 4000 Group2)

Group1 N1011 Group2 N3989 P-value
Age 59.4  4.0 63.5  4.0 lt 0.001
Male_Gender 560(  55.4 ) 2009 (  50.4 ) 0.004
  History of Diabetes 689 (  16.9 ) 516 (  21.4 ) lt 0.001
27
Example Create PS

proc logistic datadataset descending
model group1 age gender diabetes others
output outpred ppred xbetalogit
run

Predicted probabilities of being in group 1
On Logit scale
28
Example Define Caliper

proc means datapred stddev
var logit
output outlstd
run
data _null_
set lstd
if _stat_'STD' THEN do
call symputx('std',logit/5)
end
run

Creating caliper of .2stddev(logit)
29
Example Perform Match

gmatch(datapred, groupgroup1, idid,
mvarslogit, wts1 , dmaxkstd, ncontls1,
seedca987896, seedco425632, outmatch)

Group1 N858 Group2 N858 P-value
Age 60.1  3.6 60.17  3.62 .678
Male_Gender 469(  54.66 ) 478 (  55.71 ) .662
History of Diabetes 261 (  30.42 ) 256 (  29.84 ) .792
mayoresearch.mayo.edu/biostat/upload/gmatch.sas
30
Example Assess Balance

Original Data
std_diff(datafulldata, groupgroup1,
continuousage others, binarymale diabetes
others, outbefore)
Matched Data
std_diff(datamatched_data, groupgroup1,
continuousage others, binarymale diabetes
others, outafter)
Combine
data after
set after(rename(stddiffafter_stddiff))
run
proc sql
create table both as select
from before as a join after as b on
a.variableb.variable
quit

31
Example Assess Balance
Variable label STD DIFF Before STD DIFF AFTER
V1 V2 V3 Age Gender Diabetes 99.65 9.22 15.9 .3 .45 3.3
proc gplot databoth title 'Standardized
difference plot' plot labelStdDiff1
labelafter_stddiff2/overlay vaxisaxis1
haxisaxis2 href10 legendlegend1 AUTOVREF
chrefblack lhref3 run quit
32
Hmmma bit ugly
33
Format macro
Sort by stddiff before match

proc sort databothby stddiffrun
/attach formats to variables/
macro doformat(data)
data data
set data
count1
run
proc sql
select label into label separated by '' from
data
quit
let numvarwords(label,delimstr())
proc format
value fmt
do i1 to numvar
iqscan(var,i,)
end
run

Counter Variable
Read in Label names into label
Count of Variables
Format (i) counter with (i) label
34
Assessing Balance
Variable label STD DIFF Before STD DIFF AFTER Count
V1 V3 V2 Age Diabetes Gender 99.65 15.9 9.22 .3 3.3 .45 Age Diabetes Gender
proc gplot databoth title 'Standardized
difference plot' plot countStdDiff1
countafterstddiff2/overlay vaxisaxis1
haxisaxis2 href10 legendlegend1 AUTOVREF
chrefblack lhref3 run quit
35
(No Transcript)
36
Standardized difference plot
stemi
emergency
elective
age
currentsmoke
nstemi
apr_mort
cardiogenic_shock
prior_PCI
self_pay
apr_sev
hypertension
hyperlipidemia
diabetes
race_white
male
chronic_kidney_dis
formersmoke
race_black
prior_MI
anemia
PVD
oth_aterialdisease
rheumatic_HD
CVD
heartfailure
stroke
renal_insufficiency
tia
COPD
obese
dialysis
otherheart_disease
Before Match
renal_failure
After Match
underweight
0
10
20
30
40
50
60
70
Standardized Difference
37
Now What?

Variable Standardized differences are lt10,
indicating balance
Now we can see if group membership has an impact
on our outcome
Caution this is matched data so statistically we
need to account for this
Paired t-tests, McNemars Test, Conditional
Logistic Regression, Stratified Proportional
Hazard Regression

38
Other Uses

A way to show just how different 2 groups are

Distribution of Propensity Scores
1.0
0.9
0.8
0.7
0.6
Probability of Group 2
0.5
0.4
0.3
0.2
0.1
0
Group 1
Group 2
39
Probability Group 2
Group 1
Group 2
40
Concluding Remarks

If you want more information Search for Ralph
DAgostino Jr. (Wake Forest) and Peter Austin
(Univ of Toronto)
Introductory Read
DAgostino JR Tutorial in Biostatistics
Propensity Score Methods for Bias Reduction in
the comparison of treatment to a non-randomized
control group. Statist. Med 17 (1998), 2265-2281
1Many Matching
Austin P. Assessing balance in measured baseline
covariates when using many-to-one matching on the
propensity score. Pharmacoepidemiology and drug
safety (2008) 17 1218-1225

41
Concluding Remarksthings to avoid

Austin (2008) performed a literature review and
found many propensity score matching papers were
done incorrectly
47 Articles reviewed from medical literature
which did Propensity Score Matching
Only 2 studies used Standardized Differences to
access match (most relied on p-values)
Only 13 used correct statistical methods for
matched data
See paper for the common errors
Only 2 studies assessed balance correctly and
used correct statistical methods

Austin PC. A critical appraisal of
propensity-score matching in the
medical literature between 1996 and 2003. Stat
Med. 2008 May 3027(12)2037-49
42
Concluding Remarksthings to avoid

Austins Recommendations
Strategy for creating pairings should be
specifically stated with appropriate statistical
citation
The distribution of baseline characteristics
between treated and control should be described
Differences in distributions should be assessed
with methods not influenced by sample size
Use appropriate statistical methods to account
for match
McNemars Test for Binary data
Use of strata statement in proc logistic or phreg

43
What have we learnedif anything

RCT may be the gold standard but Propensity
Scores are their attractive cousin
Using PS can remove a lot of bias in determining
treatment effect
You can Match, stratify, or adjust for the PS
Use the standardized difference to determine
balance (unaffected by sample size)

44
Contact Information

Name Kevin Kennedy
Company Mid America Heart Institute St. Lukes
Hospital
Address 4401 Wornall Rd, Kansas City, MO
Email kfk3388_at_gmail.com or
kfkennedy_at_saint-lukes.org
SAS and all other SAS Institute Inc. product or
service names are registered trademarks or
trademarks of SAS Institute Inc. in the USA and
other countries. indicates USA registration.
Other brand and product names are trademarks of
their respective companies.