Title: Propensity Score Analyses: A good looking cousin of an RCT
1Propensity Score Analyses A good looking cousin
of an RCT KCASUG Q1 March 4, 2010
- Kevin Kennedy, MS
- Saint Lukes Hospital, Kansas City, MO
- John House, MS
- Saint Lukess Hospital, Kansas City, MO
- Phil Jones, MS
- Saint Lukes Hospital, Kansas City, MO
2Motivation
- Estimating Treatment effect is important!
- Is Drug A advantageous to Placebo?
- Do same sex classes increase academic
performance? - Do Titanium golf clubs increase distance of
drives? - Designing ways to answer these questions should
be - Ethical
- Practical
- Cost Effective
3The Gold Standard
- Randomized Control Trials
- Randomization of subjects to treatment groups
(essentially coin flip determines group) - On average all subject characteristics will be
balanced between groups
Treatment (n100) Control (n100) P-value
Age 573.2 573.1 .78
Male 57 58 .65
History Diabetes 22 22 .99
History Heart Failure 8 9 .75
4Benefits of a RCT
- A pure link between Treatment and Outcome
- Random allocation of subjects removes the
possibility of a third factor being associated
with treatment and outcome - Can blind subjects and researchers to treatment
allocation
5Potential Caveats with an RCT
- Ethical Issues
- Not assigning subjects to a treatment generally
thought to improve outcomes is often thought
unethical - Practical Issues
- Problems with recruitment of subjects
- Consenting to alternatives, and substantial
drop out - Cost and Time Issues
- Enrolling subjects, training staff, designing
trial, treatment - May be too controlled
- Specific subject criteria and treatment use
- Population may not represent the real world
experience
Spaar A, Frey M, Turk A, Karrer W, Puhan MA.
Recruitment barriers in a randomized controlled
trial from the physicians' perspective a postal
survey. BMC Med Res Methodol. 2009 Mar 2914
6Sowhat now?
- Observational data is popular
- Treatment is not given due to randomization, only
observed - UnfortunatelySubject characteristics will likely
not be balanced
Treatment (n100) Control (n100) P-value
Age 573.2 625 .031
Male 57 42 .047
History Diabetes 22 30 lt.001
History Heart Failure 8 15 ..035
7Sowhat now?
- Need to account for the differences between
treatment and control - Common in modeling to adjust away differences
between groups - However, sample size constraints restrict the
of variables to adjust for - Solution Propensity Scores
8Propensity Score Outline
- Introduction
- How to use the score
- Matching
- Stratifying
- Accessing Balance
- Standardized Difference
- Propensity Scores Using SAS
- Concluding remarks
- Other uses
- Issues with publications
9Introduction
- Definition
- Propensity score (PS) the conditional
probability of being treated given the
individuals covariates - Notation
- Estimating Propensity Score can be done with the
common logistic regression model predicting
treatment on selected covariates needing balanced - Will be used to balance characteristics between
groups
10Introduction
Treatment (n100) Control (n100) P-value
Age 573.2 625 .031
Male 57 42 .047
History Diabetes 22 30 lt.001
History Heart Failure 8 15 ..035
Here we would develop a PS for being in the
treatment group conditioned on age, gender,
diabetes history, and heart failure
11Introduction-why important?
- Important For a specific value of the PS the
difference between treatment and control is an
unbiased estimate of the average treatment effect
at that PS (Rosenbaum Rubin, 1983 Theorem 4) - Quasi-Randomized experiment
- Take 2 subjects (one from treatment and other
control) with the same PS then you could
imagine these 2 subjects were randomly
assigned to each group. (since they are equally
likely to be treated.
Rosenbaum PR, Rubin DB. The central role of the
propensity score in observational studies for
causal effects. Biometrika. 1983704155.
12Introduction
- What Covariates should go into the PS model?
- Dont use covariates that define the group
- Dont use insulin to predict diabetes group
- Since goal is balancing groups, one can be more
liberal with the of covariates in model1 - Austin2 (2006) showed that more parsimonious
models resulted in greater precision
- DAgostino JR. Propensity Scores in
Cardiovascular Research. Circ.
20071152340-2343. - Austin P. A comparison of the ability of
different propensity score models to balance
measured variables - between treated and untreated subjects a
Monte Carlo study. Statist. Med. 2007 734-753
13Introduction
- Its not just a side analysis anymore
14Ways to use the PS
- Common strategies include
- Matching
- Match treatment and controls on PS
- Stratification
- Keep all subjects but analyze in Strata (usually
quintiles of PS) - Regression adjustment
15Matching
- Most common use of PS analyses.
- Since the PS is a single scalar quantity Matching
is comparatively easier (as opposed to matching
on age, gender, history, etc) - Matching 1 Control to 1 Treatment makes for an
easily understood analyses - Common to match on the Logit of the PS since it
is approximately normal
16Matching
- Nearest Neighbor matching (w/o replacement)
- Randomly Order Treated and Control Subjects
- Take the first treated subject and find the
Control with the closest Propensity Score.
Remove both from list - Move to the second Treated subject and find
control with closest PScontinue until you run
out of treated patients - This will create a 11 match of treated and
control patients - Note methods exist for 1many matches also
17Matching
- Problem The Nearest neighbor may not be that
Near - May want to enforce a caliper width for
acceptable matches - E.g. if there is no control within the caliper
of a case then no match occurs and case will be
removed - Common in Literature to use
- .2stddevL(x) as the caliper
- For a matching macro see
- mayoresearch.mayo.edu/biostat/upload/gmatch.sas
-
18Matching Ideal Scenario
Treatment (n543) Control (n1598) P-value
Age 573.2 625 .031
Male 57 42 .047
History Diabetes 22 30 lt.001
History Heart Failure 8 15 ..035
Treatment (n500) Control (n500) P-value
Age 573.2 57.33 .45
Male 57 57 .88
History Diabetes 22 23 .48
History Heart Failure 8 7 .77
19Stratification
- Matching will inevitably result in a smaller
dataset - Stratifying analyses on PS will keep all data.
- Create the PS
- Cut the PS into equal groups (Quartile,
Quintiles) - (Rosenbaum Rubin, 1983) claim quintile strata
will remove 90 of bias - Conduct the analyses within these strata
20Example
- Comparison of Angiography (vs not) in elderly
patients with Chronic Kidney Disease (CKD) - Propensity score for receiving an Angio
- Based on Demographics, History, and Hospital
Characteristics
Propensity Quintile Group of patients 1-year Mortality OR (95CI)
1(0-.06) Angio No Angio 46 1307 56.5 56.2 1.02 (.56-1.84)
2(.06-.16) Angio No Angio 133 1221 36.8 50.7 .57 (.39-.82)
3 (.16-.30) Angio No Angio 303 1051 34.7 44.7 .66 (.50-.86)
4 (.30-.54) Angio No Angio 557 797 30.7 38.3 .72 (.57-.90)
5 (.54-1) Angio No Angio 967 387 18.9 34.1 .45 (.35-.59)
Overall Angio No Angio 2014 4780 26.7 47.4 .62 (.54-.70)
Chertow GM, Normand SL, McNeil BJ. "Renalism"
inappropriately low rates of coronary angiography
in elderly individuals with renal insufficiency.
J Am Soc Nephrol. 2004 Sep15(9)2462-8
21Covariate Adjustment
- This use would be the least recommended.
- Do a model for PS, and then use that PS in a
model as an adjustment when evaluating
association between treatment and outcome - Advantage over normal covariate adjustment
- Simpler final model
- Can have many more covariates in the PS model
22Assessing Balance
- Remember the main purpose of a PS is to balance
characteristics between treated and controlsso
how do we show success? - P-values
- Function of Sample Size
- May be misleading for Stratification or 1many
match - Standardized Differences
- Not a function of Sample Size
- Can be used for Stratification and 1many matches
23Standardized Differences
- Formula Continuous Variables
- Formula Dichotomous Variables
- For Stratified analyses compute d in each strata
and take average
24Standardized Differences
- Sample Calculations for a 11 match
- Before Match
- After Match
Treatment (n543) Control (n1598) P-value
Age 573.2 625 .031
Treatment (n500) Control (n500) P-value
Age 573.2 57.33 .45
25Standardized Differences
- What value constitutes balance?
- Peter Austin Commonly states values less than 10
constitute balance between groups - The closer to 0 then more balanced
26Propensity Analysis (Matching) Using SAS
- Simulated Data
- Data specifics
- N5000 (1000 Group1, 4000 Group2)
Group1 N1011 Group2 N3989 P-value
Age 59.4 4.0 63.5 4.0 lt 0.001
Male_Gender 560( 55.4 ) 2009 ( 50.4 ) 0.004
History of Diabetes 689 ( 16.9 ) 516 ( 21.4 ) lt 0.001
27Example Create PS
- proc logistic datadataset descending
- model group1 age gender diabetes others
- output outpred ppred xbetalogit
- run
Predicted probabilities of being in group 1
On Logit scale
28Example Define Caliper
- proc means datapred stddev
- var logit
- output outlstd
- run
- data _null_
- set lstd
- if _stat_'STD' THEN do
- call symputx('std',logit/5)
- end
- run
Creating caliper of .2stddev(logit)
29Example Perform Match
- gmatch(datapred, groupgroup1, idid,
- mvarslogit, wts1 , dmaxkstd, ncontls1,
- seedca987896, seedco425632, outmatch)
Group1 N858 Group2 N858 P-value
Age 60.1 3.6 60.17 3.62 .678
Male_Gender 469( 54.66 ) 478 ( 55.71 ) .662
History of Diabetes 261 ( 30.42 ) 256 ( 29.84 ) .792
mayoresearch.mayo.edu/biostat/upload/gmatch.sas
30Example Assess Balance
- Original Data
- std_diff(datafulldata, groupgroup1,
continuousage others, binarymale diabetes
others, outbefore) - Matched Data
- std_diff(datamatched_data, groupgroup1,
continuousage others, binarymale diabetes
others, outafter) - Combine
- data after
- set after(rename(stddiffafter_stddiff))
- run
- proc sql
- create table both as select
- from before as a join after as b on
a.variableb.variable -
- quit
31Example Assess Balance
Variable label STD DIFF Before STD DIFF AFTER
V1 V2 V3 Age Gender Diabetes 99.65 9.22 15.9 .3 .45 3.3
proc gplot databoth title 'Standardized
difference plot' plot labelStdDiff1
labelafter_stddiff2/overlay vaxisaxis1
haxisaxis2 href10 legendlegend1 AUTOVREF
chrefblack lhref3 run quit
32Hmmma bit ugly
33Format macro
Sort by stddiff before match
- proc sort databothby stddiffrun
- /attach formats to variables/
- macro doformat(data)
- data data
- set data
- count1
- run
- proc sql
- select label into label separated by '' from
data - quit
- let numvarwords(label,delimstr())
- proc format
- value fmt
- do i1 to numvar
- iqscan(var,i,)
- end
- run
Counter Variable
Read in Label names into label
Count of Variables
Format (i) counter with (i) label
34Assessing Balance
Variable label STD DIFF Before STD DIFF AFTER Count
V1 V3 V2 Age Diabetes Gender 99.65 15.9 9.22 .3 3.3 .45 Age Diabetes Gender
proc gplot databoth title 'Standardized
difference plot' plot countStdDiff1
countafterstddiff2/overlay vaxisaxis1
haxisaxis2 href10 legendlegend1 AUTOVREF
chrefblack lhref3 run quit
35(No Transcript)
36Standardized difference plot
stemi
emergency
elective
age
currentsmoke
nstemi
apr_mort
cardiogenic_shock
prior_PCI
self_pay
apr_sev
hypertension
hyperlipidemia
diabetes
race_white
male
chronic_kidney_dis
formersmoke
race_black
prior_MI
anemia
PVD
oth_aterialdisease
rheumatic_HD
CVD
heartfailure
stroke
renal_insufficiency
tia
COPD
obese
dialysis
otherheart_disease
Before Match
renal_failure
After Match
underweight
0
10
20
30
40
50
60
70
Standardized Difference
37Now What?
- Variable Standardized differences are lt10,
indicating balance - Now we can see if group membership has an impact
on our outcome - Caution this is matched data so statistically we
need to account for this - Paired t-tests, McNemars Test, Conditional
Logistic Regression, Stratified Proportional
Hazard Regression
38Other Uses
- A way to show just how different 2 groups are
Distribution of Propensity Scores
1.0
0.9
0.8
0.7
0.6
Probability of Group 2
0.5
0.4
0.3
0.2
0.1
0
Group 1
Group 2
39Probability Group 2
Group 1
Group 2
40Concluding Remarks
- If you want more information Search for Ralph
DAgostino Jr. (Wake Forest) and Peter Austin
(Univ of Toronto) - Introductory Read
- DAgostino JR Tutorial in Biostatistics
Propensity Score Methods for Bias Reduction in
the comparison of treatment to a non-randomized
control group. Statist. Med 17 (1998), 2265-2281 - 1Many Matching
- Austin P. Assessing balance in measured baseline
covariates when using many-to-one matching on the
propensity score. Pharmacoepidemiology and drug
safety (2008) 17 1218-1225
41Concluding Remarksthings to avoid
- Austin (2008) performed a literature review and
found many propensity score matching papers were
done incorrectly - 47 Articles reviewed from medical literature
which did Propensity Score Matching - Only 2 studies used Standardized Differences to
access match (most relied on p-values) - Only 13 used correct statistical methods for
matched data - See paper for the common errors
- Only 2 studies assessed balance correctly and
used correct statistical methods
Austin PC. A critical appraisal of
propensity-score matching in the
medical literature between 1996 and 2003. Stat
Med. 2008 May 3027(12)2037-49
42Concluding Remarksthings to avoid
- Austins Recommendations
- Strategy for creating pairings should be
specifically stated with appropriate statistical
citation - The distribution of baseline characteristics
between treated and control should be described - Differences in distributions should be assessed
with methods not influenced by sample size - Use appropriate statistical methods to account
for match - McNemars Test for Binary data
- Use of strata statement in proc logistic or phreg
43What have we learnedif anything
- RCT may be the gold standard but Propensity
Scores are their attractive cousin - Using PS can remove a lot of bias in determining
treatment effect - You can Match, stratify, or adjust for the PS
- Use the standardized difference to determine
balance (unaffected by sample size)
44Contact Information
- Name Kevin Kennedy
- Company Mid America Heart Institute St. Lukes
Hospital - Address 4401 Wornall Rd, Kansas City, MO
- Email kfk3388_at_gmail.com or
kfkennedy_at_saint-lukes.org -
- SAS and all other SAS Institute Inc. product or
service names are registered trademarks or
trademarks of SAS Institute Inc. in the USA and
other countries. indicates USA registration.
Other brand and product names are trademarks of
their respective companies.