SlideShare a Scribd company logo
1 of 46
Application of Propensity Score
Matching in Observational
Studies Using SAS
Yinghui (Delian) Duan, M.Sc., Ph.D candidate
Department of Community Medicine and Health Care,
University of Connecticut Health Center
Connecticut Institute for Clinical and Translational Science (CICATS)
Email: yduan@uchc.edu
RCTs & Observational
Studies
Randomized Control Trials (RCTs)
› Treatment assignment is randomized
› Pre-treatment characteristics are balanced,
no confounding effects
› Difference in post-treatment outcomes can
be attributed to treatment effects
› “Gold standard” to estimate the effects of
treatment, interventions, and exposures
Observational Studies
› Non-experimental
› Treatment assignment is not determined by
design
› Usually the “treated” and “untreated” are
systematically different in some
characteristics that can affect outcome of
interest (i.e. confounders)
› Difficult to conclude causal effects due to
confounders
Propensity Score
Method
A useful tool to control confounding
effects in observational studies
Propensity Score (PS)
› Defined by Rosenbaum & Rubin in 1983:
the probability of treatment assignment
conditional on observed baseline covariates
PSi = Pr (Treatmenti = 1 |Xi)
› A useful tool to remove confounding effects
and enhance causal inference in
observational studies
Estimating PS
› PS is most often estimated by a logistic
regression model
› Can also be estimated using other methods,
e.g., bagging or boosting, recursive
partitioning or tree-based methods, random
forests, and neural networks.
› No significant advantages reported compared
to logistic regression model
Estimating PS in SAS
Example dataset: 6-month Mortality after
Percutaneous Coronary Intervention (PCI)
› Study sample: patients who received PCI
› Treatments: usual care alone vs. usual care + a
blood thinner
› Baseline confounders: age, gender, height,
coronary stent placement, acute myocardial
infarction within 7 days, and diabetes
› Outcome: 6-month mortality (0 or 1)
Example dataset: 6-month Mortality after
Percutaneous Coronary Intervention (PCI)
› Table: Sample description
p-value
n % n %
Age (mean ± SD) 64.6 ± 4.2 62.0 ± 3.8 <0.001
Female 938 33.1 760 32.6 0.673
Height (mean ± SD) 172.4 ± 10.2 171.6 ± 9.5 0.002
Stent 1,794 63.4 1,611 69.1 <0.001
Diabetes 659 23.3 438 18.8 <0.001
Acute MI 193 6.8 356 15.3 <0.001
Usual care
alone
(N = 2,830)
Usual care + blood
thinner
(N = 2,332)
PROB = PREDICTED = PRED = P
Snapshot of output dataset “new_ps”
Remove Confounding
Effects using PS
Two Important Assumptions
› The assignment of treatment is independent
of potential outcomes conditional on the
observed baseline covariates
› Every subjects has a nonzero probability to
receive either treatment
Four Methods
› PS matching – most widely used
› Stratification using PS
PS Treated Untreated Strata
U
T U
U
0.4 U
T U
T
0.5 T U
T U
T U
T U
T
0.7 T U
T U
T
0.8 T
T
0.9
1
2
3
4
5
0.3
0.6
› Stratification using PS Trimming
Trimming
Four Methods
› PS matching – most widely used
› Stratification using PS
› Weighting adjustment, e.g., Inverse
probability of treatment weighting (IPTW)
using PS
› Covariate adjustment using PS – not
recommended
Propensity Score Matching
To form matched sets of
treated and untreated
subjects who share a similar
value of PS
Common Support
Frequency
Untreated
Treated
0 Propensity Score 1
Region of Common Support
Four Methods – Common Support
› PS matching –
› Stratification – only when used together with
trimming
› IPTW – not explicitly examine common
support
› Covariate adjustment – not explicitly
examine common support
PS Matching
› Some decisions to be made:
› 1:1 or N:1 matching
› N:1 can improve efficiency, reduce
variance, but increase bias
› With or without replacement
› With-replacement may yield less bias,
but higher variance
› Which algorithm?
PSM Algorithms: Nearest-Neighbor
PS Treated Untreated
0.3 U
T U
0.4
T
0.5
T U
T U
0.6 U
T U
0.7
› Each treated will get a match,
even if it isn’t a very good one
› Will create problem when a
treated subject just doesn’t have
any controls with similar PS
› If there are multiple untreated
subjects with the same PS value as
the treated subject, randomly
select one
PSM Algorithms: Match within Caliper
› Caliper: limit matches to be within
some range of PS values
› 0.2 of the standard deviation of
the logit of the PS (Austin, 2011)
› 0.25 or 0.5 of the PS standard
deviation
PS Treated Untreated
0.3 U
T U
0.4
T
0.5
T U
T U
0.6 U
T U
0.7
PSM Algorithms: Greedy vs. Optimal
› Overall absolute distance = 0.01 + 0.03 = 0.04
ID PS ID PS
… … … …
… … … …
112 0.43 210 0.40
211 0.41
113 0.45 212 0.44
213 0.48
… … … …
Treated Untreated
PSM Algorithms: Greedy vs. Optimal
› Overall absolute distance = 0.02 + 0.01 = 0.03
ID PS ID PS
… … … …
… … … …
112 0.43 210 0.40
211 0.41
103 0.45 212 0.44
213 0.48
… … … …
Treated Untreated
PSM Algorithms: Greedy vs. Optimal
› Often does not make huge difference
› Generate the same results if matching with
replacement
PSM Example
A macro performing N:1
match on propensity score
› N:1 match
› Matching iterations are from 8-digit to
1-digit
› E.g., in the 3nd iteration, 6-digit matching,
PS = 0.12345698 is matched with PS =
0.12345605
› All macro variables are required except and
SiteN
› Lib has to be specified even if it’s “work”
(otherwise error will occur)
› If SiteN is specified, then subjects will be
matched within each site
› These statements can be modified or
removed to change matching precision
Run Matching for the Example
Dataset
Examine balance after PS Matching
› P-value can be misleading, especially in
large sample and with many confounders
› Standardized mean difference < 10
p-value
Standardized
Mean
Difference
n % n %
Age (mean ± SD) 62.7 ± 3.6 62.8 ± 3.5 0.818 0.76
Female 599 32.9 615 33.8 0.574 1.87
Height (mean ± SD) 171.9 ± 10.2 171.8 ± 9.5 0.7511 1.02
Stent 1,203 66.1 1,214 66.7 0.699 1.28
Diabetes 373 20.5 371 20.4 0.935 0.27
Acute MI 174 9.6 182 10.0 0.655 1.48
Usual care
alone
(N = 1,819)
Usual care + blood
thinner
(N = 1,819)
Standardized Mean Difference
› For continuous variables:
› For categorical variables:
› ± Sign does not matter
*100
*100
Another Example
Matching using specified
caliper = 0.2 of SD of logit
of PS
Calculate Logit of PS
Calculate SD of Logit of PS
0.2*SD = 0.158
Estimating Treatment
Effect in Matched Sample
Estimating Treatment Effects
› Run the same outcome analyses you would
have done on the original data
› Double robust: regression adjustment
for confounders can reduce residual
effects, increase precision
› If matching done with replacement, need
to use weight to reflect the fact that
controls used more than once
Some Considerations
› PS model:
› Non-parsimonious model to estimate PS
› Include covariates that are associated
with outcome, or with both outcome and
treatment; do NOT include covariates
that are strongly correlated with
treatment, but not directly associated
with outcome
› Can include interaction terms and higher
order to improve PS estimation and
matching
› Sample size
› At least 1,000 – 1,500 (Shadish 2013)
› Missing data
› List-wise deletion
Thanks!
Questions?
Comments?
Further questions: yduan@uchc.edu
References:
Overview/tutorial of Propensity Score method:
1. Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the
propensity score in observational studies for causal
effects. Biometrika, 70(1), 41-55.
2. Austin, P. C. (2011). An Introduction to Propensity Score Methods for
Reducing the Effects of Confounding in Observational Studies.
Multivariate Behavioral Research, 46(3), 399–424.
http://doi.org/10.1080/00273171.2011.568786
3. Stuart, E. A. (2010). Matching methods for causal inference: A review
and a look forward. Statistical Science : A Review Journal of the
Institute of Mathematical Statistics, 25(1), 1–21.
http://doi.org/10.1214/09-STS313
4. Caliendo, M., & Kopeinig, S. (2008). Some practical guidance for the
implementation of propensity score matching. Journal of economic
surveys,22(1), 31-72.
References (cont.):
Others:
1. Brookhart, M. A., Schneeweiss, S., Rothman, K. J., Glynn, R. J., Avorn, J., &
Stürmer, T. (2006). Variable selection for propensity score
models.American journal of epidemiology, 163(12), 1149-1156.
2. Shadish, W. R. (2013). Propensity score analysis: promise, reality and
irrational exuberance. Journal of Experimental Criminology, 9(2), 129-
144.
3. Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2007). Matching as
nonparametric preprocessing for reducing model dependence in
parametric causal inference. Political analysis, 15(3), 199-236.
References (cont.):
Materials from Other Presentations:
1. Stuart, E. (2011). “The why, when, and how of propensity score methods
for estimating causal effects” at Society for Prevention Research, May
31, 2011. Slides: http://www.preventionresearch.org/wp-
content/uploads/2011/07/SPR-Propensity-pc-workshop-slides.pdf
2. VanEseltine, M. (2013). “Introduction to propensity score analysis” at
CFDR Summer Methods Seminar, June 26, 2013. Slides:
https://www.bgsu.edu/content/dam/BGSU/college-of-arts-and-
sciences/center-for-family-and-demographic-
research/documents/Workshops/2013%20-workshop-propensity-
score-analysis.pdf
References (cont.):
Macros for propensity score matching:
1. Parsons, L. (2004, May). Performing a 1: N case-control match on
propensity score. In Proceedings of the 29th Annual SAS Users Group
international conference (pp. 165-29).
2. Fraeman, K. H. (2010). An introduction to implementing propensity score
matching with SAS®. Bethesda, MD: United BioSource Corporation.
3. Feng, W. W., Jun, Y., & Xu, R. (2006). A method/macro based on
propensity score and Mahalanobis distance to reduce bias in treatment
comparison in observational study. In SAS PharmaSUG 2006
Conference.
4. Coca-Perraillon, M. (2007, April). Local and global optimal propensity
score matching. In SAS Global Forum (Vol. 185, pp. 1-9).

More Related Content

What's hot

9. Calculate samplesize for diagnostic study
9. Calculate samplesize for diagnostic study9. Calculate samplesize for diagnostic study
9. Calculate samplesize for diagnostic studyAzmi Mohd Tamil
 
Randomization Tests
Randomization Tests Randomization Tests
Randomization Tests Ajay Dhamija
 
The Effect of Imagery on 1RM Scores of College Students
The Effect of Imagery on 1RM Scores of College StudentsThe Effect of Imagery on 1RM Scores of College Students
The Effect of Imagery on 1RM Scores of College StudentsLucy Wilkes
 
2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higgins2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higginsrgveroniki
 
Measures and Feedback January 2011
Measures and Feedback January 2011Measures and Feedback January 2011
Measures and Feedback January 2011Scott Miller
 
Exploratory Data Analysis - Checking For Normality
Exploratory Data Analysis - Checking For NormalityExploratory Data Analysis - Checking For Normality
Exploratory Data Analysis - Checking For NormalityAzmi Mohd Tamil
 
Influence of the Event Rate on Discrimination Abilities of Bankruptcy Predict...
Influence of the Event Rate on Discrimination Abilities of Bankruptcy Predict...Influence of the Event Rate on Discrimination Abilities of Bankruptcy Predict...
Influence of the Event Rate on Discrimination Abilities of Bankruptcy Predict...Lili Zhang
 
Introduction tocausalinference april02_2020
Introduction tocausalinference april02_2020Introduction tocausalinference april02_2020
Introduction tocausalinference april02_2020Viswanath Gangavaram
 
2010 smg training_cardiff_day1_session1(3 of 3)beyene
2010 smg training_cardiff_day1_session1(3 of 3)beyene2010 smg training_cardiff_day1_session1(3 of 3)beyene
2010 smg training_cardiff_day1_session1(3 of 3)beyenergveroniki
 
Galambos N Analysis Of Survey Results
Galambos N Analysis Of Survey ResultsGalambos N Analysis Of Survey Results
Galambos N Analysis Of Survey ResultsNora Galambos
 

What's hot (19)

9. Calculate samplesize for diagnostic study
9. Calculate samplesize for diagnostic study9. Calculate samplesize for diagnostic study
9. Calculate samplesize for diagnostic study
 
Randomization Tests
Randomization Tests Randomization Tests
Randomization Tests
 
Student's t-test
Student's t-testStudent's t-test
Student's t-test
 
2019 PMED Spring Course - Introduction to Dynamic Treatment Regimes - Marie D...
2019 PMED Spring Course - Introduction to Dynamic Treatment Regimes - Marie D...2019 PMED Spring Course - Introduction to Dynamic Treatment Regimes - Marie D...
2019 PMED Spring Course - Introduction to Dynamic Treatment Regimes - Marie D...
 
T test and ANOVA
T test and ANOVAT test and ANOVA
T test and ANOVA
 
The Effect of Imagery on 1RM Scores of College Students
The Effect of Imagery on 1RM Scores of College StudentsThe Effect of Imagery on 1RM Scores of College Students
The Effect of Imagery on 1RM Scores of College Students
 
2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higgins2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higgins
 
Measures and Feedback January 2011
Measures and Feedback January 2011Measures and Feedback January 2011
Measures and Feedback January 2011
 
Exploratory Data Analysis - Checking For Normality
Exploratory Data Analysis - Checking For NormalityExploratory Data Analysis - Checking For Normality
Exploratory Data Analysis - Checking For Normality
 
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
 
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
 
Influence of the Event Rate on Discrimination Abilities of Bankruptcy Predict...
Influence of the Event Rate on Discrimination Abilities of Bankruptcy Predict...Influence of the Event Rate on Discrimination Abilities of Bankruptcy Predict...
Influence of the Event Rate on Discrimination Abilities of Bankruptcy Predict...
 
PMED Transition Workshop - Dynamic Treatment Regimes via Reward Ignorant Mode...
PMED Transition Workshop - Dynamic Treatment Regimes via Reward Ignorant Mode...PMED Transition Workshop - Dynamic Treatment Regimes via Reward Ignorant Mode...
PMED Transition Workshop - Dynamic Treatment Regimes via Reward Ignorant Mode...
 
Introduction tocausalinference april02_2020
Introduction tocausalinference april02_2020Introduction tocausalinference april02_2020
Introduction tocausalinference april02_2020
 
Pearson Chi-Square
Pearson Chi-SquarePearson Chi-Square
Pearson Chi-Square
 
Triss Method
Triss MethodTriss Method
Triss Method
 
2010 smg training_cardiff_day1_session1(3 of 3)beyene
2010 smg training_cardiff_day1_session1(3 of 3)beyene2010 smg training_cardiff_day1_session1(3 of 3)beyene
2010 smg training_cardiff_day1_session1(3 of 3)beyene
 
Non Parametric Statistics
Non Parametric StatisticsNon Parametric Statistics
Non Parametric Statistics
 
Galambos N Analysis Of Survey Results
Galambos N Analysis Of Survey ResultsGalambos N Analysis Of Survey Results
Galambos N Analysis Of Survey Results
 

Similar to PS.Observational.SAS_Y.Duan

2010 JSM - Meta Stat Issue Medical Devices
2010 JSM - Meta Stat Issue Medical Devices2010 JSM - Meta Stat Issue Medical Devices
2010 JSM - Meta Stat Issue Medical DevicesTerry Liao
 
Robust Methods for Health-related Quality-of-life Assessment
Robust Methods for Health-related  Quality-of-life AssessmentRobust Methods for Health-related  Quality-of-life Assessment
Robust Methods for Health-related Quality-of-life Assessmentdylanturner22
 
Big Data Analytics for Healthcare
Big Data Analytics for HealthcareBig Data Analytics for Healthcare
Big Data Analytics for HealthcareChandan Reddy
 
2010 smg training_cardiff_day2_session4_sterne
2010 smg training_cardiff_day2_session4_sterne2010 smg training_cardiff_day2_session4_sterne
2010 smg training_cardiff_day2_session4_sternergveroniki
 
Chapter 7Hypothesis Testing ProceduresLearning.docx
Chapter 7Hypothesis Testing ProceduresLearning.docxChapter 7Hypothesis Testing ProceduresLearning.docx
Chapter 7Hypothesis Testing ProceduresLearning.docxmccormicknadine86
 
Ct lecture 7. comparing two groups cont data
Ct lecture 7. comparing two groups   cont dataCt lecture 7. comparing two groups   cont data
Ct lecture 7. comparing two groups cont dataHau Pham
 
Robust Methods for Health-related Quality-of-life Assessment
Robust Methods for Health-related Quality-of-life AssessmentRobust Methods for Health-related Quality-of-life Assessment
Robust Methods for Health-related Quality-of-life Assessmentdylanturner22
 
Metanalysis Lecture
Metanalysis LectureMetanalysis Lecture
Metanalysis Lecturedrmomusa
 
Network meta-analysis & models for inconsistency
Network meta-analysis & models for inconsistencyNetwork meta-analysis & models for inconsistency
Network meta-analysis & models for inconsistencycheweb1
 
ANOVA Lec 1 (alternate).pptx
ANOVA Lec 1 (alternate).pptxANOVA Lec 1 (alternate).pptx
ANOVA Lec 1 (alternate).pptxMohsinIqbalQazi
 
Ct lecture 4. descriptive analysis of cont variables
Ct lecture 4. descriptive analysis of cont variablesCt lecture 4. descriptive analysis of cont variables
Ct lecture 4. descriptive analysis of cont variablesHau Pham
 
An infinite population has a standard deviation of 10.  A random s.docx
An infinite population has a standard deviation of 10.  A random s.docxAn infinite population has a standard deviation of 10.  A random s.docx
An infinite population has a standard deviation of 10.  A random s.docxgreg1eden90113
 
Nuts & Bolts of Systematic Reviews, Meta-analyses & Network Meta-analyses
Nuts & Bolts of Systematic Reviews, Meta-analyses & Network Meta-analysesNuts & Bolts of Systematic Reviews, Meta-analyses & Network Meta-analyses
Nuts & Bolts of Systematic Reviews, Meta-analyses & Network Meta-analysesOARSI
 
Measuring clinical utility: uncertainty in Net Benefit
Measuring clinical utility: uncertainty in Net BenefitMeasuring clinical utility: uncertainty in Net Benefit
Measuring clinical utility: uncertainty in Net BenefitLaure Wynants
 

Similar to PS.Observational.SAS_Y.Duan (20)

2010 JSM - Meta Stat Issue Medical Devices
2010 JSM - Meta Stat Issue Medical Devices2010 JSM - Meta Stat Issue Medical Devices
2010 JSM - Meta Stat Issue Medical Devices
 
Quantitative Synthesis I
Quantitative Synthesis IQuantitative Synthesis I
Quantitative Synthesis I
 
Robust Methods for Health-related Quality-of-life Assessment
Robust Methods for Health-related  Quality-of-life AssessmentRobust Methods for Health-related  Quality-of-life Assessment
Robust Methods for Health-related Quality-of-life Assessment
 
Gen diff
Gen diffGen diff
Gen diff
 
Big Data Analytics for Healthcare
Big Data Analytics for HealthcareBig Data Analytics for Healthcare
Big Data Analytics for Healthcare
 
2010 smg training_cardiff_day2_session4_sterne
2010 smg training_cardiff_day2_session4_sterne2010 smg training_cardiff_day2_session4_sterne
2010 smg training_cardiff_day2_session4_sterne
 
Chapter 7Hypothesis Testing ProceduresLearning.docx
Chapter 7Hypothesis Testing ProceduresLearning.docxChapter 7Hypothesis Testing ProceduresLearning.docx
Chapter 7Hypothesis Testing ProceduresLearning.docx
 
Ct lecture 7. comparing two groups cont data
Ct lecture 7. comparing two groups   cont dataCt lecture 7. comparing two groups   cont data
Ct lecture 7. comparing two groups cont data
 
Robust Methods for Health-related Quality-of-life Assessment
Robust Methods for Health-related Quality-of-life AssessmentRobust Methods for Health-related Quality-of-life Assessment
Robust Methods for Health-related Quality-of-life Assessment
 
Test of significance
Test of significanceTest of significance
Test of significance
 
Metanalysis Lecture
Metanalysis LectureMetanalysis Lecture
Metanalysis Lecture
 
Network meta-analysis & models for inconsistency
Network meta-analysis & models for inconsistencyNetwork meta-analysis & models for inconsistency
Network meta-analysis & models for inconsistency
 
Errors2
Errors2Errors2
Errors2
 
ANOVA Lec 1 (alternate).pptx
ANOVA Lec 1 (alternate).pptxANOVA Lec 1 (alternate).pptx
ANOVA Lec 1 (alternate).pptx
 
Factorial Experiments
Factorial ExperimentsFactorial Experiments
Factorial Experiments
 
Ct lecture 4. descriptive analysis of cont variables
Ct lecture 4. descriptive analysis of cont variablesCt lecture 4. descriptive analysis of cont variables
Ct lecture 4. descriptive analysis of cont variables
 
An infinite population has a standard deviation of 10.  A random s.docx
An infinite population has a standard deviation of 10.  A random s.docxAn infinite population has a standard deviation of 10.  A random s.docx
An infinite population has a standard deviation of 10.  A random s.docx
 
Two Means, Independent Samples
Two Means, Independent SamplesTwo Means, Independent Samples
Two Means, Independent Samples
 
Nuts & Bolts of Systematic Reviews, Meta-analyses & Network Meta-analyses
Nuts & Bolts of Systematic Reviews, Meta-analyses & Network Meta-analysesNuts & Bolts of Systematic Reviews, Meta-analyses & Network Meta-analyses
Nuts & Bolts of Systematic Reviews, Meta-analyses & Network Meta-analyses
 
Measuring clinical utility: uncertainty in Net Benefit
Measuring clinical utility: uncertainty in Net BenefitMeasuring clinical utility: uncertainty in Net Benefit
Measuring clinical utility: uncertainty in Net Benefit
 

PS.Observational.SAS_Y.Duan

  • 1. Application of Propensity Score Matching in Observational Studies Using SAS Yinghui (Delian) Duan, M.Sc., Ph.D candidate Department of Community Medicine and Health Care, University of Connecticut Health Center Connecticut Institute for Clinical and Translational Science (CICATS) Email: yduan@uchc.edu
  • 3. Randomized Control Trials (RCTs) › Treatment assignment is randomized › Pre-treatment characteristics are balanced, no confounding effects › Difference in post-treatment outcomes can be attributed to treatment effects › “Gold standard” to estimate the effects of treatment, interventions, and exposures
  • 4. Observational Studies › Non-experimental › Treatment assignment is not determined by design › Usually the “treated” and “untreated” are systematically different in some characteristics that can affect outcome of interest (i.e. confounders) › Difficult to conclude causal effects due to confounders
  • 5. Propensity Score Method A useful tool to control confounding effects in observational studies
  • 6. Propensity Score (PS) › Defined by Rosenbaum & Rubin in 1983: the probability of treatment assignment conditional on observed baseline covariates PSi = Pr (Treatmenti = 1 |Xi) › A useful tool to remove confounding effects and enhance causal inference in observational studies
  • 7. Estimating PS › PS is most often estimated by a logistic regression model › Can also be estimated using other methods, e.g., bagging or boosting, recursive partitioning or tree-based methods, random forests, and neural networks. › No significant advantages reported compared to logistic regression model
  • 9. Example dataset: 6-month Mortality after Percutaneous Coronary Intervention (PCI) › Study sample: patients who received PCI › Treatments: usual care alone vs. usual care + a blood thinner › Baseline confounders: age, gender, height, coronary stent placement, acute myocardial infarction within 7 days, and diabetes › Outcome: 6-month mortality (0 or 1)
  • 10. Example dataset: 6-month Mortality after Percutaneous Coronary Intervention (PCI) › Table: Sample description p-value n % n % Age (mean ± SD) 64.6 ± 4.2 62.0 ± 3.8 <0.001 Female 938 33.1 760 32.6 0.673 Height (mean ± SD) 172.4 ± 10.2 171.6 ± 9.5 0.002 Stent 1,794 63.4 1,611 69.1 <0.001 Diabetes 659 23.3 438 18.8 <0.001 Acute MI 193 6.8 356 15.3 <0.001 Usual care alone (N = 2,830) Usual care + blood thinner (N = 2,332)
  • 11. PROB = PREDICTED = PRED = P
  • 12. Snapshot of output dataset “new_ps”
  • 14. Two Important Assumptions › The assignment of treatment is independent of potential outcomes conditional on the observed baseline covariates › Every subjects has a nonzero probability to receive either treatment
  • 15. Four Methods › PS matching – most widely used › Stratification using PS
  • 16. PS Treated Untreated Strata U T U U 0.4 U T U T 0.5 T U T U T U T U T 0.7 T U T U T 0.8 T T 0.9 1 2 3 4 5 0.3 0.6 › Stratification using PS Trimming Trimming
  • 17. Four Methods › PS matching – most widely used › Stratification using PS › Weighting adjustment, e.g., Inverse probability of treatment weighting (IPTW) using PS › Covariate adjustment using PS – not recommended
  • 18. Propensity Score Matching To form matched sets of treated and untreated subjects who share a similar value of PS
  • 20. Four Methods – Common Support › PS matching – › Stratification – only when used together with trimming › IPTW – not explicitly examine common support › Covariate adjustment – not explicitly examine common support
  • 21. PS Matching › Some decisions to be made: › 1:1 or N:1 matching › N:1 can improve efficiency, reduce variance, but increase bias › With or without replacement › With-replacement may yield less bias, but higher variance › Which algorithm?
  • 22. PSM Algorithms: Nearest-Neighbor PS Treated Untreated 0.3 U T U 0.4 T 0.5 T U T U 0.6 U T U 0.7 › Each treated will get a match, even if it isn’t a very good one › Will create problem when a treated subject just doesn’t have any controls with similar PS › If there are multiple untreated subjects with the same PS value as the treated subject, randomly select one
  • 23. PSM Algorithms: Match within Caliper › Caliper: limit matches to be within some range of PS values › 0.2 of the standard deviation of the logit of the PS (Austin, 2011) › 0.25 or 0.5 of the PS standard deviation PS Treated Untreated 0.3 U T U 0.4 T 0.5 T U T U 0.6 U T U 0.7
  • 24. PSM Algorithms: Greedy vs. Optimal › Overall absolute distance = 0.01 + 0.03 = 0.04 ID PS ID PS … … … … … … … … 112 0.43 210 0.40 211 0.41 113 0.45 212 0.44 213 0.48 … … … … Treated Untreated
  • 25. PSM Algorithms: Greedy vs. Optimal › Overall absolute distance = 0.02 + 0.01 = 0.03 ID PS ID PS … … … … … … … … 112 0.43 210 0.40 211 0.41 103 0.45 212 0.44 213 0.48 … … … … Treated Untreated
  • 26. PSM Algorithms: Greedy vs. Optimal › Often does not make huge difference › Generate the same results if matching with replacement
  • 27. PSM Example A macro performing N:1 match on propensity score
  • 28. › N:1 match › Matching iterations are from 8-digit to 1-digit › E.g., in the 3nd iteration, 6-digit matching, PS = 0.12345698 is matched with PS = 0.12345605
  • 29. › All macro variables are required except and SiteN › Lib has to be specified even if it’s “work” (otherwise error will occur) › If SiteN is specified, then subjects will be matched within each site
  • 30. › These statements can be modified or removed to change matching precision
  • 31. Run Matching for the Example Dataset
  • 32. Examine balance after PS Matching › P-value can be misleading, especially in large sample and with many confounders › Standardized mean difference < 10 p-value Standardized Mean Difference n % n % Age (mean ± SD) 62.7 ± 3.6 62.8 ± 3.5 0.818 0.76 Female 599 32.9 615 33.8 0.574 1.87 Height (mean ± SD) 171.9 ± 10.2 171.8 ± 9.5 0.7511 1.02 Stent 1,203 66.1 1,214 66.7 0.699 1.28 Diabetes 373 20.5 371 20.4 0.935 0.27 Acute MI 174 9.6 182 10.0 0.655 1.48 Usual care alone (N = 1,819) Usual care + blood thinner (N = 1,819)
  • 33. Standardized Mean Difference › For continuous variables: › For categorical variables: › ± Sign does not matter *100 *100
  • 34. Another Example Matching using specified caliper = 0.2 of SD of logit of PS
  • 35. Calculate Logit of PS Calculate SD of Logit of PS 0.2*SD = 0.158
  • 36.
  • 38. Estimating Treatment Effects › Run the same outcome analyses you would have done on the original data › Double robust: regression adjustment for confounders can reduce residual effects, increase precision › If matching done with replacement, need to use weight to reflect the fact that controls used more than once
  • 40. › PS model: › Non-parsimonious model to estimate PS › Include covariates that are associated with outcome, or with both outcome and treatment; do NOT include covariates that are strongly correlated with treatment, but not directly associated with outcome › Can include interaction terms and higher order to improve PS estimation and matching
  • 41. › Sample size › At least 1,000 – 1,500 (Shadish 2013) › Missing data › List-wise deletion
  • 43. References: Overview/tutorial of Propensity Score method: 1. Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41-55. 2. Austin, P. C. (2011). An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies. Multivariate Behavioral Research, 46(3), 399–424. http://doi.org/10.1080/00273171.2011.568786 3. Stuart, E. A. (2010). Matching methods for causal inference: A review and a look forward. Statistical Science : A Review Journal of the Institute of Mathematical Statistics, 25(1), 1–21. http://doi.org/10.1214/09-STS313 4. Caliendo, M., & Kopeinig, S. (2008). Some practical guidance for the implementation of propensity score matching. Journal of economic surveys,22(1), 31-72.
  • 44. References (cont.): Others: 1. Brookhart, M. A., Schneeweiss, S., Rothman, K. J., Glynn, R. J., Avorn, J., & Stürmer, T. (2006). Variable selection for propensity score models.American journal of epidemiology, 163(12), 1149-1156. 2. Shadish, W. R. (2013). Propensity score analysis: promise, reality and irrational exuberance. Journal of Experimental Criminology, 9(2), 129- 144. 3. Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political analysis, 15(3), 199-236.
  • 45. References (cont.): Materials from Other Presentations: 1. Stuart, E. (2011). “The why, when, and how of propensity score methods for estimating causal effects” at Society for Prevention Research, May 31, 2011. Slides: http://www.preventionresearch.org/wp- content/uploads/2011/07/SPR-Propensity-pc-workshop-slides.pdf 2. VanEseltine, M. (2013). “Introduction to propensity score analysis” at CFDR Summer Methods Seminar, June 26, 2013. Slides: https://www.bgsu.edu/content/dam/BGSU/college-of-arts-and- sciences/center-for-family-and-demographic- research/documents/Workshops/2013%20-workshop-propensity- score-analysis.pdf
  • 46. References (cont.): Macros for propensity score matching: 1. Parsons, L. (2004, May). Performing a 1: N case-control match on propensity score. In Proceedings of the 29th Annual SAS Users Group international conference (pp. 165-29). 2. Fraeman, K. H. (2010). An introduction to implementing propensity score matching with SAS®. Bethesda, MD: United BioSource Corporation. 3. Feng, W. W., Jun, Y., & Xu, R. (2006). A method/macro based on propensity score and Mahalanobis distance to reduce bias in treatment comparison in observational study. In SAS PharmaSUG 2006 Conference. 4. Coca-Perraillon, M. (2007, April). Local and global optimal propensity score matching. In SAS Global Forum (Vol. 185, pp. 1-9).

Editor's Notes

  1. Observational study is also to study casual effects Biggest challenge in observational studies is to eliminate or decrease confounding effects as much as possible
  2. PS exists in both RCT and observational studies. In RCT, PS is known and is defined by the study design. In observational studies, PS is generally unknown and has to be estimated from study data. (Austin, 2011)
  3. In this step, usual diagnostic procedures are not necessary. Predictability of the model, or collinearity of covariates are not concerns. We only want to obtain predicted probabilities and care about whether it results in balanced sample matching step later.
  4. The first assumption can also paraphrased as “all variables that affects treatment assignment and outcome have been measured. There is no unknown or unmeasured confounders”. If this assumption holds, which means unconfoundedness holds given the full set of observed covariates, then the unconfoundedness also holds given the propensity score. If both assumptions hold, the treatment assignment is strongly ignorable, therefore conditioning on the propensity score allows one to obtain unbiased estimates of average treatment effects.
  5. Matching is the most widely used method and is also the focus of my talk today. Covariate adjustment is the least used method among the four. I barely see any recent studies that used PS method actually used covariate adjustment.
  6. Strata that have very few subjects from either group should be trimmed out from final dataset to estimate treatment effects.
  7. IPTW method is use PS directly in outcome analysis; treated weight = 1/PS; untreated weight = 1/(1-PS).
  8. Without common support, if treated and untreated have very different distributions of the confounders, model results can be biased
  9. Not hugely different unless research question concerns about the pairs themselves rather than concerns about the overall matched sample. Optimal matching picks about the same controls, but does a better job of assigning them to treated units
  10. 1819 subjects in each group.
  11. Overview/tutorial of Propensity Score method: Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41-55. Austin, P. C. (2011). An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies. Multivariate Behavioral Research, 46(3), 399–424. http://doi.org/10.1080/00273171.2011.568786 Stuart, E. A. (2010). Matching methods for causal inference: A review and a look forward. Statistical Science : A Review Journal of the Institute of Mathematical Statistics, 25(1), 1–21. http://doi.org/10.1214/09-STS313 Caliendo, M., & Kopeinig, S. (2008). Some practical guidance for the implementation of propensity score matching. Journal of economic surveys,22(1), 31-72. Others: Brookhart, M. A., Schneeweiss, S., Rothman, K. J., Glynn, R. J., Avorn, J., & Stürmer, T. (2006). Variable selection for propensity score models.American journal of epidemiology, 163(12), 1149-1156. Shadish, W. R. (2013). Propensity score analysis: promise, reality and irrational exuberance. Journal of Experimental Criminology, 9(2), 129-144. Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political analysis, 15(3), 199-236. Materials from Other Presentations: Stuart, E. (2011). “The why, when, and how of propensity score methods for estimating causal effects” at Society for Prevention Research, May 31, 2011. Slides: http://www.preventionresearch.org/wp-content/uploads/2011/07/SPR-Propensity-pc-workshop-slides.pdf VanEseltine, M. (2013). “Introduction to propensity score analysis” at CFDR Summer Methods Seminar, June 26, 2013. Slides: https://www.bgsu.edu/content/dam/BGSU/college-of-arts-and-sciences/center-for-family-and-demographic-research/documents/Workshops/2013%20-workshop-propensity-score-analysis.pdf Macros for propensity score matching: Parsons, L. (2004, May). Performing a 1: N case-control match on propensity score. In Proceedings of the 29th Annual SAS Users Group international conference (pp. 165-29). Fraeman, K. H. (2010). An introduction to implementing propensity score matching with SAS®. Bethesda, MD: United BioSource Corporation. Feng, W. W., Jun, Y., & Xu, R. (2006). A method/macro based on propensity score and Mahalanobis distance to reduce bias in treatment comparison in observational study. In SAS PharmaSUG 2006 Conference. Coca-Perraillon, M. (2007, April). Local and global optimal propensity score matching. In SAS Global Forum (Vol. 185, pp. 1-9).
  12. Overview/tutorial of Propensity Score method: Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41-55. Austin, P. C. (2011). An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies. Multivariate Behavioral Research, 46(3), 399–424. http://doi.org/10.1080/00273171.2011.568786 Stuart, E. A. (2010). Matching methods for causal inference: A review and a look forward. Statistical Science : A Review Journal of the Institute of Mathematical Statistics, 25(1), 1–21. http://doi.org/10.1214/09-STS313 Caliendo, M., & Kopeinig, S. (2008). Some practical guidance for the implementation of propensity score matching. Journal of economic surveys,22(1), 31-72. Others: Brookhart, M. A., Schneeweiss, S., Rothman, K. J., Glynn, R. J., Avorn, J., & Stürmer, T. (2006). Variable selection for propensity score models.American journal of epidemiology, 163(12), 1149-1156. Shadish, W. R. (2013). Propensity score analysis: promise, reality and irrational exuberance. Journal of Experimental Criminology, 9(2), 129-144. Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political analysis, 15(3), 199-236. Materials from Other Presentations: Stuart, E. (2011). “The why, when, and how of propensity score methods for estimating causal effects” at Society for Prevention Research, May 31, 2011. Slides: http://www.preventionresearch.org/wp-content/uploads/2011/07/SPR-Propensity-pc-workshop-slides.pdf VanEseltine, M. (2013). “Introduction to propensity score analysis” at CFDR Summer Methods Seminar, June 26, 2013. Slides: https://www.bgsu.edu/content/dam/BGSU/college-of-arts-and-sciences/center-for-family-and-demographic-research/documents/Workshops/2013%20-workshop-propensity-score-analysis.pdf Macros for propensity score matching: Parsons, L. (2004, May). Performing a 1: N case-control match on propensity score. In Proceedings of the 29th Annual SAS Users Group international conference (pp. 165-29). Fraeman, K. H. (2010). An introduction to implementing propensity score matching with SAS®. Bethesda, MD: United BioSource Corporation. Feng, W. W., Jun, Y., & Xu, R. (2006). A method/macro based on propensity score and Mahalanobis distance to reduce bias in treatment comparison in observational study. In SAS PharmaSUG 2006 Conference. Coca-Perraillon, M. (2007, April). Local and global optimal propensity score matching. In SAS Global Forum (Vol. 185, pp. 1-9).
  13. Overview/tutorial of Propensity Score method: Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41-55. Austin, P. C. (2011). An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies. Multivariate Behavioral Research, 46(3), 399–424. http://doi.org/10.1080/00273171.2011.568786 Stuart, E. A. (2010). Matching methods for causal inference: A review and a look forward. Statistical Science : A Review Journal of the Institute of Mathematical Statistics, 25(1), 1–21. http://doi.org/10.1214/09-STS313 Caliendo, M., & Kopeinig, S. (2008). Some practical guidance for the implementation of propensity score matching. Journal of economic surveys,22(1), 31-72. Others: Brookhart, M. A., Schneeweiss, S., Rothman, K. J., Glynn, R. J., Avorn, J., & Stürmer, T. (2006). Variable selection for propensity score models.American journal of epidemiology, 163(12), 1149-1156. Shadish, W. R. (2013). Propensity score analysis: promise, reality and irrational exuberance. Journal of Experimental Criminology, 9(2), 129-144. Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political analysis, 15(3), 199-236. Materials from Other Presentations: Stuart, E. (2011). “The why, when, and how of propensity score methods for estimating causal effects” at Society for Prevention Research, May 31, 2011. Slides: http://www.preventionresearch.org/wp-content/uploads/2011/07/SPR-Propensity-pc-workshop-slides.pdf VanEseltine, M. (2013). “Introduction to propensity score analysis” at CFDR Summer Methods Seminar, June 26, 2013. Slides: https://www.bgsu.edu/content/dam/BGSU/college-of-arts-and-sciences/center-for-family-and-demographic-research/documents/Workshops/2013%20-workshop-propensity-score-analysis.pdf Macros for propensity score matching: Parsons, L. (2004, May). Performing a 1: N case-control match on propensity score. In Proceedings of the 29th Annual SAS Users Group international conference (pp. 165-29). Fraeman, K. H. (2010). An introduction to implementing propensity score matching with SAS®. Bethesda, MD: United BioSource Corporation. Feng, W. W., Jun, Y., & Xu, R. (2006). A method/macro based on propensity score and Mahalanobis distance to reduce bias in treatment comparison in observational study. In SAS PharmaSUG 2006 Conference. Coca-Perraillon, M. (2007, April). Local and global optimal propensity score matching. In SAS Global Forum (Vol. 185, pp. 1-9).
  14. Overview/tutorial of Propensity Score method: Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41-55. Austin, P. C. (2011). An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies. Multivariate Behavioral Research, 46(3), 399–424. http://doi.org/10.1080/00273171.2011.568786 Stuart, E. A. (2010). Matching methods for causal inference: A review and a look forward. Statistical Science : A Review Journal of the Institute of Mathematical Statistics, 25(1), 1–21. http://doi.org/10.1214/09-STS313 Caliendo, M., & Kopeinig, S. (2008). Some practical guidance for the implementation of propensity score matching. Journal of economic surveys,22(1), 31-72. Others: Brookhart, M. A., Schneeweiss, S., Rothman, K. J., Glynn, R. J., Avorn, J., & Stürmer, T. (2006). Variable selection for propensity score models.American journal of epidemiology, 163(12), 1149-1156. Shadish, W. R. (2013). Propensity score analysis: promise, reality and irrational exuberance. Journal of Experimental Criminology, 9(2), 129-144. Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political analysis, 15(3), 199-236. Materials from Other Presentations: Stuart, E. (2011). “The why, when, and how of propensity score methods for estimating causal effects” at Society for Prevention Research, May 31, 2011. Slides: http://www.preventionresearch.org/wp-content/uploads/2011/07/SPR-Propensity-pc-workshop-slides.pdf VanEseltine, M. (2013). “Introduction to propensity score analysis” at CFDR Summer Methods Seminar, June 26, 2013. Slides: https://www.bgsu.edu/content/dam/BGSU/college-of-arts-and-sciences/center-for-family-and-demographic-research/documents/Workshops/2013%20-workshop-propensity-score-analysis.pdf Macros for propensity score matching: Parsons, L. (2004, May). Performing a 1: N case-control match on propensity score. In Proceedings of the 29th Annual SAS Users Group international conference (pp. 165-29). Fraeman, K. H. (2010). An introduction to implementing propensity score matching with SAS®. Bethesda, MD: United BioSource Corporation. Feng, W. W., Jun, Y., & Xu, R. (2006). A method/macro based on propensity score and Mahalanobis distance to reduce bias in treatment comparison in observational study. In SAS PharmaSUG 2006 Conference. Coca-Perraillon, M. (2007, April). Local and global optimal propensity score matching. In SAS Global Forum (Vol. 185, pp. 1-9).