Measuring clinical utility: uncertainty in Net Benefit
PS.Observational.SAS_Y.Duan
1. Application of Propensity Score
Matching in Observational
Studies Using SAS
Yinghui (Delian) Duan, M.Sc., Ph.D candidate
Department of Community Medicine and Health Care,
University of Connecticut Health Center
Connecticut Institute for Clinical and Translational Science (CICATS)
Email: yduan@uchc.edu
3. Randomized Control Trials (RCTs)
› Treatment assignment is randomized
› Pre-treatment characteristics are balanced,
no confounding effects
› Difference in post-treatment outcomes can
be attributed to treatment effects
› “Gold standard” to estimate the effects of
treatment, interventions, and exposures
4. Observational Studies
› Non-experimental
› Treatment assignment is not determined by
design
› Usually the “treated” and “untreated” are
systematically different in some
characteristics that can affect outcome of
interest (i.e. confounders)
› Difficult to conclude causal effects due to
confounders
6. Propensity Score (PS)
› Defined by Rosenbaum & Rubin in 1983:
the probability of treatment assignment
conditional on observed baseline covariates
PSi = Pr (Treatmenti = 1 |Xi)
› A useful tool to remove confounding effects
and enhance causal inference in
observational studies
7. Estimating PS
› PS is most often estimated by a logistic
regression model
› Can also be estimated using other methods,
e.g., bagging or boosting, recursive
partitioning or tree-based methods, random
forests, and neural networks.
› No significant advantages reported compared
to logistic regression model
9. Example dataset: 6-month Mortality after
Percutaneous Coronary Intervention (PCI)
› Study sample: patients who received PCI
› Treatments: usual care alone vs. usual care + a
blood thinner
› Baseline confounders: age, gender, height,
coronary stent placement, acute myocardial
infarction within 7 days, and diabetes
› Outcome: 6-month mortality (0 or 1)
10. Example dataset: 6-month Mortality after
Percutaneous Coronary Intervention (PCI)
› Table: Sample description
p-value
n % n %
Age (mean ± SD) 64.6 ± 4.2 62.0 ± 3.8 <0.001
Female 938 33.1 760 32.6 0.673
Height (mean ± SD) 172.4 ± 10.2 171.6 ± 9.5 0.002
Stent 1,794 63.4 1,611 69.1 <0.001
Diabetes 659 23.3 438 18.8 <0.001
Acute MI 193 6.8 356 15.3 <0.001
Usual care
alone
(N = 2,830)
Usual care + blood
thinner
(N = 2,332)
14. Two Important Assumptions
› The assignment of treatment is independent
of potential outcomes conditional on the
observed baseline covariates
› Every subjects has a nonzero probability to
receive either treatment
15. Four Methods
› PS matching – most widely used
› Stratification using PS
16. PS Treated Untreated Strata
U
T U
U
0.4 U
T U
T
0.5 T U
T U
T U
T U
T
0.7 T U
T U
T
0.8 T
T
0.9
1
2
3
4
5
0.3
0.6
› Stratification using PS Trimming
Trimming
17. Four Methods
› PS matching – most widely used
› Stratification using PS
› Weighting adjustment, e.g., Inverse
probability of treatment weighting (IPTW)
using PS
› Covariate adjustment using PS – not
recommended
18. Propensity Score Matching
To form matched sets of
treated and untreated
subjects who share a similar
value of PS
20. Four Methods – Common Support
› PS matching –
› Stratification – only when used together with
trimming
› IPTW – not explicitly examine common
support
› Covariate adjustment – not explicitly
examine common support
21. PS Matching
› Some decisions to be made:
› 1:1 or N:1 matching
› N:1 can improve efficiency, reduce
variance, but increase bias
› With or without replacement
› With-replacement may yield less bias,
but higher variance
› Which algorithm?
22. PSM Algorithms: Nearest-Neighbor
PS Treated Untreated
0.3 U
T U
0.4
T
0.5
T U
T U
0.6 U
T U
0.7
› Each treated will get a match,
even if it isn’t a very good one
› Will create problem when a
treated subject just doesn’t have
any controls with similar PS
› If there are multiple untreated
subjects with the same PS value as
the treated subject, randomly
select one
23. PSM Algorithms: Match within Caliper
› Caliper: limit matches to be within
some range of PS values
› 0.2 of the standard deviation of
the logit of the PS (Austin, 2011)
› 0.25 or 0.5 of the PS standard
deviation
PS Treated Untreated
0.3 U
T U
0.4
T
0.5
T U
T U
0.6 U
T U
0.7
28. › N:1 match
› Matching iterations are from 8-digit to
1-digit
› E.g., in the 3nd iteration, 6-digit matching,
PS = 0.12345698 is matched with PS =
0.12345605
29. › All macro variables are required except and
SiteN
› Lib has to be specified even if it’s “work”
(otherwise error will occur)
› If SiteN is specified, then subjects will be
matched within each site
30. › These statements can be modified or
removed to change matching precision
38. Estimating Treatment Effects
› Run the same outcome analyses you would
have done on the original data
› Double robust: regression adjustment
for confounders can reduce residual
effects, increase precision
› If matching done with replacement, need
to use weight to reflect the fact that
controls used more than once
40. › PS model:
› Non-parsimonious model to estimate PS
› Include covariates that are associated
with outcome, or with both outcome and
treatment; do NOT include covariates
that are strongly correlated with
treatment, but not directly associated
with outcome
› Can include interaction terms and higher
order to improve PS estimation and
matching
41. › Sample size
› At least 1,000 – 1,500 (Shadish 2013)
› Missing data
› List-wise deletion
43. References:
Overview/tutorial of Propensity Score method:
1. Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the
propensity score in observational studies for causal
effects. Biometrika, 70(1), 41-55.
2. Austin, P. C. (2011). An Introduction to Propensity Score Methods for
Reducing the Effects of Confounding in Observational Studies.
Multivariate Behavioral Research, 46(3), 399–424.
http://doi.org/10.1080/00273171.2011.568786
3. Stuart, E. A. (2010). Matching methods for causal inference: A review
and a look forward. Statistical Science : A Review Journal of the
Institute of Mathematical Statistics, 25(1), 1–21.
http://doi.org/10.1214/09-STS313
4. Caliendo, M., & Kopeinig, S. (2008). Some practical guidance for the
implementation of propensity score matching. Journal of economic
surveys,22(1), 31-72.
44. References (cont.):
Others:
1. Brookhart, M. A., Schneeweiss, S., Rothman, K. J., Glynn, R. J., Avorn, J., &
Stürmer, T. (2006). Variable selection for propensity score
models.American journal of epidemiology, 163(12), 1149-1156.
2. Shadish, W. R. (2013). Propensity score analysis: promise, reality and
irrational exuberance. Journal of Experimental Criminology, 9(2), 129-
144.
3. Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2007). Matching as
nonparametric preprocessing for reducing model dependence in
parametric causal inference. Political analysis, 15(3), 199-236.
45. References (cont.):
Materials from Other Presentations:
1. Stuart, E. (2011). “The why, when, and how of propensity score methods
for estimating causal effects” at Society for Prevention Research, May
31, 2011. Slides: http://www.preventionresearch.org/wp-
content/uploads/2011/07/SPR-Propensity-pc-workshop-slides.pdf
2. VanEseltine, M. (2013). “Introduction to propensity score analysis” at
CFDR Summer Methods Seminar, June 26, 2013. Slides:
https://www.bgsu.edu/content/dam/BGSU/college-of-arts-and-
sciences/center-for-family-and-demographic-
research/documents/Workshops/2013%20-workshop-propensity-
score-analysis.pdf
46. References (cont.):
Macros for propensity score matching:
1. Parsons, L. (2004, May). Performing a 1: N case-control match on
propensity score. In Proceedings of the 29th Annual SAS Users Group
international conference (pp. 165-29).
2. Fraeman, K. H. (2010). An introduction to implementing propensity score
matching with SAS®. Bethesda, MD: United BioSource Corporation.
3. Feng, W. W., Jun, Y., & Xu, R. (2006). A method/macro based on
propensity score and Mahalanobis distance to reduce bias in treatment
comparison in observational study. In SAS PharmaSUG 2006
Conference.
4. Coca-Perraillon, M. (2007, April). Local and global optimal propensity
score matching. In SAS Global Forum (Vol. 185, pp. 1-9).
Editor's Notes
Observational study is also to study casual effects
Biggest challenge in observational studies is to eliminate or decrease confounding effects as much as possible
PS exists in both RCT and observational studies. In RCT, PS is known and is defined by the study design. In observational studies, PS is generally unknown and has to be estimated from study data. (Austin, 2011)
In this step, usual diagnostic procedures are not necessary. Predictability of the model, or collinearity of covariates are not concerns. We only want to obtain predicted probabilities and care about whether it results in balanced sample matching step later.
The first assumption can also paraphrased as “all variables that affects treatment assignment and outcome have been measured. There is no unknown or unmeasured confounders”. If this assumption holds, which means unconfoundedness holds given the full set of observed covariates, then the unconfoundedness also holds given the propensity score.
If both assumptions hold, the treatment assignment is strongly ignorable, therefore conditioning on the propensity score allows one to obtain unbiased estimates of average treatment effects.
Matching is the most widely used method and is also the focus of my talk today.
Covariate adjustment is the least used method among the four. I barely see any recent studies that used PS method actually used covariate adjustment.
Strata that have very few subjects from either group should be trimmed out from final dataset to estimate treatment effects.
IPTW method is use PS directly in outcome analysis; treated weight = 1/PS; untreated weight = 1/(1-PS).
Without common support, if treated and untreated have very different distributions of the confounders, model results can be biased
Not hugely different unless research question concerns about the pairs themselves rather than concerns about the overall matched sample. Optimal matching picks about the same controls, but does a better job of assigning them to treated units
1819 subjects in each group.
Overview/tutorial of Propensity Score method:
Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41-55.
Austin, P. C. (2011). An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies. Multivariate Behavioral Research, 46(3), 399–424. http://doi.org/10.1080/00273171.2011.568786
Stuart, E. A. (2010). Matching methods for causal inference: A review and a look forward. Statistical Science : A Review Journal of the Institute of Mathematical Statistics, 25(1), 1–21. http://doi.org/10.1214/09-STS313
Caliendo, M., & Kopeinig, S. (2008). Some practical guidance for the implementation of propensity score matching. Journal of economic surveys,22(1), 31-72.
Others:
Brookhart, M. A., Schneeweiss, S., Rothman, K. J., Glynn, R. J., Avorn, J., & Stürmer, T. (2006). Variable selection for propensity score models.American journal of epidemiology, 163(12), 1149-1156.
Shadish, W. R. (2013). Propensity score analysis: promise, reality and irrational exuberance. Journal of Experimental Criminology, 9(2), 129-144.
Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political analysis, 15(3), 199-236.
Materials from Other Presentations:
Stuart, E. (2011). “The why, when, and how of propensity score methods for estimating causal effects” at Society for Prevention Research, May 31, 2011. Slides: http://www.preventionresearch.org/wp-content/uploads/2011/07/SPR-Propensity-pc-workshop-slides.pdf
VanEseltine, M. (2013). “Introduction to propensity score analysis” at CFDR Summer Methods Seminar, June 26, 2013. Slides: https://www.bgsu.edu/content/dam/BGSU/college-of-arts-and-sciences/center-for-family-and-demographic-research/documents/Workshops/2013%20-workshop-propensity-score-analysis.pdf
Macros for propensity score matching:
Parsons, L. (2004, May). Performing a 1: N case-control match on propensity score. In Proceedings of the 29th Annual SAS Users Group international conference (pp. 165-29).
Fraeman, K. H. (2010). An introduction to implementing propensity score matching with SAS®. Bethesda, MD: United BioSource Corporation.
Feng, W. W., Jun, Y., & Xu, R. (2006). A method/macro based on propensity score and Mahalanobis distance to reduce bias in treatment comparison in observational study. In SAS PharmaSUG 2006 Conference.
Coca-Perraillon, M. (2007, April). Local and global optimal propensity score matching. In SAS Global Forum (Vol. 185, pp. 1-9).
Overview/tutorial of Propensity Score method:
Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41-55.
Austin, P. C. (2011). An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies. Multivariate Behavioral Research, 46(3), 399–424. http://doi.org/10.1080/00273171.2011.568786
Stuart, E. A. (2010). Matching methods for causal inference: A review and a look forward. Statistical Science : A Review Journal of the Institute of Mathematical Statistics, 25(1), 1–21. http://doi.org/10.1214/09-STS313
Caliendo, M., & Kopeinig, S. (2008). Some practical guidance for the implementation of propensity score matching. Journal of economic surveys,22(1), 31-72.
Others:
Brookhart, M. A., Schneeweiss, S., Rothman, K. J., Glynn, R. J., Avorn, J., & Stürmer, T. (2006). Variable selection for propensity score models.American journal of epidemiology, 163(12), 1149-1156.
Shadish, W. R. (2013). Propensity score analysis: promise, reality and irrational exuberance. Journal of Experimental Criminology, 9(2), 129-144.
Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political analysis, 15(3), 199-236.
Materials from Other Presentations:
Stuart, E. (2011). “The why, when, and how of propensity score methods for estimating causal effects” at Society for Prevention Research, May 31, 2011. Slides: http://www.preventionresearch.org/wp-content/uploads/2011/07/SPR-Propensity-pc-workshop-slides.pdf
VanEseltine, M. (2013). “Introduction to propensity score analysis” at CFDR Summer Methods Seminar, June 26, 2013. Slides: https://www.bgsu.edu/content/dam/BGSU/college-of-arts-and-sciences/center-for-family-and-demographic-research/documents/Workshops/2013%20-workshop-propensity-score-analysis.pdf
Macros for propensity score matching:
Parsons, L. (2004, May). Performing a 1: N case-control match on propensity score. In Proceedings of the 29th Annual SAS Users Group international conference (pp. 165-29).
Fraeman, K. H. (2010). An introduction to implementing propensity score matching with SAS®. Bethesda, MD: United BioSource Corporation.
Feng, W. W., Jun, Y., & Xu, R. (2006). A method/macro based on propensity score and Mahalanobis distance to reduce bias in treatment comparison in observational study. In SAS PharmaSUG 2006 Conference.
Coca-Perraillon, M. (2007, April). Local and global optimal propensity score matching. In SAS Global Forum (Vol. 185, pp. 1-9).
Overview/tutorial of Propensity Score method:
Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41-55.
Austin, P. C. (2011). An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies. Multivariate Behavioral Research, 46(3), 399–424. http://doi.org/10.1080/00273171.2011.568786
Stuart, E. A. (2010). Matching methods for causal inference: A review and a look forward. Statistical Science : A Review Journal of the Institute of Mathematical Statistics, 25(1), 1–21. http://doi.org/10.1214/09-STS313
Caliendo, M., & Kopeinig, S. (2008). Some practical guidance for the implementation of propensity score matching. Journal of economic surveys,22(1), 31-72.
Others:
Brookhart, M. A., Schneeweiss, S., Rothman, K. J., Glynn, R. J., Avorn, J., & Stürmer, T. (2006). Variable selection for propensity score models.American journal of epidemiology, 163(12), 1149-1156.
Shadish, W. R. (2013). Propensity score analysis: promise, reality and irrational exuberance. Journal of Experimental Criminology, 9(2), 129-144.
Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political analysis, 15(3), 199-236.
Materials from Other Presentations:
Stuart, E. (2011). “The why, when, and how of propensity score methods for estimating causal effects” at Society for Prevention Research, May 31, 2011. Slides: http://www.preventionresearch.org/wp-content/uploads/2011/07/SPR-Propensity-pc-workshop-slides.pdf
VanEseltine, M. (2013). “Introduction to propensity score analysis” at CFDR Summer Methods Seminar, June 26, 2013. Slides: https://www.bgsu.edu/content/dam/BGSU/college-of-arts-and-sciences/center-for-family-and-demographic-research/documents/Workshops/2013%20-workshop-propensity-score-analysis.pdf
Macros for propensity score matching:
Parsons, L. (2004, May). Performing a 1: N case-control match on propensity score. In Proceedings of the 29th Annual SAS Users Group international conference (pp. 165-29).
Fraeman, K. H. (2010). An introduction to implementing propensity score matching with SAS®. Bethesda, MD: United BioSource Corporation.
Feng, W. W., Jun, Y., & Xu, R. (2006). A method/macro based on propensity score and Mahalanobis distance to reduce bias in treatment comparison in observational study. In SAS PharmaSUG 2006 Conference.
Coca-Perraillon, M. (2007, April). Local and global optimal propensity score matching. In SAS Global Forum (Vol. 185, pp. 1-9).
Overview/tutorial of Propensity Score method:
Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41-55.
Austin, P. C. (2011). An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies. Multivariate Behavioral Research, 46(3), 399–424. http://doi.org/10.1080/00273171.2011.568786
Stuart, E. A. (2010). Matching methods for causal inference: A review and a look forward. Statistical Science : A Review Journal of the Institute of Mathematical Statistics, 25(1), 1–21. http://doi.org/10.1214/09-STS313
Caliendo, M., & Kopeinig, S. (2008). Some practical guidance for the implementation of propensity score matching. Journal of economic surveys,22(1), 31-72.
Others:
Brookhart, M. A., Schneeweiss, S., Rothman, K. J., Glynn, R. J., Avorn, J., & Stürmer, T. (2006). Variable selection for propensity score models.American journal of epidemiology, 163(12), 1149-1156.
Shadish, W. R. (2013). Propensity score analysis: promise, reality and irrational exuberance. Journal of Experimental Criminology, 9(2), 129-144.
Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political analysis, 15(3), 199-236.
Materials from Other Presentations:
Stuart, E. (2011). “The why, when, and how of propensity score methods for estimating causal effects” at Society for Prevention Research, May 31, 2011. Slides: http://www.preventionresearch.org/wp-content/uploads/2011/07/SPR-Propensity-pc-workshop-slides.pdf
VanEseltine, M. (2013). “Introduction to propensity score analysis” at CFDR Summer Methods Seminar, June 26, 2013. Slides: https://www.bgsu.edu/content/dam/BGSU/college-of-arts-and-sciences/center-for-family-and-demographic-research/documents/Workshops/2013%20-workshop-propensity-score-analysis.pdf
Macros for propensity score matching:
Parsons, L. (2004, May). Performing a 1: N case-control match on propensity score. In Proceedings of the 29th Annual SAS Users Group international conference (pp. 165-29).
Fraeman, K. H. (2010). An introduction to implementing propensity score matching with SAS®. Bethesda, MD: United BioSource Corporation.
Feng, W. W., Jun, Y., & Xu, R. (2006). A method/macro based on propensity score and Mahalanobis distance to reduce bias in treatment comparison in observational study. In SAS PharmaSUG 2006 Conference.
Coca-Perraillon, M. (2007, April). Local and global optimal propensity score matching. In SAS Global Forum (Vol. 185, pp. 1-9).