0
More  Predictive Modeling of Total Healthcare Costs Using Pharmacy Claims Data:  Adherence Dimension and Boosted Regressio...
Predictive Modeling <ul><li>Forecasts health services utilization and costs of insurance plan members </li></ul><ul><li>Id...
Study Background <ul><li>Pharmacy claims are less costly and contain fewer coding errors than medical data </li></ul><ul><...
Study Objectives <ul><li>Examine relationship between plan participant  adherence to drug therapy  and future total health...
About Adherence <ul><li>Adherence is defined as “the extent to which a person’s behavior—taking medication, following a di...
About Boosted Regression <ul><li>Came out of computational learning, called “boosting” 1   </li></ul><ul><li>Expands into ...
Data <ul><li>Utilized integrated medical and pharmacy claims data from a large (N=369,985) U.S. health plan  </li></ul><ul...
Methods <ul><li>Used five multivariate predictive models of total annual healthcare costs (pharmacy and  medical) in the f...
Methods <ul><li>Included five econometric modeling techniques: </li></ul><ul><ul><li>Ordinary least squares (OLS) </li></u...
Results Descriptive statistics
Results Adherence dimension coefficient estimates from OLS model of prospective total annual healthcare costs (untruncated)
Results Adherence dimension coefficient estimates from OLS model of prospective total annual healthcare costs (truncated a...
Results Validation sample summary results from predictive models of prospective total annual healthcare costs (untruncated...
Results Validation sample summary results from predictive models of prospective total annual healthcare costs (truncated a...
Conclusions <ul><li>Increased compliance was associated with a decrease in next year’s total healthcare costs </li></ul><u...
Conclusions  cont. <ul><li>PHD provided classification and predictive power similar to other prescription-only risk-adjust...
Limitations <ul><li>The potential endogeneity of the adherence measures was not examined. </li></ul><ul><li>Non-adherent b...
Upcoming SlideShare
Loading in...5
×

More Predictive Modeling of Total Healthcare Costs Using Pharmacy Claims Data

1,166

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,166
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
37
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "More Predictive Modeling of Total Healthcare Costs Using Pharmacy Claims Data"

  1. 1. More Predictive Modeling of Total Healthcare Costs Using Pharmacy Claims Data: Adherence Dimension and Boosted Regression M. Christopher Roebuck & Joshua N. Liberman American Society of Health Economists Inaugural Conference Madison, Wisc. June 6, 2006
  2. 2. Predictive Modeling <ul><li>Forecasts health services utilization and costs of insurance plan members </li></ul><ul><li>Identifies candidates for disease and therapy management interventions </li></ul><ul><li>Infers disease state, severity and statistical/econometric methods employed, depending on the classification system used </li></ul><ul><li>Includes well-known claims- and diagnosis-based “groupers,” such as the chronic disease score, adjusted clinical groups, diagnostic cost groups and episode risk groups </li></ul>
  3. 3. Study Background <ul><li>Pharmacy claims are less costly and contain fewer coding errors than medical data </li></ul><ul><li>Pharmacy health dimensions (PHD): pharmacy-based risk index that categorizes a year of prescription data into 62 disease indicators </li></ul><ul><li>Previous study of PHD accuracy at predicting prospective total annual healthcare costs that used several econometric techniques to deal with skewness and kurtosis 1 </li></ul><ul><li>This study is an extension of that work </li></ul>1 Powers, C.A., C.M. Meyer, M.C. Roebuck, and B. Vaziri. 2005. “Predictive Modeling of Total Healthcare Costs Using Pharmacy Claims Data: A Comparison of Alternative Econometric Cost Modeling Techniques.” Medical Care 43(11): 1065-1072.
  4. 4. Study Objectives <ul><li>Examine relationship between plan participant adherence to drug therapy and future total healthcare costs by augmenting PHD with an adherence dimension of predictors </li></ul><ul><li>Evaluate use of boosted regression modeling as an alternative to other econometric approaches for predicting (commonly) skewed and kurtotic healthcare cost data </li></ul>
  5. 5. About Adherence <ul><li>Adherence is defined as “the extent to which a person’s behavior—taking medication, following a diet and/or executing lifestyle changes—corresponds with agreed recommendations from a healthcare provider.” 1 </li></ul><ul><li>“ Poor adherence to the treatment of chronic diseases is a worldwide problem of striking magnitude. The impact of poor adherence grows as the burden of chronic disease grows.” 1 </li></ul><ul><li>Adherence to drug therapy is measured as both: </li></ul><ul><ul><li>Compliance: the extent to which a plan participant takes medicine as prescribed (e.g., medication possession ratio). </li></ul></ul><ul><ul><li>Persistence: the extent to which a plan participant follows the prescribed length of therapy (e.g., length of continuous therapy in days). </li></ul></ul>Source: 1. World Health Organization 2003. “Adherence to Long-term Therapies – Evidence for action.”
  6. 6. About Boosted Regression <ul><li>Came out of computational learning, called “boosting” 1 </li></ul><ul><li>Expands into generalized linear models based on the Gradient Boosting Machine 2, 3 that recast the algorithm in a likelihood framework </li></ul><ul><li>Fits a regression tree to residuals from previously fitted regression tree (beginning first with a “guess” of the response variable) </li></ul><ul><li>Updates regression tree sequentially to include all previously estimated regression trees, until the following two parameters (specified a priori) are reached: </li></ul><ul><ul><li>Number of “splits” (or N-way interactions) </li></ul></ul><ul><ul><li>Maximum number of iterations </li></ul></ul>Sources: 1. Freund, Y. and R. E. Schapire. 1997. “A decision-theoretic generalization of online learning and an application to boosting.”  Journal of Computer and System Sciences 55(1): 119-139. 2. Friedman, J.H.  2001. “Greedy function approximation: a gradient boosting machine.” Annals of Statistics 29(5): 1189-1232. 3. Friedman, J.H.  2002. “Stochastic gradient boosting.” Computational Statistics and Data Analysis 38(4): 367-378.
  7. 7. Data <ul><li>Utilized integrated medical and pharmacy claims data from a large (N=369,985) U.S. health plan </li></ul><ul><li>Studied 2003 and 2004 data, which allowed for a baseline/follow-up design </li></ul><ul><li>Included plan participants continuously eligible for pharmacy benefits for the entire study period </li></ul><ul><li>Allowed for no other exclusions or restrictions (e.g., all ages and all claims remained in the study) </li></ul><ul><li>Partitioned data randomly into 70% training and 30% validation samples </li></ul>
  8. 8. Methods <ul><li>Used five multivariate predictive models of total annual healthcare costs (pharmacy and medical) in the follow-up year to estimate four conditions: </li></ul><ul><ul><li>Diabetes, congestive heart failure, hypercholesterolemia, hypertension </li></ul></ul><ul><li>Included independent variables: </li></ul><ul><ul><li>Continuous measure of baseline pharmacy costs </li></ul></ul><ul><ul><li>14 age/gender categories </li></ul></ul><ul><ul><li>62 PHD disease indicators </li></ul></ul><ul><ul><li>Average co-pay per day supplied </li></ul></ul><ul><ul><li>Percent mail service days supplied </li></ul></ul><ul><ul><li>Four adherence dimension variables: </li></ul></ul><ul><ul><ul><li>Compliance and compliance 2 </li></ul></ul></ul><ul><ul><ul><li>Days persistent </li></ul></ul></ul><ul><ul><ul><li>Number of different drugs </li></ul></ul></ul>
  9. 9. Methods <ul><li>Included five econometric modeling techniques: </li></ul><ul><ul><li>Ordinary least squares (OLS) </li></ul></ul><ul><ul><li>Robust regression </li></ul></ul><ul><ul><li>Two-part model-probit/OLS </li></ul></ul><ul><ul><li>Two-part model-probit/GLM (gamma,log link) </li></ul></ul><ul><ul><li>Boosted regression with STATA command syntax: </li></ul></ul><ul><ul><li>boost THC2004_T50 $RHS $OTH, influence distribution(normal) trainfraction(0.7) maxiter(1000) seed(1) bag(0.5) predict(HATS`m') interaction(3) shrink(0.01) </li></ul></ul>
  10. 10. Results Descriptive statistics
  11. 11. Results Adherence dimension coefficient estimates from OLS model of prospective total annual healthcare costs (untruncated)
  12. 12. Results Adherence dimension coefficient estimates from OLS model of prospective total annual healthcare costs (truncated at $50,000)
  13. 13. Results Validation sample summary results from predictive models of prospective total annual healthcare costs (untruncated) * Mean absolute prediction error
  14. 14. Results Validation sample summary results from predictive models of prospective total annual healthcare costs (truncated at $50,000)
  15. 15. Conclusions <ul><li>Increased compliance was associated with a decrease in next year’s total healthcare costs </li></ul><ul><li>Each additional day of persistent therapy was associated with a decrease of between $6 and $16 in next year’s total healthcare costs </li></ul><ul><li>The magnitude of this association varied, as expected, by disease state </li></ul><ul><li>The number of different drugs – filled within a given year and indicated for that disease state – increased next year’s total healthcare costs, likely signifying: </li></ul><ul><ul><li>Treatment resistance/failure </li></ul></ul><ul><ul><li>Therapeutic aggressiveness/intensity </li></ul></ul><ul><ul><li>Disease severity </li></ul></ul>
  16. 16. Conclusions cont. <ul><li>PHD provided classification and predictive power similar to other prescription-only risk-adjustment groupers </li></ul><ul><li>Robust regression, as expected, always returned the least mean absolute prediction error </li></ul><ul><li>While boosted regression did offer higher R2, overfitting was evident in untruncated models (Note: this study did not attempt to respecify the boosting parameters) </li></ul><ul><li>Unfortunately, currently available user-written command BOOST does not output the regression tree structure for application in other samples </li></ul><ul><li>Boosting is useful in uncovering important interaction terms </li></ul>
  17. 17. Limitations <ul><li>The potential endogeneity of the adherence measures was not examined. </li></ul><ul><li>Non-adherent behavior may not alter next year’s total healthcare costs, but may affect future periods’ total healthcare costs. </li></ul><ul><li>The study sample was from a single, national health plan, and are therefore, not generalizable. </li></ul><ul><li>Need to consider other measures of accuracy (positive predictive value). </li></ul><ul><li>Need to tweak boosting specification to reduce overfitting. </li></ul>
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×