Predictive Analytics: It's The Intervention That Matters

2,950 views
2,717 views

Published on

In this two-part webinar, get the detailed knowledge you need to make informed decisions about adopting predictive analytics in healthcare so you can separate today's hype from reality. In part 1, you'll learn key learnings from Dale Sanders including 1) our fixation on predictive analytics in readmissions, 2) the common trap of predictions without interventions, 3) the common misconceptions of correlations verses causation, 4) examples of predictions without algorithms, and 5) the importance of putting the basics first.

In part 2, you'll hear from industry expert David Crockett, PhD in a "graduate level" crash course cover key concepts such as machine learning, algorithms, feature selection, classification, tools and more.

Published in: Health & Medicine, Business

Predictive Analytics: It's The Intervention That Matters

  1. 1. Predictive Analytics: It’s The Intervention That Matters David Crockett Dale Sanders, Sep 2013 © 2013 Health Catalyst ww© 2013 Health Ccom w.healthcatalyst.atalyst www.healthcatalyst.com
  2. 2. An Email Yesterday From a CMIO about today’s webinar Dale, “One thing that I think would be helpful, generally speaking, is helping executives and operational leaders get a really concrete idea of what value true analytics brings to a healthcare organization. These days the terms ‘predictive analytics’ and ‘big data’ are thrown around so frequently and so casually, that for many they have become devoid of actual meaning.” © 2013 Health Catalyst www.healthcatalyst.com
  3. 3. Audience Poll Benefits of predictive analytics to healthcare over the next three years? © 2013 Health Catalyst www.healthcatalyst.com
  4. 4. Overview Dale Sanders • Human interest and color commentary • A selection of stories, concepts and lessons learned, such as… • The parallels between “treating” terrorists and treating patients using predictive analytics David Crockett • A graduate-level crash course • Machine learning, algorithms, feature selection, classification, tools, etc. © 2013 Health Catalyst www.healthcatalyst.com
  5. 5. An Oddly Relevant Career Path US Air Force CIO •Nuclear warfare operations TRW • NSA • Nuclear Command & Control Counter Threat Program • Credit risk scoring, traffic routing & optimization • Strategic Execution Decision Aid 5 © 2013 Health Catalyst www.healthcatalyst.com
  6. 6. Key Messages & Themes 1. We are fixated on predictions and interventions of readmission ● ● We should be predicting and intervening on ADMISSIONS Aim higher…! 2. Predictions without interventions are useless-- and potentially worse than useless 3. Correlation does not imply causation 4. Missing data = Poor predictions ● 5. Some of the most important predictions don’t need a computer algorithm ● 6. Patient outcomes, familial, genomics Nurses and physicians can tell you When it comes to analytics, take care of the basics, first ● The time will come for predictive analytics 6 © 2013 Health Catalyst www.healthcatalyst.com
  7. 7. Most Common Causes for Readmission Robert Wood Johnson Foundation, Feb 2013 1. Patients have no family or other caregiver at home 2. Patients did not receive accurate discharge instructions 3. Patients did not understand discharge instructions 4. Patients discharged too soon 5. Patients referred to outpatient physicians and clinics not affiliated with the hospital 7 © 2013 Health Catalyst www.healthcatalyst.com
  8. 8. Healthcare Analytics Adoption Model Level 8 Cost per Unit of Health Payment & Prescriptive Analytics Contracting for & managing health. Tailoring patient care based on population outcomes. Level 7 Cost per Capita Payment & Predictive Analytics Diagnosis-based financial reimbursement & managing risk proactively Level 6 Cost per Case Payment & The Triple Aim Procedure-based financial risk and applying “closed loop” analytics at the point of care Level 5 Clinical Effectiveness & Accountable Care Measuring & managing evidence based care Level 4 Automated External Reporting Efficient, consistent production & agility Level 3 Automated Internal Reporting Efficient, consistent production Level 2 Standardized Vocabulary & Patient Registries Relating and organizing the core data Level 1 Integrated, Enterprise Data Warehouse Foundation of data and technology Level 0 Fragmented Point Solutions Inefficient, inconsistent versions of the truth © 2013 Health Catalyst www.healthcatalyst.com
  9. 9. Healthcare Analytics Adoption Model: The Details for Organizational Self-Inspection Level 8 Level 7 Cost per Unit of Health Payment & Prescriptive Analytics: Providers Analytic motive expands to wellness management and mass customization of care. Physicians, hospitals, employers, payers and members/patients collaborate to share risk and reward (e.g., financial reward to patients for healthy behavior). Analytics expands to include NLP of text, prescriptive analytics, and interventional © cision support. Prescriptive analytics are available at the de point of care to improve patient specific outcomes based upon population outcomes. Data content expands to include genomic and familial information. The EDW is updated within a few minutes of changes in the source systems. Cost per Capita Payment & Predictive Analytics: Analytic motive expands to address diagnosis-based, fixed-fee per capita reimbursement models. Focus expands from management of cases to collaboration with clinician and payer partners to manage episodes of care, using predictive modeling, forecasting, and risk stratification to support outreach, triage, escalation and referrals. Patients are flagged in registries who are unable or will not participate in care protocols. Data content expands to include external pharmacy data and protocol-specific patient reported outcomes. On average, the EDW is updated within one hour or less of source system changes. Level 6 Cost per Case Payment & The Triple Aim: The “accountable care organization” shares in the financial risk and reward that is tied to clinical outcomes. At least 50% of acute care cases are managed under bundled payments. Analytics are available at the point of care to support the Triple Aim of maximizing the quality of individual patient care, population management, and the economics of care. Data content expands to include bedside devices and detailed activity based costing. Data governance plays a major role in the accuracy of metrics supporting quality-based compensation plans for clinicians and executives. On average, the EDW is updated within one day of source system changes. The EDW reports organizationally to a C-level executive who is accountable for balancing cost of care and quality of care. Level 5 Clinical Effectiveness & Accountable Care: Analytic motive is focused on measuring clinical effectiveness that maximizes quality and minimizes waste and variability. Data governance expands to support care management teams that are focused on improving the health of patient populations. Permanent multidisciplinary teams are in-place that continuously monitor opportunities to improve quality, and reduce risk and cost, across acute care processes, chronic diseases, patient safety scenarios, and internal workflows. Precision of registries is improved by including data from lab, pharmacy, and clinical observations in the definition of the patient cohorts. EDW content is organized into evidence-based, standardized data marts that combine clinical and cost data associated with patient registries. Data content expands to include insurance claims. On average, the EDW is updated within one week of source system changes. Level 4 Automated External Reporting: Analytic motive is focused on consistent, efficient production of reports required for regulatory and accreditation requirements (e.g. CMS, Joint Commission, tumor registry, communicable diseases); payer incentives (e.g. MU, PQRS, VBP, readmission reduction); and specialty society databases (e.g. STS,NRMI, Vermont-Oxford). Adherence to industry-standard vocabularies is required. Clinical text data content is available for simple key word searches. Centralized data governance exists for review and approval of externally released data. Level 3 Automated Internal Reporting: Analytic motive is focused on consistent, efficient production of reports supporting basic management and operation of the healthcare organization. Key performance indicators are easily accessible from the executive level to the front-line manager. Corporate and business unit data analysts meet regularly to collaborate and steer the EDW . Data governance expands to raise the data literacy of the organization and develop a data acquisition strategy for Levels 4 and above. Level 2 Standardized Vocabulary & Patient Registries: Master vocabulary and reference data identified and standardized across disparate source system content in the data warehouse. Naming, definition, and data types are consistent with local standards. Patient registries are defined solely on ICD billing data. Data governance forms around the definition and evolution of patient registries and master data management. Level 1 Integrated, Enterprise Data Warehouse: At a minimum, the following data are co-located in a single data warehouse, locally or hosted: HIMSS EMR Stage 3 data, Revenue Cycle, Financial, Costing, Supply Chain, and Patient Experience. Searchable metadata repository is available across the enterprise. Data content includes insurance claims, if possible. Data warehouse is updated within one month of changes in the source system. Data governance is forming around the data quality of source systems. The EDW reports organizationally to the CIO. Level 0 Fragmented Point Solutions: Vendor-based and internally developed applications are used to address specific analytic needs as they arise. The fragmented Point Solutions are neither co-located in a data warehouse nor otherwise architecturally integrated with one another. Overlapping data content leads to multiple versions of analytic truth. Reports are labor intensive and inconsistent. Data governance is non-existent.
  10. 10. Audience Poll Analytics Adoption Model: At what level does your organization consistently and reliably function? 10 © 2013 Health Catalyst www.healthcatalyst.com
  11. 11. Challenge of Predicting Anything Human 11
  12. 12. Sampling Rate vs. Predictability The sampling rate and volume of data in an experiment is directly proportional to the predictability of the next experiment 12 © 2013 Health Catalyst www.healthcatalyst.com
  13. 13. Can We Learn From Nuclear Warfare Decision Making? “Clinical” observations • Satellites and radar indicate an enemy launch Predictive “diagnosis” • Are we under attack or not? Decision making timeframe • <4 minutes to first impact when enemy subs launch from the east coast of the US “Treatment” & intervention • Launch on warning or not? 13 © 2013 Health Catalyst www.healthcatalyst.com
  14. 14. Desired “Outcomes” 1. Retain US society as described in the Constitution 2. Retain the ability to govern & command US forces 3. Minimize loss of US lives 4. Minimize destruction of US infrastructure 5. Achieve all of this as quickly as possible with minimal expenditure of US military resources 14 © 2013 Health Catalyst www.healthcatalyst.com
  15. 15. Where And How Can A Computer Help? Reduce variability in decision making & improve outcomes 15 © 2013 Health Catalyst www.healthcatalyst.com
  16. 16. 16
  17. 17. Lessons For Healthcare • Humans didn’t trust predictive models when the decision making timeframe was compressed and the consequences of a bad decision were extreme • • Subjective human issues were not well-modeled • • • At present in healthcare, predictive analytics are easier to apply in slowly changing situations, e.g., chronic condition management, elective procedures, ventilator weaning, glucose management in the ER, antibiotic protocols The “Rogue Commander” scenario We need to at least try to quantify the “difficult patient” Without outcomes data, it’s all guesswork • • Thankfully, we don’t have much outcomes data related to nuclear warfare That shouldn’t be the case in healthcare 17 © 2013 Health Catalyst www.healthcatalyst.com
  18. 18. Quantifying the Atypical Patient Not all patients can participate in a protocol At Northwestern, we found that 30% of patients fell into one or more of these categories 1. Cognitive inability 2. Economic inability 3. Physical inability 4. Geographic inability 5. Religious beliefs 6. Contraindications to the protocol 7. Voluntarily non-compliant 18 © 2013 Health Catalyst www.healthcatalyst.com
  19. 19. Accounting For These Patients 30% of your patients will have to be treated and/or reached in a unique way • Your predictive algorithms must be adjusted these attributes, especially for readmission • These patients are a unique numerator in the overall denominator of patients under accountable care • You need a data collection & governance strategy for these patient attributes • You need a different interventional strategy for each of the 7 categories • Your physician compensation model must be adjusted for these patient types 19 © 2013 Health Catalyst www.healthcatalyst.com
  20. 20. Sortie Turnaround Times The Goal: Predictable, fast turnaround of aircraft to a successful battle 20 © 2013 Health Catalyst www.healthcatalyst.com
  21. 21. Patient Fight Path Profiler The Goal: Predictable, fast turnaround of patients to a good life 21
  22. 22. Healthcare As a Battle Field…?? The Order of Battle and the Order of Care Demand forecasting: What do we need and when? 22
  23. 23. NSA, Terrorists, and Patients The Odd Parallels of T errorist Registries and Patient Registries 23 © 2013 Health Catalyst www.healthcatalyst.com
  24. 24. Predicting Terrorist Risk Risk = P(A) × P(S|A) × C • Probability of Attack • Probability of Success if Attack occurs • Consequences of Attack (dollars, lives, national psyche, etc.) • What are the costs of intervention and mitigation? • Do they significantly outweigh the Risk? 24 © 2013 Health Catalyst www.healthcatalyst.com
  25. 25. Predicting Patient Risk 25 © 2013 Health Catalyst www.healthcatalyst.com
  26. 26. We Know the Probabilities What are the consequences? What are the strategies and costs to intervene? 26
  27. 27. Lessons For Healthcare • Multiple predictive models that “vote” are more accurate than single models • In the absence of data, and until more data is available, multiple expert opinion is better than nothing for predicting outcomes and managing risk “Wisdom of crowds” • • Backward chaining predictive models (supervised learning) are the most accurate, but are also inflexible and fragile How did this person become a terrorist? What was their pre-terrorist data profile? • • Friends, family, and what you read are MAJOR predictors of terror risks • We don’t collect familial data in the course of care • If you associate with more than one terrorist group, even greater predictor • • Parallels to a comorbidity Even when we can predict accurately, the cultural willingness and ability to intervene are incredibly difficult and can be very controversial Armed drone “assassinations”, TSA profiling • • BRCA genes and prophylactic mastectomies 27 © 2013 Health Catalyst www.healthcatalyst.com
  28. 28. More Reading 1. Eliciting Probabilities from Experts. Advances in Decision Analysis. Cambridge, UK: Cambridge University Press, 2007. Hora S. ,Edwards W, Miles R, Jr., von Winterfeldt D (eds). 2. Estimating Terrorism Risk. RAND Center for T errorism Risk Management Policy, 2003. Willis H, Morral A, Kelly T Medby J. , 3. Probabilistic Risk Analysis and Terrorism Risk. Risk Analysis, Vol. 30, No. 4, 2010. Barry Charles Ezell, Steven P Bennett, Detlof von Winterfeldt, John . Sokolowski, and Andrew J. Collins. 28 © 2013 Health Catalyst www.healthcatalyst.com
  29. 29. Suggestive Analytics© Surround the decision making environment with suggestions, based on analytic data ● ● Much easier than predicting Leverages “Wisdom of Crowds” data Worth reading ● “Nudge: Improving Decisions About Health, Wealth, and Happiness” © 2013 Health Catalyst www.healthcatalyst.com
  30. 30. 30 © 2013 Health Catalyst www.healthcatalyst.com
  31. 31. Closed Loop Analytics: The Triple Aim 31
  32. 32. The Antibiotic Assistant • Predicting the efficacy and costs of antibiotic protocols for inpatients Antibiotic Protocol Dosage Route Interval Predicted Efficacy Average Cost/Patient Option 1 500mg IV Q12 98% $7,256 Option 2 300mg IV Q24 96% $1,236 Option 3 40mg IV Q6 90% $1,759 Dave Claussen, Scott Evans 32 © 2013 Health Catalyst www.healthcatalyst.com
  33. 33. The Antibiotic Assistant Impact Complications declined 50% Avg # doses declined from 19 -> 5.3 The replicable and bigger story ● Antibiotic cost per treated patient: $123 -> $52 ● By simply displaying the cost to physicians 33 © 2013 Health Catalyst www.healthcatalyst.com
  34. 34. Stories of Correlation vs. Causation • The production of butter in Bangladesh and the S&P 500 • • David Leinweber, UC Berkeley TRW Credit Reporting (Experian) • • • NSA-developed predictive algorithms indicated Black (African American) borrowers were higher risk Sociologists on staff explained the bigger picture Women, Hormone Replacement Therapy, and Cardiovascular Disease • • Women with HRT had lower CVD Women with HRT were from higher income levels and could afford to exercise 34 © 2013 Health Catalyst www.healthcatalyst.com
  35. 35. Audience Poll How confident are you that your organization is prepared to combine the technology of predictive analytics with the processes of intervention? 35 © 2013 Health Catalyst www.healthcatalyst.com
  36. 36. Wrap-Up • Vendors are in the EXTREME hype cycle of predictive analytics • • Without outcomes data, we are largely stuck with predicting the obvious T ake care of the basics of analytics, first • The human “mathematical model” is years away • Suggestive analytics is easier • Intervening to reduce risk is the hard part • • Predicting is the easy part Predicting without intervening is ripe for lawyers 36 © 2013 Health Catalyst www.healthcatalyst.com
  37. 37. Many thanks • Contact information • dale.sanders@healthcatalyst.com • dale.sanders@hsa.ky • @drsanders • www.linkedin.com/in/dalersanders 37 © 2013 Health Catalyst www.healthcatalyst.com
  38. 38. David Crockett: The Graduate Crash Course © 2013 Health Catalyst ww© 2013 Health Ccom t w.healthcatalyst.atalys 38 www.healthcatalyst.com
  39. 39. Objectives T opics we’ll cover today include: Machine Learning Overview Software Examples (Open Source, Commercial) Prediction Modeling Demo 4 Insights to Implementation © 2013 Health Catalyst www.healthcatalyst.com
  40. 40. Machine Learning 101 A scientific discipline concerned with algorithm design and development that allows computers to learn based on data. A major focus of machine learning research is to automatically learn to recognize complex patterns and make intelligent decisions based on data. © 2013 Health Catalyst www.healthcatalyst.com
  41. 41. Machine Learning 102 Extracting useful information from large machine-readable data sets is a problem faced by people in nearly every area of commerce, manufacturing, government, academic discipline and science Machine learning has a wide range of applications: © 2013 Health Catalyst www.healthcatalyst.com
  42. 42. Algorithms Machine learning algorithms are organized into a taxonomy, based on the desired outcome of the algorithm. Supervised Learning Generates a function that maps inputs to desired outputs Reinforcement Learning Learns how to act given an observation of the world Unsupervised Learning Models a set of inputs: labeled examples are not available Transduction Tries to predict new outputs based on training inputs, training outputs, and test inputs © 2013 Health Catalyst www.healthcatalyst.com
  43. 43. The Modeling Process Step 1 Define Problem Gather Data Step 2 Select Model Run/Evaluate Models T est Model INITIAL VALIDATION DATASET DATASET Step 3 Apply Model Run Prediction TEST DATASET © 2013 Health Catalyst www.healthcatalyst.com
  44. 44. A Few Definitions… INPUT • • • • • Features Attributes Variables Class Labels OUTPUT • • • • Outcome Prediction Forecast Trend FEATURE SELECTION Selecting a subset of relevant variables (features) for use in construction a model CLASSIFICATION Use an object's characteristics (features) to identify which class (or group) it belongs to © 2013 Health Catalyst www.healthcatalyst.com
  45. 45. Feature Selection PCA ● principal components analysis ReliefFAttributeEval ● recursive feature elimination CfsSubsetEval ● correlation-based feature subset selection ChiSquaredAttributeEval ● chi-squared statistic with respect to class © 2013 Health Catalyst www.healthcatalyst.com
  46. 46. Know Your Data… It is crucial to understand the data set being used © 2013 Health Catalyst www.healthcatalyst.com
  47. 47. Insight #1 Don’t confuse more data with more insight. © 2013 Health Catalyst www.healthcatalyst.com
  48. 48. Specific Improves Accuracy In our hands: Generic Readmissions Predictor ~ 79% PPV Heart Failure Readmission Predictor ~ 91% PPV © 2013 Health Catalyst www.healthcatalyst.com
  49. 49. Classification Rules Linear (Regression) Trees Neural Networks Support Vector Machine Bayes © 2013 Health Catalyst www.healthcatalyst.com
  50. 50. Classification – Rules Based Rules gender=male AND age>40: getting a little thin on top ~ 65% gender=male AND age>40 AND bald grandpa OR uncle ~ 82% Linear (Regression) Trees Neural Networks Support Vector Machine Bayes © 2013 Health Catalyst www.healthcatalyst.com
  51. 51. Classification – Regression Rules Linear (Regression) male(1)x(41.8) + age(44)x(0.607): getting thin on top ~ 68% male(1)x(41.8)+age(44)x(0.607)+uncle(19.2): going bald ~ 87% Trees Neural Networks Support Vector Machine Bayes © 2013 Health Catalyst www.healthcatalyst.com
  52. 52. Classification – Tree Based Rules Linear (Regression) Trees Neural Networks Support Vector Machine Bayes © 2013 Health Catalyst www.healthcatalyst.com
  53. 53. Insight #2 Don’t confuse insight with value. © 2013 Health Catalyst www.healthcatalyst.com
  54. 54. The Cost of Readmissions Medicare rehospitalization within 30 days after discharge: $17.4 billion Jencks, NEJM 2009 © 2013 Health Catalyst www.healthcatalyst.com
  55. 55. Prediction In Context 87% © 2013 Health Catalyst www.healthcatalyst.com
  56. 56. Data Warehouse Synergy Prediction can be more powerful “in context.” Rothman Index, an early wellness metric. In a data warehouse environment… System architecture and data flow • Incorporate additional site specific data • Filter across multiple data sources • Available to any user and any application • Show when and where needed © 2013 Health Catalyst www.healthcatalyst.com
  57. 57. Evaluating Performance true positive (TP hit) , true negative (TN, correct rejection) false positive (FP false alarm, Type I error) , false negative (FN, miss, Type II error) sensitivity or true positive rate (TPR, hit rate, recall) TPR = TP / P = TP / (TP + FN) specificity (SPC or True Negative Rate) SPC = TN / N = TN / (FP + TN) = 1 − FPR positive predictive value (PPV, precision) PPV = TP / (TP + FP) Sensitivity measures the proportion of actual positives which are correctly identified Specificity measures the proportion of negatives which are correctly identified © 2013 Health Catalyst www.healthcatalyst.com
  58. 58. Always a trade off… However, there is always a trade-off between sensitivity and specificity… For example, airport security scanners can be set to trigger on low-risk items like belt buckles and keys… (very sensitive, but not too specific) This trade-off is often represented graphically as a receiver operating characteristic (ROC) curve. © 2013 Health Catalyst www.healthcatalyst.com
  59. 59. ROC Curves The trade-off is between: Sensitivity vs. Specificity… Area Under the Curve (AUC) Machine learning often uses this AUC statistic for model comparison. (~ c statistic) PPV (precision) is a common metric that’s also used. PPV = TP / (TP + FP) Whiting et al. BMC Med Res Methodol. 2008 Apr 11;8:20 © 2013 Health Catalyst www.healthcatalyst.com
  60. 60. Insight #3 Don’t overestimate the ability to interpret the data. © 2013 Health Catalyst www.healthcatalyst.com
  61. 61. Open Source Tools Many open source tools exist... Waffles http://jmlr.org/mloss/ © 2013 Health Catalyst www.healthcatalyst.com
  62. 62. Open Source Example Various implementations of a given approach can be found. For example, random forest classification (circa 2001): Original implementation by Leo Breiman and Adele Cutler (Fortran) ALGLIB (library, C++, C#, Pascal, Visual Basic, Python) Orange (toolkit, C++, with Python interface) fast-random-forest (Java) RandomForest (Weka) Milk (toolkit, Python) treelearn (Python) http://oz.berkeley.edu/users/breiman/randomforest2001.pdf © 2013 Health Catalyst www.healthcatalyst.com
  63. 63. Open Source Standards Predictive Model Markup Language (PMML), is an industry standard used to represent numerous predictive modeling techniques. Such as: • Association Rules • Cluster Models • Neural Networks • Decision Trees © 2013 Health Catalyst www.healthcatalyst.com
  64. 64. Commercial Tools Many commercial tools also exist... IBM SPSS GE Predictivity The Forrester W ave™: Predictive Analytics Solutions, Q1 2013 © 2013 Health Catalyst www.healthcatalyst.com
  65. 65. Enhancing Commercial Tools Leveraging the enterprise warehouse… IBM SPSS GE Predictivity Late-BindingTM Data Bus Data Acquisition and Storage Metadata Engine © 2013 Health Catalyst www.healthcatalyst.com
  66. 66. Insight #4 Don’t underestimate the challenge of implementation. © 2013 Health Catalyst www.healthcatalyst.com
  67. 67. Predictive Modeling Demo Supervised learning with a known class label (outcome) • Simple example: Pima Indian diabetes Several standard ML techniques have been built into a software "workbench" called Waikato Environment for Knowledge Analysis (WEKA). © 2013 Health Catalyst www.healthcatalyst.com
  68. 68. Weka File Explorer 1 3 2 © 2013 Health Catalyst www.healthcatalyst.com
  69. 69. Weka Classify – Zero Rules 1 2 3 © 2013 Health Catalyst www.healthcatalyst.com
  70. 70. Weka Classify – One Rule T mode:10-fold cross-validation est 1 === Classifier model (full training set) === plas: < 114.5 -> tested_negative < 115.5 -> tested_positive < 133.5 -> tested_negative < 135.5 -> tested_positive < 144.5 -> tested_negative < 152.5 -> tested_positive < 154.5 -> tested_negative >= 154.5 -> tested_positive (585/768 instances correct) Time taken to build model: 0.05 s © 2013 Health Catalyst www.healthcatalyst.com
  71. 71. Weka Classify – JRIP 1 T mode:10-fold cross-validation est === Classifier model (full training set) === JRIP rules: =========== (plas >= 132) and (mass >= 30) => class=tested_positive (182.0/48.0) (age >= 29) and (insu >= 125) and (preg <= 3) => class=tested_positive (19.0/4.0) (age >= 31) and (pedi >= 0.529) and (preg >= 8) and (mass >= 25.9) => class=tested_positive (22.0/5.0) => class=tested_negative (545.0/102.0) Number of Rules : 4 Time taken to build model: 0.06 s © 2013 Health Catalyst www.healthcatalyst.com
  72. 72. Weka Classify – Regression 1 Test mode:10-fold cross-validation === Classifier model (full training set) === Logistic Re gression with ridge parameter o f 1.0E-8 Odds Ratios... Variable tested_negative ============================ preg 0.8841 plas 0.9654 pres 1.0134 skin 0.9994 insu 1.0012 mass 0.9142 pedi 0.3886 age 0.9852 Time taken to build model: 0.06 s © 2013 Health Catalyst www.healthcatalyst.com
  73. 73. Machine Learning Models Step 1 Define Problem Gather Data Step 2 Select Model Run/Evaluate Models T est Model INITIAL VALIDATION DATASET DATASET Step 3 Apply Model Run Prediction TEST DATASET © 2013 Health Catalyst www.healthcatalyst.com
  74. 74. Predictive Analytics: Insights to Implementation/Intervention Lessons Learned: ● Lesson #1: Don’t confuse more data with more insight ● Lesson #2: Don’t confuse insight with value ● Lesson #3: Don’t overestimate the ability to interpret the data ● Lesson #4: Don’t underestimate the challenge of implementation From David Shaywitz, Forbes.com © 2013 Health Catalyst www.healthcatalyst.com
  75. 75. Thank you! Questions and Answers © 2013 Health Catalyst www.healthcatalyst.com
  76. 76. Next Webinars Healthcare Reform: Implications for Your Health System Brian Ahier, Healthcare Evangelist Date: October 8, 2013 from 1:00 PM - 2:00 PM ET Widely recognized as one of healthcare's most knowledgeable speakers on healthcare policy Brian Ahier will provide an in-depth look at current healthcare , reform and more specifically the implications of President Barack Obama's 2010 legislative efforts, namely the Patient Protection and Affordable Care Act, also referred to as 'Obamacare’. Population Health Fundamentals Dr. David A. Burton, MD, Executive Chairman Date: October 9, 2013 from 1:00 PM - 2:00 PM ET Dr. Burton, former Intermountain Executive and current Executive Chairman of the Board of Health Catalyst, will lead a webinar Oct. 9 on the paradigm shift in healthcare analytics from a focus on inpatient (acute care) to a focus on the continuum of care. The shift is being driven by the linked imperatives of Population Health Management and Accountable Care Organizations and other shared accountability arrangements. © 2013 Health Catalyst www.healthcatalyst.com

×