Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The MD Anderson / IBM Watson Announcement: What does it mean for machine learning in healthcare?

It’s been over six years since IBM’s Watson amazed all of us on Jeopardy, but it has yet to deliver similar breakthroughs in healthcare. The headlines in last week’s Forbes article read, “MD Anderson Benches IBM Watson In Setback For Artificial Intelligence In Medicine.” Is it really a setback for the entire industry or not? Health Catalyst’s EVP for Product Development, Dale Sanders, believes that the challenges are unique to IBM’s machine learning strategy in healthcare. If they adjust that strategy and better manage expectations about what’s possible for machine learning in medicine, the future will be brighter for Watson, their clients, and AI in healthcare, in general. Watson’s success is good for all of us, but it’s failure is bad for all of us, too.

Join Dale as he discusses:

The good news: Machine learning technology is accelerating at a rate beyond Moore’s Law. Dale believes that machine learning algorithms and models are doubling in capability every six months.
The bad news: The healthcare data ecosystem is not nearly as rich as many would believe, and certainly not as rich as that used to train Watson for Jeopardy. Without high-volume, high-quality data, Watson’s potential and the constant advances in machine learning algorithms will hit a glass ceiling in healthcare.
The best news: By adjusting strategy and expectations, there are still plenty of opportunities to do great things with machine learning by using the current data content in healthcare, while we build out the volume and breadth of data we need to truly understand the patient at the center of the healthcare picture… and you don’t need an army of PhD data scientists to do it.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Be the first to comment

The MD Anderson / IBM Watson Announcement: What does it mean for machine learning in healthcare?

  1. 1. Is it really a setback, in general, or not? March 1, 2017 Dale Sanders Executive Vice President, Software Forbes Magazine: “MD Anderson Benches IBM Watson In Setback For Artificial Intelligence In Medicine”
  2. 2. Let’s Make Things Very Clear • IBM and Watson are a frequent competitor to Health Catalyst • I do NOT celebrate the difficulties of our competitors, especially IBM • “There but by the grace of God, go I” • Watson’s success begets Health Catalyst’s success • Were it not for IBM, I wouldn’t have a career in information technology • IBM was the backbone of the Air Force information systems that taught me so very much
  3. 3. Opening Salvo to Stir Things Up  • Tying Watson to a “cancer moonshot” created the peak of already inflated expectations about Watson • Every executive and politician wants to be John F. Kennedy • We have a generation of political and corporate executives who don’t understand technology and software, even though it’s running their world • Executives are selling technology they don’t understand and executives are buying technology they don’t understand • Information asymmetry always leads to an exploited consumer • Technology professionals have a moral and ethical obligation to speak up when they see this happening
  4. 4. Agenda • My background as it relates to this topic • The fundamental data challenges of applying Watson to healthcare • Health Catalyst’s approach to machine learning and AI in healthcare • I’m not selling here… I’m just informing you about a different approach • History will be the judge about whether the Catalyst approach works or not • These slides are purposely bland… this webinar is not about selling Health Catalyst
  5. 5. Data, data, data… for decision support My Background 1983 2016 B.S. Chemistry, biology minor US Air Force Command, Control, Communication, Computers & Intelligence (C4l) Officer Reagan/Gorbachev Summits TRW/National Security Agency • START Treaty • Nuclear Non-proliferation • Nuclear command & control system threat protection • Knowledge Based Systems Commercialization Nuclear Warfare Planning and Execution-- NEACP & Looking Glass Intel Corp, Enterprise Data Warehouse • Chief Data Guy • Regional Director of Medical Informatics, Intermountain Healthcare • CIO, Northwestern • Chief Data Warehousing Guy CIO, Cayman Islands National Health System Product Development, Health Catalyst
  6. 6. The Over-Hype of AI in the 1990s • I lived it. I hyped it. • Military and credit reporting systems managed the largest databases in the world at the time • They pale in comparison to Silicon Valley data content today • My team at TRW, the Knowledge Based Systems Group, was tasked with commercializing our military and intelligence technology in expert systems, fuzzy logic systems, neural nets, and genetic algorithms • Our first target was healthcare. Sound familiar?
  7. 7. I presented the following six slides at a conference in Feb 2012, exactly one year after Watson’s victory on Jeopardy, when hopes for Watson were very high in medicine. I was a fairly lonely contrarian.
  8. 8. What About Watson? 8
  9. 9. Watson  First, a little background on Dale Sanders  Natural Language Processing and Text Mining  Watson is revolutionary.  It’s the first thing in my IT career that really excited me… everything else has been incremental or variations of the same flavor 9
  10. 10. Watson’s Technology  Apache  Unstructured Information Management Architecture (UIMA)  Hadoop  Java, C++  Lexicals and ontologies  DBPedia, WordNet, and Yago  IBM Content Analytics with Enterprise Search  90 IBM Power 750 servers enclosed in 10 racks  16 Terabytes of memory  A 2,880 processor core  Linux based 10
  11. 11. What is Watson?  Near-word associations coupled with semantic mapping and zillions of sources of knowledge… digitized books, encyclopedias, news feeds, magazines, blogs, Wikipedia, etc.  Equivalent to approximately 240 million pages, in memory  Jeopardy answer  “A famous red quaffed clown or just any incompetent fool”  Watson’s correct answer  “Who is Bozo?”  Watson searched its indexes for near-word associations, recognized that Bozo was the most common word in the indexes that was missing from the question 11
  12. 12. Watson’s Problem With Healthcare  Watson’s training set for Jeopardy was a HUGE collection of human wisdom, academic and otherwise, stretching back thousands of years  What’s the training set for healthcare wisdom?  A few decades of clinical trials journals?  Claims processing data from a dysfunctional healthcare system that doesn’t include patient outcomes?  Progress notes? Radiology reports? Pathology reports?  Watson is not going to impact healthcare in the near term like many hope it will 12
  13. 13. Factoids  More than 50% of all medicines are prescribed, dispensed or sold inappropriately  Less than 40% of patients in the public sector and 30% in the private sector are treated according to clinical guidelines World Health Organization, May 2010 13
  14. 14. Key Points • Watson is a text-centric, Natural Language Processing (NLP) engine • Millions of “near word associations” are processed in seconds • Although related at some level, that’s different than a generic pattern recognition approach to machine learning used for discrete data and images • NLP: ”Find things for me faster in all this text.” • Machine Learning: “Make decisions and suggestions for me, and learn from each decision and suggestion.”
  15. 15. Key Points • 80% of healthcare data is text-based clinical notes and diagnostic reports, if you don’t count digital images, but that’s still not very much data in terms of sheer volumes, and the quality and consistency of that data varies considerably across clinicians • The source of Watson’s primary knowledge base in healthcare-- peer- reviewed journals and clinical trials data-- is relatively small in terms of volume and has questionable value in day-to-day healthcare • Watson’s training set for Jeopardy was at least 100x larger than what’s available to train Watson for healthcare
  16. 16. IBM Watson “Learning” Acquisitions • Phytel • Explorys • Truven • Merge • If the fundamental design of Watson is NLP and text-centric, will these acquisitions help Watson learn?
  17. 17. Is training Watson on chemotherapy and radiation therapy protocols the right strategy for treating and preventing cancer? I would argue that it’s not. Current cancer treatment strategies will go down in history alongside bloodletting and trepanation. We need to apply Watson and similar technology breakthroughs on something other than optimizing the status quo, which is anything but great.
  18. 18. The Cancer Data Ecosystem This is the data you need to prevent and treat cancer. Do we have this data in high volume, across many patients, with reasonable quality and consistency? No. • Genomics • Lifestyle • Epigenetics • Microbiome • Environmental • Traditional healthcare delivery data • Quality and length of life outcomes data for long-term survivors • All the above on healthy patients so we understand the target condition
  19. 19. Health Catalyst’s Approach to AI and Machine Learning
  20. 20. Semantics • Machine learning is one thing. Machine doing is another. • In my definition, it’s not Artificial Intelligence until the machine acts on your behalf. • We’ll get there in healthcare, but it will take a long time. • In the meantime, I prefer “Suggestive Analytics” based on machine learning.
  21. 21. Our Simple Mission Our mission is to organize the data in healthcare and make it accessible, useful, and valuable to the clients, patients, and families we serve. With data, all things are possible. Without it, not much.
  22. 22. Our fundamental strategy for Machine Learning: Integrate text and discrete data to inform the vectors and clusters in our models
  23. 23. Your machine learning aspirations must be tempered by the data that’s available, both in breadth and depth. Ironically, it’s easier for us to model and predict bad things in healthcare right now, than good things. We have more data about bad outcomes than good outcomes.
  24. 24. No Data, No Machine Learning • Moore’s Law: Chips double in capacity every 18 months • Sanders’ Law: Machine learning models double in capability every 6 months • But without data content, the models are of no use
  25. 25. 25 For the most part, this is the simple three-part pattern recognition model that we are building and that, I would argue, healthcare should broadly pursue Patients like this [pattern] Who were treated like this [pattern] Had these outcomes and costs [pattern]
  26. 26. The Human Health Data Ecosystem And, by the way, we don’t have much of any data on healthy patients
  27. 27. We Are Not “Big Data” in Healthcare, Yet 27
  28. 28. Data Volume vs. Machine Learning Model “But invariably, simple models and a lot of data trump more elaborate models based on less data.” •“The Unreasonable Effectiveness of Data”, March 2009, IEEE Computer Society; Alon Halevy, Peter Norvig, and Fernando Pereira, of Google 28
  29. 29. 29 Google’s Self Driving Car drove 80 million miles before it ever touched a road Think of a computer sitting in the seat of this computerized driving simulator, not a human
  30. 30. 30 Retina: The data collection system for Feature extraction Cerebral cortex: The data base and algorithms for Classification & Clustering The more times you go through this loop with different ”data”, the faster and better you become at feature extraction and classifying “people”
  31. 31. 31 Pattern recognition process Data acquisition Data reduction Feature extraction Classification & Clustering Confidence evaluation EHRs, billing, outcomes data, lab, meds, vitals, supply chain, et al Cleaning out the noisy or bad data, identifying general patterns These are properties of the object. Finding new and specific ways to identify new categories and representations of patient types, outcomes, events, encounters, episodes Using the features to assign patterns to the categories and representations Evaluating and correcting the confidence in the model’s output
  32. 32. 32 The challenges in healthcare Data acquisition Data reduction Feature extraction Classification & Clustering Confidence evaluation Very limited data. We think we are big data, but we’re not and generally, what limited data we have, is about sick patients, not healthy patients. How, then, do we extract Features that Classify a healthy patient so we know how to achieve that “Healthy Patient” pattern? If we don’t collect outcomes data, how then do we identify the Features to Classify a healthy or sick patient with good or bad outcomes?
  33. 33. 33 ess of Predictive AnalyticsThe Machine Learning loop 33 • In healthcare, we have, essentially, no outcomes data, so this is an open loop • If you don’t have a strategy for intervention, predicting something for the sake of predicting has no value
  34. 34. Troubling factoid • Of the 1,958 quality metrics in the National Quality Measures Clearinghouse, only 7% of those measure clinical outcomes and less than 2% of those are based on patient reported outcomes 34 N Engl J Med 2016; 374:504-506, February 11, 2016
  35. 35. Thank you for the graphs, PreSonus Healthcare and patients are continuous flow, analog process and beings But, if we sample that analog process enough, we can approximately recreate it with digital data 35
  36. 36. We are treating physicians and nurses as if they were digital sampling devices. “Every new click of the mouse you guys ask me to do, all in the name of data, sucks another piece of my soul away.” --Beleaguered primary care physician
  37. 37. 37 Predictive and suggestive analytics in the same user interface The efficacy and costs of antibiotic protocols for inpatients The Antibiotic Assistant at Intermountain Healthcare: The First Triple Aim Antibiotic Protocol Dosage Route Interval Predicted Efficacy Average Cost/Patient Option 1 500mg IV Q12 98% $7,256 Option 2 300mg IV Q24 96% $1,236 Option 3 40mg IV Q6 90% $1,759
  38. 38. • Antiinfective drugs • Average Savings per Patient = $280 • Cost of Hospitalization • Average Savings per Patient = $13,759 • Annual Savings (12-bed ICU) • Est. Total Savings per Year = $7,925,184 New England Journal of Medicine January 22, 1998 Economic Impact
  39. 39. • 30% reduction in Adverse Drug Events • 27.4% reduction in Mortality • 99.1% “on-time” delivery of pre-operative antibiotics • 84.5% reduction in post-operative antibiotic use • Stabilized antibiotic resistance Annals of Internal Medicine May 15, 1996 Quality of Care Impact
  40. 40. The Shark Tank Story • Chicago-based healthcare IT startups • Three hours of 15 minute presentations • Incredibly creative ideas at the application layer of technology • Absolutely no answer for, or conceptual understanding of, the challenges at the healthcare data layer
  41. 41. This is not an HIE, Clinical Data Repository, or Enterprise Data Warehouse. It’s a little bit of all three but better. 41
  42. 42. Health Catalyst Data Operating System Kernel Metadata Data Ingest Real-time Streaming Machine Learning NLP Source Connectors Catalyst Analytics Engine Core Services Data Processing Secure Messaging Security, Identity & Compliance Health Catalyst Fabric Registries Terminology & Groupers EHR Integration ISVsPRBLeading Wisely Catalyst Apps Care Management Apps Alerting FHIR Big Data SAMD & SMD Measures Patient & Provider Matching Atlas Risk Classifications Patient Attribution Data Quality Data Governance Data Pattern Recognition Data Export
  43. 43. New Generation Product Briefing 43 Health Catalyst Data Operating System Machine Learning Foundation1 catalyst.ai • Our machine learning models • Our strategy for embedding machine learning into all of our products 2 healthcare.ai • Our tools to automate machine learning tasks • Democratizing machine learning by releasing as open-source 3
  44. 44. healthcare.ai 44 Our open-source machine learning software product Automates key tasks in developing models, or customizing existing models using local data Makes deployment in an analytics environment easy and ‘production quality’
  45. 45. New Generation Product Briefing Scaling People Data Architects  Great domain knowledge  Often looking for opportunities to advance career/skills With the right tools…  Data architects make great feature engineers  Data architects can easily get started in predictive analytics. 45 With healthcare.ai, you have the people to do data science right now.
  46. 46. The healthcare.ai project list 46 Central Line-Associated Bloodstream Infection (CLABSI) Risk – Clinical Decision Support Congestive Heart Failure, Readmissions Risk – Clinical Decision Support COPD, Readmissions Risk – Clinical Decision Support Respiratory (COPD, Asthma, Pneumonia, & Resp. Failure), Readmission Risk – Clinical Decision Support Predictive Appointment No-shows – Operations and Performance Management Pre-surgical Risk (Bowel) – Clinical Decision Support and client request Propensity to Pay – Financial Decision Support Patient Flight Path, Diabetes Future Risk – Clinical Decision Support Patient Flight Path, Diabetes Future Cost– Clinical Decision Support Patient Flight Path, Diabetes Top Treatments – Clinical Decision Support Patient Flight Path, Diabetes Next Likely Complications (Glaucoma) – Clinical Decision Support Patient Flight Path, Diabetes Next Likely Complications (Retinopathy) – Clinical Decision Support Patient Flight Path, Diabetes Next Likely Complications (ESRD) – Clinical Decision Support In Development Built Planned Sepsis Risk – Clinical Decision Support Post-surgical Risk (Hips and Knees) – Clinical Decision Support Charge-denial Risk – Financial Decision Support Charge-grouping Guidance – Financial Decision Support Predictive ETL Batch Load Times – Platform Hospital Length of Stay – Operations and Performance Management Hospital Census – Operations and Performance Management CAUTI and VTE – Clinical Decision Support Risk-adjusted Comparisons Across Health Systems – CAFÉ 1-yr Admission Risk – Population Health and Accountable Care Bronchiolitis Admissions Risk – Clinical Decision Support Emergency C-section Risk – Clinical Decision Support Palliative Care vs Invasive Procedure Guidance – Clinical Decision Support Mortality Risk in Pre-term Births – Clinical Decision Support Registry Automation via Unsupervised Learning – Clinical Decision Support Mortality Risk in PICU – Clinical Decision Support
  47. 47. Predictive Seedlings 47 Bronchiolitis Admissions Risk Emergency C-section Risk Palliative Care vs Invasive Procedure Guidance Mortality Risk in Pre-term Births Mortality Risk in PICU Deep Learning for Large Tabular Data (1M+ rows) Patients Like This – Modifiable Risk-factor Recommendation for Patient Attributes Patients Like This – Optimal Treatment Recommendation Registry Automation via Unsupervised Learning Radiology Image Classification via Deep Learning Pathology Image Classification via Deep Learning Currently possible with healthcare.ai and the right data Roadmap for healthcare.ai
  48. 48. In Summary • Watson was overhyped, overbought, oversold… Not maliciously, but rather, probably naively • But it will have a big impact on society • Healthcare data ecosystem is just not quite ready for Watson, especially the text content that Watson thrives on • We have a bright future ahead for machine learning in healthcare, if you adjust your strategy and expectations according to the data content that’s available

    Be the first to comment

    Login to see the comments

  • nunogomes_18

    May. 9, 2017
  • JohnHernandez47

    Jun. 24, 2017
  • n_ramrakhyani

    Oct. 22, 2017
  • zahernourredine

    Dec. 9, 2018
  • MarielMindurob

    Mar. 18, 2019
  • MohammedAli1356

    May. 5, 2020

It’s been over six years since IBM’s Watson amazed all of us on Jeopardy, but it has yet to deliver similar breakthroughs in healthcare. The headlines in last week’s Forbes article read, “MD Anderson Benches IBM Watson In Setback For Artificial Intelligence In Medicine.” Is it really a setback for the entire industry or not? Health Catalyst’s EVP for Product Development, Dale Sanders, believes that the challenges are unique to IBM’s machine learning strategy in healthcare. If they adjust that strategy and better manage expectations about what’s possible for machine learning in medicine, the future will be brighter for Watson, their clients, and AI in healthcare, in general. Watson’s success is good for all of us, but it’s failure is bad for all of us, too. Join Dale as he discusses: The good news: Machine learning technology is accelerating at a rate beyond Moore’s Law. Dale believes that machine learning algorithms and models are doubling in capability every six months. The bad news: The healthcare data ecosystem is not nearly as rich as many would believe, and certainly not as rich as that used to train Watson for Jeopardy. Without high-volume, high-quality data, Watson’s potential and the constant advances in machine learning algorithms will hit a glass ceiling in healthcare. The best news: By adjusting strategy and expectations, there are still plenty of opportunities to do great things with machine learning by using the current data content in healthcare, while we build out the volume and breadth of data we need to truly understand the patient at the center of the healthcare picture… and you don’t need an army of PhD data scientists to do it.

Views

Total views

1,725

On Slideshare

0

From embeds

0

Number of embeds

2

Actions

Downloads

159

Shares

0

Comments

0

Likes

6

×