De presentatie van Nicky Hekster (IBM) tijdens de conferentie 'Big Data in de Zorg' van 23 november 2011 in Almere. Op deze conferentie werd het officiële startschot gegeven voor Almere DataCapital en de Dutch Health Hub.
IBM – hart-long machine, ooglaseren, chocladeindustrie gered, MRI voor virussen
However, current medical practice does not consider to provide for the collection and access to patient “health-related” data that would be required for personalized medicine. This is a natural progression along the path to better, more scientific and data-driven care. The large arrow shows a progression from practicing medicine based on individual knowledge and experience, through intuitive or consensus-based approaches when evidence is sparse, to evidence based on populations and ultimately to the holy grail of personalized health promotion and care delivery with evidence based on patients like me. Currently, too much care today is “trial and error”, meaning that it is based on individual clinician expertise and knowledge, all-to-frequently with limited access to relevant patient information and clinical knowledge. Healthcare is too complex and changing too fast to base care only on what an individual clinician can learn and retain. In 1975 there were about 200 clinical trials published. By 2005, the number had grown to over 30,000 . Add to that all the industry knowledge generated outside of clinical trials… In short, we have increasing complexity of intervention options, increasing insights into patient heterogeneity and an expanding scope of potential services for prevention, chronic care, etc. It is no longer possible to practice medicine “with the knowledge in a clinician’s head.” We don’t have complete knowledge today of all diseases – at it is unlikely that we ever will have complete knowledge. If a physician has access to more complete patient information and clinical knowledge, but knowledge of the disease, interaction of multiple diseases, etc. does not exist then the physician must depend upon his / her intuition to diagnose and determine the best treatment approaches, prognosis, etc. Evidence based approaches can represent a huge step forward. The problem with evidence based on populations is called heterogeneity of treatment effects, which describes the variation in treatment results from the same treatment in different patients. For example, some may respond well to a drug, some may respond but poorly, some may have an adverse reaction and some may have no response. Also, what we think are similar diseases based on symptoms may in fact be quite different diseases. For example, we now know that there are over 90 different types of lymphoma and leukemia. Experts today suggest that we have evidence for only about ¼ to 1/3 of what we do. Also we have been remarkably uncurious regarding what works, why it works and for whom it works. The share of US health expenses devoted to determining what works best is about one-tenth of one percent. Personalized healthcare, in the upper right hand corner, uses more complete information (for example, about the patient, disease states or responses to treatments) to help predict, prevent and aid in early detection of diseases. Then it uses the patient’s unique physiology – and patient preferences, where appropriate – to help determine the best preventive or therapeutic approaches. Regarding the axes, note the reference to diagnostic tools. An incorrect or incomplete diagnosis occurs all too frequently – in up to 50% of cases according to some studies. In a recent report, researchers state that the rate of diagnostic error is up to 15% and that the cases physicians see as routine and unchallenging are often the ones that end up being misdiagnosed. May 2, 2008 in Medscape. Also, autopsies suggest that as many as 20% of fatal illnesses are misdiagnosed. Jerome Groopman, MD and author of “How Doctors Think” suggests that patient pose 3 questions to their doctor when he or she suggests a diagnosis: What else could it be? There are over 10,000 diseases and the biggest diagnostic error is premature closure. Computerized diagnostic tools such as Isabel can help. It is now being interfaced to NextGen. Could two things be going on that would explain my symptoms? Is there anything in my history, physical examination, laboratory findings or other tests that seems not to fit with the diagnosis? Regarding the cause of the disease, we need to know the exact cause, not just the symptoms. A lot of diseases with different causes (and requiring different treatments) share similar symptoms. Regarding comparative effectiveness, we need better information about benefits, risks and costs (for cost effectiveness) for different interventions for different conditions (or multiple conditions) for different patient populations and subpopulations. Also, for the vertical axis, the definition of relevant patient information will expand in a more patient-centric, value-based healthcare system. Clinicians will need to know a lot more about a patient for prevention, prediction, early detection, chronic care coordination, patient compliance and behavior modification than is needed for a specific acute intervention.“
Key Points to Make The healthcare industry in undergoing a major transformation … we all need it and we all use it. Those of us “in healthcare” realize we have a legacy health system that is fee for service based resulting in a care system that is high-cost and inconsistent quality. We're moving to a more patient centric, evidence and competitive care system where the participants will have to compete on the value deliver. This transformation is driving new thinking, new business models and a restructuring of clinical and operational care models. The expectation of value is changing and healthcare organizations have to adjust their business to deliver value, not volume. This type of transformation requires innovation … innovation that is about productivity / value and not just advancing medical technology for technology sake. The main consideration must be to total well being and total cost. As the backbone for a transformed healthcare system, leveraging clinical information and operational data are obvious things to do to improve quality of care, patient satisfaction and business efficiency. This is placing a premium on making information accessible and actionable to optimize outcomes. Interesting Background Nuggets Transformation has begun – evidence: Implementation of EMR: 98% of Hospitals and Clinicians have begun planning HITECH makes $27B available through 2020 Provider consolidation is accelerating; competitors as well as payers/providers are merging Pay for Performance: Payers consider factors such as quality of care, patient satisfaction and business efficiency
IBM Confidential Storage capacity shipments are growing at 54% CAGR IDC, &quot;Worldwide Disk Storage Systems 2008-2012 Forecast: Content- Centric Customers- Reshaping Market Demand,&quot; Doc # 212177, May 2008. Funda has permission from &quot;Suzanne Hopkins&quot; <SHopkins@idc.com> 07/15/2008 02:23 PM medical images will take up 30% of the world’s storage The field of medical imaging is driving breakthroughs in diagnosis and treatment in medicine – and that will only accelerate. As a result, there is exponential growth in the number and size of digital medical images . Medical images that used to be two-dimensional and 1MB in size a few years ago are now typically four-dimensional and 1TB in size. By 2010, it’s estimated that 30% of the world’s storage will be taken up by these medical images.¹ Source:
How Can Watson Help Doctors? “ In ‘Jeopardy’, built into the game is this notion confidence; that it’s not worth answering unless you’re sure. And in the real world, there are lots of problems like that. You don’t want your doctor to guess. You want him to have confidence in his answer before he decides to give you a treatment,” commented Dr. Katharine Frase , VP, Industry Solutions and Emerging Business at IBM Research. Healthcare delivery has become increasingly reliant on a sophisticated mix of medical devices, diagnostic and therapeutic equipment, and the multimodal data they provide. The delivery of high quality healthcare is dependent on a wide variety of diagnostic, surgical, therapeutic systems, the environment of care, as well as supporting IT infrastructures. This abundance of data has created challenges for healthcare providers. “ For at least thirty years, it’s been humanly impossible for a physician to master all the material they need to practice at the highest level,” said Dr. Herbert Chase , Professor of Clinical Medicine at Columbia University. “Medical literature has doubled in size every seven years” he added. “ You are never going to replace a trained doctor or nurse,” said Dr. Joseph Jasinski , Healthcare and Life Sciences at IBM Research. “But certainly a system like Watson could be a Physician’s Assistant. Suppose you are a clinician, a doctor, a nurse trying to diagnose a very complex case. You have some ideas, but in order to confirm your hypothesis, confirm what you think is wrong; you need a lot of information.” he added In addition to assisting with Medical diagnosis, Watson could help lower the occurrence of Medical errors and Hospital acquired infections, a key issue facing the healthcare system. “Twenty percent of medical errors are diagnostic errors. And it’s not that they’ve missed diagnoses, often that they are delayed.” said Dr. Chase. “Watson has the capacity to get the diagnoses up there sooner” A system like Watson could be tied into a hospital’s facilities and biomedical systems and Healthcare management infrastructure to ensure that assets were cleaned, sanitized or sterilized before they are available for use with a new patient. It could also ensure that clinical equipment which has been recalled or is under a service alert, is removed from service before it is used on a patient. These scenarios would help to proactively reduce the opportunity for medical errors and preventable infections acquired in Hospital. “ It is the effective and efficient storage, retrieval, analyses, and use of biomedical information to improve health. At the end of the day, the goal is to improve health,” said Dr. Chase. NLP Ambiguïteiten, in context, impliciet, niet precies Zelflerend 200 miljoen pagina’s/3 sec
Eine der Herausforderungen war es, den Ansatz klassischer Suchmaschinen zu erweitern Wörter können in natürlicher Sprache je nach Kontext ihre Bedeutung verändern oder eben sogar doppeldeutig sein Schauen wir uns das Beispiel an: Auf Basis der bloßen Stichworte auf der linken Seite, wäre die Antwort vermutlich wie rechts zu lesen – interessant, aber falsch, denn aus den bloßen Stichwörtern lässt sich nicht automatisch auf Vasco da Gama als richtige Antwort schließen Das System muss also wesentlich mehr tun, um nur Stichwörter aneinander zu reihen Die Stichwörter müssen klassifiziert und in den richtigen Zusammenhang gebracht werden Das System muss mögliche Antworten gegeneinander abwägen und am Ende – bei Jeopardy eben innerhalb von 3 Sekunden – entscheiden, ob es antwortet und welche Antwort die wahrscheinlichste ist Backgroundinformation We faced a lot of technical challenges but at the center of the problem is dealing with the many was you can express the same meaning in natural language. NL is often very sensitive to context and is often incomplete, tacit and ambiguous. Simplified approaches can easily lead you astray. These next two examples should help motivate our approach. Consider this question. <Read it> Now consider that based simply on keywords it would be straight-forward to pick up this potentially answer-bearing passage. <read green passage> This is a great hit from a keyword perspective in shares many common terms – May , Arrived , Anniversary , Portugal , India etc. and by using keyword evidence should give good confidence that Gary is the explorer in question. And whose to say Garry is not an Explorer . After all, we are all explorers in our own special way. In fact, the next sentence might read – and then Gary returned home to explore his attic looking for a lost photo album . Such a sentence would be legitimate evidence that Gary can be classified as an Explorer. Classifications are tricky, we humans are very flexible in how we classify things – we are willing to accept all sorts of variations in meaning to make language work. Of course in this case, the famous explorer Vasco De Gama is the correct answer but how would a computer know that for sure. A computer system must learn to dig deeper, to find, evaluate and weigh different kinds of evidence – ultimately finding the answer that is best supported by the content. Consider this…<next slide>
Hier also der Ansatz von Watson: Um die möglichen Antworten zu bewerten, braucht das System einen so genannten Confidence Level Es zerlegt die Aussage links grammatikalisch und im kontextualen Zusammenhang Rechts sehen wir, dass das System bereits aus seinem Datensatz die richtige Passage rausgegriffen hat – aber es weiß es nicht mit Sicherheit, weil nur das Wort “May” übereinstimmt. Also muss das System jetzt in seinem gesamten Datensatz suchen, um die Aussage zu untermauern und die Wahrscheinlichkeit der richtigen Antwort zu erhöhen Bei dieser Suche beginnt das System bereits zu lernen und neue Verknüpfungen in seinen Daten zu bilden Das ist die Aufgabe der vorhin erwähnten Algorithmen Backgroundinformation: Here we see the same question on the right <read it again> To identify and gain confidence in better evidence, the system must parse the question, determining its grammatical structure and identify the main predicates like celebrated and arrived along with their main arguments (that is their subjects and objects, etc) for example -- who is doing the celebrating , and who is doing the arriving AND for each of these actions where and when are they happening. This would further require the system to attempt to distinguish places , dates and people from each other and from other words and phrases in the question. On the right side, we see a passage containing the RIGHT answer BUT with only one key word in common -- “ MAY ” . <read the green passage> Given just that one common and very popular term, the system must look at a huge amount of unrelated stuff to even get a chance to consider this passage and then must employ and weigh the right algorithms to match the question with an accurate confidence, for example in this case <click> Temporal reasoning algorithms can relate a 400 th anniversary in 1898 to 1498, Statistical Paraphrasing algorithms can help the computer learn from reading lots of texts that landed in can imply arrived in and finally with Geospatial reasoning using geographical databases the system may learn that Kappad Beach is in India and if you arrive in Kappad Beach you have therefore arrived in India. And still, all of this will admit numerous errors since few of these computations will produce 100% certainty in mapping from words, to concepts to other words. Just as an example, what if the passage said “ considered landing in ” rather than “ landed in ” or what if it the question said “ arrival in what he thought to be India? ” . Question Answering Technology tries to understand what the user is really asking for and to deliver precise and correct responses. But Natural language is hard … the authors intended meaning can be expressed in so many different ways. To achieve high levels of precision and confidence you must consider much more information and analyze it more deeply. We needed a radically different approach that could rapidly admit and integrate many algorithms , considering lots of different bits of evidence from different perspectives, AND that could learn how to combine and weigh these different sorts of evidence ultimately determining how strongly or weakly they support or refute possible answers.
Key Points to Make Organizations are already deploying the breadth of IBM’s solutions to address information based problems in healthcare: Master Data and Advanced Case Management Data Warehouses, Business Analytics including IBM Netezza based solutions Big data, new data models and more … all based on a common integration framework Same Examples Include: Independent Health Used predictive analytics to create engagement segmentation studies and pinpointed the best way to engage individual customers in lowering health risks through more effective wellness programs. Geisinger Health System Provided rapid analysis and reporting of vital insights from millions of patient encounters through a first-of-its-kind clinical decision intelligence system, improving patient care, research and innovation. North Carolina State University Performed data and content analytics on unstructured information sources to reduce the time needed to find target companies for technology investments from months to days Aetna Used fraud and abuse analytics to identify suspicious physician, hospital, and patient issues in healthcare claims. “To date, the SIU has identified more than $20,000,000 in potential recoveries on the cases.” Not shown: Rice University uses POWER7 for healthcare analytics ( http://www-03.ibm.com/press/us/en/pressrelease/29315.wss) Regional University Hospital (supported by Medicaid) Phase 1: Treatment effectiveness through patient monitoring during / post discharge Phase 2: Expand effectiveness and into research areas Solution consists of analyzed patient care information with customized alerting to Medicaid (IBM Content Analytics with medical annotators) Large Healthcare Payer Phase 1: Analyzed patient records for claims processing to reduce cost of evidence collection and automate claims related decisions – expects to save $7.5 million annually Solution to consist of analyzed patient care information with claims (case) data (IBM Content Analytics, IBM Case Manager with medical annotators)
Nicky Hekster (IBM) - Watson for Health
Wat kan Watson betekenen voor het zoeken in Big Data? Dr N.S. Hekster 23 November 2011 Big data in de Zorg
<ul><li>To wrest from nature the secrets which have perplexed philosophers in all ages, to track to their sources the causes of disease, to correlate the vast stores of knowledge , that they are quickly available for the prevention and cure of disease - these are our ambitions </li></ul><ul><li>Sir William Osler (1849-1919) Co-founder of the Faculty of Medicine Johns Hopkins University </li></ul>
Introductie spreker <ul><li>Mijn naam: Nicky Hekster </li></ul><ul><li>Functie: Technical Leader Healthcare & LifeSciences IBM Benelux </li></ul><ul><li>Actief in ICT sinds 1987, in HC & LS sinds 2006 </li></ul><ul><li>Leveranciersvoorzitter </li></ul><ul><li>[email_address] </li></ul><ul><li>Te vinden op </li></ul>
De routekaart van eHealth Van informatie op orde naar informatie van waarde waarde Samenwerking in zorgteams en interactie met de patiënt Het zorgorganisatie-breed kunnen bekijken en uitwisselen van gegevens via een aanpasbaar gebruikersinterface. Veilige uitwisseling en harmonisatie van betrouwbare klinische gegevens binnen en tussen zorginstellingen Onttrekken van waarde en inzicht ter verbetering van kwaliteit, resultaat en ter voorkoming van klinische variaties en fluctuerende kosten. Digitalisering van informatie binnen het ziekenhuis of een afdeling. Het vastleggen van patiëntinformatie Bruikbare klinische en bedrijfsmatige inzichten Geïntegreerde patiëntinformatie Evidence Based Medicine Clinical Decision Support Het leveren van klinische kennis en patiëntspecifieke informatie ter ondersteuning van het beste besluit en zorgpad. IT optimimalisering Volwassenheid van zorg-ICT in de tijd
Zorg op het lijf geschreven - realiteit vereist betere toegang tot en analyse van relevante patiëntinformatie en klinische kennis Toegang tot klinische kennis (e.g. Diagnostische hulpmiddelen, kennis van de oorzaken van ziekten, empirisch bewijs of vergelijkende effectiviteit ) Toegang tot relevante patiëntinformatie Matig Matig Goed Goed Proefondervindelijk (Gebaseerd op expertise en ervaring) Voorspellend en Evidence-based (Gebaseerd op patiëntencohorts) Gepersonaliseerd (Gebaseerd op mensen zoals ik ) Waarde Intuïtief en volgens klinische consensus ( Op basis van partiële toegang tot beschikbare patiëntinformatie en klinische kennis ) Meer kunst dan wetenschap Meer wetenschap dan kunst Bron: IBM Global Business Services and IBM Institute for Business Value
Inconsistente kwaliteit en toenemende kosten vragen om verandering Verhoging van doeltreffendheid en doelmatigheid <ul><li>Medicatiefouten leiden jaarlijks tot zo'n 90.000 onnodige opnames </li></ul><ul><li>De kosten van incidenten in de zorg bedragen jaarlijks 4 miljard euro </li></ul><ul><li>Geneesmiddelen tegen kanker en Alzheimer werken in 75 -80% van de gevallen niet </li></ul>Verbetering van de klinische bedrijfsvoering <ul><li>Het aantal mensen dat in Nederland per jaar overlijdt aan vermijdbare medische fouten is ca. 1500 tot 1600 </li></ul><ul><li>Er gaan jaarlijks miljoenen verloren door administratieve en klinische verspilling, fraude, en misbruik </li></ul>
Gestructureerd Ongestructureerd Ongeveer 80% van alle opgeslagen zorgdata is ongestructureerd 1 Zorgdataopslagcapaciteit groeit met 35% per jaar 2 30% van de wereldwijde gegevensopslag bestaat uit medische beelden 3 Radiologie Cardiologie Pathologie MDL Dermatologie Zorgdata – hoge volumes en enorme variatie EPD 2 Recente studie door de Enterprise Strategy Group 4 http://www.machinaresearch.com/healthcare2020.html 1 AIIM website, geaccepteerd percentage Data jaarlijks uitgewisseld tussen zorginformatiesystemen: In 2010: 283 terabyte, en in 2020: 78 petabyte - 774 miljoen verbonden apparaten 4 3 IBM Global Technology Outlook for 2005 1 MB/2D beeld 500 MB/4D beeld 2004 >2009 2006 2007 2008 2009 2010 2011 Geleverde PetaBytes
Watson – de toekomstige doktersassistent? <ul><li>Complexe diagnosen </li></ul><ul><li>Betere en beter afgestemde medicatie </li></ul><ul><li>Voorkomen van medische missers </li></ul><ul><li>Evidence Based Medicine (EBD) </li></ul><ul><li>Natuurlijke taal interface - NLP </li></ul><ul><li>Niet verbonden met Internet </li></ul>YouTube, Perspectives on Watson: Healthcare
De manier Keyword Matching In May , Gary arrived in India after he celebrated his anniversary in Portugal . explorer In May 1898 400th anniversary arrival in Portugal celebrated India Gary In May arrived in in Portugal celebrated India anniversary In May 1898 Portugal celebrated the 400th anniversary of this ex-plorer ’ s arrival in India.
De Watson manier India In May 1898 400th anniversary arrival in Portugal celebrated explorer Kappad Beach 27th of May 1498 landed in In May 1898 Portugal celebrated the 400th anniversary of this ex-plorer ’ s arrival in India. On the 27th of May 1498, Vasco da Gama landed in Kappad Beach. Vasco da Gama Statistical Paraphrasing GeoSpatial Reasoning Temporal Reasoning Date Math Paraphrases Geo-KB
Slimmere gezondheidszorg d.m.v. analytics Verbetering van kennis en diagnostiek en vroegtijdige herkenning t.b.v. zeldzame ziekten. Integratie van virale genomics data met klinische gegevens om de reactie op een anti-HIV therapie te voorspellen. Koppeling van o.a. DNA-expressies en klinische data voor een beter begrip van en sneller onderzoek naar inflammatoire darmziekten. Real-time analytics op streaming data van medische instrumenten ter monitoring van te vroeg geboren baby's in het Toronto Sick Children's Hospital.