IBM Watson: Clinical Decision Support


Published on

Dr. Martin Kohn's presentation to the New York Technology Council at their

Published in: Health & Medicine, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Question in form of a statement, answer in form of a question
  • RSDC 2009 12/09/11 20:19
  • Here we see the same question on the right <read it again> To identify and gain confidence in better evidence, the system must parse the question, determining its grammatical structure and identify the main predicates like celebrated and arrived along with their main arguments (that is their subjects and objects, etc) for example -- who is doing the celebrating , and who is doing the arriving AND for each of these actions where and when are they happening. This would further require the system to attempt to distinguish places , dates and people from each other and from other words and phrases in the question. On the right side, we see a passage containing the RIGHT answer BUT with only one key word in common -- “ MAY ”. <read the green passage> Given just that one common and very popular term, the system must look at a huge amount of unrelated stuff to even get a chance to consider this passage and then must employ and weigh the right algorithms to match the question with an accurate confidence, for example in this case <click>   Temporal reasoning algorithms can relate a 400 th anniversary in 1898 to 1498, Statistical Paraphrasing algorithms can help the computer learn from reading lots of texts that landed in can imply arrived in and finally with Geospatial reasoning using geographical databases the system may learn that Kappad Beach is in India and if you arrive in Kappad Beach you have therefore arrived in India.   And still, all of this will admit numerous errors since few of these computations will produce 100% certainty in mapping from words, to concepts to other words. Just as an example, what if the passage said “considered landing in” rather than “landed in” or what if it the question said “arrival in what he thought to be India?”.   Question Answering Technology tries to understand what the user is really asking for and to deliver precise and correct responses. But Natural language is hard … the authors intended meaning can be expressed in so many different ways. To achieve high levels of precision and confidence you must consider much more information and analyze it more deeply.   We needed a radically different approach that could rapidly admit and integrate many algorithms , considering lots of different bits of evidence from different perspectives, AND that could learn how to combine and weigh these different sorts of evidence ultimately determining how strongly or weakly they support or refute possible answers.
  • <click>   Watson – the computer system we developed to play Jeopardy! is based on the DeepQA softate archtiecture.Here is a look at the DeepQA architecture. This is like looking inside the brain of the Watson system from about 30,000 feet high. Remember, the intended meaning of natural language is ambiguous, tacit and highly contextual. The computer needs to consider many possible meanings, attempting to find the evidence and inference paths that are most confidently supported by the data. So, the primary computational principle supported by the DeepQA architecture is to assume and pursue multiple interpretations of the question, to generate many plausible answers or hypotheses and to collect and evaluate many different competing evidence paths that might support or refute those hypotheses. Each component in the system adds assumptions about what the question might means or what the content means or what the answer might be or why it might be correct. DeepQA is implemented as an extensible architecture and was designed at the outset to support interoperability. <UIMA Mention> For this reason it was implemented using UIMA, a framework and OASIS standard for interoperable text and multi-modal analysis contributed by IBM to the open-source community. Over 100 different algorithms, implemented as UIMA components, were integrated into this architecture to build Watson . In the first step, Question and Category analysis , parsing algorithms decompose the question into its grammatical components. Other algorithms here will identify and tag specific semantic entities like names, places or dates. In particular the type of thing being asked for, if is indicated at all, will be identified. We call this the LAT or Lexical Answer Type, like this “FISH”, this “CHARACTER” or “COUNTRY”. In Query Decomposition, different assumptions are made about if and how the question might be decomposed into sub questions. The original and each identified sub part follow parallel paths through the system. In Hypothesis Generation, DeepQA does a variety of very broad searches for each of several interpretations of the question. Note that Watson, to compete on Jeopardy! is not connected to the internet. These searches are performed over a combination of unstructured data, natural language documents, and structured data, available data bases and knowledge bases fed to Watson during training. The goal of this step is to generate possible answers to the question and/or its sub parts. At this point there is very little confidence in these possible answers since little intelligence has been applied to understanding the content that might relate to the question. The focus at this point on generating a broad set of hypotheses, – or for this application what we call them “Candidate Answers”. To implement this step for Watson we integrated and advanced multiple open-source text and KB search components. After candidate generation DeepQA also performs Soft Filtering where it makes parameterized judgments about which and how many candidate answers are most likely worth investing more computation given specific constrains on time and available hardware. Based on a trained threshold for optimizing the tradeoff between accuracy and speed, Soft Filtering uses different light-weight algorithms to judge which candidates are worth gathering evidence for and which should get less attention and continue through the computation as-is. In contrast, if this were a hard-filter those candidates falling below the threshold would be eliminated from consideration entirely at this point. In Hypothesis & Evidence Scoring the candidate answers are first scored independently of any additional evidence by deeper analysis algorithms. This may for example include Typing Algorithms. These are algorithms that produce a score indicating how likely it is that a candidate answer is an instance of the Lexical Answer Type determined in the first step – for example Country, Agent, Character, City, Slogan, Book etc. Many of these algorithms may fire using different resources and techniques to come up with a score. What is the likelihood that “Washington” for example, refers to a “General” or a “Capital” or a “State” or a “Mountain” or a “Father” or a “Founder”? For each candidate answer many pieces of additional Evidence are search for. Each of these pieces of evidence are subjected to more algorithms that deeply analyze the evidentiary passages and score the likelihood that the passage supports or refutes the correctness of the candidate answer. These algorithms may consider variations in grammatical structure, word usage, and meaning. In the Synthesis step, if the question had been decomposed into sub-parts, one or more synthesis algorithms will fire. They will apply methods for inferring a coherent final answer from the constituent elements derived from the questions sub-parts. Finally, arriving at the last step, Final Merging and Ranking, are many possible answers, each paired with many pieces of evidence and each of these scored by many algorithms to produce hundreds of feature scores. All giving some evidence for the correctness of each candidate answer. Trained models are applied to weigh the relative importance of these feature scores. These models are trained with ML methods to predict, based on past performance, how best to combine all this scores to produce final, single confidence numbers for each candidate answer and to produce the final ranking of all candidates. The answer with the strongest confidence would be Watson’s final answer. And Watson would try to buzz-in provided that top answer’s confidence was above a certain threshold. ---- The DeepQA system defers commitments and carries possibilities through the entire process while searching for increasing broader contextual evidence and more credible inferences to support the most likely candidate answers. All the algorithms used to interpret questions, generate candidate answers, score answers, collection evidence and score evidence are loosely coupled but work holistically by virtue of DeepQA’s pervasive machine learning infrastructure. No one component could realize its impact on end-to-end performance without being integrated and trained with the other components AND they are all evolving simultaneously. In fact what had 10% impact on some metric one day, might 1 month later, only contribute 2% to overall performance due to evolving component algorithms and interactions. This is why the system as it develops in regularly trained and retrained. DeepQA is a complex system architecture designed to extensibly deal with the challenges of natural language processing applications and to adapt to new domains of knowledge. The Jeopardy! Challenge has greatly inspired its design and implementation for the Watson system.
  • What we did for Jeopardy! Applied to Healthcare too. This is one aspect. There are others as well. Makes a small point. Emphasize multiple aspects of evidence?
  • 12/09/11 20:19
  • 12/09/11 20:19
  • 12/09/11 20:19
  • Key points: Timing is as good as it has been to introduce new decision support solutions for Healthcare Our strategy is holistic involving many varied partners What role is of interest? Providers include doctors, nurses, hospitals, clinics
  • IBM Watson: Clinical Decision Support

    1. 1. Clinical Decision Support: DeepQA Martin S. Kohn, MD, MS, FACEP, FACPE Chief Medical Scientist, Care Delivery Systems IBM Research
    2. 2. World Healthcare Expenditures – 2006/2007 WHO Data <ul><li>Total world healthcare spend $4.75 Trillion </li></ul><ul><li>Total US healthcare spend $2.05 Trillion </li></ul><ul><li>US spends 43% of all the healthcare dollars in the world! </li></ul><ul><li>2009, for the first time, the government paid more than 50% of all US healthcare dollars </li></ul><ul><li>So, our government spends more than 20% of all the money spent on healthcare in the world </li></ul>
    3. 3. Per Capita Health Spending and 15-Year Survival for 45-Year-Old Women Copyright ©2010 by Project HOPE, all rights reserved. Peter A. Muennig and Sherry A. Glied, What Changes In Survival Rates Tell Us About US Health Care, Health Affairs, Vol 0, Issue 2010, hlthaff.2010.0073v1-101377201
    4. 4. WHO – Five Common Shortcomings of Healthcare 2008 <ul><li>Inverse care. People with the most means – whose needs for health care are often less – consume the most care, whereas those with the least means and greatest health problems consume the least. Public spending on health services most often benefits the rich more than the poor in high- and low income countries alike. </li></ul><ul><li>Impoverishing care. Wherever people lack social protection and payment for care is largely out-of-pocket at the point of service, they can be confronted with catastrophic expenses. Over 100 million people annually fall into poverty because they have to pay for health care. </li></ul><ul><li>Fragmented and fragmenting care. The excessive specialization of health-care providers and the narrow focus of many disease control programmes discourage a holistic approach to the individuals and the families they deal with and do not appreciate the need for continuity in care. Health services for poor and marginalized groups are often highly fragmented and severely under-resourced, while development aid often adds to the fragmentation. </li></ul><ul><li>Unsafe care. Poor system design that is unable to ensure safety and hygiene standards leads to high rates of hospital-acquired infections, along with medication errors and other avoidable adverse effects that are an underestimated cause of death and ill-health. </li></ul><ul><li>Misdirected care. Resource allocation clusters around curative services at great cost, neglecting the potential of primary prevention and health promotion to prevent up to 70% of the disease burden. At the same time, the health sector lacks the expertise to mitigate the adverse effects on health from other sectors and make the most of what these other sectors can contribute to health. </li></ul>
    5. 5. World Health Report 2008 Global Issues (WHO) <ul><li>Demand for primary care </li></ul><ul><li>Equitable, inclusive and fair systems </li></ul><ul><li>Universal coverage and universal access </li></ul><ul><li>Service delivery reforms </li></ul><ul><li>Integrating public health and primary care </li></ul><ul><li>“ Put people at the centre of health care” </li></ul>
    6. 6. IOM 2001 and WHO “People at the Center of Healthcare” - REDESIGN IMPERATIVES: SIX CHALLENGES <ul><li>Re-engineered care processes </li></ul><ul><li>Effective use of information technologies </li></ul><ul><li>Knowledge and skills management </li></ul><ul><li>Coordination of care across patient conditions, services and sites of care over time </li></ul><ul><li>Development of effective teams </li></ul><ul><li>Use of appropriate performance and outcome measures </li></ul>
    7. 7. Patient-Centered Care Concepts <ul><li>Personal Relationship </li></ul><ul><li>Expanded Access </li></ul><ul><li>Team Approach </li></ul><ul><li>Comprehensive </li></ul><ul><li>Coordination </li></ul><ul><li>Quality and Safety </li></ul><ul><li>Added Value and Payment Reform </li></ul>
    8. 8. A value-based health system must appropriately balance resources expended in keeping people healthy Value-based Health Care System 20% of people generate 80% of costs Health care spending Early Symptoms Health Status Source: IBM Global Business Services and IBM Institute for Business Value Healthy / Low Risk High Risk At Risk Active Disease Early Clinical Symptoms
    9. 9. IBM’s role and commitment to Healthcare <ul><li>Change agent </li></ul><ul><ul><li>Enabling national & regional eHealth programmes </li></ul></ul><ul><ul><li>IBM Research and systems thinking </li></ul></ul><ul><ul><li>600+ patents in life sciences, healthcare and medical devices </li></ul></ul><ul><ul><li>Commitment to open industry standards </li></ul></ul><ul><li>Health IT services and solution provider </li></ul><ul><ul><li>>8,000 employees dedicated to healthcare </li></ul></ul><ul><ul><li>56 Medical doctors, 350 health professionals </li></ul></ul><ul><ul><li>Broad solution portfolio and business partner ecosystem </li></ul></ul><ul><li>Supporting European eHealth </li></ul><ul><ul><li>Participants in EuRESIST, UniversAAL, epSOS </li></ul></ul><ul><ul><li>Members of COCIR, Continua, IHE Europe </li></ul></ul><ul><li>Buyer/Customer of Healthcare in the U.S. </li></ul><ul><ul><li>450,000 lives, $1.3B in spend </li></ul></ul>
    10. 10. “ Smarter Healthcare” + + Improve operational effectiveness Achieve better quality and outcomes Collaborate for prevention and wellness A smarter health system collaborates across health and care settings, activating individuals in their own health and making best use of resources to treat conditions, keep people healthy and deliver greater individual value. Intelligent Instrumented Interconnected
    11. 11. Health systems will need to emphasize different competencies to thrive in a changing healthcare environment Empower Patients Empower members to assume accountability and make more informed health and financial choices Collaborate with Providers Help providers become successful in a value-base reimbursement environment Innovate Collaboratively innovate products and services, operational processes, and business models Optimize Operational Efficiencies Continue driving costs down in order to maximize margins in a highly regulated industry Enable through Information Technology Flexible applications, BI, on-demand information, effective operations/management & governance Competencies
    12. 12. Informed Decision Making: Search vs. Expert Q&A Decision Maker Search Engine Finds Documents containing Keywords Delivers Documents based on Popularity Has Question Distills to 2-3 Keywords Reads Documents, Finds Answers Finds & Analyzes Evidence Expert Understands Question Produces Possible Answers & Evidence Delivers Response, Evidence & Confidence Analyzes Evidence, Computes Confidence Asks NL Question Considers Answer & Evidence Decision Maker
    13. 13. Basic Game Play: Confidence is King <ul><li>Players compete to answer </li></ul><ul><li>1 st to buzz-in, answers </li></ul>1 of 3 Players Selects a Clue <ul><li>IF correct, gets $ value. </li></ul><ul><li>IF wrong, loses $ value and other players compete again </li></ul>Host reads Clue out loud Six Categories 5 Levels of Difficulty ALL POLICEMEN CAN THANK STEPHANIE KWOLEK FOR HER INVENTION OF THIS POLYMER FIBER, 5 TIMES TOUGHER THAN STEEL TECHNOLOGY
    14. 14. The Jeopardy! Challenge: A compelling and notable way to drive and measure the technology of automatic Question Answering along 5 Key Dimensions $600 In cell division, mitosis splits the nucleus & cytokinesis splits this liquid cushioning the nucleus $200 If you're standing, it's the direction you should look to check out the wainscoting . $2000 Of the 4 countries in the world that the U.S. does not have diplomatic relations with, the one that’s farthest north $1000 The first person mentioned by name in ‘The Man in the Iron Mask’ is this hero of a previous book by the same author. Broad/Open Domain Complex Language High Precision Accurate Confidence High Speed
    15. 15. Keyword Evidence Keyword Matching Keyword Matching Keyword Matching Keyword Matching Keyword Matching celebrated India In May 1898 400th anniversary arrival in Portugal India In May Gary explorer celebrated anniversary in Portugal arrived in In May , Gary arrived in India after he celebrated his anniversary in Portugal . In May 1898 Portugal celebrated the 400th anniversary of this explorer’s arrival in India. This evidence suggests “Gary” is the answer BUT the system must learn that keyword matching may be weak relative to other types of evidence
    16. 16. Temporal Reasoning Statistical Paraphrasing GeoSpatial Reasoning Deeper Evidence celebrated May 1898 400th anniversary arrival in In May 1898 Portugal celebrated the 400th anniversary of this explorer’s arrival in India. Portugal landed in 27th May 1498 Vasco da Gama explorer On the 27 th of May 1498, Vasco da Gama landed in Kappad Beach Kappad Beach Para-phrases Geo-KB Date Math India Stronger evidence can be much harder to find and score. The evidence is still not 100% certain. <ul><li>Search Far and Wide </li></ul><ul><li>Explore many hypotheses </li></ul><ul><li>Find Judge Evidence </li></ul><ul><li>Many inference algorithms </li></ul>
    17. 17. The Missing Link On hearing of the discovery of George Mallory's body, he told reporters he still thinks he was first. TV remote controls, Buttons Shirts, Telephones Mt Everest He was first Edmund Hillary
    18. 18. DeepQA : The Technology Behind Watson Massively Parallel Probabilistic Evidence-Based Architecture Generates and scores many hypotheses using a combination of 1000’s Natural Language Processing , Information Retrieval , Machine Learning and Reasoning Algorithms. These gather, evaluate, weigh and balance different types of evidence to deliver the answer with the best support it can find. . . . 1000’s of Pieces of Evidence Multiple Interpretations 100,000’s Scores from many Deep Analysis Algorithms 100’s sources 100’s Possible Answers Balance & Combine Answer Scoring Models Answer & Confidence Question Evidence Sources Models Models Models Models Models Primary Search Candidate Answer Generation Hypothesis Generation Hypothesis and Evidence Scoring Final Confidence Merging & Ranking Synthesis Answer Sources Question & Topic Analysis Evidence Retrieval Deep Evidence Scoring Learned Models help combine and weigh the Evidence Hypothesis Generation Hypothesis and Evidence Scoring Question Decomposition
    19. 19. Baseline 12/06 v0.1 12/07 v0.3 08/08 v0.5 05/09 v0.6 10/09 v0.8 11/10 v0.4 12/08 DeepQA: Incremental Progress in Answering Precision on the Jeopardy Challenge: 6/2007-11/2010 v0.2 05/08 IBM Watson Playing in the Winners Cloud V0.7 04/10
    20. 20. Watson was built for Jeopardy! , but can be enhanced and paired with other solutions for future applications 8/2/2011 Watson's current capabilities were constrained for Jeopardy! requirements... <ul><li>English only </li></ul><ul><li>A single questioner per system instance </li></ul><ul><li>3-second response time </li></ul><ul><li>Static content </li></ul><ul><li>Unstructured text </li></ul><ul><li>Requires training data – history of questions and answers </li></ul>...but future Watson enhancements are possible with further development... <ul><li>Multiple, varied users </li></ul><ul><li>More dynamic content updates </li></ul><ul><li>More/varied training data </li></ul><ul><li>Varied response times </li></ul><ul><li>Additional languages </li></ul>...and other solutions could ultimately compliment Watson capabilities <ul><li>Large amounts of structured data </li></ul><ul><ul><li>Cognos, InfoSphere </li></ul></ul><ul><li>Predictive / statistical capabilities </li></ul><ul><ul><li>SPSS </li></ul></ul><ul><li>Social media analysis </li></ul><ul><ul><li>COBRA, Banter </li></ul></ul>Early 2011 Future
    21. 21. Watson for Healthcare 8/2/2011 <ul><li>Private data </li></ul><ul><li>Custom algorithms </li></ul><ul><li>Custom applications </li></ul>Watson for Healthcare Service Benefits Providers ... Medical Professionals and Patients <ul><li>Improve quality of care </li></ul><ul><li>Reduce errors </li></ul><ul><li>Engage patients </li></ul><ul><li>Improve audit trails </li></ul><ul><li>Improve efficiency </li></ul><ul><li>Better utilize skills </li></ul><ul><li>Right content, right time </li></ul><ul><li>Best-practices to point of care </li></ul><ul><li>Capture value </li></ul><ul><li>Advance Evidenced-based Care </li></ul><ul><li>Foster a healthcare analytics ecosystem </li></ul><ul><li>Capture value </li></ul>Customized Watson Solution Appliance ... Patient Workup Customized Solution Differential Diagnosis Second Opinion ... ... Public domain Publishers Healthcare Providers ... ... Watson-enabled Solutions from major healthcare solution providers Evidence Data Training Data Annotators and Algorithms Relationship Data Watson Engine and Scoring and Confidence Models
    22. 22. Evidence must be Evaluated for Different Forms <ul><li>Temporal Reasoning </li></ul><ul><ul><li>Developed for Jeopardy! Has application in Healthcare as sequence or timing of symptoms may be relevant </li></ul></ul><ul><li>Geospatial Reasoning </li></ul><ul><ul><li>Earth geography algorithms can be reworked for human body (the Pain started in my fingertips and progressed up my left arm) </li></ul></ul><ul><li>Statistical Paraphrasing </li></ul><ul><ul><li>New Algorithms required to, for example, Map between medical terminology and lay terms. </li></ul></ul>TEMPORAL REASONING EXAMPLE Typical influenza in adults is characterized by sudden onset of chills, fever, prostration, cough, and generalized aches and pains (especially in the back and legs). Headache is prominent, often with photophobia and retrobulbar aching. Respiratory symptoms may be mild at first , with scratchy sore throat, substernal burning, nonproductive cough, and sometimes coryza . Later , lower respiratory tract illness becomes dominant ; cough can be persistent , raspy, and productive. GI symptoms may occur and appear to be more common with the 2009 pandemic H1N1 strain. Children may have prominent nausea, vomiting, or abdominal pain, and infants may present with a sepsis-like syndrome. After 2 to 3 days, acute symptoms rapidly subside , although fever may last up to 5 days . Cough, weakness, sweating, and fatigue may persist for several days or occasionally for weeks.
    23. 23. Why is Watson Technology ideal for Healthcare? Source: IBM Research, MI, SCIP, BCG analysis Interprets and understands natural language questions Understands ambiguous and imprecise questions using sophisticated natural language algorithms Analyzes large volumes of unstructured data Synthesizes broad domain of unstructured data from a variety of selected public, licensed and private sources Quantifies degrees of confidence in potential answers Generates hypotheses and ranks degrees of confidence in a range of potential answers based on evidence Supports iterative dialogue to refine results Internal iterative and interactive question and answering to refine and improve results Adapts and learns to improve results over time Learns from additional evidence, additional questions and mistakes to improve accuracy over time What condition has red eye, pain, inflammation, blurred vision, floating spots and sensitivity to light? Physician Notes, Medical Journals, Pathology results, Clinical Trials, Wikipedia, etc/ Family History, Physical Exam, Current Medications, etc. New Clinical Recommendations. New Drugs. Approved use of Drugs, etc. Uveitis 91% Iritis 48% Keratitis 29%
    24. 24. A Range of Watson-enabled Healthcare Solutions Patient Caregiver…Nurse…Physician Assistant Clinician Specialty Diagnosis & Treatment Options Longitudinal Patient Electronic Health Information Specialty Research Genomic-based Analysis Coding Analysis & Automation Caregiver Education Consumer Portal Patient Workup Differential Diagnosis Treatment Options Patient Inquiry On-going Treatment Treatment Protocol Analysis Treatment Authorization Population Analysis & Care Mgmt Second Opinion Care Consideration Automation
    25. 25. Key Elements of the Clinical Diagnostic Reasoning Process Bowen J. N Engl J Med 2006;355:2217-2225
    26. 26. Graber, et al. Diagnostic Error in Internal Medicine, Arch Int Med 2005; 165:1493-1499
    27. 27. Role of Electronic Systems in Improving Diagnosis <ul><li>Filtering, organizing , and providing access to information … thoroughness in gathering the patient's history , findings from the physical examination, and other data . … The problem of having too much information is now surpassing that of having too little, and it will become increasingly difficult to review all the patient information that is electronically available. </li></ul><ul><li>Serving as a place where clinicians, together with patients , document succinct evaluations, craft thoughtful differential diagnoses , and note unanswered questions. Free-text narrative will often be superior to point-and-click boilerplate … </li></ul>Can Electronic Clinical Documentation Help Prevent Diagnostic Errors? Gordon D. Schiff, M.D., and David W. Bates, M.D. N Engl J Med 2010; 362:1066-1069
    28. 28. Role of Electronic Systems in Improving Diagnosis <ul><li>A better approach to managing problem lists is needed. The failure to effectively integrate the creation, updating, reorganization, and inactivation of items on problem lists into the clinician's workflow has been one of the great failures of clinical informatics. … allowing specific providers (for instance, specialists or nonphysician staff members) to work selectively with a subset of problems are necessary features … </li></ul><ul><li>Electronic systems should incorporate checklist prompts to make sure that key questions are asked and relevant diagnoses considered. … diagnostic checklists have so far been neither clinically helpful nor widely used. Yet, human memory alone cannot guarantee that key questions will be asked and important diagnoses considered and accurately weighed. Decision-support software and predictive models have also had limited use to date, but both could become important if their design were more practical and evidence-based — if, for example, they automatically generated differential diagnoses that facilitated both documentation and decision making . </li></ul>Can Electronic Clinical Documentation Help Prevent Diagnostic Errors? Gordon D. Schiff, M.D., and David W. Bates, M.D. N Engl J Med 2010; 362:1066-1069
    29. 29. Leveraging Electronic Clinical Documentation to Decrease Diagnostic Error Rates <ul><li>Ensure ease , speed , and selectivity of information searches; aid cognition through aggregation, trending, contextual relevance , and minimizing of superfluous data. </li></ul><ul><li>Provide a space for recording thoughtful, succinct assessments , differential diagnoses , contingencies , and unanswered questions </li></ul><ul><li>Provide checklists to minimize reliance on memory and directed questioning to aid in diagnostic thoroughness and problem solving . </li></ul><ul><li>Provide instant access to knowledge resources through context-specific “infobuttons” triggered by keywords in notes that link user to relevant textbooks and guidelines. </li></ul><ul><li>Providing access to information </li></ul><ul><li>Recording and sharing assessments </li></ul><ul><li>Providing prompts </li></ul><ul><li>Providing access to information sources </li></ul>Goals and Features of Redesigned Systems Role for Electronic Documentation Can Electronic Clinical Documentation Help Prevent Diagnostic Errors? Gordon D. Schiff, M.D., and David W. Bates, M.D. N Engl J Med 2010; 362:1066-1069
    30. 30. Watson’s Reasoning <ul><li>“ Shallower” reasoning over large volumes of data and presenting alternatives to clinicians for the final decisions </li></ul><ul><li>Casts a wide net </li></ul><ul><ul><li>Considers a large amount of data </li></ul></ul><ul><ul><ul><li>EMR </li></ul></ul></ul><ul><ul><ul><li>Literature </li></ul></ul></ul><ul><ul><li>Unbiased </li></ul></ul><ul><ul><li>Learns </li></ul></ul><ul><li>Not limited by a database structure </li></ul><ul><li>Watson defers judgment until it has considered many possibilities </li></ul>
    31. 31. Watson’s Reasoning <ul><li>Hits sweet spot of human judgment </li></ul><ul><ul><li>Problems with bias </li></ul></ul><ul><ul><li>Difficulty processing large arrays of evidence knows what additional case input information could have improved the confidence in the output analysis </li></ul></ul><ul><li>Health Care is inherently &quot;uncertain.” Watson does not make a diagnosis. It provides evidence-based information to help the clinician make an informed decision. </li></ul><ul><li>Identifies missing information </li></ul><ul><li>Watson’s interactive process helps clinician vector in on the appropriate decisions </li></ul>
    32. 32. [email_address]
    33. 33. Convergence of Market and Technical Forces Create an Opportunity to Leverage Watson for Healthcare 8/2/2011 Advances in Analytics Advances in Mobile Computing Watson / DeepQA platform Partner Ecosystem Next Generation Solutions for Healthcare Pharma Medical Research Centers Information Providers Private, Public Insurers Medical Devices Providers Public Health Demand for Healthcare Reform Spiraling Costs Medical Information Overload Quality and Safety Concerns Aging Population Watson for Healthcare
    34. 34. Watson: How It Works Runtime Pipeline Input CASes Answers & Confidence Question and Answer Key Output CASes A variety of NLP algorithms analyze the question and the context to attempt to figure out what is being asked. (named entities, relations, LAT detection, question class) Retrieve content related to the question using index search on documents, passages and structured repositories From retrieved content, extract the words or phrases that could be possible answers Consider all the scored evidence to produce a final ranked list of answers with confidence For each candidate answer, retrieve more content that relates that answer to the question Many algorithms attempt to determine the degree to which the retrieved evidence supports the candidate answers. Training/Testing questions with medically vetted answers WaaS API Initial Scoring of candidate answers independent of supporting passages Build an abstract query from question analysis Removes candidates that should not proceed to remaining phases Question Analysis Query Builder Context-Independent Answer Scoring Candidate Answer Filtering Supporting Evidence Retrieval Context-Dependent Answer Scoring Final Merger Primary Search Search Result Processing Candidate Answer Generation Context Analysis