"Toward Generating Domain-specific / Personalized Problem Lists from Electronic Medical Records"

412 views

Published on

Ching-huei Tsou, senior software engineer in the Watson Algorithms group from IBM Watson Research, presented this at the Cognitive Systems Institute Speaker Series on April 14, 2016.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
412
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
38
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

"Toward Generating Domain-specific / Personalized Problem Lists from Electronic Medical Records"

  1. 1. Generating Problem-Oriented Summary from Electronic Medical Records IBM Watson Health Ching-Huei Tsou April 14, 2016
  2. 2. Watson’s post-Jeopardy challenge: Healthcare  Our first domain of exploration is medical decision support because of its mature, complex and meaningful problem solving nature  After Watson’s win on Jeopardy!, people (outside of computer science community) assumed that anything that could be phrased as a question could be correctly answered by Watson: Watson, “Given my medical record <insert hundreds of pages of structured and unstructured data here> , what’s wrong with me?”
  3. 3. Filing System Summarization Multi-Step ReasoningClinical Knowledge QA EMR Deep QA & Search *Not to scale From the generated Problem oriented summary, physician noticed the patient’s “Creatinine level is high” What has been done to treat his diabetic nephropathy? What else can I try? Clinical Decision Support System EMR What are the causes of creatinine elevation? What are the most likely causes for this patient? Toward a Clinical Decision Support System EMR Analysis Medical QA Reasoning EMR QA
  4. 4. Problem-Oriented Summarization
  5. 5. Electronic Medical Records Unstructured Data Clinical Notes Semi-structured Data, e.g. Diagnoses, medications, lab test results 100s of notes for a typical patient; 1,000s for older patients with inpatient notes
  6. 6. Promises of Electronic Medical/Health Record  Why a Dr. went into medicine  Not why a Dr. went into medicine  Prevent medical errors  Reduce health care costs  Increase administrative efficiency  Decrease paperwork  Expand access to affordable care
  7. 7. Today’s EMR is Broken  Digitizing medical records does not reduce physicians’ cognitive load  Today’s EMR is largely billing-oriented  Billing compliance regulations require that notes stand on their own, which may promote duplication of text  Detailed coding = bill = better quality? General Coding Specific Coding 428.0 CHF NOS 428.9 HF NOS 428.21 Ac Systolic HF 428.23 Ac-Chr Systolic HF 428.31 Ac Diastolic HF 428.33 Ac-Chr Diastolic HF 428.41 Ac Comb S&D HF 428.43 Ac-Chr Comb S&D HF $29,716 $53,670
  8. 8. How EMR Summarization Can Help a Physician?  Consider a physician who is about to see a patient in an outpatient setting  Perhaps this is the first encounter for the physician with the patient, or  It has been a while since the physician has seen this patient  Before seeing the patient, the physician may want to know  What are the patient’s current problems?  When was a problem last discussed / addressed?  How a problem is being managed? Current medications? Related lab test results? Most questions are problem-oriented
  9. 9. Problem-Oriented Medical Record (POMR) Summary Problems List Medications Lab tests “treated by” “measured by” “discussed in” Procedures Vitals Clinical Notes & timeline  POMR, as originally defined by Dr. Lawrence Weed in the 1960s is the official record keeping method in most US hospitals  Problem list is also a mandatory section in the CCD (continuity of care), part of HL7’s CDA (clinical document architecture) standard  The key to success? An accurate problem list
  10. 10. The Problem List Challenge  Unfortunately, manually maintained problem lists are not accurate Our assessment of existing problem lists based on a gold standard indicates the challenge Entered Problem List Accuracy: Recall (Sensitivity) = 0.55 Precision (Positive Predictive Rate) = 0.28 Ground-Truth Problem Problems on the entered list Resolved Problem Acute Problem Problems added for billing purpose Patient’s pre- existing problems No time to update the list FN FP TP Rule-out diagnoses
  11. 11. Automated Problem List Generation
  12. 12. Problem List: A list of current and active diagnoses as well as past diagnoses relevant to the current care of the patient CMS (center of medicare and medicaid services) Meaningful Use Stage 1 Problem List Definition
  13. 13. Problem-List Ground-Truth Annotation Guidelines What to include: 1. Chronic disease like diabetes, hypertension, hyperlipidemia etc. 2. History of cardiovascular events such as CVA, MI, DVT, PE. 3. Non-injury related musculoskeletal conditions like degenerative disc disease, osteoarthritis, osteoporosis, and rickets 4. History of drug or alcohol ABUSE 5. All psychiatric diagnoses 6. Obesity and obesity related problems like sleep apnea, fatty liver disease etc. 7. Resolved problems of high importance such as recurrent PNA, anemia, etc. 8. Complications from other disease processes, such as diabetic neuropathy, CKD from hypertension etc 9. malignant Neoplasms (or history of) regardless of patient status and any benign neoplasms that need to be monitored What should NOT be included: 1. all injuries 2. resolved problems of either low importance, or those which have been corrected by surgery(bronchitis, pneumonia, cholecystitis with cholecystectomy, hernia that has been surgically corrected, appendicitis with appendectomy, etc). 3. Most dermatologic conditions including warts, transient skin rashes of low importance that are resolved. Only exception to this is Acne (regardless of severity) is included. 4. Signs or symptoms of disease; chest pain, headache, abdominal pain, epistaxis, hematuria, etc. Usually these will have some corresponding diagnosis. If not then it isn’t included. Only exception is Lumbago, which because of its usual chronicity IS included. 5. Severity of disease, as these tend to wax and wane in many chronic problems. 6. Cause of death or anything from an autopsy report Where to take information from: 1. Any clinical note, operative report, telephone encounter, etc, where a specific diagnoses is discussed. 2. Do not make inferences. Ie, if a note says fasting glucose of 156, unless it explicitly says this patient has diabetes, leave the diagnosis off 3. Words like probable or suspected before a given diagnoses are situation dependent. Sometimes a later note will confirm or refute that diagnosis. Tips: 1. Remember that notes have places for allergies, past surgeries, procedures, etc. so leaving things off of a problem list doesn’t mean the information isn’t available. 2. Try to make the diagnosis as concise as possible, abbreviations are acceptable. 3. If you’re unsure then include it and it can always be removed during adjudication Guidelines are subject to explanation and extensive domain knowledge is required
  14. 14. EMRA Problem List Generation Candidate Generation Scoring & Ranking Find everything that looks like a disorder from the clinical notes Look for contextual information and supporting evidences
  15. 15. Clinical Note
  16. 16. (Watson) Annotated Clinical Note / Entity Linking  Parsing  Sections  Paragraphs  Part-of-speech  Entity-Linking  Recognition  Disambiguation  Negation Detection
  17. 17. Context-aware Computing  Given the context, we have no problem reading the sentences above, even though the characters H and A (and B and13) are identical
  18. 18. Context in EMR Word Sentence Section Note Medication & Labs Similar Pattern in Other EMRs  Hypertension  Hypertension: Yes  Assessment and Plan  Hypertension: Yes  Mentioned in several other notes  Taking HTN drugs, elevated BP  Other patients with similar pattern has been diagnosed with hypertension
  19. 19. 19 EMRA Problem List Accuracy: Recall (Sensitivity) = 0.84 Precision (Positive Predictive Rate) = 0.52 Recall Oriented  F2 = 0.75 Entered Problem List Accuracy: Recall (Sensitivity) = 0.55 Precision (Positive Predictive Rate) = 0.28 GroupingCandidate Generation Feature Generation Information Extraction Text Segmen- tation Scoring / WeightingEMR Clinical Factors Extraction CUI Confidence Note Section Notes Structured Data (Medications, Orders, Lab, etc) CUIs of unique Disorders (100s) Candidate Problems (10s) CUIs of unique Medications (10s), Orders, Lab, etc. Merging and Clustering Closely Related Problems Term Frequency Relationship LSA / DSRD CUI Path LSA / DSRD CUI PathMedsLabs Score 1.0 0 0.4 Confidence Score 1.0 0 10 Term Frequency Score 1.0 0 0.3 LSA Score Score 1.0 0 A may treat B Path Pattern Score 1. 0 0 PMH Note Section Note Type EMRA Problem List Generation
  20. 20. EMRA Problem List Generation 20 EMRA Problem List Accuracy: Recall (Sensitivity) = 0.70 Precision (Positive Predictive Rate) = 0.73 Precision Oriented  F1 = 0.72 Entered Problem List Accuracy: Recall (Sensitivity) = 0.55 Precision (Positive Predictive Rate) = 0.28 GroupingCandidate Generation Feature Generation Information Extraction Text Segmen- tation Scoring / WeightingEMR Clinical Factors Extraction CUI Confidence Note Section Notes Structured Data (Medications, Orders, Lab, etc) CUIs of unique Disorders (100s) Candidate Problems (10s) CUIs of unique Medications (10s), Orders, Lab, etc. Merging and Clustering Closely Related Problems Term Frequency Relationship LSA / DSRD CUI Path LSA / DSRD CUI PathMedsLabs Score 1.0 0 0.4 Confidence Score 1.0 0 10 Term Frequency Score 1.0 0 0.3 LSA Score Score 1.0 0 A may treat B Path Pattern Score 1. 0 0 PMH Note Section Note Type
  21. 21. EMRA in Action
  22. 22. EMR Summarization Watson generates and groups Problems by clinical relevance Watson groups medications by clinical relevance Each panel contains answers to a pre-defined question
  23. 23. Context-aware User Interface Labs show elevated glucose and A1C among the others… When a problem is selected Current and related meds are highlighted Relevant notes are highlighted
  24. 24. Is the patient's diabetes well-controlled?  What was patient's last HbA1c? When was it taken?  Patient's hemoglobin A1c is red indicating it is not within normal range. Patient’s HbA1c has been high except for a single reading in 2013, so patient's diabetes has NOT been well-controlled. A1C went down, why? A1C went up, why?
  25. 25. A1C went down; why? A1C went up in most recent test despite being on Victoza (liraglutide); why? Endocrinology note on 03/06/2013 Endocrinology note on 07/17/2013  EMRA makes it easy to find and bring up relevant notes Is the patient's diabetes well-controlled?
  26. 26. Semantic Find  Acute problems are normally not considered as problems, and don’t show up in the Summarization UI  Patient come in complaining of hearing problem  has patient experienced this before?  Was patient started on any treatment?
  27. 27. Quality Assessment
  28. 28. Quality Assessment “I’d consider Watson extremely useful if it can find one important problem that is missed by physician” Neil Mehta. M.D., Internist, Cleveland Clinic
  29. 29. Quality Assessment  6 Cleveland Clinic physicians reviewed 15 EMRs to generated their own problem lists, and then compared and rated the problem lists each physician reviewed 5 EMRs, and each EMR is reviewed by 2 physicians Watson generated lists were given after physicians completed their own list. Physicians were asked to rate the Watson generated problems one by one and as a whole for each problem, is it correct? Is it on your list? If correct, how important is it? as a whole, rate each list from 1-10 (Likert scale) Very Important Ground Truth Physician Watson Important Somewhat Important Not at all Important
  30. 30. Quality Assessment Manually Maintained Physician Generated Watson Generated  Average Rating  Current System: 5.8  Watson: 7.4  Physician: 8.4 *The differences are statistical Significant (p=0.02)
  31. 31. Quality Assessment  Simple linear regression indicates the most important factor to higher Watson rating is “Percentage of very important problems that are missed by physician and found by Watson”  In average Watson found 1.2 very important or important problem missed by physician per EMR (avg. 6 problems)
  32. 32. Specialty Specific and/or Personalized
  33. 33. Type of False-Positive Problems Transient problem 51% Correct 21% Redundant Problem 11% Certainty error 5% System error 4% Noise 4% Negation error 3% Human error 1%  Error analysis showed most of the false-positives are “transient problems”  Transient problems are true findings or disorders of the patient that are less important to the medical care  Minor / self-limited problem  waxing and waning, e.g. seasonal  Resolved  The definition is somewhat subjective  a resolved problem to one physician may be a significant past medical history to another physician
  34. 34.  CMS (center of medicare and medicaid services) Meaningful Use Stage 1  Problem List: A list of current and active diagnoses as well as past diagnoses relevant to the current care of the patient Problem List Definition Every known findings / risk factors / disorders of the patient • “ideal” problem list for a nephrologist • The blue list contains too many irrelevant problems • “ideal” problem list from an internist • The green list is too specific and not comprehensive
  35. 35. The Problem List Challenge Cardiovascular Digestive BodySystem Endocrine Respiratory Genitourinary
  36. 36. The Problem List Challenge Cardiovascular Digestive BodySystem Endocrine Respiratory Genitourinary
  37. 37. Active Learning (Personalized)
  38. 38. Active Learning (Sample Complexity) 0.5 0.6 0.7 0.8 0 50 100 150 200 250 300 F2Measure Number of Training EMRs
  39. 39. Current Research Direction Learning Supervised (batch learning) Supervised (active learning) Features Knowledge-based features O(100) selected using ADT tree / boosting Features O(1,000) extracted and selected by DNN (e.g. auto-encoder) Temporal Aspect Modeled implicitly Explicitly clustering multivariate time series Today Work in Progress

×