Deep Learning in Healthcare

Deep Learning in Healthcare
NUS Healthcare AI Datahon and Expo 2018
Hector Yee @eigenhector
Staff Software Engineer, Google AI: Healthcare
Some slide credits: Greg Corrado, Eyal Oren, Kai Chen, Jimbo Wilson, Jack Po

About the speaker
● ML on EHR & medical waveforms
● Image search
● Self-driving car perception
● Youtube personalized video recommendations
● Virtual reality
● Airbnb smart pricing, Shrek 2, Star wars, C&C
games etc
● Affiliations - Google

Published a single approach for predicting variety of events

SOURCE: Landrigan et al., “Temporal Trends in Rates of Patient Harm Resulting from Medical Care.” NEJM 2010;363(22): 2124–34.
1 in 4
Patients will have an adverse event in a hospitalization
(e.g. too much of medication, cardiac arrest, having to be readmitted).
If ML can predict them, maybe doctors can prevent them.
Opportunity: Care management / decision support

Care management / decision support
Conjecture: Machine Learning can assist in everyday care decision making,
improving care quality, lowering costs, and catching errors.
Example: Smart EMR helps inpatient providers stay on top of care.
Forecast: High likelihood of success, but may take decades.
Challenges:
● Data portability and inconsistency.
● Correlation vs causation.
● Establishing what kind of help providers really need.
● Rigorous evaluation of clinical value and reliability.

>85%
of US hospitals
now use EHRs
THE GOOD
~150k
measurements
per hospitalization

● Most of the entered data are
never used even by humans
● Data are idiosyncratic,
incomplete, inconsistent.
● Almost all ML models ignore
unstructured data like notes
THE BAD
>85%
of US hospitals
now use EHRs
THE GOOD
~150k
measurements
per hospitalization

● Most of the entered data are
never used even by humans
● Data are idiosyncratic,
incomplete, inconsistent.
● Almost all ML models ignore
unstructured data like notes
THE BAD
Show that ML predictions that use more of the EHR,
can offer more value in supporting care decisions
THE RESEARCH GOAL
>85%
of US hospitals
now use EHRs
THE GOOD
~150k
measurements
per hospitalization

1. Many different data schemas
2. Many coding systems and codes
3. Many variables, unclear which to use
The data is messy

Problem #1 : many different data schemas

glucose random urine, pre-hospital blood glucose (mg/dl), serum blood glucose, glucose 1 hour,
glucose-6-phosphate dehydrogenase screen rbc quantitative, action / effect on blood glucose possible side
effects, glucose csf, glucose mg/dl (manual entry), glucose body fluid, blood glucose monitoring, glucose
(manual entry), glucose fasting - external, glucose meter download, glucose water intake (ml), glucose 2
hour, glucose fasting (pregnant) 2 hr gtt, glucose fast'g preg, glucose whole blood, blood glucose meter,
glucose 2 hour - external, glucose plasma (external lab), glucose loading screen, poct glucose, glucose - pre
arrival lab, blood glucose (mg/dl)- manual entry, glucose, estim. avg glucose (eag) (external lab), glucose
fasting, glucose 3 hour (external lab), target glucose values, glucose 60 minute, poct glucose urine, urine
glucose, glucose 1 hour - external, glucose (ua), glucose 120 mins, glucose loading screen - external, glucose
180 mins, glucose non-fasting, mean plasma glucose (external lab), glucose ur
If we wanted to use the patient’s “glucose” in a model,
which of these measurements are synonyms?
Problem #2 : many different codes

Synonyms are actually a common problem in text
Example Discharge Note
81 year-old female with a history of CAD s/p MI with stent
placed 1 year ago, presenting with abdominal pain radiating
to the back. CT showed multiple stones in a dilated CBD. She
was found to have elevated liver enzymes.
Labs were notable for a leukocytosis to 23, markedly elevated
liver and pancreatic enzymes, but hemodynamically stable
and was afebrile. Placed a stent in her CBD and extracted
multiple stones.
She was given levo and flagyl. She was transferred back to
the floor for monitoring. She is currently afebrile, BP 130/58,
HR 90 satting 98% on 2L by nc.
Variants of “levo” in real notes
levohped, levophase, phenylephrineand, levoqin,
levophenpressors, levothroid, neosynephrine, levocardia,
levofolx, levopment, levoq, levoguin, levocarnitine,
levophedd, levo, levoscoliotic, levoconcave, pressorts,
levoalbuterol, pressorand, levobunolol, levodopa,
phenylephrine, levoquen, levopohed, levopehd,
vasopressinn, levohed, vasopressinf, levohphed, levofl,
levolar, norepinephrine, levofed, dopaminergic, levof,
levora, levophed, levophen, pressores, levophend,
pressorer, phenylephrined, pressore, pressor, pressorxl,
vasopressin, levoconvexity, vasopressors, vasopressor,
levocurvature, pressorss, levop, vasostriction, pressors,
vasopressine, levophedpatient, epinephrine, levoped,
pressory, dopamine, levoflacin, levophe, levophedrine,
levoflxicin, levoscoliosis, levoconvex, vasopressing,
levosoliosis, levoiphed, levotiracetam

Yes, Google is still head-over-heels for word embeddings
SOURCE: Mikolov et al., “Efficient Estimation of Word Representations in Vector Space.” ICLR 2013
Reflux
Diabetes
Influenza
Cancer
Hypertension
Stroke
Hyperlipidemia
Depression
Obesity
Sucralfate
Metformin
Ribavirin
Cisplatin
Metoprolol
Dipyridamole
Gemfibrozil
Fluoxetine
Orlistat
Disease - DrugCountry - CapitalMale - Female

Nearest neighbors in embeddings from medical text
arrhytmias
dysrhythmias
tachyarrhythmias
bradyarrhytmias
arrhythmia
Neo
neosynephrine
Levophed
Dopa
levo hd
crrt
hemo
cvvhd
esrd

1. Many different data schemas:
use FHIR
2. Many coding systems and codes:
learn embeddings
3. Many variables, unclear which to use:
use all & let the model learn
How we handle healthcare data in our models

Dataset A Dataset B Dataset C Dataset D
FHIR representation
TensorFlow
FHIR as internal datastructure

Single FHIR-based pipeline to train models for 6+ healthcare systems
What past history
should you review
about the patient?
What you need to know
about the patient’s
current state?
What are
opportunities to
intervene?
What are the risks of
future outcomes?
18
Models that work at hospital 1 don’t work at hospital 2
Models can be used across health systems
Sequence models have access to the entire patient’s record
● Each hospital records data in idiosyncratic data-structures and formats
● Models built on a given hospital don’t work with data from a different hospital

What past history
should you review
about the patient?
current state?
What are
opportunities to
intervene?
future outcomes?
19
● We map data from multiple sites to a single format based on FHIR
● No manual harmonization of data (data remains in idiosyncratic terminologies)
● Through use of embeddings, data is implicitly harmonized

What past history
should you review
about the patient?
current state?
What are
opportunities to
intervene?
future outcomes?
20
● We temporally sequence all data (tokenized FHIR resources that are embedded) for each patient
into a timeline

● Single data structure used for all predictions
● Large variety of input data:
○ Coded diagnoses / procedures
○ Time series of labs, vitals, medications
○ Semi-structured nursing flowsheets
○ Clinical notes
● Wide range of benchmark prediction tasks:
○ 30-day unplanned readmission (a high-cost adverse event)
○ Hospital length-of-stay (for hospital operations)
○ In-hospital mortality (as early warning system)
○ Diagnosis codes (auto-suggest problem list)

Confidential + Proprietary
More accurate predictions, and sooner too.
● Predict in-hospital mortality
before, at, and after admission

More accurate predictions, and sooner too.
● Predict in-hospital mortality
before, at, and after admission
● Baseline: 39 standard risk factors
fed to logistic regression.
0.87 AUC @ +24 hours
● Deep Learning: Eat the FHIR stream
0.90 AUC @ 0 hours
0.93 AUC @ +24 hours

In the paper “Deep learning” is an Ensemble

Behind the scenes : boosting
● Linear, convex model (boosting) did really well
○ Classic competitive baseline for Deep models
■ Paper used embed + 2 layers + noise 4 regularization
○ Binarize time sequences
○ Detecting label leaks via direct interpretability
○ Fast to train (~5 minutes per task, interactive)
○ Enable 1 vs all ICD9 level feature mining
○ Derived applications
■ Human in the loop model training

Human In the Loop training w/ Kai C., Jimbo W., Alvin R.

Session log Interactive (one round)
INFO: Reading initial training data. Please wait.
INFO: Training done after Iteration 0. New model:
--- Old Predicates ---
--- New Predicates ---
[ 0, Y, -0.0232] #:Composition.section.text.div.tokenized nicu >= 2
[ 1, Y, 0.0235] #:[Base excess in Blood by calculation] >= 2
[ 2, Y, 0.0234] Context age_in_years >= 60.000000 @ t <= 1.000000
[ 12, Y, -0.0234] E:Composition.section.text.div.tokenized neonatology
[ 19, Y, -0.0212] E:Context Patient.gendermale
[ 31, Y, 0.0233] E:[Erythrocytes [#/volume] in Blood by Automated count]
[ 32, Y, 0.0232] E:[Hemoglobin [Mass/volume] in Blood]
[ 40, Y, -2.3079] TRUE
New Model Test Score: 0.731937, Rules: 41
BOOST> d 2 3 3 4 5 6 9 10 13 15 1 18 19 20 1 21 22 23 24 25 26 30 31 32 33 34 35 36 37 38 39
BOOST> score
Test Score: 0.685877

Human in the Loop interactive model building
Is Human + AI > AI?
● Human models are smaller
● More trusted
○ E.g. machine models might use
○ “echocardiogram”
Validation AUC vs full dataset iterations

Neurology Cardiology Anesthesia
Intensive Care Consumers
Medical
Waveforms
● First line diagnostics by providers
● Increasingly available to patients
directly
● Data collected with high precision
(>200Hz) and then dropped on the
floor.
● Waveforms IT systems and processes
typically much less integrated into
EHRs (thus much easier to change).
My latest projects

313131
Thanks!
On behalf of everyone
at Google Research

Deep Learning in Healthcare

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Deep Learning in Healthcare

Similar to Deep Learning in Healthcare (20)

Recently uploaded

Recently uploaded (20)

Deep Learning in Healthcare