SlideShare a Scribd company logo
How to predict ICU mortality with
digital health data
Max Pumperla
Munich, October 11th 2017
Founded 2014
Clients 14 Enterprises
3,500 GH Forks, 7,200 Stars
300,000+ DL4J downloads/mo.
Team ~35; mostly engineers; 6 PhDs
COMPANY OVERVIEW
Deeplearning4j
Build, train, and deploy neural
networks on JVM
ND4J
High performance linear algebra CPU
and GPU libraries
DataVec
Data ingestion, normalization, and
vectorization
HEALTHCARE AT SKYMIND
● Part of “Healthy China 2030” initiative
○ Expand health service industry to $2.35 trillion
○ Pilot project with 3600 hospitals in Fuzhou
○ 20 leading industry partners around CEC
○ Fatty liver disease detection
○ Bone fracture detection
○ Other use cases to come
● Intensive care unit (ICU) mortality prediction
DIGITAL HEALTH DATA ADOPTION
Randall Wetzel, Virtual PICU, Children’s Hospital LA
“The patient is data” The patient!
DATA RECORDED IN EHR IN ICU
● Patient-level info (e.g., age, gender)
● Physiologic measurements (e.g., heart rate)
● Lab results (e.g., glucose)
● Clinical assessments (e.g., glasgow coma scale)
● Medications and treatments (one treatment:
mechanical ventilation)
● Clinical notes
● Diagnoses
● Outcomes (one outcome: mortality)
PHYSIONET DATA
● Data from 12000 ICU stays published 2012
○ Only 4000 labeled records publicly available
○ 4000 unlabeled records used for tuning during
competition (we didn’t use)
○ 4000 test examples not available
● Binary outcome: in-hospital survival or mortality
(~13% mortality)
● Sequences vary in length from hours to weeks
● Observations begin at time of admission, not at
onset of illness
General descriptors
• ID
• age
• gender
• weight
• height
• ICU type (cardiac,
medical, surgical, trauma)
Time-series data
• blood pressure
• blood glucose
• etc.
PHYSIONET CHALLENGE
Goal: Advances toward accurate
patient-specific predictive models
Task: Predict mortality from first 48
hours of data
Challenges:
• Long-term dependencies
• Temporal outcome
• Class imbalances
• Measuring treatment effects
• Missing data
• Irregularly sampled data
PREPROCESSING & DATA IMPUTATION
1. Flatten data: (timestamp, variable name, value)
2. One-hot encoding of categorical variables
3. Rescaling to range [0,1] of real-valued variables
4. Missing values:
a. Carry forward imputation (suggested here)
b. “Missing” flag to indicate imputation
5. Replicate descriptors across time
6. Do not resample to fixed timestamps
RECURRENT NEURAL NETWORKS
● Recurrent neural networks (RNNs) great for modeling temporal
structure in data
● Very flexible in input & output data
•Predict next word (language
modeling)
•Translate from English to
French (machine translation)
•Predict patient outcome from
sequence (today)
Generate beer review from
category, score
RNNS FOR TIME-SERIES DATA
RNNS FOR TIME-SERIES DATA
RNNS FOR TIME-SERIES DATA
RNNS FOR TIME-SERIES DATA
RNNS FOR TIME-SERIES DATA
RNNS FOR TIME-SERIES DATA
RNNS FOR TIME-SERIES DATA
BUILDING A MODEL WITH DL4J
● Using long short-term memory networks (LSTMs,
Hochreiter & Schmidhuber, 1997)
● Training done with mini-batch stochastic gradient
descent
source: http://colah.github.io
DISTRIBUTED TRAINING WITH SPARK
● DL4J scale-out module using Spark
● Data-parallel training procedure
● Parameter averaging on master
● Deployment on CDH cluster (Spark on
YARN)
EVALUATION & EXPERIMENTAL
RESULTS
REFERENCES
● Blog article on cloudera blog
● O’Reilly AI course on deep learning for time
series
● Strata NYC 2017 Tutorial: Securely building
deep learning models for digital health data
● Machine Learning for Healthcare
Conference: http://mucmd.org/
● Lipton, et al.: A Critical Review of RNNs
● Harutyunyan, et al. Multitask Learning and
Benchmarking with Clinical Time Series
Data.
max@skymind.io
twitter: maxpumperla
github: maxpumperla

More Related Content

Similar to GTC Europe 2017 - How to predict ICU mortality with digital health data

Computer-Aided Detection (1).pptx
Computer-Aided Detection (1).pptxComputer-Aided Detection (1).pptx
Computer-Aided Detection (1).pptx
MohammedMasliuddin
 
Meaningful (meta)data at scale: removing barriers to precision medicine research
Meaningful (meta)data at scale: removing barriers to precision medicine researchMeaningful (meta)data at scale: removing barriers to precision medicine research
Meaningful (meta)data at scale: removing barriers to precision medicine research
Nolan Nichols
 
Diabetes prediciton model ppt.ppt
Diabetes prediciton model ppt.pptDiabetes prediciton model ppt.ppt
Diabetes prediciton model ppt.ppt
satvikpatil5
 
HPC and Precision Medicine: A New Framework for Alzheimer's and Parkinson's
HPC and Precision Medicine: A New Framework for Alzheimer's and Parkinson'sHPC and Precision Medicine: A New Framework for Alzheimer's and Parkinson's
HPC and Precision Medicine: A New Framework for Alzheimer's and Parkinson's
inside-BigData.com
 
Introduction to High-performance In-memory Genome Project at HPI
Introduction to High-performance In-memory Genome Project at HPI Introduction to High-performance In-memory Genome Project at HPI
Introduction to High-performance In-memory Genome Project at HPI
Matthieu Schapranow
 
Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?
Matthieu Schapranow
 
A Platform for Integrated Genome Data Analysis
A Platform for Integrated Genome Data AnalysisA Platform for Integrated Genome Data Analysis
A Platform for Integrated Genome Data Analysis
Matthieu Schapranow
 
Supporting high throughput high-biotechnologies in today’s research environme...
Supporting high throughput high-biotechnologies in today’s research environme...Supporting high throughput high-biotechnologies in today’s research environme...
Supporting high throughput high-biotechnologies in today’s research environme...
Ed Dodds
 
Predicting Patient Outcomes in Real-Time at HCA
Predicting Patient Outcomes in Real-Time at HCAPredicting Patient Outcomes in Real-Time at HCA
Predicting Patient Outcomes in Real-Time at HCA
Sri Ambati
 
Deep Learning for AI (3)
Deep Learning for AI (3)Deep Learning for AI (3)
Deep Learning for AI (3)
Dongheon Lee
 
Pistoia alliance debates analytics 15-09-2015 16.00
Pistoia alliance debates   analytics 15-09-2015 16.00Pistoia alliance debates   analytics 15-09-2015 16.00
Pistoia alliance debates analytics 15-09-2015 16.00
Pistoia Alliance
 
10th Annual Utah's Health Services Research Conference - Data Quality in Mult...
10th Annual Utah's Health Services Research Conference - Data Quality in Mult...10th Annual Utah's Health Services Research Conference - Data Quality in Mult...
10th Annual Utah's Health Services Research Conference - Data Quality in Mult...
Utah's Annual Health Services Research Conference
 
Working With Large-Scale Clinical Datasets
Working With Large-Scale Clinical DatasetsWorking With Large-Scale Clinical Datasets
Working With Large-Scale Clinical Datasets
Craig Smail
 
Data Con LA 2019 - Best Practices for Prototyping Machine Learning Models for...
Data Con LA 2019 - Best Practices for Prototyping Machine Learning Models for...Data Con LA 2019 - Best Practices for Prototyping Machine Learning Models for...
Data Con LA 2019 - Best Practices for Prototyping Machine Learning Models for...
Data Con LA
 
Leveraging Machine Learning Techniques Predictive Analytics for Knowledge Dis...
Leveraging Machine Learning Techniques Predictive Analytics for Knowledge Dis...Leveraging Machine Learning Techniques Predictive Analytics for Knowledge Dis...
Leveraging Machine Learning Techniques Predictive Analytics for Knowledge Dis...
Kevin Mader
 
Impact of Big Data & Artificial Intelligence in Drug Discovery & Development ...
Impact of Big Data & Artificial Intelligence in Drug Discovery & Development ...Impact of Big Data & Artificial Intelligence in Drug Discovery & Development ...
Impact of Big Data & Artificial Intelligence in Drug Discovery & Development ...
Nick Brown
 
Big Data at Geisinger Health System: Big Wins in a Short Time
Big Data at Geisinger Health System: Big Wins in a Short TimeBig Data at Geisinger Health System: Big Wins in a Short Time
Big Data at Geisinger Health System: Big Wins in a Short Time
DataWorks Summit
 
LIMS in Modern Molecular Pathology by Dr. Perry Maxwell
LIMS in Modern Molecular Pathology by Dr. Perry MaxwellLIMS in Modern Molecular Pathology by Dr. Perry Maxwell
LIMS in Modern Molecular Pathology by Dr. Perry Maxwell
Cirdan
 
Meaningful use stage 3 - Nalashaa capabilities
Meaningful use stage 3 - Nalashaa capabilitiesMeaningful use stage 3 - Nalashaa capabilities
Meaningful use stage 3 - Nalashaa capabilities
Nalashaa Healthcare Solutions
 
Informatics and the merging of research and quality measures with bedside care
Informatics and the merging of research and quality measures with bedside careInformatics and the merging of research and quality measures with bedside care
Informatics and the merging of research and quality measures with bedside care
Mike Hogarth, MD, FACMI, FACP
 

Similar to GTC Europe 2017 - How to predict ICU mortality with digital health data (20)

Computer-Aided Detection (1).pptx
Computer-Aided Detection (1).pptxComputer-Aided Detection (1).pptx
Computer-Aided Detection (1).pptx
 
Meaningful (meta)data at scale: removing barriers to precision medicine research
Meaningful (meta)data at scale: removing barriers to precision medicine researchMeaningful (meta)data at scale: removing barriers to precision medicine research
Meaningful (meta)data at scale: removing barriers to precision medicine research
 
Diabetes prediciton model ppt.ppt
Diabetes prediciton model ppt.pptDiabetes prediciton model ppt.ppt
Diabetes prediciton model ppt.ppt
 
HPC and Precision Medicine: A New Framework for Alzheimer's and Parkinson's
HPC and Precision Medicine: A New Framework for Alzheimer's and Parkinson'sHPC and Precision Medicine: A New Framework for Alzheimer's and Parkinson's
HPC and Precision Medicine: A New Framework for Alzheimer's and Parkinson's
 
Introduction to High-performance In-memory Genome Project at HPI
Introduction to High-performance In-memory Genome Project at HPI Introduction to High-performance In-memory Genome Project at HPI
Introduction to High-performance In-memory Genome Project at HPI
 
Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?
 
A Platform for Integrated Genome Data Analysis
A Platform for Integrated Genome Data AnalysisA Platform for Integrated Genome Data Analysis
A Platform for Integrated Genome Data Analysis
 
Supporting high throughput high-biotechnologies in today’s research environme...
Supporting high throughput high-biotechnologies in today’s research environme...Supporting high throughput high-biotechnologies in today’s research environme...
Supporting high throughput high-biotechnologies in today’s research environme...
 
Predicting Patient Outcomes in Real-Time at HCA
Predicting Patient Outcomes in Real-Time at HCAPredicting Patient Outcomes in Real-Time at HCA
Predicting Patient Outcomes in Real-Time at HCA
 
Deep Learning for AI (3)
Deep Learning for AI (3)Deep Learning for AI (3)
Deep Learning for AI (3)
 
Pistoia alliance debates analytics 15-09-2015 16.00
Pistoia alliance debates   analytics 15-09-2015 16.00Pistoia alliance debates   analytics 15-09-2015 16.00
Pistoia alliance debates analytics 15-09-2015 16.00
 
10th Annual Utah's Health Services Research Conference - Data Quality in Mult...
10th Annual Utah's Health Services Research Conference - Data Quality in Mult...10th Annual Utah's Health Services Research Conference - Data Quality in Mult...
10th Annual Utah's Health Services Research Conference - Data Quality in Mult...
 
Working With Large-Scale Clinical Datasets
Working With Large-Scale Clinical DatasetsWorking With Large-Scale Clinical Datasets
Working With Large-Scale Clinical Datasets
 
Data Con LA 2019 - Best Practices for Prototyping Machine Learning Models for...
Data Con LA 2019 - Best Practices for Prototyping Machine Learning Models for...Data Con LA 2019 - Best Practices for Prototyping Machine Learning Models for...
Data Con LA 2019 - Best Practices for Prototyping Machine Learning Models for...
 
Leveraging Machine Learning Techniques Predictive Analytics for Knowledge Dis...
Leveraging Machine Learning Techniques Predictive Analytics for Knowledge Dis...Leveraging Machine Learning Techniques Predictive Analytics for Knowledge Dis...
Leveraging Machine Learning Techniques Predictive Analytics for Knowledge Dis...
 
Impact of Big Data & Artificial Intelligence in Drug Discovery & Development ...
Impact of Big Data & Artificial Intelligence in Drug Discovery & Development ...Impact of Big Data & Artificial Intelligence in Drug Discovery & Development ...
Impact of Big Data & Artificial Intelligence in Drug Discovery & Development ...
 
Big Data at Geisinger Health System: Big Wins in a Short Time
Big Data at Geisinger Health System: Big Wins in a Short TimeBig Data at Geisinger Health System: Big Wins in a Short Time
Big Data at Geisinger Health System: Big Wins in a Short Time
 
LIMS in Modern Molecular Pathology by Dr. Perry Maxwell
LIMS in Modern Molecular Pathology by Dr. Perry MaxwellLIMS in Modern Molecular Pathology by Dr. Perry Maxwell
LIMS in Modern Molecular Pathology by Dr. Perry Maxwell
 
Meaningful use stage 3 - Nalashaa capabilities
Meaningful use stage 3 - Nalashaa capabilitiesMeaningful use stage 3 - Nalashaa capabilities
Meaningful use stage 3 - Nalashaa capabilities
 
Informatics and the merging of research and quality measures with bedside care
Informatics and the merging of research and quality measures with bedside careInformatics and the merging of research and quality measures with bedside care
Informatics and the merging of research and quality measures with bedside care
 

More from Max Pumperla

Data science in practice
Data science in practiceData science in practice
Data science in practice
Max Pumperla
 
Snakes on a plane - Ship your Python on enterprise machines
Snakes on a plane - Ship your Python on enterprise machinesSnakes on a plane - Ship your Python on enterprise machines
Snakes on a plane - Ship your Python on enterprise machines
Max Pumperla
 
Bridging the gap in enterprise AI
Bridging the gap in enterprise AIBridging the gap in enterprise AI
Bridging the gap in enterprise AI
Max Pumperla
 
Hitchhiker's guide to the kerasverse
Hitchhiker's guide to the kerasverseHitchhiker's guide to the kerasverse
Hitchhiker's guide to the kerasverse
Max Pumperla
 
From Experimentation to Production - Scala & Python APIs for DL4J
From Experimentation to Production - Scala & Python APIs for DL4JFrom Experimentation to Production - Scala & Python APIs for DL4J
From Experimentation to Production - Scala & Python APIs for DL4J
Max Pumperla
 
Deep Recommender systems - Shibsted, Oslo
Deep Recommender systems - Shibsted, OsloDeep Recommender systems - Shibsted, Oslo
Deep Recommender systems - Shibsted, Oslo
Max Pumperla
 
EclipseCon 2017 - Introduction to Machine Learning with Eclipse Deeplearning4j
EclipseCon 2017 - Introduction to Machine Learning with Eclipse Deeplearning4jEclipseCon 2017 - Introduction to Machine Learning with Eclipse Deeplearning4j
EclipseCon 2017 - Introduction to Machine Learning with Eclipse Deeplearning4j
Max Pumperla
 

More from Max Pumperla (7)

Data science in practice
Data science in practiceData science in practice
Data science in practice
 
Snakes on a plane - Ship your Python on enterprise machines
Snakes on a plane - Ship your Python on enterprise machinesSnakes on a plane - Ship your Python on enterprise machines
Snakes on a plane - Ship your Python on enterprise machines
 
Bridging the gap in enterprise AI
Bridging the gap in enterprise AIBridging the gap in enterprise AI
Bridging the gap in enterprise AI
 
Hitchhiker's guide to the kerasverse
Hitchhiker's guide to the kerasverseHitchhiker's guide to the kerasverse
Hitchhiker's guide to the kerasverse
 
From Experimentation to Production - Scala & Python APIs for DL4J
From Experimentation to Production - Scala & Python APIs for DL4JFrom Experimentation to Production - Scala & Python APIs for DL4J
From Experimentation to Production - Scala & Python APIs for DL4J
 
Deep Recommender systems - Shibsted, Oslo
Deep Recommender systems - Shibsted, OsloDeep Recommender systems - Shibsted, Oslo
Deep Recommender systems - Shibsted, Oslo
 
EclipseCon 2017 - Introduction to Machine Learning with Eclipse Deeplearning4j
EclipseCon 2017 - Introduction to Machine Learning with Eclipse Deeplearning4jEclipseCon 2017 - Introduction to Machine Learning with Eclipse Deeplearning4j
EclipseCon 2017 - Introduction to Machine Learning with Eclipse Deeplearning4j
 

Recently uploaded

Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 

Recently uploaded (20)

Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 

GTC Europe 2017 - How to predict ICU mortality with digital health data

  • 1. How to predict ICU mortality with digital health data Max Pumperla Munich, October 11th 2017
  • 2. Founded 2014 Clients 14 Enterprises 3,500 GH Forks, 7,200 Stars 300,000+ DL4J downloads/mo. Team ~35; mostly engineers; 6 PhDs COMPANY OVERVIEW Deeplearning4j Build, train, and deploy neural networks on JVM ND4J High performance linear algebra CPU and GPU libraries DataVec Data ingestion, normalization, and vectorization
  • 3. HEALTHCARE AT SKYMIND ● Part of “Healthy China 2030” initiative ○ Expand health service industry to $2.35 trillion ○ Pilot project with 3600 hospitals in Fuzhou ○ 20 leading industry partners around CEC ○ Fatty liver disease detection ○ Bone fracture detection ○ Other use cases to come ● Intensive care unit (ICU) mortality prediction
  • 4. DIGITAL HEALTH DATA ADOPTION Randall Wetzel, Virtual PICU, Children’s Hospital LA “The patient is data” The patient!
  • 5. DATA RECORDED IN EHR IN ICU ● Patient-level info (e.g., age, gender) ● Physiologic measurements (e.g., heart rate) ● Lab results (e.g., glucose) ● Clinical assessments (e.g., glasgow coma scale) ● Medications and treatments (one treatment: mechanical ventilation) ● Clinical notes ● Diagnoses ● Outcomes (one outcome: mortality)
  • 6. PHYSIONET DATA ● Data from 12000 ICU stays published 2012 ○ Only 4000 labeled records publicly available ○ 4000 unlabeled records used for tuning during competition (we didn’t use) ○ 4000 test examples not available ● Binary outcome: in-hospital survival or mortality (~13% mortality) ● Sequences vary in length from hours to weeks ● Observations begin at time of admission, not at onset of illness General descriptors • ID • age • gender • weight • height • ICU type (cardiac, medical, surgical, trauma) Time-series data • blood pressure • blood glucose • etc.
  • 7. PHYSIONET CHALLENGE Goal: Advances toward accurate patient-specific predictive models Task: Predict mortality from first 48 hours of data Challenges: • Long-term dependencies • Temporal outcome • Class imbalances • Measuring treatment effects • Missing data • Irregularly sampled data
  • 8. PREPROCESSING & DATA IMPUTATION 1. Flatten data: (timestamp, variable name, value) 2. One-hot encoding of categorical variables 3. Rescaling to range [0,1] of real-valued variables 4. Missing values: a. Carry forward imputation (suggested here) b. “Missing” flag to indicate imputation 5. Replicate descriptors across time 6. Do not resample to fixed timestamps
  • 9. RECURRENT NEURAL NETWORKS ● Recurrent neural networks (RNNs) great for modeling temporal structure in data ● Very flexible in input & output data •Predict next word (language modeling) •Translate from English to French (machine translation) •Predict patient outcome from sequence (today) Generate beer review from category, score
  • 17. BUILDING A MODEL WITH DL4J ● Using long short-term memory networks (LSTMs, Hochreiter & Schmidhuber, 1997) ● Training done with mini-batch stochastic gradient descent source: http://colah.github.io
  • 18. DISTRIBUTED TRAINING WITH SPARK ● DL4J scale-out module using Spark ● Data-parallel training procedure ● Parameter averaging on master ● Deployment on CDH cluster (Spark on YARN)
  • 20. REFERENCES ● Blog article on cloudera blog ● O’Reilly AI course on deep learning for time series ● Strata NYC 2017 Tutorial: Securely building deep learning models for digital health data ● Machine Learning for Healthcare Conference: http://mucmd.org/ ● Lipton, et al.: A Critical Review of RNNs ● Harutyunyan, et al. Multitask Learning and Benchmarking with Clinical Time Series Data.