SlideShare a Scribd company logo
1 of 27
Download to read offline
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
NLP in Healthcare to Predict
Adverse Events with Amazon
SageMaker
Garin Kessler
Data Scientist
AWS Machine Learning Solutions Lab
A I M 3 4 6
Mayank Thakkar
Life Sciences Specialist
AWS Solutions Architecture
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Goal
Learn how to apply machine learning methods to predict adverse
events from reported patient data
… and much more
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Background
• Pharmacovigilance and patient safety programs
• Adverse events and FDA regulations
• FAERS
• Workable data
• Call center recording / summaries
• Emails / faxes
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Adverse event detection – The challenge
• Disparate data types
• Unstructured data
• Understanding semantic
dispositions
• Synonyms, spelling mistakes
• Sentiment detection
• Categorizing interactions
• Various data sources
• Meeting compliance
objectives
• True positives, “sleeping doctor”
• Scale, enormous scale
• Cost efficiency
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Machine learning to the rescue
• Improve accuracy and
reliability
• Doesn’t replace humans – aids
humans
• Offload repetitive work – humans
can handle edge cases
• Decrease costs
• Repurpose human workforce for
‘value-adding’ endeavors
• Keep up with ongoing
research
• Incorporate published articles at
scale
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Machine learning – The process
Fetch data
Clean &
format data
Prepare &
transform
data
Train model
Evaluate
model
Integrate
with prod
Monitor /
debug /
refresh
Data wrangling
• Set up and manage Notebook
environments
• Get data to notebooks securely
Experimentation
• Set up and manage clusters
• Scale/distribute ML algorithms
Deployment
• Set up and manage
inference clusters
• Manage and auto scale
inference APIs
• Testing, versioning, and
monitoring 6-18
months
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
A managed service
that provides one of the quickest and easiest ways for
your data scientists and developers to get
ML models from idea to production
Amazon SageMaker
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Introducing Amazon SageMaker
End-to-end
machine learning
platform
Zero setup Flexible model
training - bring
your own deep
learning script
Pay by the
second
Or your custom
algorithm
Docker image
One step
deployment
A/B testing Low latency,
high
throughput,
high reliability
Choice of several
ML algorithms
Train faster, in
a single pass
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Introducing Amazon SageMaker
Choice of several
ML algorithms
XGBoost, FM,
and Linear for
classification
and regression
K-means and
PCA for
clustering and
dimensionality
reduction
LDA and NTM
for topic
modeling,
seq2seq for
translation
Image
classification
with
convolutional
neural
networks
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Natural language processing methods
• Dataset preprocessing - feature generators
• Latent Dirichlet Analysis (LDA)
• Comprehend topic modeling
• BlazingText word embeddings
• Classification - algorithms utilized
• K-nearest neighbors
• Logistic regression
• XGBoost
• Amazon SageMaker BlazingText Classifier
• Deep convolutional neural network running on TensorFlow and Amazon SageMaker
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Preprocessing
• Data sources
• Call center summaries
• Stored in Amazon Simple Storage Service (Amazon S3)
• Preprocessing
• Lemmatization with Natural Language Toolkit (NLTK)
• BlazingText with Amazon SageMaker
Using BlazingText, reduced the preprocessing time by 10x
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon SageMaker tooling
• TensorFlow and Keras
• “Bring your own model”
• Convolutional neural network
• Built-in algorithms
• Automatic model tuning
• Spinning out many jobs simultaneously
• Amazon CloudWatch and TensorBoard
• Monitoring instances and accuracy metrics
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Architecture
VPC
Private subnet
AWS Cloud
Availability zone 1
AWS Region Raw data and
model artifacts
Production
data
Availability zone 2
Private subnet
Training Deployment
Training Deployment
Auto
Scaling
group
Auto
Scaling
group
Endpoint
Endpoint
Results by algorithm
Feature generator Classifier Accuracy AUC
False
positive
rate
False
negative rate
Precision Recall Sensitivity Specificity
LDA
(Latent Dirichlet Allocation)
kNN 0.775 0.767 0.182 0.288 0.729 0.712 0.712 0.818
Logistic regression 0.728 0.787 0.277 0.257 0.485 0.743 0.743 0.723
XGBoost 0.812 0.905 0.152 0.240 0.774 0.759 0.759 0.848
Comprehend topic modeling
kNN 0.759 0.718 0.254 0.189 0.516 0.811 0.811 0.742
Logistic regression 0.516 0.892 0.395 0.602 0.433 0.398 0.398 0.605
XGBoost 0.855 0.936 0.069 0.230 0.908 0.769 0.769 0.931
Amazon SageMaker
BlazingText
BlazingText Classifier 0.979 0.997 0.023 0.020 0.980 0.985 0.985 0.970
Amazon SageMaker Deep
CNN
0.978 0.998 0.021 0.020 0.978 0.982 0.982 0.972
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
BlazingText embeddings
overview
• Plot the top 5000 most common terms
• Terms overlap with semantically similar
terms
• Models leverage these semantics for
computation and performance
• Will look at terms in two sections of the
word embedding space
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
BlazingText embeddings: Zoomed in – Part 1
• Model has learned
important familial
and patient
relationships,
including caregivers
and reporters
• Robust to typos:
Patient, Pateint, Pt
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
BlazingText embeddings: Zoomed in – Part 2
• Model has learned
important side effects
and adverse drug
reactions
• Types of reactions are
even clustered
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Cost
Service Resources used Pricing dimension Cost
Amazon S3 50 GB for one month $0.023 per GB-month $1.15
Amazon EFS Storage $3
$1.3
$0.0714 per instance-minute $8.55
$0.021 per instance-minute $0.084
Total $14.08 ($0.11 per 1000 predictions)*
What does it cost to run this model?
Amazon SageMaker on-demand ML instances let you pay for machine learning compute capacity by the second, with a one-minute minimum, with no long-term
commitments.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
To learn more…
• Amazon SageMaker here
• Blogs:
• Enhanced text classification and word vectors using Amazon SageMaker BlazingText
• https://tinyurl.com/sagemaker-blazingtext
• Bring your own pre-trained MXNet or TensorFlow models into Amazon SageMaker
• https://tinyurl.com/sagemaker-byom
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Questions?
Garin Kessler
Data Scientist
AWS Machine Learning Solutions Lab
Mayank Thakkar
Life Sciences Specialist
AWS Solutions Architecture
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
Garin Kessler
Data Scientist
AWS Machine Learning Solutions Lab
Mayank Thakkar
Life Sciences Specialist
AWS Solutions Architecture
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

More Related Content

What's hot

BIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareBIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareSkillspeed
 
Analytics in healthcare
Analytics in healthcareAnalytics in healthcare
Analytics in healthcareISME College
 
Streamline Data Governance with Egeria: The Industry's First Open Metadata St...
Streamline Data Governance with Egeria: The Industry's First Open Metadata St...Streamline Data Governance with Egeria: The Industry's First Open Metadata St...
Streamline Data Governance with Egeria: The Industry's First Open Metadata St...DataWorks Summit
 
Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)Krishnaram Kenthapadi
 
Big Data Analytics for Healthcare
Big Data Analytics for HealthcareBig Data Analytics for Healthcare
Big Data Analytics for HealthcareChandan Reddy
 
Presentation on Big Data Analytics
Presentation on Big Data AnalyticsPresentation on Big Data Analytics
Presentation on Big Data AnalyticsS P Sajjan
 
Build, train and deploy ML models at scale.pdf
Build, train and deploy ML models at scale.pdfBuild, train and deploy ML models at scale.pdf
Build, train and deploy ML models at scale.pdfAmazon Web Services
 
Ajay Resume for B. E. Computer Engineering
Ajay Resume for B. E. Computer EngineeringAjay Resume for B. E. Computer Engineering
Ajay Resume for B. E. Computer Engineeringajay
 
CyberSecurity Medical Devices
CyberSecurity Medical DevicesCyberSecurity Medical Devices
CyberSecurity Medical DevicesSuresh Mandava
 
Using Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and AnalyticsUsing Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and AnalyticsPerficient, Inc.
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataJoey Li
 
NPTEL BIG DATA FULL PPT BOOK WITH ASSIGNMENT SOLUTION RAJIV MISHRA IIT PATNA...
NPTEL BIG DATA FULL PPT  BOOK WITH ASSIGNMENT SOLUTION RAJIV MISHRA IIT PATNA...NPTEL BIG DATA FULL PPT  BOOK WITH ASSIGNMENT SOLUTION RAJIV MISHRA IIT PATNA...
NPTEL BIG DATA FULL PPT BOOK WITH ASSIGNMENT SOLUTION RAJIV MISHRA IIT PATNA...SayantanRoy14
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationDr. Abdul Ahad Abro
 
20190528 - Guidelines for Trustworthy AI
20190528 - Guidelines for Trustworthy AI20190528 - Guidelines for Trustworthy AI
20190528 - Guidelines for Trustworthy AIBrussels Legal Hackers
 
Let's Talk Technical: Malware Evasion and Detection
Let's Talk Technical: Malware Evasion and DetectionLet's Talk Technical: Malware Evasion and Detection
Let's Talk Technical: Malware Evasion and DetectionJames Haughom Jr
 
Diabetes Data Science
Diabetes Data ScienceDiabetes Data Science
Diabetes Data SciencePhilip Bourne
 
"Data Annotation at Scale: Pitfalls and Solutions," a Presentation from Intel
"Data Annotation at Scale: Pitfalls and Solutions," a Presentation from Intel"Data Annotation at Scale: Pitfalls and Solutions," a Presentation from Intel
"Data Annotation at Scale: Pitfalls and Solutions," a Presentation from IntelEdge AI and Vision Alliance
 

What's hot (20)

BIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareBIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in Healthcare
 
Analytics in healthcare
Analytics in healthcareAnalytics in healthcare
Analytics in healthcare
 
Streamline Data Governance with Egeria: The Industry's First Open Metadata St...
Streamline Data Governance with Egeria: The Industry's First Open Metadata St...Streamline Data Governance with Egeria: The Industry's First Open Metadata St...
Streamline Data Governance with Egeria: The Industry's First Open Metadata St...
 
Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)
 
Big Data Analytics for Healthcare
Big Data Analytics for HealthcareBig Data Analytics for Healthcare
Big Data Analytics for Healthcare
 
Presentation on Big Data Analytics
Presentation on Big Data AnalyticsPresentation on Big Data Analytics
Presentation on Big Data Analytics
 
Abstract
AbstractAbstract
Abstract
 
Build, train and deploy ML models at scale.pdf
Build, train and deploy ML models at scale.pdfBuild, train and deploy ML models at scale.pdf
Build, train and deploy ML models at scale.pdf
 
Ajay Resume for B. E. Computer Engineering
Ajay Resume for B. E. Computer EngineeringAjay Resume for B. E. Computer Engineering
Ajay Resume for B. E. Computer Engineering
 
CyberSecurity Medical Devices
CyberSecurity Medical DevicesCyberSecurity Medical Devices
CyberSecurity Medical Devices
 
Using Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and AnalyticsUsing Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and Analytics
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
NPTEL BIG DATA FULL PPT BOOK WITH ASSIGNMENT SOLUTION RAJIV MISHRA IIT PATNA...
NPTEL BIG DATA FULL PPT  BOOK WITH ASSIGNMENT SOLUTION RAJIV MISHRA IIT PATNA...NPTEL BIG DATA FULL PPT  BOOK WITH ASSIGNMENT SOLUTION RAJIV MISHRA IIT PATNA...
NPTEL BIG DATA FULL PPT BOOK WITH ASSIGNMENT SOLUTION RAJIV MISHRA IIT PATNA...
 
Medical Data Analysis
Medical Data AnalysisMedical Data Analysis
Medical Data Analysis
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
20190528 - Guidelines for Trustworthy AI
20190528 - Guidelines for Trustworthy AI20190528 - Guidelines for Trustworthy AI
20190528 - Guidelines for Trustworthy AI
 
Benefit of Predictive Analytics in Healthcare
Benefit of Predictive Analytics in HealthcareBenefit of Predictive Analytics in Healthcare
Benefit of Predictive Analytics in Healthcare
 
Let's Talk Technical: Malware Evasion and Detection
Let's Talk Technical: Malware Evasion and DetectionLet's Talk Technical: Malware Evasion and Detection
Let's Talk Technical: Malware Evasion and Detection
 
Diabetes Data Science
Diabetes Data ScienceDiabetes Data Science
Diabetes Data Science
 
"Data Annotation at Scale: Pitfalls and Solutions," a Presentation from Intel
"Data Annotation at Scale: Pitfalls and Solutions," a Presentation from Intel"Data Annotation at Scale: Pitfalls and Solutions," a Presentation from Intel
"Data Annotation at Scale: Pitfalls and Solutions," a Presentation from Intel
 

Similar to NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - AWS re:Invent 2018

Introducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksIntroducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksAmazon Web Services
 
Quickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scaleQuickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scaleAWS Germany
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon Web Services
 
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...Amazon Web Services
 
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...Designing for Operability: Getting the Last Nines in Five-Nines Availability ...
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...Amazon Web Services
 
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018Amazon Web Services
 
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...Amazon Web Services
 
Enabling Sustainable Research Platforms in the Cloud
Enabling Sustainable Research Platforms in the CloudEnabling Sustainable Research Platforms in the Cloud
Enabling Sustainable Research Platforms in the CloudAmazon Web Services
 
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Amazon Web Services
 
Serverless on AWS: Architectural Patterns and Best Practices
Serverless on AWS: Architectural Patterns and Best PracticesServerless on AWS: Architectural Patterns and Best Practices
Serverless on AWS: Architectural Patterns and Best PracticesVladimir Simek
 
NEW LAUNCH! Introducing Amazon SageMaker - MCL365 - re:Invent 2017
NEW LAUNCH! Introducing Amazon SageMaker - MCL365 - re:Invent 2017NEW LAUNCH! Introducing Amazon SageMaker - MCL365 - re:Invent 2017
NEW LAUNCH! Introducing Amazon SageMaker - MCL365 - re:Invent 2017Amazon Web Services
 
Fraud Prevention and Detection on AWS
Fraud Prevention and Detection on AWSFraud Prevention and Detection on AWS
Fraud Prevention and Detection on AWSAmazon Web Services
 
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...Amazon Web Services Korea
 
Work with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS Summit
Work with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS SummitWork with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS Summit
Work with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS SummitAmazon Web Services
 
Introducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksIntroducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksAmazon Web Services
 
Get the Most out of Your Amazon Elasticsearch Service Domain (ANT334-R1) - AW...
Get the Most out of Your Amazon Elasticsearch Service Domain (ANT334-R1) - AW...Get the Most out of Your Amazon Elasticsearch Service Domain (ANT334-R1) - AW...
Get the Most out of Your Amazon Elasticsearch Service Domain (ANT334-R1) - AW...Amazon Web Services
 
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...Amazon Web Services
 

Similar to NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - AWS re:Invent 2018 (20)

Introducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksIntroducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech Talks
 
Quickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scaleQuickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scale
 
Introduction to Sagemaker
Introduction to SagemakerIntroduction to Sagemaker
Introduction to Sagemaker
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)
 
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
 
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...Designing for Operability: Getting the Last Nines in Five-Nines Availability ...
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...
 
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
 
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
 
Enabling Sustainable Research Platforms in the Cloud
Enabling Sustainable Research Platforms in the CloudEnabling Sustainable Research Platforms in the Cloud
Enabling Sustainable Research Platforms in the Cloud
 
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
 
Introducing Amazon SageMaker
Introducing Amazon SageMakerIntroducing Amazon SageMaker
Introducing Amazon SageMaker
 
Serverless on AWS: Architectural Patterns and Best Practices
Serverless on AWS: Architectural Patterns and Best PracticesServerless on AWS: Architectural Patterns and Best Practices
Serverless on AWS: Architectural Patterns and Best Practices
 
NEW LAUNCH! Introducing Amazon SageMaker - MCL365 - re:Invent 2017
NEW LAUNCH! Introducing Amazon SageMaker - MCL365 - re:Invent 2017NEW LAUNCH! Introducing Amazon SageMaker - MCL365 - re:Invent 2017
NEW LAUNCH! Introducing Amazon SageMaker - MCL365 - re:Invent 2017
 
Fraud Prevention and Detection on AWS
Fraud Prevention and Detection on AWSFraud Prevention and Detection on AWS
Fraud Prevention and Detection on AWS
 
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
 
Amazon SageMaker
Amazon SageMakerAmazon SageMaker
Amazon SageMaker
 
Work with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS Summit
Work with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS SummitWork with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS Summit
Work with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS Summit
 
Introducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksIntroducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech Talks
 
Get the Most out of Your Amazon Elasticsearch Service Domain (ANT334-R1) - AW...
Get the Most out of Your Amazon Elasticsearch Service Domain (ANT334-R1) - AW...Get the Most out of Your Amazon Elasticsearch Service Domain (ANT334-R1) - AW...
Get the Most out of Your Amazon Elasticsearch Service Domain (ANT334-R1) - AW...
 
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - AWS re:Invent 2018

  • 1.
  • 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. NLP in Healthcare to Predict Adverse Events with Amazon SageMaker Garin Kessler Data Scientist AWS Machine Learning Solutions Lab A I M 3 4 6 Mayank Thakkar Life Sciences Specialist AWS Solutions Architecture
  • 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Goal Learn how to apply machine learning methods to predict adverse events from reported patient data … and much more
  • 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Background • Pharmacovigilance and patient safety programs • Adverse events and FDA regulations • FAERS • Workable data • Call center recording / summaries • Emails / faxes
  • 6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Adverse event detection – The challenge • Disparate data types • Unstructured data • Understanding semantic dispositions • Synonyms, spelling mistakes • Sentiment detection • Categorizing interactions • Various data sources • Meeting compliance objectives • True positives, “sleeping doctor” • Scale, enormous scale • Cost efficiency
  • 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Machine learning to the rescue • Improve accuracy and reliability • Doesn’t replace humans – aids humans • Offload repetitive work – humans can handle edge cases • Decrease costs • Repurpose human workforce for ‘value-adding’ endeavors • Keep up with ongoing research • Incorporate published articles at scale
  • 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Machine learning – The process Fetch data Clean & format data Prepare & transform data Train model Evaluate model Integrate with prod Monitor / debug / refresh Data wrangling • Set up and manage Notebook environments • Get data to notebooks securely Experimentation • Set up and manage clusters • Scale/distribute ML algorithms Deployment • Set up and manage inference clusters • Manage and auto scale inference APIs • Testing, versioning, and monitoring 6-18 months
  • 11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. A managed service that provides one of the quickest and easiest ways for your data scientists and developers to get ML models from idea to production Amazon SageMaker
  • 12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Introducing Amazon SageMaker End-to-end machine learning platform Zero setup Flexible model training - bring your own deep learning script Pay by the second Or your custom algorithm Docker image One step deployment A/B testing Low latency, high throughput, high reliability Choice of several ML algorithms Train faster, in a single pass
  • 13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Introducing Amazon SageMaker Choice of several ML algorithms XGBoost, FM, and Linear for classification and regression K-means and PCA for clustering and dimensionality reduction LDA and NTM for topic modeling, seq2seq for translation Image classification with convolutional neural networks
  • 14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Natural language processing methods • Dataset preprocessing - feature generators • Latent Dirichlet Analysis (LDA) • Comprehend topic modeling • BlazingText word embeddings • Classification - algorithms utilized • K-nearest neighbors • Logistic regression • XGBoost • Amazon SageMaker BlazingText Classifier • Deep convolutional neural network running on TensorFlow and Amazon SageMaker
  • 16. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Preprocessing • Data sources • Call center summaries • Stored in Amazon Simple Storage Service (Amazon S3) • Preprocessing • Lemmatization with Natural Language Toolkit (NLTK) • BlazingText with Amazon SageMaker Using BlazingText, reduced the preprocessing time by 10x
  • 17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon SageMaker tooling • TensorFlow and Keras • “Bring your own model” • Convolutional neural network • Built-in algorithms • Automatic model tuning • Spinning out many jobs simultaneously • Amazon CloudWatch and TensorBoard • Monitoring instances and accuracy metrics
  • 18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Architecture VPC Private subnet AWS Cloud Availability zone 1 AWS Region Raw data and model artifacts Production data Availability zone 2 Private subnet Training Deployment Training Deployment Auto Scaling group Auto Scaling group Endpoint Endpoint
  • 19. Results by algorithm Feature generator Classifier Accuracy AUC False positive rate False negative rate Precision Recall Sensitivity Specificity LDA (Latent Dirichlet Allocation) kNN 0.775 0.767 0.182 0.288 0.729 0.712 0.712 0.818 Logistic regression 0.728 0.787 0.277 0.257 0.485 0.743 0.743 0.723 XGBoost 0.812 0.905 0.152 0.240 0.774 0.759 0.759 0.848 Comprehend topic modeling kNN 0.759 0.718 0.254 0.189 0.516 0.811 0.811 0.742 Logistic regression 0.516 0.892 0.395 0.602 0.433 0.398 0.398 0.605 XGBoost 0.855 0.936 0.069 0.230 0.908 0.769 0.769 0.931 Amazon SageMaker BlazingText BlazingText Classifier 0.979 0.997 0.023 0.020 0.980 0.985 0.985 0.970 Amazon SageMaker Deep CNN 0.978 0.998 0.021 0.020 0.978 0.982 0.982 0.972
  • 20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. BlazingText embeddings overview • Plot the top 5000 most common terms • Terms overlap with semantically similar terms • Models leverage these semantics for computation and performance • Will look at terms in two sections of the word embedding space
  • 21. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. BlazingText embeddings: Zoomed in – Part 1 • Model has learned important familial and patient relationships, including caregivers and reporters • Robust to typos: Patient, Pateint, Pt
  • 22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. BlazingText embeddings: Zoomed in – Part 2 • Model has learned important side effects and adverse drug reactions • Types of reactions are even clustered
  • 23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Cost Service Resources used Pricing dimension Cost Amazon S3 50 GB for one month $0.023 per GB-month $1.15 Amazon EFS Storage $3 $1.3 $0.0714 per instance-minute $8.55 $0.021 per instance-minute $0.084 Total $14.08 ($0.11 per 1000 predictions)* What does it cost to run this model? Amazon SageMaker on-demand ML instances let you pay for machine learning compute capacity by the second, with a one-minute minimum, with no long-term commitments.
  • 24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. To learn more… • Amazon SageMaker here • Blogs: • Enhanced text classification and word vectors using Amazon SageMaker BlazingText • https://tinyurl.com/sagemaker-blazingtext • Bring your own pre-trained MXNet or TensorFlow models into Amazon SageMaker • https://tinyurl.com/sagemaker-byom
  • 25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Questions? Garin Kessler Data Scientist AWS Machine Learning Solutions Lab Mayank Thakkar Life Sciences Specialist AWS Solutions Architecture
  • 26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thank you! Garin Kessler Data Scientist AWS Machine Learning Solutions Lab Mayank Thakkar Life Sciences Specialist AWS Solutions Architecture
  • 27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.