SlideShare a Scribd company logo
Dr. David Talby
DEEP LEARNING FOR
NATURAL LANGUAGE UNDERSTANDING
CONTENTS
 NLP & THE PROMISE OF DEEP LEARNING
 IN ACTION: NAMED ENTITY RECOGNITION
 GOING TO PRODUCTION
AI VS. DOCTORS
Deep Learning
Computer
Vision
Access to Care
Diagnostic
Accuracy
NLP IN HEALTHCARE
Deep Learning
NLP
Efficiency
Accuracy
Radiology Diagnostic
Mental
Health
Safety
Events
Inpatient
Pre-
Auth
Key
Opinion
Leaders
Research
Meta
Analysis
Clinical
Coding
Financial
Anti-
Fraud
Adverse
Events
Drug Development
Recruit
for Trials
Natural Language Understanding
is an AI-Complete problem.
ED Triage Notes
states started last night, upper abd, took alka seltzer approx
0500, no relief. nausea no vomiting
Since yeatreday 10/10 "constant Tylenol 1 hr ago. +nausea.
diaphoretic. Mid abd radiates to back
Generalized abd radiating to lower x 3 days accompanied
by dark stools. Now with bloody stool this am. Denies dizzy,
sob, fatigue. Visiting from Japan on business.”
Features
Type of Pain
Intensity of Pain
Body part of region
Symptoms
Onset of symptoms
Attempted home remedy
HUMAN LANGUAGE IS CONTEXTUAL
HUMAN LANGUAGE IS NUANCED
THE PROMISE OF DEEP LEARNING
Get by with rules, search,
RegEx, attribute extraction
Welcome to the world of
NLP, ML and DL
Social media
Does this social media post
contain an offensive word?
Is this social media post
offensive?
Legal
Find patents with the terms
‘car’ and battery’, or synonyms
Who is patenting next-gen
electrical car batteries?
Support
Find products mentioned in
customer emails or phone calls
What is this customer
complaining about?
Finance
Extract the fee structure from a
mutual fund prospectus
Are UK pensions allowed to
invest in this fund?
Healthcare
Extract the patient’s blood
pressure reading from a note
Does this patient have high
blood pressure?
CONTENTS
 NLP & THE PROMISE OF DEEP LEARNING
 IN ACTION: NAMED ENTITY RECOGNITION
 GOING TO PRODUCTION
NAMED ENTITY RECOGNITION
From Sutton & McCallum’s An Introduction to Conditional Random Fields.
FROM CRF TO DEEP LEARNING (AND BACK)
From Yves Peirsman’s Named Entity Recognition and the Road to Deep Learning
• CoNLL-2003 shared task dataset
• CRF++ Implementation
• Feature engineering:
• the token itself
• Its Bigram & trigram
• Their prefix & suffix
• Its part of speech
• Its chunk type
• Does it start with a capital?
• Is it uppercase?
• Is it a digit?
• Surrounding context words
Starting Point: “Classic” machine learning approach
81.15%
F-score
CRF + WORD EMBEDDINGS
From Yves Peirsman’s Named Entity Recognition and the Road to Deep Learning
Replacing curated dictionaries with embeddings to model semantic similarity
84.9%
F-score
FORGET CRF. LET’S USE AN LSTM NETWORK
From Yves Peirsman’s Named Entity Recognition and the Road to Deep Learning
An LSTM is a type of RNN, well suited for sequential data with long-term dependencies
64.9%
LSTM F-score
76.1%
biLSTM F-score
TRANSFER LEARNING: USE PRETRAINED EMBEDDINGS
From Yves Peirsman’s Named Entity Recognition and the Road to Deep Learning
85.9%
F-score
Reuse the embeddings trained on Wikipedia,
instead of on CoNNL which only has 200,000 words
ADD CHARACTER BASED MODEL: BI-LSTM OR CNN
From Yves Peirsman’s Named Entity Recognition and the Road to Deep Learning
89.3%
F-score
In addition to token based models, add a character-based biLSTM or CNN
to learn and model word prefixes and suffixes
LET’S GET OVER 90% - BRING BACK THE CRF!
From Yves Peirsman’s Named Entity Recognition and the Road to Deep Learning
90.3%
F-score
Because predicting all labels independently of each other, not taking into account the
labels predicted for the surrounding words, leaves some accuracy on the table
In deep learning, architecture engineering
is the new feature engineering.
Stephen Merity
CONTENTS
 NLP & THE PROMISE OF DEEP LEARNING
 IN ACTION: NAMED ENTITY RECOGNITION
 GOING TO PRODUCTION
Data
Curation
Data
Science
Data
Engineering
Data
Operations
Moving from research to production?
 Business Case
 All four roles on the team
Data
Curation
Data
Science
Data
Engineering
Data
Operations
Get the data Get expert labels
Get pretrained datasets
& embeddings
“Inception v3 was trained on
1.28 million images”
“In the study, the algorithm went
head-to-head against 21 board-
certified dermatologists”
Facebook open sourced
pre-trained word vectors for
294 languages, trained
on Wikipedia using fastText
“used over 120,000 retinal
images to train a neural network
to detect diabetic retinopathy”
“All images were graded by 3 to 7
different ophthalmologists, from
a panel of 54 US-licensed senior
residents & ophthalmologists”
UMLS has over 1 million
biomedical concepts and 5
million concept names, from
over 100 controlled vocabularies
Data
Curation
Data
Science
Data
Engineering
Data
Operations
Read up on state of the art, domain specific research
“How to Train Good Word Embeddings
for Biomedical NLP”.
Chiu et al., In Proceedings of BioNLP’16, August 2016.
“Entity Recognition from Clinical Texts via Recurrent
Neural Network”.
Liu et al., BMC Medical Informatics & Decision Making, July 2017.
Are your ML/DL/NLP libraries research or industrial grade?
Data Sources API
Spark Core API (RDD’s, Project Tungsten)
Spark SQL API (DataFrame, Catalyst Optimizer)
Spark ML API (Pipeline, Transformer, Estimator)
Part of Speech Tagger
Named Entity Recognition
Sentiment Analysis
Spell Checker
Tokenizer
Stemmer
Lemmatizer
Entity Extraction
Topic Modeling
Word2Vec
TF-IDF
String distance calculation
N-grams calculation
Stop word removal
Train/Test & Cross-Validate
Ensembles
High Performance Natural Language Understanding at Scale
Data
Curation
Data
Science
Data
Engineering
Data
Operations
DeepLearning4j Spark-NLP
Data
Curation
Data
Science
Data
Engineering
Data
Operations
Data
Curation
Data
Science
Data
Engineering
Data
Operations
From: Post by Ben Lorica
david@pacific.ai
@davidtalby
in/davidtalby
THANK YOU!

More Related Content

Similar to Deep learning for natural language understanding

Workshop on Assignment 2 SCI115 Live workshop 103020.docx
Workshop on Assignment 2 SCI115 Live workshop 103020.docxWorkshop on Assignment 2 SCI115 Live workshop 103020.docx
Workshop on Assignment 2 SCI115 Live workshop 103020.docx
dunnramage
 
How deep learning reshapes medicine
How deep learning reshapes medicineHow deep learning reshapes medicine
How deep learning reshapes medicine
Hongyoon Choi
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
Akshay Hegde
 
NLP in Practice - Part I
NLP in Practice - Part INLP in Practice - Part I
NLP in Practice - Part I
Delip Rao
 
CSCE181 Big ideas in NLP
CSCE181 Big ideas in NLPCSCE181 Big ideas in NLP
CSCE181 Big ideas in NLP
Insoo Chung
 
The (very) basics of AI for the Radiology resident
The (very) basics of AI for the Radiology residentThe (very) basics of AI for the Radiology resident
The (very) basics of AI for the Radiology resident
Pedro Staziaki
 
Snorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher RéSnorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher Ré
Databricks
 
Snorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher RéSnorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher Ré
Jen Aman
 
Deep Learning - Speaker Verification, Sound Event Detection
Deep Learning - Speaker Verification, Sound Event DetectionDeep Learning - Speaker Verification, Sound Event Detection
Deep Learning - Speaker Verification, Sound Event Detection
Sai Kiran Kadam
 
Week1- Introduction.pptx
Week1- Introduction.pptxWeek1- Introduction.pptx
Week1- Introduction.pptx
fahmi324663
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
RajkiranVeluri
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
Boston Institute of Analytics
 
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Databricks
 
How to Apply NLP to Analyze Clinical Trials
How to Apply NLP to Analyze Clinical TrialsHow to Apply NLP to Analyze Clinical Trials
How to Apply NLP to Analyze Clinical Trials
David Talby
 
Using Machine Learning to Automate Clinical Pathways
Using Machine Learning to Automate Clinical PathwaysUsing Machine Learning to Automate Clinical Pathways
Using Machine Learning to Automate Clinical Pathways
diannepatricia
 
A Practical Use of Artificial Intelligence in the Fight Against Cancer by Bri...
A Practical Use of Artificial Intelligence in the Fight Against Cancer by Bri...A Practical Use of Artificial Intelligence in the Fight Against Cancer by Bri...
A Practical Use of Artificial Intelligence in the Fight Against Cancer by Bri...
Data Con LA
 
Text Analytics for Semantic Computing
Text Analytics for Semantic ComputingText Analytics for Semantic Computing
Text Analytics for Semantic Computing
Meena Nagarajan
 
Deep Learning for Automatic Speaker Recognition
Deep Learning for Automatic Speaker RecognitionDeep Learning for Automatic Speaker Recognition
Deep Learning for Automatic Speaker Recognition
Sai Kiran Kadam
 
Text analysis-semantic-search
Text analysis-semantic-searchText analysis-semantic-search
Text analysis-semantic-search
Diana Maynard
 
Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
Shishir Choudhary
 

Similar to Deep learning for natural language understanding (20)

Workshop on Assignment 2 SCI115 Live workshop 103020.docx
Workshop on Assignment 2 SCI115 Live workshop 103020.docxWorkshop on Assignment 2 SCI115 Live workshop 103020.docx
Workshop on Assignment 2 SCI115 Live workshop 103020.docx
 
How deep learning reshapes medicine
How deep learning reshapes medicineHow deep learning reshapes medicine
How deep learning reshapes medicine
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
 
NLP in Practice - Part I
NLP in Practice - Part INLP in Practice - Part I
NLP in Practice - Part I
 
CSCE181 Big ideas in NLP
CSCE181 Big ideas in NLPCSCE181 Big ideas in NLP
CSCE181 Big ideas in NLP
 
The (very) basics of AI for the Radiology resident
The (very) basics of AI for the Radiology residentThe (very) basics of AI for the Radiology resident
The (very) basics of AI for the Radiology resident
 
Snorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher RéSnorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher Ré
 
Snorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher RéSnorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher Ré
 
Deep Learning - Speaker Verification, Sound Event Detection
Deep Learning - Speaker Verification, Sound Event DetectionDeep Learning - Speaker Verification, Sound Event Detection
Deep Learning - Speaker Verification, Sound Event Detection
 
Week1- Introduction.pptx
Week1- Introduction.pptxWeek1- Introduction.pptx
Week1- Introduction.pptx
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
 
How to Apply NLP to Analyze Clinical Trials
How to Apply NLP to Analyze Clinical TrialsHow to Apply NLP to Analyze Clinical Trials
How to Apply NLP to Analyze Clinical Trials
 
Using Machine Learning to Automate Clinical Pathways
Using Machine Learning to Automate Clinical PathwaysUsing Machine Learning to Automate Clinical Pathways
Using Machine Learning to Automate Clinical Pathways
 
A Practical Use of Artificial Intelligence in the Fight Against Cancer by Bri...
A Practical Use of Artificial Intelligence in the Fight Against Cancer by Bri...A Practical Use of Artificial Intelligence in the Fight Against Cancer by Bri...
A Practical Use of Artificial Intelligence in the Fight Against Cancer by Bri...
 
Text Analytics for Semantic Computing
Text Analytics for Semantic ComputingText Analytics for Semantic Computing
Text Analytics for Semantic Computing
 
Deep Learning for Automatic Speaker Recognition
Deep Learning for Automatic Speaker RecognitionDeep Learning for Automatic Speaker Recognition
Deep Learning for Automatic Speaker Recognition
 
Text analysis-semantic-search
Text analysis-semantic-searchText analysis-semantic-search
Text analysis-semantic-search
 
Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
 

More from David Talby

Building State-of-the-art Natural Language Processing Projects with Free Soft...
Building State-of-the-art Natural Language Processing Projects with Free Soft...Building State-of-the-art Natural Language Processing Projects with Free Soft...
Building State-of-the-art Natural Language Processing Projects with Free Soft...
David Talby
 
Turning Medical Expert Knowledge into Responsible Language Models - K1st World
Turning Medical Expert Knowledge into Responsible Language Models - K1st WorldTurning Medical Expert Knowledge into Responsible Language Models - K1st World
Turning Medical Expert Knowledge into Responsible Language Models - K1st World
David Talby
 
New Frontiers in Applied NLP​ - PAW Healthcare 2022
New Frontiers in Applied NLP​ - PAW Healthcare 2022New Frontiers in Applied NLP​ - PAW Healthcare 2022
New Frontiers in Applied NLP​ - PAW Healthcare 2022
David Talby
 
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
David Talby
 
Applying NLP to Personalized Healthcare - 2021
Applying NLP to Personalized Healthcare - 2021Applying NLP to Personalized Healthcare - 2021
Applying NLP to Personalized Healthcare - 2021
David Talby
 
Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...
Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...
Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...
David Talby
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 edition
David Talby
 
Build your open source data science platform
Build your open source data science platformBuild your open source data science platform
Build your open source data science platform
David Talby
 
Natural Language Understanding with Machine Learned Annotators and Deep Learn...
Natural Language Understanding with Machine Learned Annotators and Deep Learn...Natural Language Understanding with Machine Learned Annotators and Deep Learn...
Natural Language Understanding with Machine Learned Annotators and Deep Learn...
David Talby
 
Architecting a Predictive, Petabyte-Scale, Self-Learning Fraud Detection System
Architecting a Predictive,  Petabyte-Scale, Self-Learning Fraud Detection SystemArchitecting a Predictive,  Petabyte-Scale, Self-Learning Fraud Detection System
Architecting a Predictive, Petabyte-Scale, Self-Learning Fraud Detection System
David Talby
 
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
David Talby
 

More from David Talby (11)

Building State-of-the-art Natural Language Processing Projects with Free Soft...
Building State-of-the-art Natural Language Processing Projects with Free Soft...Building State-of-the-art Natural Language Processing Projects with Free Soft...
Building State-of-the-art Natural Language Processing Projects with Free Soft...
 
Turning Medical Expert Knowledge into Responsible Language Models - K1st World
Turning Medical Expert Knowledge into Responsible Language Models - K1st WorldTurning Medical Expert Knowledge into Responsible Language Models - K1st World
Turning Medical Expert Knowledge into Responsible Language Models - K1st World
 
New Frontiers in Applied NLP​ - PAW Healthcare 2022
New Frontiers in Applied NLP​ - PAW Healthcare 2022New Frontiers in Applied NLP​ - PAW Healthcare 2022
New Frontiers in Applied NLP​ - PAW Healthcare 2022
 
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
 
Applying NLP to Personalized Healthcare - 2021
Applying NLP to Personalized Healthcare - 2021Applying NLP to Personalized Healthcare - 2021
Applying NLP to Personalized Healthcare - 2021
 
Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...
Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...
Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 edition
 
Build your open source data science platform
Build your open source data science platformBuild your open source data science platform
Build your open source data science platform
 
Natural Language Understanding with Machine Learned Annotators and Deep Learn...
Natural Language Understanding with Machine Learned Annotators and Deep Learn...Natural Language Understanding with Machine Learned Annotators and Deep Learn...
Natural Language Understanding with Machine Learned Annotators and Deep Learn...
 
Architecting a Predictive, Petabyte-Scale, Self-Learning Fraud Detection System
Architecting a Predictive,  Petabyte-Scale, Self-Learning Fraud Detection SystemArchitecting a Predictive,  Petabyte-Scale, Self-Learning Fraud Detection System
Architecting a Predictive, Petabyte-Scale, Self-Learning Fraud Detection System
 
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
 

Recently uploaded

Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
Alina Yurenko
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
ShamsuddeenMuhammadA
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Google
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi ArabiaTop 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
Yara Milbes
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Shane Coughlan
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 

Recently uploaded (20)

Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi ArabiaTop 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 

Deep learning for natural language understanding

  • 1. Dr. David Talby DEEP LEARNING FOR NATURAL LANGUAGE UNDERSTANDING
  • 2. CONTENTS  NLP & THE PROMISE OF DEEP LEARNING  IN ACTION: NAMED ENTITY RECOGNITION  GOING TO PRODUCTION
  • 3. AI VS. DOCTORS Deep Learning Computer Vision Access to Care Diagnostic Accuracy
  • 4. NLP IN HEALTHCARE Deep Learning NLP Efficiency Accuracy Radiology Diagnostic Mental Health Safety Events Inpatient Pre- Auth Key Opinion Leaders Research Meta Analysis Clinical Coding Financial Anti- Fraud Adverse Events Drug Development Recruit for Trials
  • 5. Natural Language Understanding is an AI-Complete problem.
  • 6. ED Triage Notes states started last night, upper abd, took alka seltzer approx 0500, no relief. nausea no vomiting Since yeatreday 10/10 "constant Tylenol 1 hr ago. +nausea. diaphoretic. Mid abd radiates to back Generalized abd radiating to lower x 3 days accompanied by dark stools. Now with bloody stool this am. Denies dizzy, sob, fatigue. Visiting from Japan on business.” Features Type of Pain Intensity of Pain Body part of region Symptoms Onset of symptoms Attempted home remedy HUMAN LANGUAGE IS CONTEXTUAL
  • 8. THE PROMISE OF DEEP LEARNING Get by with rules, search, RegEx, attribute extraction Welcome to the world of NLP, ML and DL Social media Does this social media post contain an offensive word? Is this social media post offensive? Legal Find patents with the terms ‘car’ and battery’, or synonyms Who is patenting next-gen electrical car batteries? Support Find products mentioned in customer emails or phone calls What is this customer complaining about? Finance Extract the fee structure from a mutual fund prospectus Are UK pensions allowed to invest in this fund? Healthcare Extract the patient’s blood pressure reading from a note Does this patient have high blood pressure?
  • 9. CONTENTS  NLP & THE PROMISE OF DEEP LEARNING  IN ACTION: NAMED ENTITY RECOGNITION  GOING TO PRODUCTION
  • 10. NAMED ENTITY RECOGNITION From Sutton & McCallum’s An Introduction to Conditional Random Fields.
  • 11. FROM CRF TO DEEP LEARNING (AND BACK) From Yves Peirsman’s Named Entity Recognition and the Road to Deep Learning • CoNLL-2003 shared task dataset • CRF++ Implementation • Feature engineering: • the token itself • Its Bigram & trigram • Their prefix & suffix • Its part of speech • Its chunk type • Does it start with a capital? • Is it uppercase? • Is it a digit? • Surrounding context words Starting Point: “Classic” machine learning approach 81.15% F-score
  • 12. CRF + WORD EMBEDDINGS From Yves Peirsman’s Named Entity Recognition and the Road to Deep Learning Replacing curated dictionaries with embeddings to model semantic similarity 84.9% F-score
  • 13. FORGET CRF. LET’S USE AN LSTM NETWORK From Yves Peirsman’s Named Entity Recognition and the Road to Deep Learning An LSTM is a type of RNN, well suited for sequential data with long-term dependencies 64.9% LSTM F-score 76.1% biLSTM F-score
  • 14. TRANSFER LEARNING: USE PRETRAINED EMBEDDINGS From Yves Peirsman’s Named Entity Recognition and the Road to Deep Learning 85.9% F-score Reuse the embeddings trained on Wikipedia, instead of on CoNNL which only has 200,000 words
  • 15. ADD CHARACTER BASED MODEL: BI-LSTM OR CNN From Yves Peirsman’s Named Entity Recognition and the Road to Deep Learning 89.3% F-score In addition to token based models, add a character-based biLSTM or CNN to learn and model word prefixes and suffixes
  • 16. LET’S GET OVER 90% - BRING BACK THE CRF! From Yves Peirsman’s Named Entity Recognition and the Road to Deep Learning 90.3% F-score Because predicting all labels independently of each other, not taking into account the labels predicted for the surrounding words, leaves some accuracy on the table
  • 17. In deep learning, architecture engineering is the new feature engineering. Stephen Merity
  • 18. CONTENTS  NLP & THE PROMISE OF DEEP LEARNING  IN ACTION: NAMED ENTITY RECOGNITION  GOING TO PRODUCTION
  • 19. Data Curation Data Science Data Engineering Data Operations Moving from research to production?  Business Case  All four roles on the team
  • 20. Data Curation Data Science Data Engineering Data Operations Get the data Get expert labels Get pretrained datasets & embeddings “Inception v3 was trained on 1.28 million images” “In the study, the algorithm went head-to-head against 21 board- certified dermatologists” Facebook open sourced pre-trained word vectors for 294 languages, trained on Wikipedia using fastText “used over 120,000 retinal images to train a neural network to detect diabetic retinopathy” “All images were graded by 3 to 7 different ophthalmologists, from a panel of 54 US-licensed senior residents & ophthalmologists” UMLS has over 1 million biomedical concepts and 5 million concept names, from over 100 controlled vocabularies
  • 21. Data Curation Data Science Data Engineering Data Operations Read up on state of the art, domain specific research “How to Train Good Word Embeddings for Biomedical NLP”. Chiu et al., In Proceedings of BioNLP’16, August 2016. “Entity Recognition from Clinical Texts via Recurrent Neural Network”. Liu et al., BMC Medical Informatics & Decision Making, July 2017. Are your ML/DL/NLP libraries research or industrial grade?
  • 22. Data Sources API Spark Core API (RDD’s, Project Tungsten) Spark SQL API (DataFrame, Catalyst Optimizer) Spark ML API (Pipeline, Transformer, Estimator) Part of Speech Tagger Named Entity Recognition Sentiment Analysis Spell Checker Tokenizer Stemmer Lemmatizer Entity Extraction Topic Modeling Word2Vec TF-IDF String distance calculation N-grams calculation Stop word removal Train/Test & Cross-Validate Ensembles High Performance Natural Language Understanding at Scale Data Curation Data Science Data Engineering Data Operations DeepLearning4j Spark-NLP

Editor's Notes

  1. There is not one “language” – every vertical and communication channel has its own jargon that includes vocabulary, grammar, assumptions and semantics. For example – in these ED triage notes, none of the sentences is in valid English, and the words “patient” and “pain” do not appear.
  2. Another challenge is that a lot of what we say is not in the text itself – it’s about the relationship, occasion, social norms, feeling to be communicated. Language can be viewed as a compression problem – can you summarize a 2-hour event into a few sentences? How was the movie? What did the doctor say?
  3. Challenges in NER: Going beyond dictionaries and lists. For examples, “Chandler” is obviously not the city of Chandler, AZ and “Central Perk” is obviously a place even if you’ve never heard of it (since it the location of a meeting). There can be many kinds of entities that a given problem will need to extract: companies, people, genes, diseases, financial terms, etc.