SlideShare a Scribd company logo
David Talby
@davidtalby
CTO, Atigeo
SEMANTIC NATURAL LANGUAGE UNDERSTANDING
WITH SPARK, UIMA & MACHINE-LEARNED ONTOLOGIES
Claudiu Branzan
@melcutz
Principal Lead, Atigeo
2
2
THE PROBLEM
Who needs to
be vaccinated?
Who fits this
clinical trial?
Who is at risk
for sepsis?
Who is getting
meds they’re
allergic to?
Who on this protocol
did not have this
side effect?
3
AT THE BEGINNING, THERE WAS SEARCH
Scalable & robust Indexing pipeline
Tokenizers & analyzers
Synonyms, spellers & Auto-suggest
File formats & header boosting
Rankers, link & reputation boosting
4
THEN THERE WAS SEMANTIC SEARCH
“cheap red prom dresses”
“laptops under $500”
“italian restaurants near me that deliver”
“captain america civil war tonight”
“nba scores”
Dictionary Based Attribute Extraction
Dell - XPS 15.6 4K Ultra HD Touch-Screen
Laptop - Intel Core i5 - 8GB Memory -
256GB Solid State Drive - Silver
Machine Learned Attribute Extraction
If you go for the ambience, you'll be
disappointed. If you go for good,
inexpensive and authentic Mexican food,
then you're in the right place.
5
AND THEN, YOU NEED TO UNDERSTAND LANGUAGE
Prescribing sick days due to diagnosis of influenza. Positive
Jane complains about flu-like symptoms. Speculative
Jane may be experiencing some sort of flu episode. Possible
Jane’s RIDT came back negative for influenza. Negative
Jane is at high risk for flu if she’s not vaccinated. Conditional
Jane’s older brother had the flu last month. Family history
Jane had a severe case of flu last year. Patient history
6
LANGUAGE GETS COMPLEX & DOMAIN SPECIFIC
Joe expressed concerns about the risks of bird flu. Nothing
Joe shows no signs of stroke, except for numbness. Double Negative
Nausea, vomiting and ankle swelling negative. Compound
(it gets worse – in reality a lot of text isn’t valid English)
Patient denies alcohol abuse. Speculative
Allergies: Penicillin, Dust, Sneezing. Compound
7
7
LET’S BUILD THIS!
The input
(patient records)
The processing
framework
The output The query engines
8
8
SENTENCE DETECTION
SECTION DETECTION
TOKENIZER LEMMATIZER
STOPWORD REMOVAL
NEGATION DETECTION
CONDITIONAL SCOPE
SPECULATIVE SCOPE
DATE NUMBER UNIT QUANITITY
CONCEPT EXTRACTION
9
9
First Demo: Annotators & Assertions
1 0
10
MACHINE LEARNED ANNOTATORS
Grammatical Patterns
If … then …
Direct Inferences
Age < 18 ==> Child
Lookups
RIDT (lab test)
Under-diagnosed conditions
Flu Depression
Implied by Context
relevant labs normal
Sometimes, it’s easier to just code an annotation’s business logic
But sometimes it’s easier to learn it from examples:
1 1
11
Second Demo: Machine Learned Annotator
1 2
1 3
13
WHAT ABOUT EXPANDING & UPDATING ONTOLOGIES?
Word2Vec
1 4
14
LET’S BUILD THIS TOO!
1 5
15
Third Demo: Ontology Enrichment
1 6
16
SUMMARY & APPLICATIONS
Who needs to
be vaccinated?
Who fits this
clinical trial?
Who is at risk
for sepsis?
Who is getting
meds they’re
allergic to?
Who on this protocol
did not have this
side effect?
1 7
17
@Atigeo
@melcutz
@davidtalby
© 2015 Atigeo, Corporation. All rights reserved. Atigeo and the xPatterns logo are trademarks of Atigeo. The information herein is for informational purposes only and represents the current view of Atigeo as of the date of this presentation. Because Atigeo
must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Atigeo, and Atigeo cannot guarantee the accuracy of any information provided after the date of this presentation. ATIGEO MAKES NO WARRANTIES,
EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
APPENDIX
In case the live demo gets cold feet on stage
1 9
2 0
2 1
2 2
2 3
2 4
2 5
2 6
2 7
2 8

More Related Content

Viewers also liked

"There's a bot for that!" - The World of Conversational UIs and Chat Bots
"There's a bot for that!" - The World of Conversational UIs and Chat Bots"There's a bot for that!" - The World of Conversational UIs and Chat Bots
"There's a bot for that!" - The World of Conversational UIs and Chat Bots
Vishrut Shukla
 
Natural Language Processing for the Semantic Web
Natural Language Processing for the Semantic WebNatural Language Processing for the Semantic Web
Natural Language Processing for the Semantic Web
Isabelle Augenstein
 
Tokyo azure meetup #13 build bots with azure bot services
Tokyo azure meetup #13   build bots with azure bot servicesTokyo azure meetup #13   build bots with azure bot services
Tokyo azure meetup #13 build bots with azure bot services
Tokyo Azure Meetup
 
Artificial Intelligence as an Interface - How Conversation Bots Are Changing ...
Artificial Intelligence as an Interface - How Conversation Bots Are Changing ...Artificial Intelligence as an Interface - How Conversation Bots Are Changing ...
Artificial Intelligence as an Interface - How Conversation Bots Are Changing ...
Sage Franch
 
Online Predictive Modeling of Fraud Schemes from Mulitple Live Streams by Cla...
Online Predictive Modeling of Fraud Schemes from Mulitple Live Streams by Cla...Online Predictive Modeling of Fraud Schemes from Mulitple Live Streams by Cla...
Online Predictive Modeling of Fraud Schemes from Mulitple Live Streams by Cla...
Spark Summit
 
Data day2017
Data day2017Data day2017
Data day2017
Sanghamitra Deb
 
From Rocket Science to Data Science
From Rocket Science to Data ScienceFrom Rocket Science to Data Science
From Rocket Science to Data Science
Sanghamitra Deb
 
Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015
DataWorks Summit
 
Big Data Analytics for Healthcare Decision Support- Operational and Clinical
Big Data Analytics for Healthcare Decision Support- Operational and ClinicalBig Data Analytics for Healthcare Decision Support- Operational and Clinical
Big Data Analytics for Healthcare Decision Support- Operational and Clinical
Adrish Sannyasi
 
Using Machine Learning to Automate Clinical Pathways
Using Machine Learning to Automate Clinical PathwaysUsing Machine Learning to Automate Clinical Pathways
Using Machine Learning to Automate Clinical Pathways
diannepatricia
 
Malcolm Pradhan on Pathology in Clincial Decision Support and the role of Dee...
Malcolm Pradhan on Pathology in Clincial Decision Support and the role of Dee...Malcolm Pradhan on Pathology in Clincial Decision Support and the role of Dee...
Malcolm Pradhan on Pathology in Clincial Decision Support and the role of Dee...
Cirdan
 
Clinical Trial Management Systems of next next decade
Clinical Trial Management Systems of next next decadeClinical Trial Management Systems of next next decade
Clinical Trial Management Systems of next next decade
Fotis Stathopoulos
 
Clinical research and clinical data management - Ikya Global
Clinical research and clinical data management - Ikya GlobalClinical research and clinical data management - Ikya Global
Clinical research and clinical data management - Ikya Global
ikya global
 
Oncology Big Data: A Mirage or Oasis of Clinical Value?
Oncology Big Data:  A Mirage or Oasis of Clinical Value? Oncology Big Data:  A Mirage or Oasis of Clinical Value?
Oncology Big Data: A Mirage or Oasis of Clinical Value?
Michael Peters
 
Clinical Data Management: Strategies for unregulated data
Clinical Data Management: Strategies for unregulated dataClinical Data Management: Strategies for unregulated data
Clinical Data Management: Strategies for unregulated data
IUPUI
 
Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence
Muhammad Ahad
 
Flexible Study Design in Oracle Clinical and Remote Data Capture 4.6
Flexible Study Design in Oracle Clinical and Remote Data Capture 4.6Flexible Study Design in Oracle Clinical and Remote Data Capture 4.6
Flexible Study Design in Oracle Clinical and Remote Data Capture 4.6Perficient
 
Deep Learning and Recurrent Neural Networks in the Enterprise
Deep Learning and Recurrent Neural Networks in the EnterpriseDeep Learning and Recurrent Neural Networks in the Enterprise
Deep Learning and Recurrent Neural Networks in the Enterprise
Josh Patterson
 
Smart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVecSmart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVec
Josh Patterson
 

Viewers also liked (20)

"There's a bot for that!" - The World of Conversational UIs and Chat Bots
"There's a bot for that!" - The World of Conversational UIs and Chat Bots"There's a bot for that!" - The World of Conversational UIs and Chat Bots
"There's a bot for that!" - The World of Conversational UIs and Chat Bots
 
Natural Language Processing for the Semantic Web
Natural Language Processing for the Semantic WebNatural Language Processing for the Semantic Web
Natural Language Processing for the Semantic Web
 
Tokyo azure meetup #13 build bots with azure bot services
Tokyo azure meetup #13   build bots with azure bot servicesTokyo azure meetup #13   build bots with azure bot services
Tokyo azure meetup #13 build bots with azure bot services
 
Artificial Intelligence as an Interface - How Conversation Bots Are Changing ...
Artificial Intelligence as an Interface - How Conversation Bots Are Changing ...Artificial Intelligence as an Interface - How Conversation Bots Are Changing ...
Artificial Intelligence as an Interface - How Conversation Bots Are Changing ...
 
Online Predictive Modeling of Fraud Schemes from Mulitple Live Streams by Cla...
Online Predictive Modeling of Fraud Schemes from Mulitple Live Streams by Cla...Online Predictive Modeling of Fraud Schemes from Mulitple Live Streams by Cla...
Online Predictive Modeling of Fraud Schemes from Mulitple Live Streams by Cla...
 
Data day2017
Data day2017Data day2017
Data day2017
 
From Rocket Science to Data Science
From Rocket Science to Data ScienceFrom Rocket Science to Data Science
From Rocket Science to Data Science
 
Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015
 
Big Data Analytics for Healthcare Decision Support- Operational and Clinical
Big Data Analytics for Healthcare Decision Support- Operational and ClinicalBig Data Analytics for Healthcare Decision Support- Operational and Clinical
Big Data Analytics for Healthcare Decision Support- Operational and Clinical
 
Using Machine Learning to Automate Clinical Pathways
Using Machine Learning to Automate Clinical PathwaysUsing Machine Learning to Automate Clinical Pathways
Using Machine Learning to Automate Clinical Pathways
 
Malcolm Pradhan on Pathology in Clincial Decision Support and the role of Dee...
Malcolm Pradhan on Pathology in Clincial Decision Support and the role of Dee...Malcolm Pradhan on Pathology in Clincial Decision Support and the role of Dee...
Malcolm Pradhan on Pathology in Clincial Decision Support and the role of Dee...
 
Clinical Trial Management Systems of next next decade
Clinical Trial Management Systems of next next decadeClinical Trial Management Systems of next next decade
Clinical Trial Management Systems of next next decade
 
Clinical research and clinical data management - Ikya Global
Clinical research and clinical data management - Ikya GlobalClinical research and clinical data management - Ikya Global
Clinical research and clinical data management - Ikya Global
 
Oncology Big Data: A Mirage or Oasis of Clinical Value?
Oncology Big Data:  A Mirage or Oasis of Clinical Value? Oncology Big Data:  A Mirage or Oasis of Clinical Value?
Oncology Big Data: A Mirage or Oasis of Clinical Value?
 
Clinical Data Management: Strategies for unregulated data
Clinical Data Management: Strategies for unregulated dataClinical Data Management: Strategies for unregulated data
Clinical Data Management: Strategies for unregulated data
 
Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence
 
NLP
NLPNLP
NLP
 
Flexible Study Design in Oracle Clinical and Remote Data Capture 4.6
Flexible Study Design in Oracle Clinical and Remote Data Capture 4.6Flexible Study Design in Oracle Clinical and Remote Data Capture 4.6
Flexible Study Design in Oracle Clinical and Remote Data Capture 4.6
 
Deep Learning and Recurrent Neural Networks in the Enterprise
Deep Learning and Recurrent Neural Networks in the EnterpriseDeep Learning and Recurrent Neural Networks in the Enterprise
Deep Learning and Recurrent Neural Networks in the Enterprise
 
Smart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVecSmart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVec
 

Similar to Semantic Natural Language Understanding with Spark, UIMA & Machine Learned Ontologies

2014 abic-talk
2014 abic-talk2014 abic-talk
2014 abic-talk
c.titus.brown
 
How Four Statistical Rules Forecast Who Wins a Competitive Bid
How Four Statistical Rules Forecast Who Wins a Competitive BidHow Four Statistical Rules Forecast Who Wins a Competitive Bid
How Four Statistical Rules Forecast Who Wins a Competitive Bid
IntelCollab.com
 
All you need know about testing
All you need know about testingAll you need know about testing
All you need know about testing
Jorge Barroso
 
Essential Biology 04.4 Genetic Engineering & Biotechnology
Essential Biology 04.4 Genetic Engineering & BiotechnologyEssential Biology 04.4 Genetic Engineering & Biotechnology
Essential Biology 04.4 Genetic Engineering & Biotechnology
Stephen Taylor
 
BigData and Algorithms - LA Algorithmic Trading
BigData and Algorithms - LA Algorithmic TradingBigData and Algorithms - LA Algorithmic Trading
BigData and Algorithms - LA Algorithmic Trading
Tim Shea
 
Discount Usability Testing for Agile Teams
Discount Usability Testing for Agile TeamsDiscount Usability Testing for Agile Teams
Discount Usability Testing for Agile Teams
Ben Carey
 
IMA How to Give A Great Research Talk
IMA How to Give A Great Research Talk IMA How to Give A Great Research Talk
IMA How to Give A Great Research Talk
Julie Greensmith
 
Opsec for security researchers
Opsec for security researchersOpsec for security researchers
Opsec for security researchers
vicenteDiaz_KL
 
Gamification of Chaos Testing
Gamification of Chaos TestingGamification of Chaos Testing
Gamification of Chaos Testing
Bram Vogelaar
 
The Semantic Web - This time... its Personal
The Semantic Web - This time... its PersonalThe Semantic Web - This time... its Personal
The Semantic Web - This time... its Personal
Mark Wilkinson
 
4 Factors That Affect Research Reproducibility
4 Factors That Affect Research Reproducibility4 Factors That Affect Research Reproducibility
4 Factors That Affect Research Reproducibility
Cellero
 
Bioanalytical validation house of cards
Bioanalytical validation house of cardsBioanalytical validation house of cards
Bioanalytical validation house of cards
E. Dennis Bashaw
 
Stuart Reid - When Passion Obscures the Facts:The Case For Evidence-Based Te...
Stuart Reid  - When Passion Obscures the Facts:The Case For Evidence-Based Te...Stuart Reid  - When Passion Obscures the Facts:The Case For Evidence-Based Te...
Stuart Reid - When Passion Obscures the Facts:The Case For Evidence-Based Te...
TEST Huddle
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceDr. Haxel Consult
 
Deep Learning Applications in the Enterprise
Deep Learning Applications in the EnterpriseDeep Learning Applications in the Enterprise
Deep Learning Applications in the Enterprise
Ganes Kesari
 
Chaos Engineering Without Observability ... Is Just Chaos
Chaos Engineering Without Observability ... Is Just ChaosChaos Engineering Without Observability ... Is Just Chaos
Chaos Engineering Without Observability ... Is Just Chaos
Charity Majors
 
TCUK 2012, Leah Guren, Golden Rules Redux
TCUK 2012, Leah Guren, Golden Rules ReduxTCUK 2012, Leah Guren, Golden Rules Redux
TCUK 2012, Leah Guren, Golden Rules Redux
TCUK Conference
 
Hogeschool Den Haag Legal Analytics
Hogeschool Den Haag Legal AnalyticsHogeschool Den Haag Legal Analytics
Hogeschool Den Haag Legal Analytics
jcscholtes
 
COVID-19 Antibody Test+Vaccination Certificates: There's an app for that
COVID-19 Antibody Test+Vaccination Certificates: There's an app for thatCOVID-19 Antibody Test+Vaccination Certificates: There's an app for that
COVID-19 Antibody Test+Vaccination Certificates: There's an app for that
meisenstadt
 

Similar to Semantic Natural Language Understanding with Spark, UIMA & Machine Learned Ontologies (20)

2014 abic-talk
2014 abic-talk2014 abic-talk
2014 abic-talk
 
How Four Statistical Rules Forecast Who Wins a Competitive Bid
How Four Statistical Rules Forecast Who Wins a Competitive BidHow Four Statistical Rules Forecast Who Wins a Competitive Bid
How Four Statistical Rules Forecast Who Wins a Competitive Bid
 
All you need know about testing
All you need know about testingAll you need know about testing
All you need know about testing
 
Essential Biology 04.4 Genetic Engineering & Biotechnology
Essential Biology 04.4 Genetic Engineering & BiotechnologyEssential Biology 04.4 Genetic Engineering & Biotechnology
Essential Biology 04.4 Genetic Engineering & Biotechnology
 
BigData and Algorithms - LA Algorithmic Trading
BigData and Algorithms - LA Algorithmic TradingBigData and Algorithms - LA Algorithmic Trading
BigData and Algorithms - LA Algorithmic Trading
 
Discount Usability Testing for Agile Teams
Discount Usability Testing for Agile TeamsDiscount Usability Testing for Agile Teams
Discount Usability Testing for Agile Teams
 
IMA How to Give A Great Research Talk
IMA How to Give A Great Research Talk IMA How to Give A Great Research Talk
IMA How to Give A Great Research Talk
 
Opsec for security researchers
Opsec for security researchersOpsec for security researchers
Opsec for security researchers
 
Plexus Sept Oct 2013
Plexus Sept Oct 2013Plexus Sept Oct 2013
Plexus Sept Oct 2013
 
Gamification of Chaos Testing
Gamification of Chaos TestingGamification of Chaos Testing
Gamification of Chaos Testing
 
The Semantic Web - This time... its Personal
The Semantic Web - This time... its PersonalThe Semantic Web - This time... its Personal
The Semantic Web - This time... its Personal
 
4 Factors That Affect Research Reproducibility
4 Factors That Affect Research Reproducibility4 Factors That Affect Research Reproducibility
4 Factors That Affect Research Reproducibility
 
Bioanalytical validation house of cards
Bioanalytical validation house of cardsBioanalytical validation house of cards
Bioanalytical validation house of cards
 
Stuart Reid - When Passion Obscures the Facts:The Case For Evidence-Based Te...
Stuart Reid  - When Passion Obscures the Facts:The Case For Evidence-Based Te...Stuart Reid  - When Passion Obscures the Facts:The Case For Evidence-Based Te...
Stuart Reid - When Passion Obscures the Facts:The Case For Evidence-Based Te...
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
Deep Learning Applications in the Enterprise
Deep Learning Applications in the EnterpriseDeep Learning Applications in the Enterprise
Deep Learning Applications in the Enterprise
 
Chaos Engineering Without Observability ... Is Just Chaos
Chaos Engineering Without Observability ... Is Just ChaosChaos Engineering Without Observability ... Is Just Chaos
Chaos Engineering Without Observability ... Is Just Chaos
 
TCUK 2012, Leah Guren, Golden Rules Redux
TCUK 2012, Leah Guren, Golden Rules ReduxTCUK 2012, Leah Guren, Golden Rules Redux
TCUK 2012, Leah Guren, Golden Rules Redux
 
Hogeschool Den Haag Legal Analytics
Hogeschool Den Haag Legal AnalyticsHogeschool Den Haag Legal Analytics
Hogeschool Den Haag Legal Analytics
 
COVID-19 Antibody Test+Vaccination Certificates: There's an app for that
COVID-19 Antibody Test+Vaccination Certificates: There's an app for thatCOVID-19 Antibody Test+Vaccination Certificates: There's an app for that
COVID-19 Antibody Test+Vaccination Certificates: There's an app for that
 

More from David Talby

Building State-of-the-art Natural Language Processing Projects with Free Soft...
Building State-of-the-art Natural Language Processing Projects with Free Soft...Building State-of-the-art Natural Language Processing Projects with Free Soft...
Building State-of-the-art Natural Language Processing Projects with Free Soft...
David Talby
 
Turning Medical Expert Knowledge into Responsible Language Models - K1st World
Turning Medical Expert Knowledge into Responsible Language Models - K1st WorldTurning Medical Expert Knowledge into Responsible Language Models - K1st World
Turning Medical Expert Knowledge into Responsible Language Models - K1st World
David Talby
 
How to Apply NLP to Analyze Clinical Trials
How to Apply NLP to Analyze Clinical TrialsHow to Apply NLP to Analyze Clinical Trials
How to Apply NLP to Analyze Clinical Trials
David Talby
 
New Frontiers in Applied NLP​ - PAW Healthcare 2022
New Frontiers in Applied NLP​ - PAW Healthcare 2022New Frontiers in Applied NLP​ - PAW Healthcare 2022
New Frontiers in Applied NLP​ - PAW Healthcare 2022
David Talby
 
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
David Talby
 
Applying NLP to Personalized Healthcare - 2021
Applying NLP to Personalized Healthcare - 2021Applying NLP to Personalized Healthcare - 2021
Applying NLP to Personalized Healthcare - 2021
David Talby
 
Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...
Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...
Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...
David Talby
 
Natural Language Understanding in Healthcare
Natural Language Understanding in HealthcareNatural Language Understanding in Healthcare
Natural Language Understanding in Healthcare
David Talby
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 edition
David Talby
 
Deep learning for natural language understanding
Deep learning for natural language understandingDeep learning for natural language understanding
Deep learning for natural language understanding
David Talby
 
Build your open source data science platform
Build your open source data science platformBuild your open source data science platform
Build your open source data science platform
David Talby
 
Architecting a Predictive, Petabyte-Scale, Self-Learning Fraud Detection System
Architecting a Predictive,  Petabyte-Scale, Self-Learning Fraud Detection SystemArchitecting a Predictive,  Petabyte-Scale, Self-Learning Fraud Detection System
Architecting a Predictive, Petabyte-Scale, Self-Learning Fraud Detection System
David Talby
 

More from David Talby (12)

Building State-of-the-art Natural Language Processing Projects with Free Soft...
Building State-of-the-art Natural Language Processing Projects with Free Soft...Building State-of-the-art Natural Language Processing Projects with Free Soft...
Building State-of-the-art Natural Language Processing Projects with Free Soft...
 
Turning Medical Expert Knowledge into Responsible Language Models - K1st World
Turning Medical Expert Knowledge into Responsible Language Models - K1st WorldTurning Medical Expert Knowledge into Responsible Language Models - K1st World
Turning Medical Expert Knowledge into Responsible Language Models - K1st World
 
How to Apply NLP to Analyze Clinical Trials
How to Apply NLP to Analyze Clinical TrialsHow to Apply NLP to Analyze Clinical Trials
How to Apply NLP to Analyze Clinical Trials
 
New Frontiers in Applied NLP​ - PAW Healthcare 2022
New Frontiers in Applied NLP​ - PAW Healthcare 2022New Frontiers in Applied NLP​ - PAW Healthcare 2022
New Frontiers in Applied NLP​ - PAW Healthcare 2022
 
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
 
Applying NLP to Personalized Healthcare - 2021
Applying NLP to Personalized Healthcare - 2021Applying NLP to Personalized Healthcare - 2021
Applying NLP to Personalized Healthcare - 2021
 
Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...
Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...
Introducing the Open-Source Library for Testing NLP Models - Healthcare NLP S...
 
Natural Language Understanding in Healthcare
Natural Language Understanding in HealthcareNatural Language Understanding in Healthcare
Natural Language Understanding in Healthcare
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 edition
 
Deep learning for natural language understanding
Deep learning for natural language understandingDeep learning for natural language understanding
Deep learning for natural language understanding
 
Build your open source data science platform
Build your open source data science platformBuild your open source data science platform
Build your open source data science platform
 
Architecting a Predictive, Petabyte-Scale, Self-Learning Fraud Detection System
Architecting a Predictive,  Petabyte-Scale, Self-Learning Fraud Detection SystemArchitecting a Predictive,  Petabyte-Scale, Self-Learning Fraud Detection System
Architecting a Predictive, Petabyte-Scale, Self-Learning Fraud Detection System
 

Recently uploaded

Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 

Recently uploaded (20)

Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 

Semantic Natural Language Understanding with Spark, UIMA & Machine Learned Ontologies

  • 1. David Talby @davidtalby CTO, Atigeo SEMANTIC NATURAL LANGUAGE UNDERSTANDING WITH SPARK, UIMA & MACHINE-LEARNED ONTOLOGIES Claudiu Branzan @melcutz Principal Lead, Atigeo
  • 2. 2 2 THE PROBLEM Who needs to be vaccinated? Who fits this clinical trial? Who is at risk for sepsis? Who is getting meds they’re allergic to? Who on this protocol did not have this side effect?
  • 3. 3 AT THE BEGINNING, THERE WAS SEARCH Scalable & robust Indexing pipeline Tokenizers & analyzers Synonyms, spellers & Auto-suggest File formats & header boosting Rankers, link & reputation boosting
  • 4. 4 THEN THERE WAS SEMANTIC SEARCH “cheap red prom dresses” “laptops under $500” “italian restaurants near me that deliver” “captain america civil war tonight” “nba scores” Dictionary Based Attribute Extraction Dell - XPS 15.6 4K Ultra HD Touch-Screen Laptop - Intel Core i5 - 8GB Memory - 256GB Solid State Drive - Silver Machine Learned Attribute Extraction If you go for the ambience, you'll be disappointed. If you go for good, inexpensive and authentic Mexican food, then you're in the right place.
  • 5. 5 AND THEN, YOU NEED TO UNDERSTAND LANGUAGE Prescribing sick days due to diagnosis of influenza. Positive Jane complains about flu-like symptoms. Speculative Jane may be experiencing some sort of flu episode. Possible Jane’s RIDT came back negative for influenza. Negative Jane is at high risk for flu if she’s not vaccinated. Conditional Jane’s older brother had the flu last month. Family history Jane had a severe case of flu last year. Patient history
  • 6. 6 LANGUAGE GETS COMPLEX & DOMAIN SPECIFIC Joe expressed concerns about the risks of bird flu. Nothing Joe shows no signs of stroke, except for numbness. Double Negative Nausea, vomiting and ankle swelling negative. Compound (it gets worse – in reality a lot of text isn’t valid English) Patient denies alcohol abuse. Speculative Allergies: Penicillin, Dust, Sneezing. Compound
  • 7. 7 7 LET’S BUILD THIS! The input (patient records) The processing framework The output The query engines
  • 8. 8 8 SENTENCE DETECTION SECTION DETECTION TOKENIZER LEMMATIZER STOPWORD REMOVAL NEGATION DETECTION CONDITIONAL SCOPE SPECULATIVE SCOPE DATE NUMBER UNIT QUANITITY CONCEPT EXTRACTION
  • 10. 1 0 10 MACHINE LEARNED ANNOTATORS Grammatical Patterns If … then … Direct Inferences Age < 18 ==> Child Lookups RIDT (lab test) Under-diagnosed conditions Flu Depression Implied by Context relevant labs normal Sometimes, it’s easier to just code an annotation’s business logic But sometimes it’s easier to learn it from examples:
  • 11. 1 1 11 Second Demo: Machine Learned Annotator
  • 12. 1 2
  • 13. 1 3 13 WHAT ABOUT EXPANDING & UPDATING ONTOLOGIES? Word2Vec
  • 14. 1 4 14 LET’S BUILD THIS TOO!
  • 15. 1 5 15 Third Demo: Ontology Enrichment
  • 16. 1 6 16 SUMMARY & APPLICATIONS Who needs to be vaccinated? Who fits this clinical trial? Who is at risk for sepsis? Who is getting meds they’re allergic to? Who on this protocol did not have this side effect?
  • 18. © 2015 Atigeo, Corporation. All rights reserved. Atigeo and the xPatterns logo are trademarks of Atigeo. The information herein is for informational purposes only and represents the current view of Atigeo as of the date of this presentation. Because Atigeo must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Atigeo, and Atigeo cannot guarantee the accuracy of any information provided after the date of this presentation. ATIGEO MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
  • 19. APPENDIX In case the live demo gets cold feet on stage 1 9
  • 20. 2 0
  • 21. 2 1
  • 22. 2 2
  • 23. 2 3
  • 24. 2 4
  • 25. 2 5
  • 26. 2 6
  • 27. 2 7
  • 28. 2 8