SlideShare a Scribd company logo
Building Natural Language
Processing solutions
For Davidson Machine Learning Group
By Ramu Pulipati,
@botsplash
Introduction to NLP
• Natural Language:
• General purpose communications
• Distinct difference between humans and Animals
• Much difficult to interpret from Formal Language
• Natural Language Processing (NLP) Advancements
• Earlier focus was on Linguistics and Computer Science
• Current evolution is focused on Machine Learning, specifically
Deep Learning and Neural Networks
• Varied degrees of implementation based on use case
Scope of Natural Language Processing
• Read
• Natural Language Understanding (NLU)
• Write
• Natural Language Generation (NLG)
• Speak
• Speech Recognition / Syntesis
NLP Applications
More Applications …
• Email Spam
• Siri / Alexa / Cortana
• Legal Contacts to find Action
clauses
• Health Care Records
• Energy Sector / Utilities /
Inspection Records
• Automated Agents
• Appointment Scheduling
• Auto Email Responses
• Typing Suggestions
• Spelling Check
• Predicting Crops
• Social Media Propaganda
• Press/Earnings releases
• Weather Reports
• Search Engines
• News categorization
• Chatbot
• NY Times Oped author analysis
State of NLP
Source: https://www.slideshare.net/healess/sk-t-academy-lecture-note
Botsplash AI Strategy
Machine
Learning
Natural
Language
Processing
Predictive
Analytics
Routing Intelligence
High Intent Conversion Detection
Trends and Behavior
End Chat, Spam Detection
Content and Sentiment
FAQ, Support, Transaction
Chatbot
Re-engagement
Smart Scheduling
UI Interactions
Focus on solvable/acceptable problems
I’m looking for 30yr mortgage loan in Charlotte, NC
(Named Entity Recognition)
Thanks for your help. Great chatting with you.
(classification)
Lets connect tomorrow. Anytime evening will work for me.
(classification / intent / actionable)
This rate is unacceptable. What can you do?
(sentiment)
Note on leading NLP providers
• AWS Comprehend
• Google Cloud NLP
• Microsoft Project Oxford
• IBM Watson
• Aylien
• Cennest Comparison: https://cognitiveintegratorapp.azurewebsites.net/
Note: None of them provide the results you are looking for. Open source
packages are your best options.
Text Processing Roundup
• Normalization
• Text Classification
• Text Similarity
• Text Extraction
• Topic Modeling
• Semantic Search
• Sentiment Analysis
Word Embeddings
• Paper published by Mikolov 2013
Example: Man is to Woman, then King is to _______
• Multi-dimensional space of word representations with proximity
based on similarity of the words (word vectors)
• Algebraic expressions can be applied on Word vectors
• Building Word embedding: Provide lot of data with features to look
• Word2vec is a popular word embedding implemented with Neural
network
• Other implementations such as Glove use co-occurrence matrices
Word2vec paper results
NLP Pipeline
• Classical
follows
traditional ML
strategies
• Deep Learning
requires lot of
data
Getting started
• Python Installation. Use 3+.
• Data science packages installation. Use “pip install” or Anaconda
• Always use “virtualenv” when setting up environments.
• Start with Jupyter notebooks and convert it production code.
• Use cloud hosted jupyter notebooks with access to GPU from
floydhub, paperspace, Google, Amazon or Azure
Python packages for NLP
• NLP Focus Packages
• NLTK
• Spacy
• Gensim
• Textblob
• Scikit Learn
• Stanford NLP (java)
• WordNet, SentiWordNet
• FastText / MUSE / Faiss
• Deep Learning Frameworks
• Tensorflow / Keras
• Pytorch
• Other Noteworth
• Scrapy
• Newspaper
• nlp-architect
NLTK Code Tour
• Tokenization (Dictionary and Regex)
• Stemming
• Lemma
• NLP Grammar - Chunking and Chinking
• Entity Recognition
• WikiQuiz
Spacy.io Lightning Tour
• Industrial Strength, Fast
• POS Tagging and Dependency Parsing
• Named Entities, Word embedding and Similarity
• Custom Pipelines
• Visualization
Text classification
• Use cases: Spam, Actionable events, Intents
• For Content based or Request based classification
• Steps involve Preparing -> Training -> Prediction
• Feature Extractions
• Bag of Words
• TF-IDF model
• Word Vectors: Averaged, TD-IDF, tc
• Starspace model
• FastText
• Classification alg: Multinomial Bayes or SVM
• Intent Classification
• RASA NLU
• Snips NLU
Steps to classifying your data
1. Identify tags to be applied
2. Manually add tags for the
data (possibly in the
application)
3. Build a classification
algorithm
4. Setup your application to
auto classify tags
5. Evaluate silently and then
enable the actions
Sentiment Analysis
• Use case: Reviews, Chat transcripts, etc
• Supervised techniques are effective for a domain
• Packages:
• SentiWordNet
• StanfordNLP
• Spacy Sentiment Analysis (incomplete)
Summarization
• Summarization is hard
• Uses variety of techniques including Text extraction, Feature Matrix,
TD-IDF, Co-location, SVD and other methods
• Implement LSA to under
• Review of implementations:
• Spacy
• TextRank
• Pyteaser
• Textteaser
• Sumy
Chatbots
• Rules Based
• Intent Classification
• Context and Workflow Management
• Handle Special Cases
• Generative
• Sequence to Sequence Chatbot: DeepQA demo
Code Review / Demo Apps
• Jupyter Notebooks
• NLTK Code Review
• Space Code Review
• Word2Vec Samples
• NLTK Grammar Parsing
• WikiQuiz
• Topic Modeling Code Review
• Text Similarity – Phrase Matcher API
Follow up Learning
• Websites:
• Allen AI - NLP
• Fast AI
• Malabuba
• Coursera
• Youtube
• Resources
• Sanni Oluwatoyin Yetunde
Google Slides
• Cambridge Data Science
Group presentation
• nlp.fast.ai

More Related Content

What's hot

Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
Lifeng (Aaron) Han
 
EDS for JIBS
EDS for JIBSEDS for JIBS
EDS for JIBS
CliveRWright
 
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
Lifeng (Aaron) Han
 
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
Chinese Character Decomposition for  Neural MT with Multi-Word ExpressionsChinese Character Decomposition for  Neural MT with Multi-Word Expressions
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
Lifeng (Aaron) Han
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Joaquin Delgado PhD.
 
Searching with vectors
Searching with vectorsSearching with vectors
Searching with vectors
Simon Hughes
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
Joaquin Delgado PhD.
 
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Lucidworks
 
Developing A Big Data Search Engine - Where we have gone. Where we are going:...
Developing A Big Data Search Engine - Where we have gone. Where we are going:...Developing A Big Data Search Engine - Where we have gone. Where we are going:...
Developing A Big Data Search Engine - Where we have gone. Where we are going:...
Lucidworks
 
EDS for IFLA
EDS for IFLAEDS for IFLA
EDS for IFLA
CliveRWright
 
Data analysis patterns, tools and data types in genomics
Data analysis patterns, tools and data types in genomicsData analysis patterns, tools and data types in genomics
Data analysis patterns, tools and data types in genomics
Altuna Akalin
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data Ecosystem
Trey Grainger
 
Feature Engineering for NLP
Feature Engineering for NLPFeature Engineering for NLP
Feature Engineering for NLP
Bill Liu
 
Intent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval SystemsIntent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval Systems
Trey Grainger
 
Search summit-2018-ltr-presentation
Search summit-2018-ltr-presentationSearch summit-2018-ltr-presentation
Search summit-2018-ltr-presentation
Sujit Pal
 
Search summit-2018-content-engineering-slides
Search summit-2018-content-engineering-slidesSearch summit-2018-content-engineering-slides
Search summit-2018-content-engineering-slides
Sujit Pal
 
Transzaar - CAT Tool for Indian Languages including English Arabic
Transzaar - CAT Tool for Indian Languages including English ArabicTranszaar - CAT Tool for Indian Languages including English Arabic
Transzaar - CAT Tool for Indian Languages including English Arabic
Rashid Ahmad
 
Searching for the Best Machine Translation Combination
Searching for the Best Machine Translation CombinationSearching for the Best Machine Translation Combination
Searching for the Best Machine Translation Combination
Matīss ‎‎‎‎‎‎‎  
 
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Oscar Peña del Rio
 
Machine Learning for Everyone
Machine Learning for EveryoneMachine Learning for Everyone
Machine Learning for Everyone
Aly Abdelkareem
 

What's hot (20)

Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
 
EDS for JIBS
EDS for JIBSEDS for JIBS
EDS for JIBS
 
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
 
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
Chinese Character Decomposition for  Neural MT with Multi-Word ExpressionsChinese Character Decomposition for  Neural MT with Multi-Word Expressions
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
 
Searching with vectors
Searching with vectorsSearching with vectors
Searching with vectors
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
 
Developing A Big Data Search Engine - Where we have gone. Where we are going:...
Developing A Big Data Search Engine - Where we have gone. Where we are going:...Developing A Big Data Search Engine - Where we have gone. Where we are going:...
Developing A Big Data Search Engine - Where we have gone. Where we are going:...
 
EDS for IFLA
EDS for IFLAEDS for IFLA
EDS for IFLA
 
Data analysis patterns, tools and data types in genomics
Data analysis patterns, tools and data types in genomicsData analysis patterns, tools and data types in genomics
Data analysis patterns, tools and data types in genomics
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data Ecosystem
 
Feature Engineering for NLP
Feature Engineering for NLPFeature Engineering for NLP
Feature Engineering for NLP
 
Intent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval SystemsIntent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval Systems
 
Search summit-2018-ltr-presentation
Search summit-2018-ltr-presentationSearch summit-2018-ltr-presentation
Search summit-2018-ltr-presentation
 
Search summit-2018-content-engineering-slides
Search summit-2018-content-engineering-slidesSearch summit-2018-content-engineering-slides
Search summit-2018-content-engineering-slides
 
Transzaar - CAT Tool for Indian Languages including English Arabic
Transzaar - CAT Tool for Indian Languages including English ArabicTranszaar - CAT Tool for Indian Languages including English Arabic
Transzaar - CAT Tool for Indian Languages including English Arabic
 
Searching for the Best Machine Translation Combination
Searching for the Best Machine Translation CombinationSearching for the Best Machine Translation Combination
Searching for the Best Machine Translation Combination
 
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
 
Machine Learning for Everyone
Machine Learning for EveryoneMachine Learning for Everyone
Machine Learning for Everyone
 

Similar to Building NLP solutions for Davidson ML Group

How Oracle Uses CrowdFlower For Sentiment Analysis
How Oracle Uses CrowdFlower For Sentiment AnalysisHow Oracle Uses CrowdFlower For Sentiment Analysis
How Oracle Uses CrowdFlower For Sentiment Analysis
CrowdFlower
 
Machine Learning & Apache Mahout
Machine Learning & Apache MahoutMachine Learning & Apache Mahout
Machine Learning & Apache Mahout
Domingo Suarez Torres
 
Natural language processing and search
Natural language processing and searchNatural language processing and search
Natural language processing and search
Nathan McMinn
 
Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
Shishir Choudhary
 
Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text Mining
Minha Hwang
 
Taming Text
Taming TextTaming Text
Taming Text
Grant Ingersoll
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
ananth
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
vincent683379
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
S. Diana Hu
 
ICS1020 NLP 2020
ICS1020 NLP 2020ICS1020 NLP 2020
ICS1020 NLP 2020
Vanessa Camilleri
 
subrat
 subrat subrat
subrat
ABA,BALASORE
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
WingChan46
 
Workshop Exercise: Text Analysis Methods for Digital Humanities
Workshop Exercise: Text Analysis Methods for Digital HumanitiesWorkshop Exercise: Text Analysis Methods for Digital Humanities
Workshop Exercise: Text Analysis Methods for Digital Humanities
Helen Bailey
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
RajkiranVeluri
 
Databases, Web Services and Tools For Systems Immunology
Databases, Web Services and Tools For Systems ImmunologyDatabases, Web Services and Tools For Systems Immunology
Databases, Web Services and Tools For Systems Immunology
Yannick Pouliot
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Simon Hughes
 
Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...
Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...
Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...
hajinouha0
 
NLP,expert,robotics.pptx
NLP,expert,robotics.pptxNLP,expert,robotics.pptx
NLP,expert,robotics.pptx
AmanBadesra1
 
1. OBJECT ORIENTED PROGRAMMING USING JAVA - OOps Concepts.ppt
1. OBJECT ORIENTED PROGRAMMING USING JAVA - OOps Concepts.ppt1. OBJECT ORIENTED PROGRAMMING USING JAVA - OOps Concepts.ppt
1. OBJECT ORIENTED PROGRAMMING USING JAVA - OOps Concepts.ppt
sagarjsicg
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Simon Hughes
 

Similar to Building NLP solutions for Davidson ML Group (20)

How Oracle Uses CrowdFlower For Sentiment Analysis
How Oracle Uses CrowdFlower For Sentiment AnalysisHow Oracle Uses CrowdFlower For Sentiment Analysis
How Oracle Uses CrowdFlower For Sentiment Analysis
 
Machine Learning & Apache Mahout
Machine Learning & Apache MahoutMachine Learning & Apache Mahout
Machine Learning & Apache Mahout
 
Natural language processing and search
Natural language processing and searchNatural language processing and search
Natural language processing and search
 
Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
 
Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text Mining
 
Taming Text
Taming TextTaming Text
Taming Text
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
ICS1020 NLP 2020
ICS1020 NLP 2020ICS1020 NLP 2020
ICS1020 NLP 2020
 
subrat
 subrat subrat
subrat
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
 
Workshop Exercise: Text Analysis Methods for Digital Humanities
Workshop Exercise: Text Analysis Methods for Digital HumanitiesWorkshop Exercise: Text Analysis Methods for Digital Humanities
Workshop Exercise: Text Analysis Methods for Digital Humanities
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
 
Databases, Web Services and Tools For Systems Immunology
Databases, Web Services and Tools For Systems ImmunologyDatabases, Web Services and Tools For Systems Immunology
Databases, Web Services and Tools For Systems Immunology
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank Talk
 
Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...
Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...
Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...
 
NLP,expert,robotics.pptx
NLP,expert,robotics.pptxNLP,expert,robotics.pptx
NLP,expert,robotics.pptx
 
1. OBJECT ORIENTED PROGRAMMING USING JAVA - OOps Concepts.ppt
1. OBJECT ORIENTED PROGRAMMING USING JAVA - OOps Concepts.ppt1. OBJECT ORIENTED PROGRAMMING USING JAVA - OOps Concepts.ppt
1. OBJECT ORIENTED PROGRAMMING USING JAVA - OOps Concepts.ppt
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
 

More from botsplash.com

Migrating to postgresql
Migrating to postgresqlMigrating to postgresql
Migrating to postgresql
botsplash.com
 
Bootstrap SaaS startup using Open Source Tools
Bootstrap SaaS startup using Open Source ToolsBootstrap SaaS startup using Open Source Tools
Bootstrap SaaS startup using Open Source Tools
botsplash.com
 
Devops Days, 2019 - Charlotte
Devops Days, 2019 - CharlotteDevops Days, 2019 - Charlotte
Devops Days, 2019 - Charlotte
botsplash.com
 
Getting started with postgresql
Getting started with postgresqlGetting started with postgresql
Getting started with postgresql
botsplash.com
 
Chat interfaces, Extension to Digital Marketing
Chat interfaces, Extension to Digital MarketingChat interfaces, Extension to Digital Marketing
Chat interfaces, Extension to Digital Marketing
botsplash.com
 
Cloud computing options
Cloud computing optionsCloud computing options
Cloud computing options
botsplash.com
 
Data Science meets Digital Marketing
Data Science meets Digital MarketingData Science meets Digital Marketing
Data Science meets Digital Marketing
botsplash.com
 
botsplash deep dive
botsplash deep divebotsplash deep dive
botsplash deep dive
botsplash.com
 
Building Twitter bot using Python
Building Twitter bot using PythonBuilding Twitter bot using Python
Building Twitter bot using Python
botsplash.com
 
Python for data science
Python for data sciencePython for data science
Python for data science
botsplash.com
 
Live development & tools
Live development & toolsLive development & tools
Live development & tools
botsplash.com
 
AI Use Cases discussion
AI Use Cases discussionAI Use Cases discussion
AI Use Cases discussion
botsplash.com
 
Career advice for beginner software engineers
Career advice for beginner software engineersCareer advice for beginner software engineers
Career advice for beginner software engineers
botsplash.com
 
Node.js Getting Started &amd Best Practices
Node.js Getting Started &amd Best PracticesNode.js Getting Started &amd Best Practices
Node.js Getting Started &amd Best Practices
botsplash.com
 

More from botsplash.com (14)

Migrating to postgresql
Migrating to postgresqlMigrating to postgresql
Migrating to postgresql
 
Bootstrap SaaS startup using Open Source Tools
Bootstrap SaaS startup using Open Source ToolsBootstrap SaaS startup using Open Source Tools
Bootstrap SaaS startup using Open Source Tools
 
Devops Days, 2019 - Charlotte
Devops Days, 2019 - CharlotteDevops Days, 2019 - Charlotte
Devops Days, 2019 - Charlotte
 
Getting started with postgresql
Getting started with postgresqlGetting started with postgresql
Getting started with postgresql
 
Chat interfaces, Extension to Digital Marketing
Chat interfaces, Extension to Digital MarketingChat interfaces, Extension to Digital Marketing
Chat interfaces, Extension to Digital Marketing
 
Cloud computing options
Cloud computing optionsCloud computing options
Cloud computing options
 
Data Science meets Digital Marketing
Data Science meets Digital MarketingData Science meets Digital Marketing
Data Science meets Digital Marketing
 
botsplash deep dive
botsplash deep divebotsplash deep dive
botsplash deep dive
 
Building Twitter bot using Python
Building Twitter bot using PythonBuilding Twitter bot using Python
Building Twitter bot using Python
 
Python for data science
Python for data sciencePython for data science
Python for data science
 
Live development & tools
Live development & toolsLive development & tools
Live development & tools
 
AI Use Cases discussion
AI Use Cases discussionAI Use Cases discussion
AI Use Cases discussion
 
Career advice for beginner software engineers
Career advice for beginner software engineersCareer advice for beginner software engineers
Career advice for beginner software engineers
 
Node.js Getting Started &amd Best Practices
Node.js Getting Started &amd Best PracticesNode.js Getting Started &amd Best Practices
Node.js Getting Started &amd Best Practices
 

Recently uploaded

Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 

Recently uploaded (20)

Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 

Building NLP solutions for Davidson ML Group

  • 1. Building Natural Language Processing solutions For Davidson Machine Learning Group By Ramu Pulipati, @botsplash
  • 2. Introduction to NLP • Natural Language: • General purpose communications • Distinct difference between humans and Animals • Much difficult to interpret from Formal Language • Natural Language Processing (NLP) Advancements • Earlier focus was on Linguistics and Computer Science • Current evolution is focused on Machine Learning, specifically Deep Learning and Neural Networks • Varied degrees of implementation based on use case
  • 3. Scope of Natural Language Processing • Read • Natural Language Understanding (NLU) • Write • Natural Language Generation (NLG) • Speak • Speech Recognition / Syntesis
  • 5. More Applications … • Email Spam • Siri / Alexa / Cortana • Legal Contacts to find Action clauses • Health Care Records • Energy Sector / Utilities / Inspection Records • Automated Agents • Appointment Scheduling • Auto Email Responses • Typing Suggestions • Spelling Check • Predicting Crops • Social Media Propaganda • Press/Earnings releases • Weather Reports • Search Engines • News categorization • Chatbot • NY Times Oped author analysis
  • 6. State of NLP Source: https://www.slideshare.net/healess/sk-t-academy-lecture-note
  • 7. Botsplash AI Strategy Machine Learning Natural Language Processing Predictive Analytics Routing Intelligence High Intent Conversion Detection Trends and Behavior End Chat, Spam Detection Content and Sentiment FAQ, Support, Transaction Chatbot Re-engagement Smart Scheduling UI Interactions
  • 8. Focus on solvable/acceptable problems I’m looking for 30yr mortgage loan in Charlotte, NC (Named Entity Recognition) Thanks for your help. Great chatting with you. (classification) Lets connect tomorrow. Anytime evening will work for me. (classification / intent / actionable) This rate is unacceptable. What can you do? (sentiment)
  • 9. Note on leading NLP providers • AWS Comprehend • Google Cloud NLP • Microsoft Project Oxford • IBM Watson • Aylien • Cennest Comparison: https://cognitiveintegratorapp.azurewebsites.net/ Note: None of them provide the results you are looking for. Open source packages are your best options.
  • 10. Text Processing Roundup • Normalization • Text Classification • Text Similarity • Text Extraction • Topic Modeling • Semantic Search • Sentiment Analysis
  • 11. Word Embeddings • Paper published by Mikolov 2013 Example: Man is to Woman, then King is to _______ • Multi-dimensional space of word representations with proximity based on similarity of the words (word vectors) • Algebraic expressions can be applied on Word vectors • Building Word embedding: Provide lot of data with features to look • Word2vec is a popular word embedding implemented with Neural network • Other implementations such as Glove use co-occurrence matrices
  • 13. NLP Pipeline • Classical follows traditional ML strategies • Deep Learning requires lot of data
  • 14. Getting started • Python Installation. Use 3+. • Data science packages installation. Use “pip install” or Anaconda • Always use “virtualenv” when setting up environments. • Start with Jupyter notebooks and convert it production code. • Use cloud hosted jupyter notebooks with access to GPU from floydhub, paperspace, Google, Amazon or Azure
  • 15. Python packages for NLP • NLP Focus Packages • NLTK • Spacy • Gensim • Textblob • Scikit Learn • Stanford NLP (java) • WordNet, SentiWordNet • FastText / MUSE / Faiss • Deep Learning Frameworks • Tensorflow / Keras • Pytorch • Other Noteworth • Scrapy • Newspaper • nlp-architect
  • 16. NLTK Code Tour • Tokenization (Dictionary and Regex) • Stemming • Lemma • NLP Grammar - Chunking and Chinking • Entity Recognition • WikiQuiz
  • 17. Spacy.io Lightning Tour • Industrial Strength, Fast • POS Tagging and Dependency Parsing • Named Entities, Word embedding and Similarity • Custom Pipelines • Visualization
  • 18. Text classification • Use cases: Spam, Actionable events, Intents • For Content based or Request based classification • Steps involve Preparing -> Training -> Prediction • Feature Extractions • Bag of Words • TF-IDF model • Word Vectors: Averaged, TD-IDF, tc • Starspace model • FastText • Classification alg: Multinomial Bayes or SVM • Intent Classification • RASA NLU • Snips NLU
  • 19. Steps to classifying your data 1. Identify tags to be applied 2. Manually add tags for the data (possibly in the application) 3. Build a classification algorithm 4. Setup your application to auto classify tags 5. Evaluate silently and then enable the actions
  • 20. Sentiment Analysis • Use case: Reviews, Chat transcripts, etc • Supervised techniques are effective for a domain • Packages: • SentiWordNet • StanfordNLP • Spacy Sentiment Analysis (incomplete)
  • 21. Summarization • Summarization is hard • Uses variety of techniques including Text extraction, Feature Matrix, TD-IDF, Co-location, SVD and other methods • Implement LSA to under • Review of implementations: • Spacy • TextRank • Pyteaser • Textteaser • Sumy
  • 22. Chatbots • Rules Based • Intent Classification • Context and Workflow Management • Handle Special Cases • Generative • Sequence to Sequence Chatbot: DeepQA demo
  • 23. Code Review / Demo Apps • Jupyter Notebooks • NLTK Code Review • Space Code Review • Word2Vec Samples • NLTK Grammar Parsing • WikiQuiz • Topic Modeling Code Review • Text Similarity – Phrase Matcher API
  • 24. Follow up Learning • Websites: • Allen AI - NLP • Fast AI • Malabuba • Coursera • Youtube • Resources • Sanni Oluwatoyin Yetunde Google Slides • Cambridge Data Science Group presentation • nlp.fast.ai

Editor's Notes

  1. Natural language is ambiguous, where formal language is precise Formal language: Programming language
  2. The botsplash framework encompasses and build on strong concepts and strategy to augment business processes to achieve best outcome for business and customers of the business botsplash is a Software-as-a-Service platform on a model of B-2-b-2-C. We want the “B”(business) to provide “C”(consumers of business) the best, easy to use and reliable technology to reduce costs , increase business transactions, efficiency and customer satisfaction.
  3. ML Strategies: * Explore data and use visualizations * Create Train and Test data * Setup training algorithm and feature * Train Model * Test the result * Rinse and Repeat until the results are satisfactory
  4. Multinomial Naïve Bayes is used to predict more than 2 classes. Popular Bayes algorithm that expects each feature is independent Support vector machine are supervised algorithms used for classification, regression, anomaly and outlier detections For classification algorithm, we focus on following metrics: accuracy, precision, recall and f1 score