SlideShare a Scribd company logo
1 of 46
Presented By:
Lipika Sharma
Interaction
Hello
How are you?
I am great; thanks for asking.
How was your day?
Chatbots
Do you remember the chatbot you interacted with in last ?
https://www.pandorabots.com/mitsuku/
chatterbot-corpus/chatterbot_corpus/data/english at master · gunthercox/chatterbot-corpus ·
GitHub
What is NLP?
Natural language processing (NLP) is an integral part of AI, Computer Science,
and Linguistics. NLP is all about making computers/machines as intelligent as
human beings in the understanding of natural-communication language like text,
speech, and so on. It comprises 2 major functionalities. they are Human to machine
translation and Machine to Human translation.
Applications of NLP
•Email filters. Email filters are one of the most basic and
initial applications of NLP online. ...
•Smart assistants. ...
•Search results. ...
•Predictive text. ...
•Language translation. ...
•Digital phone calls. ...
•Data analysis. ...
•Text analytics.
Modelling
Techniques
Data Preprocessing
Tokenization
Stop Words Removal
Stemming
Lemmatization
Bag of Words
TF-IDF
Word Embeddings
Sentiment Analysis
Steps towards NLP
Tool Used - Python
Python is a high-level, interpreted, general-purpose
programming language.
Its design philosophy emphasizes code readability with the use
of significant indentation.
Python Library
• NumPy
• Pandas
• Matplotlib
• Seaborn
• NLTK
Art to read the data
Data preprocessing is a data mining
technique which is used to transform
the raw data in a useful and efficient
format..
Demo -
Tokenization –
Tokenization is a process by which sensitive data elements such
as PANs, Personally Identifiable Information elements, etc. are
replaced by surrogate values, or tokens. Tokenization (or
“masking”, or “obfuscation”) means some form of format-
preserving data protection: converting sensitive values into non-
sensitive, replacement values – tokens – the same length and
format of the original data.
•Tokens share some characteristics with the original data elements, such
as format, length, etc
•Each data element is mapped to a unique token.
•Tokens are deterministic: repeatedly generating a token for a given
value yields the same token.
•A tokenized database can be searched by tokenizing the query terms
and searching for those.
Demo
Stemming –
Stemming is the process of reducing a word to its word
stem that affixes to suffixes and prefixes or to the roots of
words known as a lemma.
Advantage of Stemming
• Stemming is a useful "normalization" technique for words
• Stemming is used in information retrieval systems like search engines.
• It is used to determine domain vocabularies in domain analysis.
• Stemming is faster because it chops words
Fun Fact -
• Google search adopted a word stemming in 2003.
Previously a search for “fish” would not have returned
“fishing” or “fishes”.
Demo
Lemmatization –
Lemmatization is a text normalization technique used
in Natural Language Processing (NLP). Essentially,
lemmatization is a technique that switches any kind of
a word to its base root mode. (Lemma)
Difference
Stemming is a process that stems or removes last few
characters from a word, often leading to incorrect
meanings and spelling.
Lemmatization considers the context and converts the
word to its meaningful base form, which is called Lemma.
Stemming vs Lemmatization
Stemming
• Stemming is a process that stems
or removes last few characters
from a word, often leading to
incorrect meanings and spelling.
• For instance, stemming the word
‘Caring‘ would return ‘Car‘.
• Stemming is used in case of large
dataset where performance is an
issue.
• It is faster to process
Lemmatization
• Lemmatization considers the
context and converts the word to
its meaningful base form, which is
called Lemma.
• For instance, lemmatizing the word
‘Caring‘ would return ‘Care‘.
• Lemmatization is computationally
expensive since it involves look-up
tables and what not.
• It is slower
Demo
Stop Words–
Stop words are a set of commonly used words in a language.
Examples of stop words in English are “a”, “the”, “is”, “are” and
etc. Stop words are commonly used in Text Mining and Natural
Language Processing (NLP) to eliminate words that are so
commonly used that they carry very little useful information.
Sample Text with Stop
Words
Sample Text without
Stop Words
Aarush Coaching Classes – A stem
learning place for kids
Aarush Coaching Classes, Stem,
Learning, Place, kids
Can Listening be exhausting ? Listening, Exhausting
I like Teaching, so I teach Like, Teaching, Teach
Stop Words Example
Demo
Modelling Techniques in NLP
Bag of Words
TF-IDF
Word Embeddings
Sentiment Analysis
Bag of Words
A bag-of-words is a representation of text that
describes the occurrence of words within a
document. It involves two things: A vocabulary of
known words. A measure of the presence of
known words.
The Bag-of-words model is an
orderless document representation —
only the counts of words matter. For
instance, in the above example "John
likes to watch movies. Mary likes
movies too", the bag-of-words
representation will not reveal that the
verb "likes" always follows a person's
name in this text.
Bag of Words - Example
TF-IDF
TF -IDF short for term frequency–inverse
document frequency, is a numerical statistic that
is intended to reflect how important a word is to
a document in a collection or corpus.
TF –IDF Explanation
• TF – IDF is multiplication of two values TF and IDF
• TF is the frequency of term divided by a total number of
terms in the document
• IDF is obtained by dividing the total number of
documents by the number of documents containing the
term and then taking the logarithmic of that quotient.
Formula
Steps
That's it 😃! the text is now ready to feed into a machine learning
algorithm.
Word Embeddings
A word embedding is a learned representation for text
where words that have the same meaning have a similar
representation.
Types
Word Embeddings Types
Word2vec Glove fastText
Sentiment Analysis
Sentiment analysis, also referred to as opinion mining, is an approach to
natural language processing (NLP) that identifies the emotional tone
behind a body of text..
“I really like the new design of your website!” → Positive
“The new design is awful!” → Negative
Machine Learning Algorithm
https://www.kaggle.com/datasets/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews
Reference :
Sentiment Analysis - Demo
Components of NLP
Difference
Phases of NLP
• Less costly than employing human staff
• Provides quicker customer service response times
• Easy to implement)
Advantages of NLP
Adieu in NLP Style
https://github.com/lipika-tech
Connect with me :
https://www.youtube.com/c/aarushcoachingclasses

More Related Content

What's hot

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processingsaurabhnarhe
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)VenkateshMurugadas
 
Natural lanaguage processing
Natural lanaguage processingNatural lanaguage processing
Natural lanaguage processinggulshan kumar
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingBhavya Chawla
 
Introduction to Named Entity Recognition
Introduction to Named Entity RecognitionIntroduction to Named Entity Recognition
Introduction to Named Entity RecognitionTomer Lieber
 
natural language processing help at myassignmenthelp.net
natural language processing  help at myassignmenthelp.netnatural language processing  help at myassignmenthelp.net
natural language processing help at myassignmenthelp.netwww.myassignmenthelp.net
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingVeenaSKumar2
 
Natural language processing
Natural language processingNatural language processing
Natural language processingYogendra Tamang
 
Intro to nlp
Intro to nlpIntro to nlp
Intro to nlpankit_ppt
 
Natural language processing
Natural language processingNatural language processing
Natural language processingAbash shah
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review Jayneel Vora
 
Natural language processing (NLP)
Natural language processing (NLP) Natural language processing (NLP)
Natural language processing (NLP) ASWINKP11
 
Introduction to natural language processing, history and origin
Introduction to natural language processing, history and originIntroduction to natural language processing, history and origin
Introduction to natural language processing, history and originShubhankar Mohan
 
Natural language processing
Natural language processingNatural language processing
Natural language processingprashantdahake
 
Natural language processing
Natural language processingNatural language processing
Natural language processingBasha Chand
 
Natural language processing
Natural language processingNatural language processing
Natural language processingKarenVacca
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Yuriy Guts
 
Natural Language processing
Natural Language processingNatural Language processing
Natural Language processingSanzid Kawsar
 
Natural language processing
Natural language processingNatural language processing
Natural language processingSaurav Aryal
 

What's hot (20)

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
 
Natural lanaguage processing
Natural lanaguage processingNatural lanaguage processing
Natural lanaguage processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Introduction to Named Entity Recognition
Introduction to Named Entity RecognitionIntroduction to Named Entity Recognition
Introduction to Named Entity Recognition
 
natural language processing help at myassignmenthelp.net
natural language processing  help at myassignmenthelp.netnatural language processing  help at myassignmenthelp.net
natural language processing help at myassignmenthelp.net
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Intro to nlp
Intro to nlpIntro to nlp
Intro to nlp
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review
 
Natural language processing (NLP)
Natural language processing (NLP) Natural language processing (NLP)
Natural language processing (NLP)
 
Introduction to natural language processing, history and origin
Introduction to natural language processing, history and originIntroduction to natural language processing, history and origin
Introduction to natural language processing, history and origin
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Nlp
NlpNlp
Nlp
 
Natural Language processing
Natural Language processingNatural Language processing
Natural Language processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 

Similar to NLP Techniques for Text Analysis

Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for developmentAravind Reddy
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for developmentAravind Reddy
 
Natural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxNatural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxSHIBDASDUTTA
 
MODULE 4-Text Analytics.pptx
MODULE 4-Text Analytics.pptxMODULE 4-Text Analytics.pptx
MODULE 4-Text Analytics.pptxnikshaikh786
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)Kuppusamy P
 
Natural Language Processing from Object Automation
Natural Language Processing from Object Automation Natural Language Processing from Object Automation
Natural Language Processing from Object Automation Object Automation
 
NLP WITH NAÏVE BAYES CLASSIFIER (1).pptx
NLP WITH NAÏVE BAYES CLASSIFIER (1).pptxNLP WITH NAÏVE BAYES CLASSIFIER (1).pptx
NLP WITH NAÏVE BAYES CLASSIFIER (1).pptxrohithprabhas1
 
Module 8: Natural language processing Pt 1
Module 8:  Natural language processing Pt 1Module 8:  Natural language processing Pt 1
Module 8: Natural language processing Pt 1Sara Hooker
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingsocarem879
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Abdullah al Mamun
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptxPriyadharshiniG41
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptxPriyadharshiniG41
 
NLP pipeline in machine translation
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translationMarcis Pinnis
 
Natural Language Processing in Artificial intelligence
Natural Language Processing in Artificial intelligenceNatural Language Processing in Artificial intelligence
Natural Language Processing in Artificial intelligenceraghu19136
 

Similar to NLP Techniques for Text Analysis (20)

Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
 
NLP.pptx
NLP.pptxNLP.pptx
NLP.pptx
 
Natural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxNatural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptx
 
MODULE 4-Text Analytics.pptx
MODULE 4-Text Analytics.pptxMODULE 4-Text Analytics.pptx
MODULE 4-Text Analytics.pptx
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
 
Natural Language Processing from Object Automation
Natural Language Processing from Object Automation Natural Language Processing from Object Automation
Natural Language Processing from Object Automation
 
NLP WITH NAÏVE BAYES CLASSIFIER (1).pptx
NLP WITH NAÏVE BAYES CLASSIFIER (1).pptxNLP WITH NAÏVE BAYES CLASSIFIER (1).pptx
NLP WITH NAÏVE BAYES CLASSIFIER (1).pptx
 
Module 8: Natural language processing Pt 1
Module 8:  Natural language processing Pt 1Module 8:  Natural language processing Pt 1
Module 8: Natural language processing Pt 1
 
Top 10 Must-Know NLP Techniques for Data Scientists
Top 10 Must-Know NLP Techniques for Data ScientistsTop 10 Must-Know NLP Techniques for Data Scientists
Top 10 Must-Know NLP Techniques for Data Scientists
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
NLP todo
NLP todoNLP todo
NLP todo
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Natural Language Processing using Java
Natural Language Processing using JavaNatural Language Processing using Java
Natural Language Processing using Java
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptx
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptx
 
NLP pipeline in machine translation
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translation
 
Natural Language Processing in Artificial intelligence
Natural Language Processing in Artificial intelligenceNatural Language Processing in Artificial intelligence
Natural Language Processing in Artificial intelligence
 
LLM.pdf
LLM.pdfLLM.pdf
LLM.pdf
 

Recently uploaded

Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 

Recently uploaded (20)

Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 

NLP Techniques for Text Analysis

  • 2. Interaction Hello How are you? I am great; thanks for asking. How was your day?
  • 3. Chatbots Do you remember the chatbot you interacted with in last ? https://www.pandorabots.com/mitsuku/ chatterbot-corpus/chatterbot_corpus/data/english at master · gunthercox/chatterbot-corpus · GitHub
  • 4. What is NLP? Natural language processing (NLP) is an integral part of AI, Computer Science, and Linguistics. NLP is all about making computers/machines as intelligent as human beings in the understanding of natural-communication language like text, speech, and so on. It comprises 2 major functionalities. they are Human to machine translation and Machine to Human translation.
  • 5. Applications of NLP •Email filters. Email filters are one of the most basic and initial applications of NLP online. ... •Smart assistants. ... •Search results. ... •Predictive text. ... •Language translation. ... •Digital phone calls. ... •Data analysis. ... •Text analytics.
  • 6. Modelling Techniques Data Preprocessing Tokenization Stop Words Removal Stemming Lemmatization Bag of Words TF-IDF Word Embeddings Sentiment Analysis Steps towards NLP
  • 7. Tool Used - Python Python is a high-level, interpreted, general-purpose programming language. Its design philosophy emphasizes code readability with the use of significant indentation.
  • 8. Python Library • NumPy • Pandas • Matplotlib • Seaborn • NLTK
  • 9. Art to read the data Data preprocessing is a data mining technique which is used to transform the raw data in a useful and efficient format.. Demo -
  • 10. Tokenization – Tokenization is a process by which sensitive data elements such as PANs, Personally Identifiable Information elements, etc. are replaced by surrogate values, or tokens. Tokenization (or “masking”, or “obfuscation”) means some form of format- preserving data protection: converting sensitive values into non- sensitive, replacement values – tokens – the same length and format of the original data.
  • 11. •Tokens share some characteristics with the original data elements, such as format, length, etc •Each data element is mapped to a unique token. •Tokens are deterministic: repeatedly generating a token for a given value yields the same token. •A tokenized database can be searched by tokenizing the query terms and searching for those.
  • 12. Demo
  • 13. Stemming – Stemming is the process of reducing a word to its word stem that affixes to suffixes and prefixes or to the roots of words known as a lemma.
  • 14. Advantage of Stemming • Stemming is a useful "normalization" technique for words • Stemming is used in information retrieval systems like search engines. • It is used to determine domain vocabularies in domain analysis. • Stemming is faster because it chops words
  • 15. Fun Fact - • Google search adopted a word stemming in 2003. Previously a search for “fish” would not have returned “fishing” or “fishes”.
  • 16. Demo
  • 17. Lemmatization – Lemmatization is a text normalization technique used in Natural Language Processing (NLP). Essentially, lemmatization is a technique that switches any kind of a word to its base root mode. (Lemma)
  • 18. Difference Stemming is a process that stems or removes last few characters from a word, often leading to incorrect meanings and spelling. Lemmatization considers the context and converts the word to its meaningful base form, which is called Lemma.
  • 19. Stemming vs Lemmatization Stemming • Stemming is a process that stems or removes last few characters from a word, often leading to incorrect meanings and spelling. • For instance, stemming the word ‘Caring‘ would return ‘Car‘. • Stemming is used in case of large dataset where performance is an issue. • It is faster to process Lemmatization • Lemmatization considers the context and converts the word to its meaningful base form, which is called Lemma. • For instance, lemmatizing the word ‘Caring‘ would return ‘Care‘. • Lemmatization is computationally expensive since it involves look-up tables and what not. • It is slower
  • 20. Demo
  • 21. Stop Words– Stop words are a set of commonly used words in a language. Examples of stop words in English are “a”, “the”, “is”, “are” and etc. Stop words are commonly used in Text Mining and Natural Language Processing (NLP) to eliminate words that are so commonly used that they carry very little useful information.
  • 22. Sample Text with Stop Words Sample Text without Stop Words Aarush Coaching Classes – A stem learning place for kids Aarush Coaching Classes, Stem, Learning, Place, kids Can Listening be exhausting ? Listening, Exhausting I like Teaching, so I teach Like, Teaching, Teach Stop Words Example
  • 23. Demo
  • 24. Modelling Techniques in NLP Bag of Words TF-IDF Word Embeddings Sentiment Analysis
  • 25. Bag of Words A bag-of-words is a representation of text that describes the occurrence of words within a document. It involves two things: A vocabulary of known words. A measure of the presence of known words.
  • 26. The Bag-of-words model is an orderless document representation — only the counts of words matter. For instance, in the above example "John likes to watch movies. Mary likes movies too", the bag-of-words representation will not reveal that the verb "likes" always follows a person's name in this text. Bag of Words - Example
  • 27. TF-IDF TF -IDF short for term frequency–inverse document frequency, is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus.
  • 28. TF –IDF Explanation • TF – IDF is multiplication of two values TF and IDF • TF is the frequency of term divided by a total number of terms in the document • IDF is obtained by dividing the total number of documents by the number of documents containing the term and then taking the logarithmic of that quotient.
  • 30. Steps
  • 31.
  • 32.
  • 33.
  • 34. That's it 😃! the text is now ready to feed into a machine learning algorithm.
  • 35. Word Embeddings A word embedding is a learned representation for text where words that have the same meaning have a similar representation.
  • 37. Sentiment Analysis Sentiment analysis, also referred to as opinion mining, is an approach to natural language processing (NLP) that identifies the emotional tone behind a body of text.. “I really like the new design of your website!” → Positive “The new design is awful!” → Negative
  • 38.
  • 45. • Less costly than employing human staff • Provides quicker customer service response times • Easy to implement) Advantages of NLP
  • 46. Adieu in NLP Style https://github.com/lipika-tech Connect with me : https://www.youtube.com/c/aarushcoachingclasses