SlideShare a Scribd company logo
Presented By:
Lipika Sharma
Interaction
Hello
How are you?
I am great; thanks for asking.
How was your day?
Chatbots
Do you remember the chatbot you interacted with in last ?
https://www.pandorabots.com/mitsuku/
chatterbot-corpus/chatterbot_corpus/data/english at master · gunthercox/chatterbot-corpus ·
GitHub
What is NLP?
Natural language processing (NLP) is an integral part of AI, Computer Science,
and Linguistics. NLP is all about making computers/machines as intelligent as
human beings in the understanding of natural-communication language like text,
speech, and so on. It comprises 2 major functionalities. they are Human to machine
translation and Machine to Human translation.
Applications of NLP
•Email filters. Email filters are one of the most basic and
initial applications of NLP online. ...
•Smart assistants. ...
•Search results. ...
•Predictive text. ...
•Language translation. ...
•Digital phone calls. ...
•Data analysis. ...
•Text analytics.
Modelling
Techniques
Data Preprocessing
Tokenization
Stop Words Removal
Stemming
Lemmatization
Bag of Words
TF-IDF
Word Embeddings
Sentiment Analysis
Steps towards NLP
Tool Used - Python
Python is a high-level, interpreted, general-purpose
programming language.
Its design philosophy emphasizes code readability with the use
of significant indentation.
Python Library
• NumPy
• Pandas
• Matplotlib
• Seaborn
• NLTK
Art to read the data
Data preprocessing is a data mining
technique which is used to transform
the raw data in a useful and efficient
format..
Demo -
Tokenization –
Tokenization is a process by which sensitive data elements such
as PANs, Personally Identifiable Information elements, etc. are
replaced by surrogate values, or tokens. Tokenization (or
“masking”, or “obfuscation”) means some form of format-
preserving data protection: converting sensitive values into non-
sensitive, replacement values – tokens – the same length and
format of the original data.
•Tokens share some characteristics with the original data elements, such
as format, length, etc
•Each data element is mapped to a unique token.
•Tokens are deterministic: repeatedly generating a token for a given
value yields the same token.
•A tokenized database can be searched by tokenizing the query terms
and searching for those.
Demo
Stemming –
Stemming is the process of reducing a word to its word
stem that affixes to suffixes and prefixes or to the roots of
words known as a lemma.
Advantage of Stemming
• Stemming is a useful "normalization" technique for words
• Stemming is used in information retrieval systems like search engines.
• It is used to determine domain vocabularies in domain analysis.
• Stemming is faster because it chops words
Fun Fact -
• Google search adopted a word stemming in 2003.
Previously a search for “fish” would not have returned
“fishing” or “fishes”.
Demo
Lemmatization –
Lemmatization is a text normalization technique used
in Natural Language Processing (NLP). Essentially,
lemmatization is a technique that switches any kind of
a word to its base root mode. (Lemma)
Difference
Stemming is a process that stems or removes last few
characters from a word, often leading to incorrect
meanings and spelling.
Lemmatization considers the context and converts the
word to its meaningful base form, which is called Lemma.
Stemming vs Lemmatization
Stemming
• Stemming is a process that stems
or removes last few characters
from a word, often leading to
incorrect meanings and spelling.
• For instance, stemming the word
‘Caring‘ would return ‘Car‘.
• Stemming is used in case of large
dataset where performance is an
issue.
• It is faster to process
Lemmatization
• Lemmatization considers the
context and converts the word to
its meaningful base form, which is
called Lemma.
• For instance, lemmatizing the word
‘Caring‘ would return ‘Care‘.
• Lemmatization is computationally
expensive since it involves look-up
tables and what not.
• It is slower
Demo
Stop Words–
Stop words are a set of commonly used words in a language.
Examples of stop words in English are “a”, “the”, “is”, “are” and
etc. Stop words are commonly used in Text Mining and Natural
Language Processing (NLP) to eliminate words that are so
commonly used that they carry very little useful information.
Sample Text with Stop
Words
Sample Text without
Stop Words
Aarush Coaching Classes – A stem
learning place for kids
Aarush Coaching Classes, Stem,
Learning, Place, kids
Can Listening be exhausting ? Listening, Exhausting
I like Teaching, so I teach Like, Teaching, Teach
Stop Words Example
Demo
Modelling Techniques in NLP
Bag of Words
TF-IDF
Word Embeddings
Sentiment Analysis
Bag of Words
A bag-of-words is a representation of text that
describes the occurrence of words within a
document. It involves two things: A vocabulary of
known words. A measure of the presence of
known words.
The Bag-of-words model is an
orderless document representation —
only the counts of words matter. For
instance, in the above example "John
likes to watch movies. Mary likes
movies too", the bag-of-words
representation will not reveal that the
verb "likes" always follows a person's
name in this text.
Bag of Words - Example
TF-IDF
TF -IDF short for term frequency–inverse
document frequency, is a numerical statistic that
is intended to reflect how important a word is to
a document in a collection or corpus.
TF –IDF Explanation
• TF – IDF is multiplication of two values TF and IDF
• TF is the frequency of term divided by a total number of
terms in the document
• IDF is obtained by dividing the total number of
documents by the number of documents containing the
term and then taking the logarithmic of that quotient.
Formula
Steps
That's it 😃! the text is now ready to feed into a machine learning
algorithm.
Word Embeddings
A word embedding is a learned representation for text
where words that have the same meaning have a similar
representation.
Types
Word Embeddings Types
Word2vec Glove fastText
Sentiment Analysis
Sentiment analysis, also referred to as opinion mining, is an approach to
natural language processing (NLP) that identifies the emotional tone
behind a body of text..
“I really like the new design of your website!” → Positive
“The new design is awful!” → Negative
Machine Learning Algorithm
https://www.kaggle.com/datasets/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews
Reference :
Sentiment Analysis - Demo
Components of NLP
Difference
Phases of NLP
• Less costly than employing human staff
• Provides quicker customer service response times
• Easy to implement)
Advantages of NLP
Adieu in NLP Style
https://github.com/lipika-tech
Connect with me :
https://www.youtube.com/c/aarushcoachingclasses

More Related Content

What's hot

What's hot (20)

Natural language processing PPT presentation
Natural language processing PPT presentationNatural language processing PPT presentation
Natural language processing PPT presentation
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing
 
Nlp
NlpNlp
Nlp
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Introduction to natural language processing, history and origin
Introduction to natural language processing, history and originIntroduction to natural language processing, history and origin
Introduction to natural language processing, history and origin
 
Natural lanaguage processing
Natural lanaguage processingNatural lanaguage processing
Natural lanaguage processing
 
Natural language processing
Natural language processing Natural language processing
Natural language processing
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
NLP
NLPNLP
NLP
 
Natural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionNatural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - Introduction
 
NLP pipeline in machine translation
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translation
 
5. phase of nlp
5. phase of nlp5. phase of nlp
5. phase of nlp
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
 
Natural Language Processing in AI
Natural Language Processing in AINatural Language Processing in AI
Natural Language Processing in AI
 

Similar to NLP PPT.pptx

Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
Aravind Reddy
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
Aravind Reddy
 
Natural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxNatural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptx
SHIBDASDUTTA
 
NLP WITH NAÏVE BAYES CLASSIFIER (1).pptx
NLP WITH NAÏVE BAYES CLASSIFIER (1).pptxNLP WITH NAÏVE BAYES CLASSIFIER (1).pptx
NLP WITH NAÏVE BAYES CLASSIFIER (1).pptx
rohithprabhas1
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
socarem879
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Abdullah al Mamun
 

Similar to NLP PPT.pptx (20)

Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
 
NLP.pptx
NLP.pptxNLP.pptx
NLP.pptx
 
Natural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxNatural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptx
 
MODULE 4-Text Analytics.pptx
MODULE 4-Text Analytics.pptxMODULE 4-Text Analytics.pptx
MODULE 4-Text Analytics.pptx
 
Natural Language Processing from Object Automation
Natural Language Processing from Object Automation Natural Language Processing from Object Automation
Natural Language Processing from Object Automation
 
NLP WITH NAÏVE BAYES CLASSIFIER (1).pptx
NLP WITH NAÏVE BAYES CLASSIFIER (1).pptxNLP WITH NAÏVE BAYES CLASSIFIER (1).pptx
NLP WITH NAÏVE BAYES CLASSIFIER (1).pptx
 
Module 8: Natural language processing Pt 1
Module 8:  Natural language processing Pt 1Module 8:  Natural language processing Pt 1
Module 8: Natural language processing Pt 1
 
Top 10 Must-Know NLP Techniques for Data Scientists
Top 10 Must-Know NLP Techniques for Data ScientistsTop 10 Must-Know NLP Techniques for Data Scientists
Top 10 Must-Know NLP Techniques for Data Scientists
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
NLP todo
NLP todoNLP todo
NLP todo
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Natural Language Processing using Java
Natural Language Processing using JavaNatural Language Processing using Java
Natural Language Processing using Java
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptx
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptx
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing in Artificial intelligence
Natural Language Processing in Artificial intelligenceNatural Language Processing in Artificial intelligence
Natural Language Processing in Artificial intelligence
 
LLM.pdf
LLM.pdfLLM.pdf
LLM.pdf
 

Recently uploaded

Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 

Recently uploaded (20)

The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 

NLP PPT.pptx

  • 2. Interaction Hello How are you? I am great; thanks for asking. How was your day?
  • 3. Chatbots Do you remember the chatbot you interacted with in last ? https://www.pandorabots.com/mitsuku/ chatterbot-corpus/chatterbot_corpus/data/english at master · gunthercox/chatterbot-corpus · GitHub
  • 4. What is NLP? Natural language processing (NLP) is an integral part of AI, Computer Science, and Linguistics. NLP is all about making computers/machines as intelligent as human beings in the understanding of natural-communication language like text, speech, and so on. It comprises 2 major functionalities. they are Human to machine translation and Machine to Human translation.
  • 5. Applications of NLP •Email filters. Email filters are one of the most basic and initial applications of NLP online. ... •Smart assistants. ... •Search results. ... •Predictive text. ... •Language translation. ... •Digital phone calls. ... •Data analysis. ... •Text analytics.
  • 6. Modelling Techniques Data Preprocessing Tokenization Stop Words Removal Stemming Lemmatization Bag of Words TF-IDF Word Embeddings Sentiment Analysis Steps towards NLP
  • 7. Tool Used - Python Python is a high-level, interpreted, general-purpose programming language. Its design philosophy emphasizes code readability with the use of significant indentation.
  • 8. Python Library • NumPy • Pandas • Matplotlib • Seaborn • NLTK
  • 9. Art to read the data Data preprocessing is a data mining technique which is used to transform the raw data in a useful and efficient format.. Demo -
  • 10. Tokenization – Tokenization is a process by which sensitive data elements such as PANs, Personally Identifiable Information elements, etc. are replaced by surrogate values, or tokens. Tokenization (or “masking”, or “obfuscation”) means some form of format- preserving data protection: converting sensitive values into non- sensitive, replacement values – tokens – the same length and format of the original data.
  • 11. •Tokens share some characteristics with the original data elements, such as format, length, etc •Each data element is mapped to a unique token. •Tokens are deterministic: repeatedly generating a token for a given value yields the same token. •A tokenized database can be searched by tokenizing the query terms and searching for those.
  • 12. Demo
  • 13. Stemming – Stemming is the process of reducing a word to its word stem that affixes to suffixes and prefixes or to the roots of words known as a lemma.
  • 14. Advantage of Stemming • Stemming is a useful "normalization" technique for words • Stemming is used in information retrieval systems like search engines. • It is used to determine domain vocabularies in domain analysis. • Stemming is faster because it chops words
  • 15. Fun Fact - • Google search adopted a word stemming in 2003. Previously a search for “fish” would not have returned “fishing” or “fishes”.
  • 16. Demo
  • 17. Lemmatization – Lemmatization is a text normalization technique used in Natural Language Processing (NLP). Essentially, lemmatization is a technique that switches any kind of a word to its base root mode. (Lemma)
  • 18. Difference Stemming is a process that stems or removes last few characters from a word, often leading to incorrect meanings and spelling. Lemmatization considers the context and converts the word to its meaningful base form, which is called Lemma.
  • 19. Stemming vs Lemmatization Stemming • Stemming is a process that stems or removes last few characters from a word, often leading to incorrect meanings and spelling. • For instance, stemming the word ‘Caring‘ would return ‘Car‘. • Stemming is used in case of large dataset where performance is an issue. • It is faster to process Lemmatization • Lemmatization considers the context and converts the word to its meaningful base form, which is called Lemma. • For instance, lemmatizing the word ‘Caring‘ would return ‘Care‘. • Lemmatization is computationally expensive since it involves look-up tables and what not. • It is slower
  • 20. Demo
  • 21. Stop Words– Stop words are a set of commonly used words in a language. Examples of stop words in English are “a”, “the”, “is”, “are” and etc. Stop words are commonly used in Text Mining and Natural Language Processing (NLP) to eliminate words that are so commonly used that they carry very little useful information.
  • 22. Sample Text with Stop Words Sample Text without Stop Words Aarush Coaching Classes – A stem learning place for kids Aarush Coaching Classes, Stem, Learning, Place, kids Can Listening be exhausting ? Listening, Exhausting I like Teaching, so I teach Like, Teaching, Teach Stop Words Example
  • 23. Demo
  • 24. Modelling Techniques in NLP Bag of Words TF-IDF Word Embeddings Sentiment Analysis
  • 25. Bag of Words A bag-of-words is a representation of text that describes the occurrence of words within a document. It involves two things: A vocabulary of known words. A measure of the presence of known words.
  • 26. The Bag-of-words model is an orderless document representation — only the counts of words matter. For instance, in the above example "John likes to watch movies. Mary likes movies too", the bag-of-words representation will not reveal that the verb "likes" always follows a person's name in this text. Bag of Words - Example
  • 27. TF-IDF TF -IDF short for term frequency–inverse document frequency, is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus.
  • 28. TF –IDF Explanation • TF – IDF is multiplication of two values TF and IDF • TF is the frequency of term divided by a total number of terms in the document • IDF is obtained by dividing the total number of documents by the number of documents containing the term and then taking the logarithmic of that quotient.
  • 30. Steps
  • 31.
  • 32.
  • 33.
  • 34. That's it 😃! the text is now ready to feed into a machine learning algorithm.
  • 35. Word Embeddings A word embedding is a learned representation for text where words that have the same meaning have a similar representation.
  • 37. Sentiment Analysis Sentiment analysis, also referred to as opinion mining, is an approach to natural language processing (NLP) that identifies the emotional tone behind a body of text.. “I really like the new design of your website!” → Positive “The new design is awful!” → Negative
  • 38.
  • 45. • Less costly than employing human staff • Provides quicker customer service response times • Easy to implement) Advantages of NLP
  • 46. Adieu in NLP Style https://github.com/lipika-tech Connect with me : https://www.youtube.com/c/aarushcoachingclasses