SlideShare a Scribd company logo
1 of 66
NATURAL LANGUAGE PROCESSING
-Jitendra Kumar Yadav
DAV Public School, Gumla
NLP
 It is the sub-field of AI that is focused on enabling
computers to understand and process human
languages.
 It is a subfield of Linguistics, Computer Science,
Information Engineering, and Artificial Intelligence.
 It is concerned with the interactions between
computers and human (natural) languages, in
particular how to program computers to process
and analyse large amounts of natural language
data.
APPLICATIONS OF NATURAL
LANGUAGE PROCESSING
 Automatic Summarization
 Sentiment Analysis
 Text classification
 Virtual Assistants
AUTOMATIC SUMMARIZATION
AUTOMATIC SUMMARIZATION…
 It is the process of shortening a set of data
computationally, to create a summary that
represents the most relevant information within the
original content.
 It comes out as the solution to information
overload.
 It is about understanding emotional meanings
within the information.
SENTIMENT ANALYSIS
SENTIMENT ANALYSIS…
 It is about identifying sentiment among several posts
or even in the same post where emotion is not always
explicitly expressed.
 Companies use NLP applications, such as sentiment
analysis, to identify opinions and sentiment online to
help them understand what customers think about
their products and services.
 Ex- “I love the new iPhone” and, a few lines later “But
sometimes it doesn’t work well” where the person is
still talking about the iPhone and overall indicators of
their reputation.
TEXT CLASSIFICATION
TEXT CLASSIFICATION…
 Text classification makes it possible to assign
predefined categories to a document and organize
it to help finding the information needed.
 For example, an application of text categorization
is spam filtering in email.
VIRTUAL ASSISTANTS
VIRTUAL ASSISTANTS…
 An application program that understands natural
language voice commands and completes tasks
for the user.
 Benefits of AI Assistants:
Improved customer support
Ease of key data collection
Personalized user experience
 Examples:
Chatbots, Voice Assistants, AI Avatars, Domain
Specific Virtual Assistants, etc.
LET’S TALK ABOUT A SCENARIO
THE WORLD IS COMPETITIVE
NOWADAYS
THE WORLD IS COMPETITIVE
NOWADAYS…
 Everybody wishes to give their best even in a
tiniest task.
 And, when people are unable to meet these
expectations, they get stressed/depression.
 People often get depressed due to reasons
like peer pressure, studies, family issues,
relationships, etc.
IS THERE ANY THERAPY FOR THIS?
CBT
CBT
 Cognitive Behavioural Therapy (CBT) is
considered to be one of the best methods to
address stress as it is easy to implement on
people and also gives good results.
 It includes understanding the behaviour and
mindset of a person in their normal life and
help people overcome their stress and live a
happy life.
How an NLP project on “CBT” will be
developed?
To understand this lets go through AI
Project Cycle.
PROBLEM SCOPING
 Most of the therapists cure patients out of
depression using CBT technique.
 But, People do not wish to seek the help of a
psychiatrist willingly.
 They try to avoid such interactions as much
as possible.
 Thus, there is a need to bridge the gap
between a person who needs help and the
psychiatrist.
PROBLEM SCOPING
 Who Canvas – Who has the problem?
 People suffering from stress/depression.
 What Canvas – What is the nature of the
problem?
 People who need help are reluctant to consult a psychiatrist
and hence live miserably.
 Where Canvas – Where does the problem
arise?
 When they are going through a stressful period of time.
 Why Canvas – Why do you think it is a problem
worth solving?
 People get a platform where they can talk and vent out their
feelings anonymously.
(4Ws CANVAS)
PROBLEM SCOPING…
DATA ACQUISITION
 To understand the sentiments of people, we
need to collect their conversational data so the
machine can interpret the words that they use
and understand their meaning.
 Such data can be collected from various means:
1. Surveys
2. Observing the therapist’s sessions
3. Databases available on the internet
4. Interviews, etc.
DATA EXPLORATION
 The textual data collected needs to be
processed and cleaned so that an easier
version can be sent to the machine.
 The text is normalised through various steps
and is lowered to minimum vocabulary since
the machine does not require grammatically
correct statements but the essence of it.
MODELLING
 Once the text has been normalised, it is then
fed to an NLP based AI model.
 In NLP, modelling requires data pre-
processing only after which the data is fed to
the machine.
 Depending upon the type of chatbot to be
made, an appropriate AI model is used to
develop the foundation of the project.
EVALUATION
 The reliability of AI model is observed on
the basis of outputs by feeding the test
dataset into the model and comparing it with
actual answers.
EVALUATION…
If the model’s
output does not
match the true
function at all, the
model is said to be
underfitting and its
accuracy is lower.
Case-I
EVALUATION…
If the model’s
performance matches
well with the true
function, then the
model has optimum
accuracy and it is
called a perfect fit.
Case-II
EVALUATION…
If the Model performance
is trying to cover all the
data samples even if
they are out of alignment
to the true function, then
this is said to be
overfitting and this too
has a lower accuracy.
Case-III
CHATBOTS
 One of the most common applications of Natural
Language Processing is a chatbot.
 An Al software that can simulate a real
human conversation with real-time responses
to users based on reinforced learning.
 AI Chatbots either use text messages, voice
commands, or both.
CHATBOTS…
 Ex-
• Mitsuku Bot
https://www.pandorabots.com/mitsuku/
• CleverBot
https://www.cleverbot.com/
• Jabberwacky
http://www.jabberwacky.com/
• Haptik
https://haptik.ai/contact-us
• Rose
http://ec2-54-215-197-164.us-west-1.compute.amazonaws.com/speech.php
• Ochatbot
https://www.ometrics.com/blog/list-of-fun-chatbots/
CHATBOTS…
 There are 2 types of chatbots:
Ex- bots deployed in the customer care section of various
companies
CHATBOTS…
Ex- Google Assistant, Alexa, Cortana, Siri, etc.
HUMAN LANGUAGE VS
COMPUTER LANGUAGE
 Human brain continuously processes everything
what it gets around, makes sense and stores it
in some place.
 When someone whispers, the focus of our brain
automatically shifts(giving more priority) to that
speech and starts processing automatically.
 While, the computer understands the language
of numbers.
 Everything that is sent to the machine has to be
converted to numbers.
DIFFICULTIES DURING PROCESSING
NATURAL LANGUAGE BY A MACHINE
 There are structures/characteristics in the
human language that might be easy for a
human to understand but extremely difficult
for a computer to understand.
 Different syntax, same semantics:
2+3 = 3+2
 Different semantics, same syntax:
2/3 (Python 2.7) ≠ 2/3 (Python 3)
Arrangement of the words and meaning
DIFFICULTIES DURING PROCESSING
NATURAL LANGUAGE BY A MACHINE…
=> His face turned red after he found out that
he took the wrong bag.
=> His face turns red after consuming the
medicine.
 Both the sentences might have multiple
meanings.
Multiple Meanings of a word
DIFFICULTIES DURING PROCESSING
NATURAL LANGUAGE BY A MACHINE…
=> Chickens feed extravagantly while the moon
drinks tea.
 Both the sentences might have multiple
meanings.
Perfect Syntax, but no Meaning
We may face these challenges if we try to
teach computers how to understand and
interact in human language.
So, lets see how does NLP do this magic?
DATA PROCESSING
(TEXT NORMALISATION)
 It involves preparing and cleaning text data
for machines to be able to analyze it.
 This process puts data in workable form and
highlights features in the text that an
algorithm can work with.
 There are several ways this can be done,
including:
DATA PROCESSING…
 Sentence Segmentation:
In this process the whole corpus is divided
into sentences. Each sentence is taken as a
different data so now the whole corpus gets
reduced to sentences.
DATA PROCESSING…
 Sentence Segmentation:
DATA PROCESSING…
 Tokenisation:
It is the process of breaking down the
sentences into smaller units(tokens) to work
with.
DATA PROCESSING…
 Tokenisation:
DATA PROCESSING…
 Removing Stopwords, Special Characters
and Numbers:
It is the process of removing common words,
special characters, etc(which do not add any
essence to the information) are removed
from text so, unique words that offer the most
information about the text remain.
Some examples of stopwords are:
a, an, are, for, etc.
DATA PROCESSING…
 Converting text to a common case:
In this process the whole text is converted
into a similar case(lower case).
This ensures that the machine is case-
insensitive.
DATA PROCESSING…
 Converting text to a common case:
DATA PROCESSING…
 Stemming:
Here, the remaining words are reduced to
their root words.
It is the process in which the affixes of words
are removed and the words are converted to
their base form.
DATA PROCESSING…
 Stemming:
DATA PROCESSING…
 Lemmatization:
The process in which a word is converted to its
meaningful root form.
Stemming and lemmatization both are
alternative processes to each other as the role
of both the processes is same – removal of
affixes. But the difference between both of them
is that in lemmatization, the word we get after
affix removal (also known as lemma) is a
meaningful one.
DATA PROCESSING…
 Lemmatization:
BAG OF WORDS
 A Natural Language Processing model which
helps in extracting features out of the text
which is very helpful in machine learning
algorithms.
 The occurrences of each word is counted
and the vocabulary for the corpus is
constructed.
BAG OF WORDS…
Vocabulary
Frequency
of
words
BAG OF WORDS…
The step-by-step approach to implement bag of words
algorithm:
1. Text Normalisation: Collect data and pre-
process it.
2. Create Dictionary: Make a list of all the unique
words occurring in the corpus. (Vocabulary).
3. Create document vectors: For each document
in the corpus, find out how many times the
word from the unique list of words has
occurred.
4. Create document vectors for all the documents.
BAG OF WORDS…
BAG OF WORDS…
Here are three documents having one sentence each. After text
normalisation, the text becomes:
Note that no tokens have been removed in the
stopwords removal step. It is because we have very
little data and since the frequency of all the words is
almost the same, no word can be said to have lesser
value than the other.
BAG OF WORDS…
List down all the words which occur in all three
documents:
BAG OF WORDS…
In this step,
•The vocabulary is written in the top row.
•Now, for each word in the document, if it matches
with the vocabulary, put a 1 under it.
•If the same word appears again, increment the
previous value by 1.
•And if the word does not occur in that document, put
a 0 under it.
BAG OF WORDS…
Since in the first document, we have words: aman,
and, anil, are, stressed. So, all these words get a
value of 1 and rest of the words get a 0 value.
BAG OF WORDS…
This gives us the document vector table for our corpus. But the tokens have still not
converted to numbers. This leads us to the final steps of our algorithm: TFIDF
BAG OF WORDS…
A plot of occurrence of words versus their value
TFIDF
TFIDF stands for Term Frequency and Inverse Document Frequency.
It helps in identifying the value for each word.
Let us understand each term one by one.
Term Frequency:
Term frequency is the frequency of a word in one
document.
It can easily be found from the document vector table.
TFIDF…
TFIDF…
Inverse Document Frequency:
It the total number of documents divided by the
document frequency.
 IDF =
Total no. of documents
The document frequency
TFIDF…
TFIDF(W) = TF(W) * log( IDF(W) )
TFIDF…
After calculating all the values:
Conclusion:
The value of a word is inversely proportional to the
IDF value of that word.
TFIDF…
Ex-
Total Number of documents: 10
Number of documents in which ‘and’ occurs: 10
Therefore, IDF(and) = 10/10 = 1
Which means: log(1) = 0.
Hence, the value of ‘and’ becomes 0.
On the other hand,
Number of documents in which ‘pollution’ occurs: 3
IDF(pollution) = 10/3 = 3.3333…
Which means: log(3.3333) = 0.522;
Which shows that the word ‘pollution’ has considerable
value in the corpus.
Thank You!!!

More Related Content

What's hot

Natural language processing PPT presentation
Natural language processing PPT presentationNatural language processing PPT presentation
Natural language processing PPT presentationSai Mohith
 
Understanding artificial intelligence and it's future scope
Understanding artificial intelligence and it's future scopeUnderstanding artificial intelligence and it's future scope
Understanding artificial intelligence and it's future scopeChaitanya Shimpi
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingRishikese MR
 
Natural Language processing
Natural Language processingNatural Language processing
Natural Language processingSanzid Kawsar
 
2.17Mb ppt
2.17Mb ppt2.17Mb ppt
2.17Mb pptbutest
 
Introduction to data science.pptx
Introduction to data science.pptxIntroduction to data science.pptx
Introduction to data science.pptxSadhanaParameswaran
 
Introduction to Artificial Intelligence
Introduction to Artificial IntelligenceIntroduction to Artificial Intelligence
Introduction to Artificial Intelligencesnehal_152
 
Lecture1 introduction to machine learning
Lecture1 introduction to machine learningLecture1 introduction to machine learning
Lecture1 introduction to machine learningUmmeSalmaM1
 
Natural language processing
Natural language processingNatural language processing
Natural language processingAbash shah
 
Natural language processing
Natural language processingNatural language processing
Natural language processingprashantdahake
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)VenkateshMurugadas
 
Natural language processing
Natural language processingNatural language processing
Natural language processingYogendra Tamang
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Yuriy Guts
 
Artificial Intelligence - Past, Present and Future
Artificial Intelligence - Past, Present and FutureArtificial Intelligence - Past, Present and Future
Artificial Intelligence - Past, Present and FutureGrigory Sapunov
 
Artificial Intelligence for Business
Artificial Intelligence for BusinessArtificial Intelligence for Business
Artificial Intelligence for BusinessNicola Mattina
 
Artificial inteligence
Artificial inteligenceArtificial inteligence
Artificial inteligenceankit dubey
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processingsaurabhnarhe
 

What's hot (20)

Natural language processing PPT presentation
Natural language processing PPT presentationNatural language processing PPT presentation
Natural language processing PPT presentation
 
NLP
NLPNLP
NLP
 
Understanding artificial intelligence and it's future scope
Understanding artificial intelligence and it's future scopeUnderstanding artificial intelligence and it's future scope
Understanding artificial intelligence and it's future scope
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural Language processing
Natural Language processingNatural Language processing
Natural Language processing
 
2.17Mb ppt
2.17Mb ppt2.17Mb ppt
2.17Mb ppt
 
Introduction to data science.pptx
Introduction to data science.pptxIntroduction to data science.pptx
Introduction to data science.pptx
 
Introduction to Artificial Intelligence
Introduction to Artificial IntelligenceIntroduction to Artificial Intelligence
Introduction to Artificial Intelligence
 
Lecture1 introduction to machine learning
Lecture1 introduction to machine learningLecture1 introduction to machine learning
Lecture1 introduction to machine learning
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Cognitive Computing
Cognitive ComputingCognitive Computing
Cognitive Computing
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Artificial Intelligence - Past, Present and Future
Artificial Intelligence - Past, Present and FutureArtificial Intelligence - Past, Present and Future
Artificial Intelligence - Past, Present and Future
 
Artificial Intelligence for Business
Artificial Intelligence for BusinessArtificial Intelligence for Business
Artificial Intelligence for Business
 
Artificial inteligence
Artificial inteligenceArtificial inteligence
Artificial inteligence
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 

Similar to NLP(Natural Language Processing)

NLP (4) for class 9 (1).pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnn
NLP (4) for class 9 (1).pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnnNLP (4) for class 9 (1).pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnn
NLP (4) for class 9 (1).pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnnshradhasharma2101
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingBhavya Chawla
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptxssuser95248c
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for developmentAravind Reddy
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for developmentAravind Reddy
 
Artificial Intelligence (Unit - 2).pdf
Artificial Intelligence   (Unit  -  2).pdfArtificial Intelligence   (Unit  -  2).pdf
Artificial Intelligence (Unit - 2).pdfSathyaNarayanan47813
 
NATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptxNATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptxsaivinay93
 
Microsoft.com Usability broken.
Microsoft.com Usability broken.Microsoft.com Usability broken.
Microsoft.com Usability broken.None None
 
Deep Machine Reading
Deep Machine ReadingDeep Machine Reading
Deep Machine ReadingNaveen Ashish
 
Web & Social Media Analystics - Workshop Semantica
Web & Social Media Analystics - Workshop SemanticaWeb & Social Media Analystics - Workshop Semantica
Web & Social Media Analystics - Workshop SemanticaRoberto Cirillo
 
Natural language understandihggjsjng. pptx
Natural language understandihggjsjng. pptxNatural language understandihggjsjng. pptx
Natural language understandihggjsjng. pptxMAKSHAY6
 
Natural language understanding of chatbots
Natural language understanding of chatbotsNatural language understanding of chatbots
Natural language understanding of chatbotsabn17p
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language ProcessingMichel Bruley
 
Demystifying Natural Language Processing: A Beginner’s Guide
Demystifying Natural Language Processing: A Beginner’s GuideDemystifying Natural Language Processing: A Beginner’s Guide
Demystifying Natural Language Processing: A Beginner’s Guidecyberprosocial
 

Similar to NLP(Natural Language Processing) (20)

NLP (4) for class 9 (1).pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnn
NLP (4) for class 9 (1).pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnnNLP (4) for class 9 (1).pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnn
NLP (4) for class 9 (1).pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnn
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptx
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
AI_Lecture_10.pptx
AI_Lecture_10.pptxAI_Lecture_10.pptx
AI_Lecture_10.pptx
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
 
sent_analysis_report
sent_analysis_reportsent_analysis_report
sent_analysis_report
 
Artificial Intelligence (Unit - 2).pdf
Artificial Intelligence   (Unit  -  2).pdfArtificial Intelligence   (Unit  -  2).pdf
Artificial Intelligence (Unit - 2).pdf
 
Top 10 Must-Know NLP Techniques for Data Scientists
Top 10 Must-Know NLP Techniques for Data ScientistsTop 10 Must-Know NLP Techniques for Data Scientists
Top 10 Must-Know NLP Techniques for Data Scientists
 
NATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptxNATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptx
 
Microsoft.com Usability broken.
Microsoft.com Usability broken.Microsoft.com Usability broken.
Microsoft.com Usability broken.
 
Deep Machine Reading
Deep Machine ReadingDeep Machine Reading
Deep Machine Reading
 
Web & Social Media Analystics - Workshop Semantica
Web & Social Media Analystics - Workshop SemanticaWeb & Social Media Analystics - Workshop Semantica
Web & Social Media Analystics - Workshop Semantica
 
NLP.pptx
NLP.pptxNLP.pptx
NLP.pptx
 
Natural language understandihggjsjng. pptx
Natural language understandihggjsjng. pptxNatural language understandihggjsjng. pptx
Natural language understandihggjsjng. pptx
 
Natural language understanding of chatbots
Natural language understanding of chatbotsNatural language understanding of chatbots
Natural language understanding of chatbots
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language Processing
 
Enterprise Systems - MS809
Enterprise Systems -   MS809Enterprise Systems -   MS809
Enterprise Systems - MS809
 
Demystifying Natural Language Processing: A Beginner’s Guide
Demystifying Natural Language Processing: A Beginner’s GuideDemystifying Natural Language Processing: A Beginner’s Guide
Demystifying Natural Language Processing: A Beginner’s Guide
 

Recently uploaded

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 

Recently uploaded (20)

Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 

NLP(Natural Language Processing)

  • 1. NATURAL LANGUAGE PROCESSING -Jitendra Kumar Yadav DAV Public School, Gumla
  • 2. NLP  It is the sub-field of AI that is focused on enabling computers to understand and process human languages.  It is a subfield of Linguistics, Computer Science, Information Engineering, and Artificial Intelligence.  It is concerned with the interactions between computers and human (natural) languages, in particular how to program computers to process and analyse large amounts of natural language data.
  • 3. APPLICATIONS OF NATURAL LANGUAGE PROCESSING  Automatic Summarization  Sentiment Analysis  Text classification  Virtual Assistants
  • 5. AUTOMATIC SUMMARIZATION…  It is the process of shortening a set of data computationally, to create a summary that represents the most relevant information within the original content.  It comes out as the solution to information overload.  It is about understanding emotional meanings within the information.
  • 7. SENTIMENT ANALYSIS…  It is about identifying sentiment among several posts or even in the same post where emotion is not always explicitly expressed.  Companies use NLP applications, such as sentiment analysis, to identify opinions and sentiment online to help them understand what customers think about their products and services.  Ex- “I love the new iPhone” and, a few lines later “But sometimes it doesn’t work well” where the person is still talking about the iPhone and overall indicators of their reputation.
  • 9. TEXT CLASSIFICATION…  Text classification makes it possible to assign predefined categories to a document and organize it to help finding the information needed.  For example, an application of text categorization is spam filtering in email.
  • 11. VIRTUAL ASSISTANTS…  An application program that understands natural language voice commands and completes tasks for the user.  Benefits of AI Assistants: Improved customer support Ease of key data collection Personalized user experience  Examples: Chatbots, Voice Assistants, AI Avatars, Domain Specific Virtual Assistants, etc.
  • 12. LET’S TALK ABOUT A SCENARIO
  • 13. THE WORLD IS COMPETITIVE NOWADAYS
  • 14. THE WORLD IS COMPETITIVE NOWADAYS…  Everybody wishes to give their best even in a tiniest task.  And, when people are unable to meet these expectations, they get stressed/depression.  People often get depressed due to reasons like peer pressure, studies, family issues, relationships, etc.
  • 15.
  • 16. IS THERE ANY THERAPY FOR THIS?
  • 17. CBT
  • 18. CBT  Cognitive Behavioural Therapy (CBT) is considered to be one of the best methods to address stress as it is easy to implement on people and also gives good results.  It includes understanding the behaviour and mindset of a person in their normal life and help people overcome their stress and live a happy life.
  • 19. How an NLP project on “CBT” will be developed? To understand this lets go through AI Project Cycle.
  • 20. PROBLEM SCOPING  Most of the therapists cure patients out of depression using CBT technique.  But, People do not wish to seek the help of a psychiatrist willingly.  They try to avoid such interactions as much as possible.  Thus, there is a need to bridge the gap between a person who needs help and the psychiatrist.
  • 21. PROBLEM SCOPING  Who Canvas – Who has the problem?  People suffering from stress/depression.  What Canvas – What is the nature of the problem?  People who need help are reluctant to consult a psychiatrist and hence live miserably.  Where Canvas – Where does the problem arise?  When they are going through a stressful period of time.  Why Canvas – Why do you think it is a problem worth solving?  People get a platform where they can talk and vent out their feelings anonymously. (4Ws CANVAS)
  • 23. DATA ACQUISITION  To understand the sentiments of people, we need to collect their conversational data so the machine can interpret the words that they use and understand their meaning.  Such data can be collected from various means: 1. Surveys 2. Observing the therapist’s sessions 3. Databases available on the internet 4. Interviews, etc.
  • 24. DATA EXPLORATION  The textual data collected needs to be processed and cleaned so that an easier version can be sent to the machine.  The text is normalised through various steps and is lowered to minimum vocabulary since the machine does not require grammatically correct statements but the essence of it.
  • 25. MODELLING  Once the text has been normalised, it is then fed to an NLP based AI model.  In NLP, modelling requires data pre- processing only after which the data is fed to the machine.  Depending upon the type of chatbot to be made, an appropriate AI model is used to develop the foundation of the project.
  • 26. EVALUATION  The reliability of AI model is observed on the basis of outputs by feeding the test dataset into the model and comparing it with actual answers.
  • 27. EVALUATION… If the model’s output does not match the true function at all, the model is said to be underfitting and its accuracy is lower. Case-I
  • 28. EVALUATION… If the model’s performance matches well with the true function, then the model has optimum accuracy and it is called a perfect fit. Case-II
  • 29. EVALUATION… If the Model performance is trying to cover all the data samples even if they are out of alignment to the true function, then this is said to be overfitting and this too has a lower accuracy. Case-III
  • 30. CHATBOTS  One of the most common applications of Natural Language Processing is a chatbot.  An Al software that can simulate a real human conversation with real-time responses to users based on reinforced learning.  AI Chatbots either use text messages, voice commands, or both.
  • 31. CHATBOTS…  Ex- • Mitsuku Bot https://www.pandorabots.com/mitsuku/ • CleverBot https://www.cleverbot.com/ • Jabberwacky http://www.jabberwacky.com/ • Haptik https://haptik.ai/contact-us • Rose http://ec2-54-215-197-164.us-west-1.compute.amazonaws.com/speech.php • Ochatbot https://www.ometrics.com/blog/list-of-fun-chatbots/
  • 32. CHATBOTS…  There are 2 types of chatbots: Ex- bots deployed in the customer care section of various companies
  • 33. CHATBOTS… Ex- Google Assistant, Alexa, Cortana, Siri, etc.
  • 34. HUMAN LANGUAGE VS COMPUTER LANGUAGE  Human brain continuously processes everything what it gets around, makes sense and stores it in some place.  When someone whispers, the focus of our brain automatically shifts(giving more priority) to that speech and starts processing automatically.  While, the computer understands the language of numbers.  Everything that is sent to the machine has to be converted to numbers.
  • 35. DIFFICULTIES DURING PROCESSING NATURAL LANGUAGE BY A MACHINE  There are structures/characteristics in the human language that might be easy for a human to understand but extremely difficult for a computer to understand.  Different syntax, same semantics: 2+3 = 3+2  Different semantics, same syntax: 2/3 (Python 2.7) ≠ 2/3 (Python 3) Arrangement of the words and meaning
  • 36. DIFFICULTIES DURING PROCESSING NATURAL LANGUAGE BY A MACHINE… => His face turned red after he found out that he took the wrong bag. => His face turns red after consuming the medicine.  Both the sentences might have multiple meanings. Multiple Meanings of a word
  • 37. DIFFICULTIES DURING PROCESSING NATURAL LANGUAGE BY A MACHINE… => Chickens feed extravagantly while the moon drinks tea.  Both the sentences might have multiple meanings. Perfect Syntax, but no Meaning
  • 38. We may face these challenges if we try to teach computers how to understand and interact in human language. So, lets see how does NLP do this magic?
  • 39. DATA PROCESSING (TEXT NORMALISATION)  It involves preparing and cleaning text data for machines to be able to analyze it.  This process puts data in workable form and highlights features in the text that an algorithm can work with.  There are several ways this can be done, including:
  • 40. DATA PROCESSING…  Sentence Segmentation: In this process the whole corpus is divided into sentences. Each sentence is taken as a different data so now the whole corpus gets reduced to sentences.
  • 42. DATA PROCESSING…  Tokenisation: It is the process of breaking down the sentences into smaller units(tokens) to work with.
  • 44. DATA PROCESSING…  Removing Stopwords, Special Characters and Numbers: It is the process of removing common words, special characters, etc(which do not add any essence to the information) are removed from text so, unique words that offer the most information about the text remain. Some examples of stopwords are: a, an, are, for, etc.
  • 45. DATA PROCESSING…  Converting text to a common case: In this process the whole text is converted into a similar case(lower case). This ensures that the machine is case- insensitive.
  • 46. DATA PROCESSING…  Converting text to a common case:
  • 47. DATA PROCESSING…  Stemming: Here, the remaining words are reduced to their root words. It is the process in which the affixes of words are removed and the words are converted to their base form.
  • 49. DATA PROCESSING…  Lemmatization: The process in which a word is converted to its meaningful root form. Stemming and lemmatization both are alternative processes to each other as the role of both the processes is same – removal of affixes. But the difference between both of them is that in lemmatization, the word we get after affix removal (also known as lemma) is a meaningful one.
  • 51. BAG OF WORDS  A Natural Language Processing model which helps in extracting features out of the text which is very helpful in machine learning algorithms.  The occurrences of each word is counted and the vocabulary for the corpus is constructed.
  • 53. BAG OF WORDS… The step-by-step approach to implement bag of words algorithm: 1. Text Normalisation: Collect data and pre- process it. 2. Create Dictionary: Make a list of all the unique words occurring in the corpus. (Vocabulary). 3. Create document vectors: For each document in the corpus, find out how many times the word from the unique list of words has occurred. 4. Create document vectors for all the documents.
  • 55. BAG OF WORDS… Here are three documents having one sentence each. After text normalisation, the text becomes: Note that no tokens have been removed in the stopwords removal step. It is because we have very little data and since the frequency of all the words is almost the same, no word can be said to have lesser value than the other.
  • 56. BAG OF WORDS… List down all the words which occur in all three documents:
  • 57. BAG OF WORDS… In this step, •The vocabulary is written in the top row. •Now, for each word in the document, if it matches with the vocabulary, put a 1 under it. •If the same word appears again, increment the previous value by 1. •And if the word does not occur in that document, put a 0 under it.
  • 58. BAG OF WORDS… Since in the first document, we have words: aman, and, anil, are, stressed. So, all these words get a value of 1 and rest of the words get a 0 value.
  • 59. BAG OF WORDS… This gives us the document vector table for our corpus. But the tokens have still not converted to numbers. This leads us to the final steps of our algorithm: TFIDF
  • 60. BAG OF WORDS… A plot of occurrence of words versus their value
  • 61. TFIDF TFIDF stands for Term Frequency and Inverse Document Frequency. It helps in identifying the value for each word. Let us understand each term one by one. Term Frequency: Term frequency is the frequency of a word in one document. It can easily be found from the document vector table.
  • 63. TFIDF… Inverse Document Frequency: It the total number of documents divided by the document frequency.  IDF = Total no. of documents The document frequency
  • 64. TFIDF… TFIDF(W) = TF(W) * log( IDF(W) )
  • 65. TFIDF… After calculating all the values: Conclusion: The value of a word is inversely proportional to the IDF value of that word.
  • 66. TFIDF… Ex- Total Number of documents: 10 Number of documents in which ‘and’ occurs: 10 Therefore, IDF(and) = 10/10 = 1 Which means: log(1) = 0. Hence, the value of ‘and’ becomes 0. On the other hand, Number of documents in which ‘pollution’ occurs: 3 IDF(pollution) = 10/3 = 3.3333… Which means: log(3.3333) = 0.522; Which shows that the word ‘pollution’ has considerable value in the corpus. Thank You!!!

Editor's Notes

  1. Artificial Intelligence nowadays is becoming an integral part of our lives, its applications are very commonly used by the majority of people in their daily lives.
  2. Artificial Intelligence nowadays is becoming an integral part of our lives, its applications are very commonly used by the majority of people in their daily lives.
  3. Artificial Intelligence nowadays is becoming an integral part of our lives, its applications are very commonly used by the majority of people in their daily lives.
  4. Artificial Intelligence nowadays is becoming an integral part of our lives, its applications are very commonly used by the majority of people in their daily lives.