SlideShare a Scribd company logo
NATURAL LANGUAGE PROCESSING
-Jitendra Kumar Yadav
DAV Public School, Gumla
NLP
 It is the sub-field of AI that is focused on enabling
computers to understand and process human
languages.
 It is a subfield of Linguistics, Computer Science,
Information Engineering, and Artificial Intelligence.
 It is concerned with the interactions between
computers and human (natural) languages, in
particular how to program computers to process
and analyse large amounts of natural language
data.
APPLICATIONS OF NATURAL
LANGUAGE PROCESSING
 Automatic Summarization
 Sentiment Analysis
 Text classification
 Virtual Assistants
AUTOMATIC SUMMARIZATION
AUTOMATIC SUMMARIZATION…
 It is the process of shortening a set of data
computationally, to create a summary that
represents the most relevant information within the
original content.
 It comes out as the solution to information
overload.
 It is about understanding emotional meanings
within the information.
SENTIMENT ANALYSIS
SENTIMENT ANALYSIS…
 It is about identifying sentiment among several posts
or even in the same post where emotion is not always
explicitly expressed.
 Companies use NLP applications, such as sentiment
analysis, to identify opinions and sentiment online to
help them understand what customers think about
their products and services.
 Ex- “I love the new iPhone” and, a few lines later “But
sometimes it doesn’t work well” where the person is
still talking about the iPhone and overall indicators of
their reputation.
TEXT CLASSIFICATION
TEXT CLASSIFICATION…
 Text classification makes it possible to assign
predefined categories to a document and organize
it to help finding the information needed.
 For example, an application of text categorization
is spam filtering in email.
VIRTUAL ASSISTANTS
VIRTUAL ASSISTANTS…
 An application program that understands natural
language voice commands and completes tasks
for the user.
 Benefits of AI Assistants:
Improved customer support
Ease of key data collection
Personalized user experience
 Examples:
Chatbots, Voice Assistants, AI Avatars, Domain
Specific Virtual Assistants, etc.
LET’S TALK ABOUT A SCENARIO
THE WORLD IS COMPETITIVE
NOWADAYS
THE WORLD IS COMPETITIVE
NOWADAYS…
 Everybody wishes to give their best even in a
tiniest task.
 And, when people are unable to meet these
expectations, they get stressed/depression.
 People often get depressed due to reasons
like peer pressure, studies, family issues,
relationships, etc.
IS THERE ANY THERAPY FOR THIS?
CBT
CBT
 Cognitive Behavioural Therapy (CBT) is
considered to be one of the best methods to
address stress as it is easy to implement on
people and also gives good results.
 It includes understanding the behaviour and
mindset of a person in their normal life and
help people overcome their stress and live a
happy life.
How an NLP project on “CBT” will be
developed?
To understand this lets go through AI
Project Cycle.
PROBLEM SCOPING
 Most of the therapists cure patients out of
depression using CBT technique.
 But, People do not wish to seek the help of a
psychiatrist willingly.
 They try to avoid such interactions as much
as possible.
 Thus, there is a need to bridge the gap
between a person who needs help and the
psychiatrist.
PROBLEM SCOPING
 Who Canvas – Who has the problem?
 People suffering from stress/depression.
 What Canvas – What is the nature of the
problem?
 People who need help are reluctant to consult a psychiatrist
and hence live miserably.
 Where Canvas – Where does the problem
arise?
 When they are going through a stressful period of time.
 Why Canvas – Why do you think it is a problem
worth solving?
 People get a platform where they can talk and vent out their
feelings anonymously.
(4Ws CANVAS)
PROBLEM SCOPING…
DATA ACQUISITION
 To understand the sentiments of people, we
need to collect their conversational data so the
machine can interpret the words that they use
and understand their meaning.
 Such data can be collected from various means:
1. Surveys
2. Observing the therapist’s sessions
3. Databases available on the internet
4. Interviews, etc.
DATA EXPLORATION
 The textual data collected needs to be
processed and cleaned so that an easier
version can be sent to the machine.
 The text is normalised through various steps
and is lowered to minimum vocabulary since
the machine does not require grammatically
correct statements but the essence of it.
MODELLING
 Once the text has been normalised, it is then
fed to an NLP based AI model.
 In NLP, modelling requires data pre-
processing only after which the data is fed to
the machine.
 Depending upon the type of chatbot to be
made, an appropriate AI model is used to
develop the foundation of the project.
EVALUATION
 The reliability of AI model is observed on
the basis of outputs by feeding the test
dataset into the model and comparing it with
actual answers.
EVALUATION…
If the model’s
output does not
match the true
function at all, the
model is said to be
underfitting and its
accuracy is lower.
Case-I
EVALUATION…
If the model’s
performance matches
well with the true
function, then the
model has optimum
accuracy and it is
called a perfect fit.
Case-II
EVALUATION…
If the Model performance
is trying to cover all the
data samples even if
they are out of alignment
to the true function, then
this is said to be
overfitting and this too
has a lower accuracy.
Case-III
CHATBOTS
 One of the most common applications of Natural
Language Processing is a chatbot.
 An Al software that can simulate a real
human conversation with real-time responses
to users based on reinforced learning.
 AI Chatbots either use text messages, voice
commands, or both.
CHATBOTS…
 Ex-
• Mitsuku Bot
https://www.pandorabots.com/mitsuku/
• CleverBot
https://www.cleverbot.com/
• Jabberwacky
http://www.jabberwacky.com/
• Haptik
https://haptik.ai/contact-us
• Rose
http://ec2-54-215-197-164.us-west-1.compute.amazonaws.com/speech.php
• Ochatbot
https://www.ometrics.com/blog/list-of-fun-chatbots/
CHATBOTS…
 There are 2 types of chatbots:
Ex- bots deployed in the customer care section of various
companies
CHATBOTS…
Ex- Google Assistant, Alexa, Cortana, Siri, etc.
HUMAN LANGUAGE VS
COMPUTER LANGUAGE
 Human brain continuously processes everything
what it gets around, makes sense and stores it
in some place.
 When someone whispers, the focus of our brain
automatically shifts(giving more priority) to that
speech and starts processing automatically.
 While, the computer understands the language
of numbers.
 Everything that is sent to the machine has to be
converted to numbers.
DIFFICULTIES DURING PROCESSING
NATURAL LANGUAGE BY A MACHINE
 There are structures/characteristics in the
human language that might be easy for a
human to understand but extremely difficult
for a computer to understand.
 Different syntax, same semantics:
2+3 = 3+2
 Different semantics, same syntax:
2/3 (Python 2.7) ≠ 2/3 (Python 3)
Arrangement of the words and meaning
DIFFICULTIES DURING PROCESSING
NATURAL LANGUAGE BY A MACHINE…
=> His face turned red after he found out that
he took the wrong bag.
=> His face turns red after consuming the
medicine.
 Both the sentences might have multiple
meanings.
Multiple Meanings of a word
DIFFICULTIES DURING PROCESSING
NATURAL LANGUAGE BY A MACHINE…
=> Chickens feed extravagantly while the moon
drinks tea.
 Both the sentences might have multiple
meanings.
Perfect Syntax, but no Meaning
We may face these challenges if we try to
teach computers how to understand and
interact in human language.
So, lets see how does NLP do this magic?
DATA PROCESSING
(TEXT NORMALISATION)
 It involves preparing and cleaning text data
for machines to be able to analyze it.
 This process puts data in workable form and
highlights features in the text that an
algorithm can work with.
 There are several ways this can be done,
including:
DATA PROCESSING…
 Sentence Segmentation:
In this process the whole corpus is divided
into sentences. Each sentence is taken as a
different data so now the whole corpus gets
reduced to sentences.
DATA PROCESSING…
 Sentence Segmentation:
DATA PROCESSING…
 Tokenisation:
It is the process of breaking down the
sentences into smaller units(tokens) to work
with.
DATA PROCESSING…
 Tokenisation:
DATA PROCESSING…
 Removing Stopwords, Special Characters
and Numbers:
It is the process of removing common words,
special characters, etc(which do not add any
essence to the information) are removed
from text so, unique words that offer the most
information about the text remain.
Some examples of stopwords are:
a, an, are, for, etc.
DATA PROCESSING…
 Converting text to a common case:
In this process the whole text is converted
into a similar case(lower case).
This ensures that the machine is case-
insensitive.
DATA PROCESSING…
 Converting text to a common case:
DATA PROCESSING…
 Stemming:
Here, the remaining words are reduced to
their root words.
It is the process in which the affixes of words
are removed and the words are converted to
their base form.
DATA PROCESSING…
 Stemming:
DATA PROCESSING…
 Lemmatization:
The process in which a word is converted to its
meaningful root form.
Stemming and lemmatization both are
alternative processes to each other as the role
of both the processes is same – removal of
affixes. But the difference between both of them
is that in lemmatization, the word we get after
affix removal (also known as lemma) is a
meaningful one.
DATA PROCESSING…
 Lemmatization:
BAG OF WORDS
 A Natural Language Processing model which
helps in extracting features out of the text
which is very helpful in machine learning
algorithms.
 The occurrences of each word is counted
and the vocabulary for the corpus is
constructed.
BAG OF WORDS…
Vocabulary
Frequency
of
words
BAG OF WORDS…
The step-by-step approach to implement bag of words
algorithm:
1. Text Normalisation: Collect data and pre-
process it.
2. Create Dictionary: Make a list of all the unique
words occurring in the corpus. (Vocabulary).
3. Create document vectors: For each document
in the corpus, find out how many times the
word from the unique list of words has
occurred.
4. Create document vectors for all the documents.
BAG OF WORDS…
BAG OF WORDS…
Here are three documents having one sentence each. After text
normalisation, the text becomes:
Note that no tokens have been removed in the
stopwords removal step. It is because we have very
little data and since the frequency of all the words is
almost the same, no word can be said to have lesser
value than the other.
BAG OF WORDS…
List down all the words which occur in all three
documents:
BAG OF WORDS…
In this step,
•The vocabulary is written in the top row.
•Now, for each word in the document, if it matches
with the vocabulary, put a 1 under it.
•If the same word appears again, increment the
previous value by 1.
•And if the word does not occur in that document, put
a 0 under it.
BAG OF WORDS…
Since in the first document, we have words: aman,
and, anil, are, stressed. So, all these words get a
value of 1 and rest of the words get a 0 value.
BAG OF WORDS…
This gives us the document vector table for our corpus. But the tokens have still not
converted to numbers. This leads us to the final steps of our algorithm: TFIDF
BAG OF WORDS…
A plot of occurrence of words versus their value
TFIDF
TFIDF stands for Term Frequency and Inverse Document Frequency.
It helps in identifying the value for each word.
Let us understand each term one by one.
Term Frequency:
Term frequency is the frequency of a word in one
document.
It can easily be found from the document vector table.
TFIDF…
TFIDF…
Inverse Document Frequency:
It the total number of documents divided by the
document frequency.
 IDF =
Total no. of documents
The document frequency
TFIDF…
TFIDF(W) = TF(W) * log( IDF(W) )
TFIDF…
After calculating all the values:
Conclusion:
The value of a word is inversely proportional to the
IDF value of that word.
TFIDF…
Ex-
Total Number of documents: 10
Number of documents in which ‘and’ occurs: 10
Therefore, IDF(and) = 10/10 = 1
Which means: log(1) = 0.
Hence, the value of ‘and’ becomes 0.
On the other hand,
Number of documents in which ‘pollution’ occurs: 3
IDF(pollution) = 10/3 = 3.3333…
Which means: log(3.3333) = 0.522;
Which shows that the word ‘pollution’ has considerable
value in the corpus.
Thank You!!!

More Related Content

What's hot

Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Yuriy Guts
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
Mercy Rani
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
saurabhnarhe
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
VeenaSKumar2
 
Machine learning
Machine learningMachine learning
Machine learning
eonx_32
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
KarenVacca
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
Saurav Aryal
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
Alexey Grigorev
 
Artificial Intelligence = ML + DL with Tensor Flow
Artificial Intelligence = ML + DL with Tensor FlowArtificial Intelligence = ML + DL with Tensor Flow
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review
Jayneel Vora
 
NLP Applications
NLP ApplicationsNLP Applications
NLP Applications
Repustate
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
Hansi Thenuwara
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)
Yuta Niki
 
Natural language processing (NLP)
Natural language processing (NLP) Natural language processing (NLP)
Natural language processing (NLP)
ASWINKP11
 
Natural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionNatural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - Introduction
Aritra Mukherjee
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Vivek Garg
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Jaganadh Gopinadhan
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
rohitnayak
 
Transformers AI PPT.pptx
Transformers AI PPT.pptxTransformers AI PPT.pptx
Transformers AI PPT.pptx
RahulKumar854607
 
Implementing Artificial Intelligence with Big Data
Implementing Artificial Intelligence with Big DataImplementing Artificial Intelligence with Big Data
Implementing Artificial Intelligence with Big Data
IDEAS - Int'l Data Engineering and Science Association
 

What's hot (20)

Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Machine learning
Machine learningMachine learning
Machine learning
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
 
Artificial Intelligence = ML + DL with Tensor Flow
Artificial Intelligence = ML + DL with Tensor FlowArtificial Intelligence = ML + DL with Tensor Flow
Artificial Intelligence = ML + DL with Tensor Flow
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review
 
NLP Applications
NLP ApplicationsNLP Applications
NLP Applications
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)
 
Natural language processing (NLP)
Natural language processing (NLP) Natural language processing (NLP)
Natural language processing (NLP)
 
Natural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionNatural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - Introduction
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Transformers AI PPT.pptx
Transformers AI PPT.pptxTransformers AI PPT.pptx
Transformers AI PPT.pptx
 
Implementing Artificial Intelligence with Big Data
Implementing Artificial Intelligence with Big DataImplementing Artificial Intelligence with Big Data
Implementing Artificial Intelligence with Big Data
 

Similar to NLP(Natural Language Processing)

NLP (4) for class 9 (1).pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnn
NLP (4) for class 9 (1).pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnnNLP (4) for class 9 (1).pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnn
NLP (4) for class 9 (1).pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnn
shradhasharma2101
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Bhavya Chawla
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptx
ssuser95248c
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
Boston Institute of Analytics
 
AI_Lecture_10.pptx
AI_Lecture_10.pptxAI_Lecture_10.pptx
AI_Lecture_10.pptx
saadurrehman35
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
Aravind Reddy
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
Aravind Reddy
 
sent_analysis_report
sent_analysis_reportsent_analysis_report
sent_analysis_report
Subhadarsini Prusty
 
Natural language processing PPT presentation
Natural language processing PPT presentationNatural language processing PPT presentation
Natural language processing PPT presentation
Sai Mohith
 
Artificial Intelligence (Unit - 2).pdf
Artificial Intelligence   (Unit  -  2).pdfArtificial Intelligence   (Unit  -  2).pdf
Artificial Intelligence (Unit - 2).pdf
SathyaNarayanan47813
 
Top 10 Must-Know NLP Techniques for Data Scientists
Top 10 Must-Know NLP Techniques for Data ScientistsTop 10 Must-Know NLP Techniques for Data Scientists
Top 10 Must-Know NLP Techniques for Data Scientists
Xavor Corporation - Redefining Health Technology
 
NATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptxNATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptx
saivinay93
 
Microsoft.com Usability broken.
Microsoft.com Usability broken.Microsoft.com Usability broken.
Microsoft.com Usability broken.
None None
 
Deep Machine Reading
Deep Machine ReadingDeep Machine Reading
Deep Machine Reading
Naveen Ashish
 
Web & Social Media Analystics - Workshop Semantica
Web & Social Media Analystics - Workshop SemanticaWeb & Social Media Analystics - Workshop Semantica
Web & Social Media Analystics - Workshop Semantica
Roberto Cirillo
 
NLP.pptx
NLP.pptxNLP.pptx
NLP.pptx
Rahul Borate
 
Natural language understanding of chatbots
Natural language understanding of chatbotsNatural language understanding of chatbots
Natural language understanding of chatbots
abn17p
 
Natural language understandihggjsjng. pptx
Natural language understandihggjsjng. pptxNatural language understandihggjsjng. pptx
Natural language understandihggjsjng. pptx
MAKSHAY6
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language Processing
Michel Bruley
 
Enterprise Systems - MS809
Enterprise Systems -   MS809Enterprise Systems -   MS809
Enterprise Systems - MS809
Diarmaid Ó Fátharta
 

Similar to NLP(Natural Language Processing) (20)

NLP (4) for class 9 (1).pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnn
NLP (4) for class 9 (1).pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnnNLP (4) for class 9 (1).pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnn
NLP (4) for class 9 (1).pptx nnnnnnnnnnnnnnnnnnnnnnnnnnnnn
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptx
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
AI_Lecture_10.pptx
AI_Lecture_10.pptxAI_Lecture_10.pptx
AI_Lecture_10.pptx
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
 
sent_analysis_report
sent_analysis_reportsent_analysis_report
sent_analysis_report
 
Natural language processing PPT presentation
Natural language processing PPT presentationNatural language processing PPT presentation
Natural language processing PPT presentation
 
Artificial Intelligence (Unit - 2).pdf
Artificial Intelligence   (Unit  -  2).pdfArtificial Intelligence   (Unit  -  2).pdf
Artificial Intelligence (Unit - 2).pdf
 
Top 10 Must-Know NLP Techniques for Data Scientists
Top 10 Must-Know NLP Techniques for Data ScientistsTop 10 Must-Know NLP Techniques for Data Scientists
Top 10 Must-Know NLP Techniques for Data Scientists
 
NATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptxNATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptx
 
Microsoft.com Usability broken.
Microsoft.com Usability broken.Microsoft.com Usability broken.
Microsoft.com Usability broken.
 
Deep Machine Reading
Deep Machine ReadingDeep Machine Reading
Deep Machine Reading
 
Web & Social Media Analystics - Workshop Semantica
Web & Social Media Analystics - Workshop SemanticaWeb & Social Media Analystics - Workshop Semantica
Web & Social Media Analystics - Workshop Semantica
 
NLP.pptx
NLP.pptxNLP.pptx
NLP.pptx
 
Natural language understanding of chatbots
Natural language understanding of chatbotsNatural language understanding of chatbots
Natural language understanding of chatbots
 
Natural language understandihggjsjng. pptx
Natural language understandihggjsjng. pptxNatural language understandihggjsjng. pptx
Natural language understandihggjsjng. pptx
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language Processing
 
Enterprise Systems - MS809
Enterprise Systems -   MS809Enterprise Systems -   MS809
Enterprise Systems - MS809
 

Recently uploaded

RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
IreneSebastianRueco1
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
Priyankaranawat4
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
Nguyen Thanh Tu Collection
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
TechSoup
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
heathfieldcps1
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
Celine George
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
mulvey2
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
adhitya5119
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
Dr. Mulla Adam Ali
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
RitikBhardwaj56
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
National Information Standards Organization (NISO)
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
tarandeep35
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
Katrina Pritchard
 
How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17
Celine George
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
TechSoup
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
adhitya5119
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Fajar Baskoro
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 

Recently uploaded (20)

RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
 
How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 

NLP(Natural Language Processing)

  • 1. NATURAL LANGUAGE PROCESSING -Jitendra Kumar Yadav DAV Public School, Gumla
  • 2. NLP  It is the sub-field of AI that is focused on enabling computers to understand and process human languages.  It is a subfield of Linguistics, Computer Science, Information Engineering, and Artificial Intelligence.  It is concerned with the interactions between computers and human (natural) languages, in particular how to program computers to process and analyse large amounts of natural language data.
  • 3. APPLICATIONS OF NATURAL LANGUAGE PROCESSING  Automatic Summarization  Sentiment Analysis  Text classification  Virtual Assistants
  • 5. AUTOMATIC SUMMARIZATION…  It is the process of shortening a set of data computationally, to create a summary that represents the most relevant information within the original content.  It comes out as the solution to information overload.  It is about understanding emotional meanings within the information.
  • 7. SENTIMENT ANALYSIS…  It is about identifying sentiment among several posts or even in the same post where emotion is not always explicitly expressed.  Companies use NLP applications, such as sentiment analysis, to identify opinions and sentiment online to help them understand what customers think about their products and services.  Ex- “I love the new iPhone” and, a few lines later “But sometimes it doesn’t work well” where the person is still talking about the iPhone and overall indicators of their reputation.
  • 9. TEXT CLASSIFICATION…  Text classification makes it possible to assign predefined categories to a document and organize it to help finding the information needed.  For example, an application of text categorization is spam filtering in email.
  • 11. VIRTUAL ASSISTANTS…  An application program that understands natural language voice commands and completes tasks for the user.  Benefits of AI Assistants: Improved customer support Ease of key data collection Personalized user experience  Examples: Chatbots, Voice Assistants, AI Avatars, Domain Specific Virtual Assistants, etc.
  • 12. LET’S TALK ABOUT A SCENARIO
  • 13. THE WORLD IS COMPETITIVE NOWADAYS
  • 14. THE WORLD IS COMPETITIVE NOWADAYS…  Everybody wishes to give their best even in a tiniest task.  And, when people are unable to meet these expectations, they get stressed/depression.  People often get depressed due to reasons like peer pressure, studies, family issues, relationships, etc.
  • 15.
  • 16. IS THERE ANY THERAPY FOR THIS?
  • 17. CBT
  • 18. CBT  Cognitive Behavioural Therapy (CBT) is considered to be one of the best methods to address stress as it is easy to implement on people and also gives good results.  It includes understanding the behaviour and mindset of a person in their normal life and help people overcome their stress and live a happy life.
  • 19. How an NLP project on “CBT” will be developed? To understand this lets go through AI Project Cycle.
  • 20. PROBLEM SCOPING  Most of the therapists cure patients out of depression using CBT technique.  But, People do not wish to seek the help of a psychiatrist willingly.  They try to avoid such interactions as much as possible.  Thus, there is a need to bridge the gap between a person who needs help and the psychiatrist.
  • 21. PROBLEM SCOPING  Who Canvas – Who has the problem?  People suffering from stress/depression.  What Canvas – What is the nature of the problem?  People who need help are reluctant to consult a psychiatrist and hence live miserably.  Where Canvas – Where does the problem arise?  When they are going through a stressful period of time.  Why Canvas – Why do you think it is a problem worth solving?  People get a platform where they can talk and vent out their feelings anonymously. (4Ws CANVAS)
  • 23. DATA ACQUISITION  To understand the sentiments of people, we need to collect their conversational data so the machine can interpret the words that they use and understand their meaning.  Such data can be collected from various means: 1. Surveys 2. Observing the therapist’s sessions 3. Databases available on the internet 4. Interviews, etc.
  • 24. DATA EXPLORATION  The textual data collected needs to be processed and cleaned so that an easier version can be sent to the machine.  The text is normalised through various steps and is lowered to minimum vocabulary since the machine does not require grammatically correct statements but the essence of it.
  • 25. MODELLING  Once the text has been normalised, it is then fed to an NLP based AI model.  In NLP, modelling requires data pre- processing only after which the data is fed to the machine.  Depending upon the type of chatbot to be made, an appropriate AI model is used to develop the foundation of the project.
  • 26. EVALUATION  The reliability of AI model is observed on the basis of outputs by feeding the test dataset into the model and comparing it with actual answers.
  • 27. EVALUATION… If the model’s output does not match the true function at all, the model is said to be underfitting and its accuracy is lower. Case-I
  • 28. EVALUATION… If the model’s performance matches well with the true function, then the model has optimum accuracy and it is called a perfect fit. Case-II
  • 29. EVALUATION… If the Model performance is trying to cover all the data samples even if they are out of alignment to the true function, then this is said to be overfitting and this too has a lower accuracy. Case-III
  • 30. CHATBOTS  One of the most common applications of Natural Language Processing is a chatbot.  An Al software that can simulate a real human conversation with real-time responses to users based on reinforced learning.  AI Chatbots either use text messages, voice commands, or both.
  • 31. CHATBOTS…  Ex- • Mitsuku Bot https://www.pandorabots.com/mitsuku/ • CleverBot https://www.cleverbot.com/ • Jabberwacky http://www.jabberwacky.com/ • Haptik https://haptik.ai/contact-us • Rose http://ec2-54-215-197-164.us-west-1.compute.amazonaws.com/speech.php • Ochatbot https://www.ometrics.com/blog/list-of-fun-chatbots/
  • 32. CHATBOTS…  There are 2 types of chatbots: Ex- bots deployed in the customer care section of various companies
  • 33. CHATBOTS… Ex- Google Assistant, Alexa, Cortana, Siri, etc.
  • 34. HUMAN LANGUAGE VS COMPUTER LANGUAGE  Human brain continuously processes everything what it gets around, makes sense and stores it in some place.  When someone whispers, the focus of our brain automatically shifts(giving more priority) to that speech and starts processing automatically.  While, the computer understands the language of numbers.  Everything that is sent to the machine has to be converted to numbers.
  • 35. DIFFICULTIES DURING PROCESSING NATURAL LANGUAGE BY A MACHINE  There are structures/characteristics in the human language that might be easy for a human to understand but extremely difficult for a computer to understand.  Different syntax, same semantics: 2+3 = 3+2  Different semantics, same syntax: 2/3 (Python 2.7) ≠ 2/3 (Python 3) Arrangement of the words and meaning
  • 36. DIFFICULTIES DURING PROCESSING NATURAL LANGUAGE BY A MACHINE… => His face turned red after he found out that he took the wrong bag. => His face turns red after consuming the medicine.  Both the sentences might have multiple meanings. Multiple Meanings of a word
  • 37. DIFFICULTIES DURING PROCESSING NATURAL LANGUAGE BY A MACHINE… => Chickens feed extravagantly while the moon drinks tea.  Both the sentences might have multiple meanings. Perfect Syntax, but no Meaning
  • 38. We may face these challenges if we try to teach computers how to understand and interact in human language. So, lets see how does NLP do this magic?
  • 39. DATA PROCESSING (TEXT NORMALISATION)  It involves preparing and cleaning text data for machines to be able to analyze it.  This process puts data in workable form and highlights features in the text that an algorithm can work with.  There are several ways this can be done, including:
  • 40. DATA PROCESSING…  Sentence Segmentation: In this process the whole corpus is divided into sentences. Each sentence is taken as a different data so now the whole corpus gets reduced to sentences.
  • 42. DATA PROCESSING…  Tokenisation: It is the process of breaking down the sentences into smaller units(tokens) to work with.
  • 44. DATA PROCESSING…  Removing Stopwords, Special Characters and Numbers: It is the process of removing common words, special characters, etc(which do not add any essence to the information) are removed from text so, unique words that offer the most information about the text remain. Some examples of stopwords are: a, an, are, for, etc.
  • 45. DATA PROCESSING…  Converting text to a common case: In this process the whole text is converted into a similar case(lower case). This ensures that the machine is case- insensitive.
  • 46. DATA PROCESSING…  Converting text to a common case:
  • 47. DATA PROCESSING…  Stemming: Here, the remaining words are reduced to their root words. It is the process in which the affixes of words are removed and the words are converted to their base form.
  • 49. DATA PROCESSING…  Lemmatization: The process in which a word is converted to its meaningful root form. Stemming and lemmatization both are alternative processes to each other as the role of both the processes is same – removal of affixes. But the difference between both of them is that in lemmatization, the word we get after affix removal (also known as lemma) is a meaningful one.
  • 51. BAG OF WORDS  A Natural Language Processing model which helps in extracting features out of the text which is very helpful in machine learning algorithms.  The occurrences of each word is counted and the vocabulary for the corpus is constructed.
  • 53. BAG OF WORDS… The step-by-step approach to implement bag of words algorithm: 1. Text Normalisation: Collect data and pre- process it. 2. Create Dictionary: Make a list of all the unique words occurring in the corpus. (Vocabulary). 3. Create document vectors: For each document in the corpus, find out how many times the word from the unique list of words has occurred. 4. Create document vectors for all the documents.
  • 55. BAG OF WORDS… Here are three documents having one sentence each. After text normalisation, the text becomes: Note that no tokens have been removed in the stopwords removal step. It is because we have very little data and since the frequency of all the words is almost the same, no word can be said to have lesser value than the other.
  • 56. BAG OF WORDS… List down all the words which occur in all three documents:
  • 57. BAG OF WORDS… In this step, •The vocabulary is written in the top row. •Now, for each word in the document, if it matches with the vocabulary, put a 1 under it. •If the same word appears again, increment the previous value by 1. •And if the word does not occur in that document, put a 0 under it.
  • 58. BAG OF WORDS… Since in the first document, we have words: aman, and, anil, are, stressed. So, all these words get a value of 1 and rest of the words get a 0 value.
  • 59. BAG OF WORDS… This gives us the document vector table for our corpus. But the tokens have still not converted to numbers. This leads us to the final steps of our algorithm: TFIDF
  • 60. BAG OF WORDS… A plot of occurrence of words versus their value
  • 61. TFIDF TFIDF stands for Term Frequency and Inverse Document Frequency. It helps in identifying the value for each word. Let us understand each term one by one. Term Frequency: Term frequency is the frequency of a word in one document. It can easily be found from the document vector table.
  • 63. TFIDF… Inverse Document Frequency: It the total number of documents divided by the document frequency.  IDF = Total no. of documents The document frequency
  • 64. TFIDF… TFIDF(W) = TF(W) * log( IDF(W) )
  • 65. TFIDF… After calculating all the values: Conclusion: The value of a word is inversely proportional to the IDF value of that word.
  • 66. TFIDF… Ex- Total Number of documents: 10 Number of documents in which ‘and’ occurs: 10 Therefore, IDF(and) = 10/10 = 1 Which means: log(1) = 0. Hence, the value of ‘and’ becomes 0. On the other hand, Number of documents in which ‘pollution’ occurs: 3 IDF(pollution) = 10/3 = 3.3333… Which means: log(3.3333) = 0.522; Which shows that the word ‘pollution’ has considerable value in the corpus. Thank You!!!

Editor's Notes

  1. Artificial Intelligence nowadays is becoming an integral part of our lives, its applications are very commonly used by the majority of people in their daily lives.
  2. Artificial Intelligence nowadays is becoming an integral part of our lives, its applications are very commonly used by the majority of people in their daily lives.
  3. Artificial Intelligence nowadays is becoming an integral part of our lives, its applications are very commonly used by the majority of people in their daily lives.
  4. Artificial Intelligence nowadays is becoming an integral part of our lives, its applications are very commonly used by the majority of people in their daily lives.