SlideShare a Scribd company logo
Unit 3 – NATURAL
LANGUAGE PROCESSING
NLP OVERVIEW
 NLP is a part of Artificial Intelligence which deals with
Human Language by a program.
 Used by machines to understand, analyse, manipulate, and
interpret human's languages.
 It helps to performing tasks such as translation, automatic
summarization, Named Entity Recognition (NER), speech
recognition, relationship extraction, and topic segmentation.
PYTHON IN NLP
NLTK, or Natural Language Toolkit, is a Python package used
in NLP.
NLTK provides a wide range of functionalities and resources for
tasks such as tokenization, stemming, tagging, parsing,
semantic reasoning, and more.
NLTK is widely used in academia and industry for tasks such as
text classification, sentiment analysis, machine translation, and
information extraction.
NLTK INSTALLATION
pip install nltk
import nltk
nltk.download() (#download all the required packages)
APPLICATIONS
NLP PROCESS EXPLAINED
1.MORPHOLOGICAL PROCESSING
Morphological processing refers to the analysis and manipulation of the internal
structure of words to understand their grammatical forms and extract meaningful
information.
Morphological processing involves tasks such as:
1. Tokenization
2. Stop Word Removal
3. Stemming
4. N –Gram Language Model
5. Name Entity Recognition(ner)
6. Chunking & Part-of-Speech (POS) Tagging
1. TOKENIZATION
Tokenization in NLP is the process of breaking a sequence of text into smaller
units, called tokens.
Types of tokenization :
Word Tokenization: This type of tokenization breaks text into individual words
based on whitespace or punctuation.
Example: "I love NLP!" ["I", "love", "NLP", "!"].
Sentence Tokenization: Sentence tokenization involves splitting a paragraph or
text into individual sentences.
Example: "I love NLP. It's fascinating." ["I love NLP.", "It's
Python Code:
from nltk.tokenize import sent_tokenize, word_tokenize
data = "All work and no play makes jack a dull boy, all work and no
play"
print(word_tokenize(data))
Output:
2. STOP WORD REMOVAL
In NLP, stop word removal is the process of eliminating commonly used words
that do not carry significant meaning of a text.
Stop word removal is performed to reduce the dimensionality of text data,
improve computational efficiency, and focus on more informative words.
By removing stop words, the remaining words in the text used to enhance the
accuracy of NLP task.
After stop word removal, the filtered tokens only contain the words that are
not considered stop words.
Python Code:
stopwords.words('english') is a function that returns a list of
commonly used stop words in the English language.
from nltk.corpus import stopwords
a = set(stopwords.words('english'))
print(a)
To remove a stop words in given sentence :
words = [word for word in data.split() if word.lower() not in a]
new_text = " ".join(words)
print(new_text)
3.STEMMING
Stemming is a Process of Reducing Words (normalization of words)
into its Base form(Root form/Stem form)
Example:
1.John ate Pizza
John—John Ate----eat Pizza---pizza
2.Stemmer, stemming, stemmed --- stem
Types of Stemming:
1. Porter Stemming : The Porter stemming applies a set of rules to
remove common English word suffixes.
2. Snowball Stemming : Snowball enables stemming for various
languages beyond English and provides a framework for creating new
stemming algorithms.
3. Lancaster Stemming : It applies a set of rules to remove English word
suffixes and aims for a more aggressive reduction of words to their
stems compared to the Porter algorithm.
4.N –GRAM LANGUAGE MODEL
 NLP N-Grams are useful to create features from text corpus for
machine learning algorithms like SVM, Naive Bayes, etc.
Due to their frequent uses, n-gram models for n=1,2,3 have specific
names as Unigram, Bigram, and Trigram models
The "N" in N-gram refers to the number of items considered in the
sequence.
N-Grams are useful for creating capabilities like autocorrect,
autocompletion of sentences, text summarization, speech recognition,
etc.
Python code:
from nltk.util import ngrams
data = "All work and no play makes jack a dull boy, all work and no play"
n = 1
unigrams = ngrams(data.split(), n)
for item in unigrams:
print(item)
Output:
n=1 n=2
5.NAME ENTITY RECOGNITION(NER)
Named Entity Recognition (NER) is a subtask in NLP that focuses on
identifying and classifying named entities in text into predefined
categories such as person names, organizations, locations, dates, and
more.
The goal of NER is to extract meaningful information from unstructured
text by recognizing and labeling named entities.
“Apple stock prices are going up”
Apple as a fruit or company ?
NER – Example 1:
NER – Example 2:
Python code:
import nltk
from nltk import sent_tokenize, word_tokenize, pos_tag, ne_chunk
sentence = "Apple Inc. is planning to open a new store in New York on
July 15th."
for chunk in
ne_chunk(pos_tag(word_tokenize(sent_tokenize(sentence)[0]))):
if hasattr(chunk, 'label'):
print(chunk.label(), ' '.join(c[0] for c in chunk))
6. CHUNKING & PART-OF-SPEECH
(POS)
In chunking, the process typically involves two steps:
1.Part-of-speech (POS) tagging
2.Chunking
PART-OF-SPEECH (POS) TAGS
Each word or token in a sentence is assigned a part-of-speech tag,
indicating its grammatical category (noun, verb, adjective, etc.).
POS tagging helps in identifying the role and function of each word in
the sentence.
It is also called grammatical tagging.
Python Code:
import nltk
from nltk import sent_tokenize, word_tokenize, pos_tag,
ne_chunk
d = "The dog ate the cat"
tokenize_text=word_tokenize(d)
nltk.pos_tag(tokenize_text)
CHUNKING
Based on the POS tags, patterns or rules are applied to group
consecutive words into chunks.
Picking up Individual pieces of information and grouping them into
bigger pieces
2.SYNTACTIC ANALYSIS or PARSING
It is the process of analysing the natural language with the rules of
formal grammar to find out the dictionary meaning of any sentence.
Syntax analysis checks the text for meaningfulness comparing to
the rules of formal grammar.
Example:
Delhi is the capital of India.
Is Delhi the of India capital.
3.SEMANTIC ANALYSIS
The work of semantic analyser is to check the text for
meaningfulness.
The goal of semantic analysis is to enable machines to understand
and interpret human language in a way that goes beyond the mere
surface-level syntactic structure.
Example:
She drank some Milk
She drank some Books
4.DISCOURSE ANALYSIS
Discourse analysis is help us to
understand how language is used in
real-world contexts.
It focuses on analyzing the structure,
coherence, and meaning of texts or
spoken interactions within their
social and cultural contexts.
Example:
Monkeys Eat Banana, When they
Wake up.
Who is they here?
Monkey
Monkeys eat Banana, When they
ripe.
Who is they here?
Banana
5.PRAGMATIC ANALYSIS
Pragmatic analysis in NLP (Natural
Language Processing) is then defined as
the process of extracting information from any
given text.
Pragmatic analysis takes into account the
speaker's intention, the listener's
understanding, and the social context in which
Example:
Close the Door
Type: Order
Please, close the door
Type: Request, Affirmation
Example of NLP
APPLICATIONS OF NLP
Core Tasks
Industry
Specific
General
Applications
VIRTUAL AGENTS
VIRTUAL AGENTS
Software programs that simulate the tasks such as managing schedules,
handling travel needs, booking appointments, sending reminders , playing
music, or controlling smart home devices and password resets are known as
Virtual Assistants.
However, its functions are slightly more advanced than chatbots.
Example:
Virtual agents are commonly used in applications like CUSTOMER
SUPPORT, where they can handle frequently asked questions, troubleshoot
issues, or guide users through processes.
VIRTUAL AGENTS ENTERPRISE
INDUSTRY CASE STUDY
IBM SOLUTION:
https://www.ibm.com/case-studies/autodesk-inc
PROBLEM STATEMENT:
As the company switched from a desktop licensing model to a SaaS model, its
reach improved. But with that surge came an increase in customer inquiries.
 Sometimes with heavy volume and complex issues, the resolution time for
questions was 1.5 days or more.
Autodesk’s staff of about 350 customer support agents handles roughly one
million customer and partner contacts per year.
About half of these are simple activation code requests, changes of address,
contract problems, and technical issues.
Spratto, Vice President of Operations at Autodesk, said:
“A lot of what my team does is just problem recognition, trying to identify what
the person wants or is asking.”

More Related Content

Similar to AI UNIT 3 - SRCAS JOC.pptx enjoy this ppt

NLP
NLPNLP
NLP.pptx
NLP.pptxNLP.pptx
NLP.pptx
Rahul Borate
 
Big data
Big dataBig data
Big data
Ishucs
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
dhruv_chaudhari
 
AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位
AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位
AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位
eLearning Consortium 電子學習聯盟
 
EXPLORING NATURAL LANGUAGE PROCESSING (1).pptx
EXPLORING NATURAL LANGUAGE PROCESSING (1).pptxEXPLORING NATURAL LANGUAGE PROCESSING (1).pptx
EXPLORING NATURAL LANGUAGE PROCESSING (1).pptx
AtulKumarUpadhyay4
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
KevinSims18
 
NLP CHEAT SHEET.pdf
NLP CHEAT SHEET.pdfNLP CHEAT SHEET.pdf
NLP CHEAT SHEET.pdf
ssuserc8990f1
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
socarem879
 
Sk t academy lecture note
Sk t academy lecture noteSk t academy lecture note
Sk t academy lecture note
Susang Kim
 
Pycon India 2018 Natural Language Processing Workshop
Pycon India 2018   Natural Language Processing WorkshopPycon India 2018   Natural Language Processing Workshop
Pycon India 2018 Natural Language Processing Workshop
Lakshya Sivaramakrishnan
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Bhavya Chawla
 
NLP Deep Learning with Tensorflow
NLP Deep Learning with TensorflowNLP Deep Learning with Tensorflow
NLP Deep Learning with Tensorflow
seungwoo kim
 
Frame-Script and Predicate logic.pptx
Frame-Script and Predicate logic.pptxFrame-Script and Predicate logic.pptx
Frame-Script and Predicate logic.pptx
nilesh405711
 
Deep learning Techniques JNTU R20 UNIT 2
Deep learning Techniques JNTU R20 UNIT 2Deep learning Techniques JNTU R20 UNIT 2
Deep learning Techniques JNTU R20 UNIT 2
EXAMCELLH4
 
Text Mining: open Source Tokenization Tools � An Analysis
Text Mining: open Source Tokenization Tools � An AnalysisText Mining: open Source Tokenization Tools � An Analysis
Text Mining: open Source Tokenization Tools � An Analysis
aciijournal
 
TEXT MINING: OPEN SOURCE TOKENIZATION TOOLS – AN ANALYSIS
TEXT MINING: OPEN SOURCE TOKENIZATION TOOLS – AN ANALYSISTEXT MINING: OPEN SOURCE TOKENIZATION TOOLS – AN ANALYSIS
TEXT MINING: OPEN SOURCE TOKENIZATION TOOLS – AN ANALYSIS
aciijournal
 
Text mining open source tokenization
Text mining open source tokenizationText mining open source tokenization
Text mining open source tokenization
aciijournal
 
DOMAIN BASED CHUNKING
DOMAIN BASED CHUNKINGDOMAIN BASED CHUNKING
DOMAIN BASED CHUNKING
ijnlc
 
DOMAIN BASED CHUNKING
DOMAIN BASED CHUNKINGDOMAIN BASED CHUNKING
DOMAIN BASED CHUNKING
kevig
 

Similar to AI UNIT 3 - SRCAS JOC.pptx enjoy this ppt (20)

NLP
NLPNLP
NLP
 
NLP.pptx
NLP.pptxNLP.pptx
NLP.pptx
 
Big data
Big dataBig data
Big data
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位
AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位
AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位
 
EXPLORING NATURAL LANGUAGE PROCESSING (1).pptx
EXPLORING NATURAL LANGUAGE PROCESSING (1).pptxEXPLORING NATURAL LANGUAGE PROCESSING (1).pptx
EXPLORING NATURAL LANGUAGE PROCESSING (1).pptx
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
NLP CHEAT SHEET.pdf
NLP CHEAT SHEET.pdfNLP CHEAT SHEET.pdf
NLP CHEAT SHEET.pdf
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
 
Sk t academy lecture note
Sk t academy lecture noteSk t academy lecture note
Sk t academy lecture note
 
Pycon India 2018 Natural Language Processing Workshop
Pycon India 2018   Natural Language Processing WorkshopPycon India 2018   Natural Language Processing Workshop
Pycon India 2018 Natural Language Processing Workshop
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
NLP Deep Learning with Tensorflow
NLP Deep Learning with TensorflowNLP Deep Learning with Tensorflow
NLP Deep Learning with Tensorflow
 
Frame-Script and Predicate logic.pptx
Frame-Script and Predicate logic.pptxFrame-Script and Predicate logic.pptx
Frame-Script and Predicate logic.pptx
 
Deep learning Techniques JNTU R20 UNIT 2
Deep learning Techniques JNTU R20 UNIT 2Deep learning Techniques JNTU R20 UNIT 2
Deep learning Techniques JNTU R20 UNIT 2
 
Text Mining: open Source Tokenization Tools � An Analysis
Text Mining: open Source Tokenization Tools � An AnalysisText Mining: open Source Tokenization Tools � An Analysis
Text Mining: open Source Tokenization Tools � An Analysis
 
TEXT MINING: OPEN SOURCE TOKENIZATION TOOLS – AN ANALYSIS
TEXT MINING: OPEN SOURCE TOKENIZATION TOOLS – AN ANALYSISTEXT MINING: OPEN SOURCE TOKENIZATION TOOLS – AN ANALYSIS
TEXT MINING: OPEN SOURCE TOKENIZATION TOOLS – AN ANALYSIS
 
Text mining open source tokenization
Text mining open source tokenizationText mining open source tokenization
Text mining open source tokenization
 
DOMAIN BASED CHUNKING
DOMAIN BASED CHUNKINGDOMAIN BASED CHUNKING
DOMAIN BASED CHUNKING
 
DOMAIN BASED CHUNKING
DOMAIN BASED CHUNKINGDOMAIN BASED CHUNKING
DOMAIN BASED CHUNKING
 

Recently uploaded

Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
UReason
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
IJECEIAES
 
Engineering Standards Wiring methods.pdf
Engineering Standards Wiring methods.pdfEngineering Standards Wiring methods.pdf
Engineering Standards Wiring methods.pdf
edwin408357
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
bijceesjournal
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
Welding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdfWelding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdf
AjmalKhan50578
 
AI for Legal Research with applications, tools
AI for Legal Research with applications, toolsAI for Legal Research with applications, tools
AI for Legal Research with applications, tools
mahaffeycheryld
 
Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...
Prakhyath Rai
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
Yasser Mahgoub
 
Generative AI Use cases applications solutions and implementation.pdf
Generative AI Use cases applications solutions and implementation.pdfGenerative AI Use cases applications solutions and implementation.pdf
Generative AI Use cases applications solutions and implementation.pdf
mahaffeycheryld
 
Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
bjmsejournal
 
Gas agency management system project report.pdf
Gas agency management system project report.pdfGas agency management system project report.pdf
Gas agency management system project report.pdf
Kamal Acharya
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
Divyanshu
 
VARIABLE FREQUENCY DRIVE. VFDs are widely used in industrial applications for...
VARIABLE FREQUENCY DRIVE. VFDs are widely used in industrial applications for...VARIABLE FREQUENCY DRIVE. VFDs are widely used in industrial applications for...
VARIABLE FREQUENCY DRIVE. VFDs are widely used in industrial applications for...
PIMR BHOPAL
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 
morris_worm_intro_and_source_code_analysis_.pdf
morris_worm_intro_and_source_code_analysis_.pdfmorris_worm_intro_and_source_code_analysis_.pdf
morris_worm_intro_and_source_code_analysis_.pdf
ycwu0509
 
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
upoux
 
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
PriyankaKilaniya
 
Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
co23btech11018
 
Digital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptxDigital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptx
aryanpankaj78
 

Recently uploaded (20)

Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
 
Engineering Standards Wiring methods.pdf
Engineering Standards Wiring methods.pdfEngineering Standards Wiring methods.pdf
Engineering Standards Wiring methods.pdf
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
Welding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdfWelding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdf
 
AI for Legal Research with applications, tools
AI for Legal Research with applications, toolsAI for Legal Research with applications, tools
AI for Legal Research with applications, tools
 
Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
 
Generative AI Use cases applications solutions and implementation.pdf
Generative AI Use cases applications solutions and implementation.pdfGenerative AI Use cases applications solutions and implementation.pdf
Generative AI Use cases applications solutions and implementation.pdf
 
Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
 
Gas agency management system project report.pdf
Gas agency management system project report.pdfGas agency management system project report.pdf
Gas agency management system project report.pdf
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
 
VARIABLE FREQUENCY DRIVE. VFDs are widely used in industrial applications for...
VARIABLE FREQUENCY DRIVE. VFDs are widely used in industrial applications for...VARIABLE FREQUENCY DRIVE. VFDs are widely used in industrial applications for...
VARIABLE FREQUENCY DRIVE. VFDs are widely used in industrial applications for...
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 
morris_worm_intro_and_source_code_analysis_.pdf
morris_worm_intro_and_source_code_analysis_.pdfmorris_worm_intro_and_source_code_analysis_.pdf
morris_worm_intro_and_source_code_analysis_.pdf
 
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
 
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
 
Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
 
Digital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptxDigital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptx
 

AI UNIT 3 - SRCAS JOC.pptx enjoy this ppt

  • 1. Unit 3 – NATURAL LANGUAGE PROCESSING
  • 2. NLP OVERVIEW  NLP is a part of Artificial Intelligence which deals with Human Language by a program.  Used by machines to understand, analyse, manipulate, and interpret human's languages.  It helps to performing tasks such as translation, automatic summarization, Named Entity Recognition (NER), speech recognition, relationship extraction, and topic segmentation.
  • 3. PYTHON IN NLP NLTK, or Natural Language Toolkit, is a Python package used in NLP. NLTK provides a wide range of functionalities and resources for tasks such as tokenization, stemming, tagging, parsing, semantic reasoning, and more. NLTK is widely used in academia and industry for tasks such as text classification, sentiment analysis, machine translation, and information extraction.
  • 4. NLTK INSTALLATION pip install nltk import nltk nltk.download() (#download all the required packages)
  • 7. 1.MORPHOLOGICAL PROCESSING Morphological processing refers to the analysis and manipulation of the internal structure of words to understand their grammatical forms and extract meaningful information. Morphological processing involves tasks such as: 1. Tokenization 2. Stop Word Removal 3. Stemming 4. N –Gram Language Model 5. Name Entity Recognition(ner) 6. Chunking & Part-of-Speech (POS) Tagging
  • 8. 1. TOKENIZATION Tokenization in NLP is the process of breaking a sequence of text into smaller units, called tokens. Types of tokenization : Word Tokenization: This type of tokenization breaks text into individual words based on whitespace or punctuation. Example: "I love NLP!" ["I", "love", "NLP", "!"]. Sentence Tokenization: Sentence tokenization involves splitting a paragraph or text into individual sentences. Example: "I love NLP. It's fascinating." ["I love NLP.", "It's
  • 9.
  • 10. Python Code: from nltk.tokenize import sent_tokenize, word_tokenize data = "All work and no play makes jack a dull boy, all work and no play" print(word_tokenize(data)) Output:
  • 11. 2. STOP WORD REMOVAL In NLP, stop word removal is the process of eliminating commonly used words that do not carry significant meaning of a text. Stop word removal is performed to reduce the dimensionality of text data, improve computational efficiency, and focus on more informative words. By removing stop words, the remaining words in the text used to enhance the accuracy of NLP task. After stop word removal, the filtered tokens only contain the words that are not considered stop words.
  • 12. Python Code: stopwords.words('english') is a function that returns a list of commonly used stop words in the English language. from nltk.corpus import stopwords a = set(stopwords.words('english')) print(a) To remove a stop words in given sentence : words = [word for word in data.split() if word.lower() not in a] new_text = " ".join(words) print(new_text)
  • 13.
  • 14. 3.STEMMING Stemming is a Process of Reducing Words (normalization of words) into its Base form(Root form/Stem form) Example: 1.John ate Pizza John—John Ate----eat Pizza---pizza 2.Stemmer, stemming, stemmed --- stem
  • 15. Types of Stemming: 1. Porter Stemming : The Porter stemming applies a set of rules to remove common English word suffixes. 2. Snowball Stemming : Snowball enables stemming for various languages beyond English and provides a framework for creating new stemming algorithms. 3. Lancaster Stemming : It applies a set of rules to remove English word suffixes and aims for a more aggressive reduction of words to their stems compared to the Porter algorithm.
  • 16. 4.N –GRAM LANGUAGE MODEL  NLP N-Grams are useful to create features from text corpus for machine learning algorithms like SVM, Naive Bayes, etc. Due to their frequent uses, n-gram models for n=1,2,3 have specific names as Unigram, Bigram, and Trigram models The "N" in N-gram refers to the number of items considered in the sequence. N-Grams are useful for creating capabilities like autocorrect, autocompletion of sentences, text summarization, speech recognition, etc.
  • 17.
  • 18. Python code: from nltk.util import ngrams data = "All work and no play makes jack a dull boy, all work and no play" n = 1 unigrams = ngrams(data.split(), n) for item in unigrams: print(item) Output: n=1 n=2
  • 19. 5.NAME ENTITY RECOGNITION(NER) Named Entity Recognition (NER) is a subtask in NLP that focuses on identifying and classifying named entities in text into predefined categories such as person names, organizations, locations, dates, and more. The goal of NER is to extract meaningful information from unstructured text by recognizing and labeling named entities. “Apple stock prices are going up” Apple as a fruit or company ?
  • 22. Python code: import nltk from nltk import sent_tokenize, word_tokenize, pos_tag, ne_chunk sentence = "Apple Inc. is planning to open a new store in New York on July 15th." for chunk in ne_chunk(pos_tag(word_tokenize(sent_tokenize(sentence)[0]))): if hasattr(chunk, 'label'): print(chunk.label(), ' '.join(c[0] for c in chunk))
  • 23. 6. CHUNKING & PART-OF-SPEECH (POS) In chunking, the process typically involves two steps: 1.Part-of-speech (POS) tagging 2.Chunking
  • 24. PART-OF-SPEECH (POS) TAGS Each word or token in a sentence is assigned a part-of-speech tag, indicating its grammatical category (noun, verb, adjective, etc.). POS tagging helps in identifying the role and function of each word in the sentence. It is also called grammatical tagging.
  • 25. Python Code: import nltk from nltk import sent_tokenize, word_tokenize, pos_tag, ne_chunk d = "The dog ate the cat" tokenize_text=word_tokenize(d) nltk.pos_tag(tokenize_text)
  • 26. CHUNKING Based on the POS tags, patterns or rules are applied to group consecutive words into chunks. Picking up Individual pieces of information and grouping them into bigger pieces
  • 27. 2.SYNTACTIC ANALYSIS or PARSING It is the process of analysing the natural language with the rules of formal grammar to find out the dictionary meaning of any sentence. Syntax analysis checks the text for meaningfulness comparing to the rules of formal grammar. Example: Delhi is the capital of India. Is Delhi the of India capital.
  • 28. 3.SEMANTIC ANALYSIS The work of semantic analyser is to check the text for meaningfulness. The goal of semantic analysis is to enable machines to understand and interpret human language in a way that goes beyond the mere surface-level syntactic structure. Example: She drank some Milk She drank some Books
  • 29. 4.DISCOURSE ANALYSIS Discourse analysis is help us to understand how language is used in real-world contexts. It focuses on analyzing the structure, coherence, and meaning of texts or spoken interactions within their social and cultural contexts. Example: Monkeys Eat Banana, When they Wake up. Who is they here? Monkey Monkeys eat Banana, When they ripe. Who is they here? Banana
  • 30. 5.PRAGMATIC ANALYSIS Pragmatic analysis in NLP (Natural Language Processing) is then defined as the process of extracting information from any given text. Pragmatic analysis takes into account the speaker's intention, the listener's understanding, and the social context in which Example: Close the Door Type: Order Please, close the door Type: Request, Affirmation
  • 32. APPLICATIONS OF NLP Core Tasks Industry Specific General Applications
  • 34. VIRTUAL AGENTS Software programs that simulate the tasks such as managing schedules, handling travel needs, booking appointments, sending reminders , playing music, or controlling smart home devices and password resets are known as Virtual Assistants. However, its functions are slightly more advanced than chatbots. Example: Virtual agents are commonly used in applications like CUSTOMER SUPPORT, where they can handle frequently asked questions, troubleshoot issues, or guide users through processes.
  • 35. VIRTUAL AGENTS ENTERPRISE INDUSTRY CASE STUDY IBM SOLUTION: https://www.ibm.com/case-studies/autodesk-inc
  • 36. PROBLEM STATEMENT: As the company switched from a desktop licensing model to a SaaS model, its reach improved. But with that surge came an increase in customer inquiries.  Sometimes with heavy volume and complex issues, the resolution time for questions was 1.5 days or more. Autodesk’s staff of about 350 customer support agents handles roughly one million customer and partner contacts per year. About half of these are simple activation code requests, changes of address, contract problems, and technical issues. Spratto, Vice President of Operations at Autodesk, said: “A lot of what my team does is just problem recognition, trying to identify what the person wants or is asking.”