SlideShare a Scribd company logo
DLO8012: Natural Language
Processing
Subject Teacher:
Prof. Vikas Dubey
RIZVI COLLEGE OF ENGINEERING
BANDRA(W),MUMBAI
1
Module-3
Syntax Analysis
CO-3 [10hrs]
CO-3: Be able to model linguistic phenomena with formal grammars.
2
3
Conditional Probability and Tags
• P(Verb) is the probability of a randomly selected word being a verb.
• P(Verb|race) is “what’s the probability of a word being a verb given that it’s
the word “race”?
• Race can be a noun or a verb.
• It’s more likely to be a noun.
• P(Verb|race) can be estimated by looking at some corpus and saying “out of
all the times we saw ‘race’, how many were verbs?
• In Brown corpus, P(Verb|race) = 96/98 = .98
• How to calculate for a tag sequence, say P(NN|DT)?

P(V | race) =
Count(race is verb)
total Count(race)
Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22
Stochastic Tagging
• Stochastic taggers generally resolve tagging ambiguities by using a
training corpus to compute the probability of a given word having a
given tag in a given context.
• Stochastic tagger called also HMM tagger or a Maximum
Likelihood Tagger, or a Markov model HMM TAGGER tagger,
based on the Hidden Markov Model.
• For a given word sequence, Hidden Markov Model (HMM) Taggers
choose the tag sequence that maximizes,
P(word | tag) * P(tag | previous-n-tags)
Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22
4
Stochastic Tagging
• A bigram HMM tagger chooses the tag ti for word wi that is most
probable given the previous tag, ti-1
ti = argmaxj P(tj | ti-1, wi)
• From the chain rule for probability factorization,
• Some approximation are introduced to simplify the model, such as
Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22
5
Stochastic Tagging
• The word probability depends only on the tag
• The dependence of a tag from the preceding tag history is limited in
time, e.i. a tag depends only on the two preceding ones,
Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22
6
7
Statistical POS Tagging (Allen95)
• Let’s step back a minute and remember some probability theory and its use in
POS tagging.
• Suppose, with no context, we just want to know given the word “flies” whether
it should be tagged as a noun or as a verb.
• We use conditional probability for this: we want to know which is greater
PROB(N | flies) or PROB(V | flies)
• Note definition of conditional probability
PROB(a | b) = PROB(a & b) / PROB(b)
– Where PROB(a & b) is the probability of the two events a & b occurring
simultaneously
Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22
8
Calculating POS for “flies”
We need to know which is more
• PROB(N | flies) = PROB(flies & N) / PROB(flies)
• PROB(V | flies) = PROB(flies & V) / PROB(flies)
• Count on a Corpus
Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22
Stochastic Tagging
• The simplest stochastic tagger applies the following approaches for POS
tagging –
Approach 1: Word Frequency Approach
• In this approach, the stochastic taggers disambiguate the words based on
the probability that a word occurs with a particular tag.
• We can also say that the tag encountered most frequently with the word in
the training set is the one assigned to an ambiguous instance of that word.
• The main issue with this approach is that it may yield inadmissible
sequence of tags.
Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22
9
Stochastic Tagging
• Assign each word its most likely POS tag
– If w has tags t1, …, tk, then can use
– P(ti | w) = c(w,ti )/(c(w,t1) + … + c(w,tk)), where
– c(w, ti ) = number of times w/ti appears in the corpus
– Success: 91% for English
Example heat :: noun/89, verb/5
Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22
10
Stochastic Tagging
Approach 2: Tag Sequence Probabilities
• It is another approach of stochastic tagging, where the tagger
calculates the probability of a given sequence of tags occurring.
• It is also called n-gram approach.
• It is called so because the best tag for a given word is determined by
the probability at which it occurs with the n previous tags.
Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22
11
Stochastic Tagging
• Given: sequence of words W
– W = w1,w2,…,wn (a sentence)
• – e.g., W = heat water in a large vessel
• Assign sequence of tags T:
• T = t1, t2, … , tn
• Find T that maximizes P(T | W)
Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22
12
Stochastic Tagging
• But P(ti|wi) is difficult to compute and Bayesian classification rule is used:
P(x|y) = P(x) P(y|x) / P(y)
• When applied to the sequence of words, the most probable tag sequence
would be
P(ti|wi) = P(ti) P(wi|ti)/P(wi)
• where P(wi) does not change and thus do not need to be calculated
• Thus, the most probable tag sequence is the product of two probabilities for
each possible sequence:
– Prior probability of the tag sequence. Context P(ti)
– Likelihood of the sequence of words considering a sequence of (hidden)
tags. P(wi|ti)
Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22
13
Stochastic Tagging
• Two simplifications for computing the most probable sequence of tags:
– Prior probability of the part of speech tag of a word depends only on the
tag of the previous word (bigrams, reduce context to previous). Facilitates
the computation of P(ti)
– Ex. Probability of noun after determiner
– Probability of a word depends only on its part-of-speech tag.
(independent of other words in the context). Facilitates the computation of
P(wi|ti), Likelihood probability.
• Ex. given the tag noun, probability of word dog
Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22
14
15
Stochastic Tagging
• Based on probability of certain tag occurring given
various possibilities
• Necessitates a training corpus
• No probabilities for words not in corpus.
• Training corpus may be too different from test corpus.
Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22
16
Stochastic Tagging (cont.)
Simple Method: Choose most frequent tag in training text
for each word!
– Result: 90% accuracy
– Why?
– Baseline: Others will do better
– HMM is an example
Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22
Thank You…
17

More Related Content

What's hot

Design cycles of pattern recognition
Design cycles of pattern recognitionDesign cycles of pattern recognition
Design cycles of pattern recognition
Al Mamun
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Saurabh Kaushik
 
NAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITIONNAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITION
live_and_let_live
 
Text classification
Text classificationText classification
Text classification
Harry Potter
 
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
Edureka!
 
Open vocabulary problem
Open vocabulary problemOpen vocabulary problem
Open vocabulary problem
JaeHo Jang
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
Kuppusamy P
 
0.0 Introduction to theory of computation
0.0 Introduction to theory of computation0.0 Introduction to theory of computation
0.0 Introduction to theory of computation
Sampath Kumar S
 
String matching algorithms
String matching algorithmsString matching algorithms
String matching algorithms
Dr Shashikant Athawale
 
Matrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsMatrix Factorization In Recommender Systems
Matrix Factorization In Recommender Systems
YONG ZHENG
 
The Role of Natural Language Processing in Information Retrieval
The Role of Natural Language Processing in Information RetrievalThe Role of Natural Language Processing in Information Retrieval
The Role of Natural Language Processing in Information Retrieval
Tony Russell-Rose
 
Deciability (automata presentation)
Deciability (automata presentation)Deciability (automata presentation)
Deciability (automata presentation)
Sagar Kumar
 
1.2. introduction to automata theory
1.2. introduction to automata theory1.2. introduction to automata theory
1.2. introduction to automata theory
Sampath Kumar S
 
Transfer Learning
Transfer LearningTransfer Learning
Transfer Learning
Hichem Felouat
 
NLP State of the Art | BERT
NLP State of the Art | BERTNLP State of the Art | BERT
NLP State of the Art | BERT
shaurya uppal
 
Support Vector Machine ppt presentation
Support Vector Machine ppt presentationSupport Vector Machine ppt presentation
Support Vector Machine ppt presentation
AyanaRukasar
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
Word2Vec
Word2VecWord2Vec
Word2Vec
hyunyoung Lee
 
Attention Is All You Need
Attention Is All You NeedAttention Is All You Need
Attention Is All You Need
Illia Polosukhin
 
Text classification presentation
Text classification presentationText classification presentation
Text classification presentation
Marijn van Zelst
 

What's hot (20)

Design cycles of pattern recognition
Design cycles of pattern recognitionDesign cycles of pattern recognition
Design cycles of pattern recognition
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
NAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITIONNAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITION
 
Text classification
Text classificationText classification
Text classification
 
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
 
Open vocabulary problem
Open vocabulary problemOpen vocabulary problem
Open vocabulary problem
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
 
0.0 Introduction to theory of computation
0.0 Introduction to theory of computation0.0 Introduction to theory of computation
0.0 Introduction to theory of computation
 
String matching algorithms
String matching algorithmsString matching algorithms
String matching algorithms
 
Matrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsMatrix Factorization In Recommender Systems
Matrix Factorization In Recommender Systems
 
The Role of Natural Language Processing in Information Retrieval
The Role of Natural Language Processing in Information RetrievalThe Role of Natural Language Processing in Information Retrieval
The Role of Natural Language Processing in Information Retrieval
 
Deciability (automata presentation)
Deciability (automata presentation)Deciability (automata presentation)
Deciability (automata presentation)
 
1.2. introduction to automata theory
1.2. introduction to automata theory1.2. introduction to automata theory
1.2. introduction to automata theory
 
Transfer Learning
Transfer LearningTransfer Learning
Transfer Learning
 
NLP State of the Art | BERT
NLP State of the Art | BERTNLP State of the Art | BERT
NLP State of the Art | BERT
 
Support Vector Machine ppt presentation
Support Vector Machine ppt presentationSupport Vector Machine ppt presentation
Support Vector Machine ppt presentation
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
K Nearest Neighbors
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
Attention Is All You Need
Attention Is All You NeedAttention Is All You Need
Attention Is All You Need
 
Text classification presentation
Text classification presentationText classification presentation
Text classification presentation
 

Similar to Lecture-18(11-02-22)Stochastics POS Tagging.pdf

Parts of Speect Tagging
Parts of Speect TaggingParts of Speect Tagging
Parts of Speect Tagging
theyaseen51
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
Dev Sahu
 
2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt
milkesa13
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
Traian Rebedea
 
Enriching Word Vectors with Subword Information
Enriching Word Vectors with Subword InformationEnriching Word Vectors with Subword Information
Enriching Word Vectors with Subword Information
Seonghyun Kim
 
NLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelNLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language Model
Hemantha Kulathilake
 
pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...
pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...
pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...
Lifeng (Aaron) Han
 
Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...
Lifeng (Aaron) Han
 
A REVIEW ON PARTS-OF-SPEECH TECHNOLOGIES
A REVIEW ON PARTS-OF-SPEECH TECHNOLOGIESA REVIEW ON PARTS-OF-SPEECH TECHNOLOGIES
A REVIEW ON PARTS-OF-SPEECH TECHNOLOGIES
IJCSES Journal
 
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
Lviv Data Science Summer School
 
Tiancheng Zhao - 2017 - Learning Discourse-level Diversity for Neural Dialog...
Tiancheng Zhao - 2017 -  Learning Discourse-level Diversity for Neural Dialog...Tiancheng Zhao - 2017 -  Learning Discourse-level Diversity for Neural Dialog...
Tiancheng Zhao - 2017 - Learning Discourse-level Diversity for Neural Dialog...
Association for Computational Linguistics
 
Statistical machine translation
Statistical machine translationStatistical machine translation
Statistical machine translation
asimuop
 
Lecture 3: Semantic Role Labelling
Lecture 3: Semantic Role LabellingLecture 3: Semantic Role Labelling
Lecture 3: Semantic Role Labelling
Marina Santini
 
word level analysis
word level analysis word level analysis
word level analysis
tjs1
 
Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...
Lifeng (Aaron) Han
 
Barreiro-Batista-LR4NLP@Coling2018-presentation
Barreiro-Batista-LR4NLP@Coling2018-presentationBarreiro-Batista-LR4NLP@Coling2018-presentation
Barreiro-Batista-LR4NLP@Coling2018-presentation
INESC-ID (Spoken Language Systems Laboratory - L2F)
 
Poscat seminar 8
Poscat seminar 8Poscat seminar 8
Poscat seminar 8
Hyungyu Shin
 
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATIONAN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
cscpconf
 
Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)
Toru Fujino
 
Word2vec slide(lab seminar)
Word2vec slide(lab seminar)Word2vec slide(lab seminar)
Word2vec slide(lab seminar)
Jinpyo Lee
 

Similar to Lecture-18(11-02-22)Stochastics POS Tagging.pdf (20)

Parts of Speect Tagging
Parts of Speect TaggingParts of Speect Tagging
Parts of Speect Tagging
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
 
2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
 
Enriching Word Vectors with Subword Information
Enriching Word Vectors with Subword InformationEnriching Word Vectors with Subword Information
Enriching Word Vectors with Subword Information
 
NLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelNLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language Model
 
pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...
pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...
pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...
 
Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...
 
A REVIEW ON PARTS-OF-SPEECH TECHNOLOGIES
A REVIEW ON PARTS-OF-SPEECH TECHNOLOGIESA REVIEW ON PARTS-OF-SPEECH TECHNOLOGIES
A REVIEW ON PARTS-OF-SPEECH TECHNOLOGIES
 
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
 
Tiancheng Zhao - 2017 - Learning Discourse-level Diversity for Neural Dialog...
Tiancheng Zhao - 2017 -  Learning Discourse-level Diversity for Neural Dialog...Tiancheng Zhao - 2017 -  Learning Discourse-level Diversity for Neural Dialog...
Tiancheng Zhao - 2017 - Learning Discourse-level Diversity for Neural Dialog...
 
Statistical machine translation
Statistical machine translationStatistical machine translation
Statistical machine translation
 
Lecture 3: Semantic Role Labelling
Lecture 3: Semantic Role LabellingLecture 3: Semantic Role Labelling
Lecture 3: Semantic Role Labelling
 
word level analysis
word level analysis word level analysis
word level analysis
 
Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...
 
Barreiro-Batista-LR4NLP@Coling2018-presentation
Barreiro-Batista-LR4NLP@Coling2018-presentationBarreiro-Batista-LR4NLP@Coling2018-presentation
Barreiro-Batista-LR4NLP@Coling2018-presentation
 
Poscat seminar 8
Poscat seminar 8Poscat seminar 8
Poscat seminar 8
 
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATIONAN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
 
Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)
 
Word2vec slide(lab seminar)
Word2vec slide(lab seminar)Word2vec slide(lab seminar)
Word2vec slide(lab seminar)
 

Recently uploaded

How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 

Recently uploaded (20)

How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 

Lecture-18(11-02-22)Stochastics POS Tagging.pdf

  • 1. DLO8012: Natural Language Processing Subject Teacher: Prof. Vikas Dubey RIZVI COLLEGE OF ENGINEERING BANDRA(W),MUMBAI 1
  • 2. Module-3 Syntax Analysis CO-3 [10hrs] CO-3: Be able to model linguistic phenomena with formal grammars. 2
  • 3. 3 Conditional Probability and Tags • P(Verb) is the probability of a randomly selected word being a verb. • P(Verb|race) is “what’s the probability of a word being a verb given that it’s the word “race”? • Race can be a noun or a verb. • It’s more likely to be a noun. • P(Verb|race) can be estimated by looking at some corpus and saying “out of all the times we saw ‘race’, how many were verbs? • In Brown corpus, P(Verb|race) = 96/98 = .98 • How to calculate for a tag sequence, say P(NN|DT)?  P(V | race) = Count(race is verb) total Count(race) Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22
  • 4. Stochastic Tagging • Stochastic taggers generally resolve tagging ambiguities by using a training corpus to compute the probability of a given word having a given tag in a given context. • Stochastic tagger called also HMM tagger or a Maximum Likelihood Tagger, or a Markov model HMM TAGGER tagger, based on the Hidden Markov Model. • For a given word sequence, Hidden Markov Model (HMM) Taggers choose the tag sequence that maximizes, P(word | tag) * P(tag | previous-n-tags) Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22 4
  • 5. Stochastic Tagging • A bigram HMM tagger chooses the tag ti for word wi that is most probable given the previous tag, ti-1 ti = argmaxj P(tj | ti-1, wi) • From the chain rule for probability factorization, • Some approximation are introduced to simplify the model, such as Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22 5
  • 6. Stochastic Tagging • The word probability depends only on the tag • The dependence of a tag from the preceding tag history is limited in time, e.i. a tag depends only on the two preceding ones, Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22 6
  • 7. 7 Statistical POS Tagging (Allen95) • Let’s step back a minute and remember some probability theory and its use in POS tagging. • Suppose, with no context, we just want to know given the word “flies” whether it should be tagged as a noun or as a verb. • We use conditional probability for this: we want to know which is greater PROB(N | flies) or PROB(V | flies) • Note definition of conditional probability PROB(a | b) = PROB(a & b) / PROB(b) – Where PROB(a & b) is the probability of the two events a & b occurring simultaneously Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22
  • 8. 8 Calculating POS for “flies” We need to know which is more • PROB(N | flies) = PROB(flies & N) / PROB(flies) • PROB(V | flies) = PROB(flies & V) / PROB(flies) • Count on a Corpus Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22
  • 9. Stochastic Tagging • The simplest stochastic tagger applies the following approaches for POS tagging – Approach 1: Word Frequency Approach • In this approach, the stochastic taggers disambiguate the words based on the probability that a word occurs with a particular tag. • We can also say that the tag encountered most frequently with the word in the training set is the one assigned to an ambiguous instance of that word. • The main issue with this approach is that it may yield inadmissible sequence of tags. Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22 9
  • 10. Stochastic Tagging • Assign each word its most likely POS tag – If w has tags t1, …, tk, then can use – P(ti | w) = c(w,ti )/(c(w,t1) + … + c(w,tk)), where – c(w, ti ) = number of times w/ti appears in the corpus – Success: 91% for English Example heat :: noun/89, verb/5 Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22 10
  • 11. Stochastic Tagging Approach 2: Tag Sequence Probabilities • It is another approach of stochastic tagging, where the tagger calculates the probability of a given sequence of tags occurring. • It is also called n-gram approach. • It is called so because the best tag for a given word is determined by the probability at which it occurs with the n previous tags. Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22 11
  • 12. Stochastic Tagging • Given: sequence of words W – W = w1,w2,…,wn (a sentence) • – e.g., W = heat water in a large vessel • Assign sequence of tags T: • T = t1, t2, … , tn • Find T that maximizes P(T | W) Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22 12
  • 13. Stochastic Tagging • But P(ti|wi) is difficult to compute and Bayesian classification rule is used: P(x|y) = P(x) P(y|x) / P(y) • When applied to the sequence of words, the most probable tag sequence would be P(ti|wi) = P(ti) P(wi|ti)/P(wi) • where P(wi) does not change and thus do not need to be calculated • Thus, the most probable tag sequence is the product of two probabilities for each possible sequence: – Prior probability of the tag sequence. Context P(ti) – Likelihood of the sequence of words considering a sequence of (hidden) tags. P(wi|ti) Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22 13
  • 14. Stochastic Tagging • Two simplifications for computing the most probable sequence of tags: – Prior probability of the part of speech tag of a word depends only on the tag of the previous word (bigrams, reduce context to previous). Facilitates the computation of P(ti) – Ex. Probability of noun after determiner – Probability of a word depends only on its part-of-speech tag. (independent of other words in the context). Facilitates the computation of P(wi|ti), Likelihood probability. • Ex. given the tag noun, probability of word dog Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22 14
  • 15. 15 Stochastic Tagging • Based on probability of certain tag occurring given various possibilities • Necessitates a training corpus • No probabilities for words not in corpus. • Training corpus may be too different from test corpus. Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22
  • 16. 16 Stochastic Tagging (cont.) Simple Method: Choose most frequent tag in training text for each word! – Result: 90% accuracy – Why? – Baseline: Others will do better – HMM is an example Prof. Vikas Dubey | RCOE | COMP | NLP | BE 2021-22