Natural language processing UNIT-II PPTS.pptx

Audisankara
College
of
Engineering
and
Technology(A)::Gudur,AP.
• Department of CSE
Subject Name: Natural Language Processing
Semester : Sixth Sem
VENKATA RATHNAM
Associate Professor
Department of CSE
ASCET
NATURAL LANGUAGE PROCESSSING
(20DS602)
Topic Name : SYNTACTIC ANALYSIS

UNIT-II:SYLLABUS
• English Word Classes, The Penn Treebank Part-
of-Speech Tagset, Part-of-Speech Tagging,
HMM Partof-Speech Tagging, Maximum
Entropy Markov Models, Grammar Rules for
English, Treebanks, Grammar Equivalence and
Normal form, Lexicalized Grammar.

Audisankara
College
of
Engineering
and
• Department of CSE ENGLISH WORD CLASSES:
 There are four main word classes; nouns, verbs, adjectives, and adverbs
 Nouns
Nouns are the words we use to describe people, places, objects, feelings,
concepts, etc. Usually, nouns are tangible (touchable) things, such as a table, a
person, or a building.
EX:My sister went to school.
 Verbs
Verbs are words that show action, event, feeling, or state of being. This
can be a physical action or event, or it can be a feeling that is experienced.
EX:She wished for a sunny day.
 Adjectives
Adjectives are words used to modify nouns, usually by describing them.
Adjectives describe an attribute, quality, or state of being of the noun.
EX:The friendly woman wore a beautiful dress.
 Adverbs
Adverbs are words that work alongside verbs, adjectives, and other
adverbs. They provide further descriptions of how, where, when, and how often
something is done.
EX:'The music was too loud.'

The other five word classes are; prepositions, pronouns, determiners,
conjunctions, and interjections. These are considered functional words, and
they provide structural and relational information in a sentence or phrase.
Prepositions
Prepositions are used to show the relationship between words in terms of
place, time, direction, and agency.
EX:'They went through the tunnel.'
Pronouns
Pronouns take the place of a noun or a noun phrase in a sentence. They
often refer to a noun that has already been mentioned and are commonly
used to avoid repetition. Chloe (noun) → she (pronoun)
Chloe's dog → her dog (possessive pronoun)
EX:'She sat on the chair which was broken.'
Determiners
Determiners work alongside nouns to clarify information about the
quantity, location, or ownership of the noun. It 'determines' exactly what
is being referred to. Much like pronouns, there are also several different
types of determiners.
EX:'The first restaurant is better than the other.'

Conjunctions
Conjunctions are words that connect other words, phrases, and clauses
together within a sentence. There are three main types of conjunctions;
• Coordinating conjunctions - these link independent clauses together.
• Subordinating conjunctions - these link dependent clauses to independent
clauses.
• Correlative conjunctions - words that work in pairs to join two parts of a
sentence of equal importance.
EX:
For, and, nor, but, or, yet, so - coordinating conjunctions
After, as, because, when, while, before, if, even though - subordinating
conjunctions Either/or, neither/nor, both/and - correlative conjunctions
'If it rains, I'm not going out.'
Interjections
Interjections are exclamatory words used to express an emotion or a
reaction. They often stand alone from the rest of the sentence and are
accompanied by an exclamation mark.
EX;'Oh, what a surprise!'

Part of speech tagging
Given input:
this is a simple sentence
the goal is to identify the part of speech (syntactic category) for each
word:
this/DET is/VERB a/DET simple/ADJ sentence/NOUN
The set of part of speech (POS) categories can differ based on the
application, corpus annotators and language.
One universal tagset used by Google (Petrov et al. 2011):

Tag Description Example
VERB Verbs (all tenses and modes) eat, ate, eats
NOUN Nouns (common and proper) home, Micah
PRON Pronouns I, you, your, he
ADJ Adjectives yellow, bigger, wildest
ADV Adverbs quickly, faster, fastest
ADP Adpositions (prepositions of, in, by, under
and postpositions)
CONJ Conjunctions and, or, but
DET Determiners a, an, the, this
NUM Cardinal numbers one, two, first, second
PRT Particles, other function words up, down, on, off
X Other: foreign words, typos, brasserie, abcense, HMM
abbreviations
. Punctuation ?, !, .

POS tagging is hard:
• Ambiguity:
glass of water/NOUN vs water/VERB the plants
lie/VERB down vs tell a lie/NOUN
wind/VERB down vs a mighty wind/NOUN
• Sparsity:
– Words we never see.
– Word-tag pairs we never see.

A probabilistic model for tagging
Let xt denote the word and zt denote the tag at time step
t.
• Initialization: z0 = <s>
• Repeat:
– Choose a tag based on the previous tag: P(zt|zt−1)
– If zt = </s>: Break
– Choose a word conditioned on its tag: P(xt|ht)

Could represent this with a state diagram:

Audisankara
College
of
Engineering
and
THE PENN TREEBANK PARTS-OF SPEEECH TAGSET:
 The English Penn Treebank tagset is used with English corpora annotated by the
TreeTagger tool
 The table shows English Penn TreeBank tagset with Sketch Engine modifications
Number Tag Description
1. CC Coordinating conjunction
2. CD Cardinal number
3. DT Determiner
4. EX Existential there
5. FW Foreign word
6. IN
Preposition or subordinating
conjunction
7. JJ Adjective
8. JJR Adjective, comparative
9. JJS Adjective, superlative
10. LS List item marker
11. MD Modal
12. NN Noun, singular or mass
13. NNS Noun, plural

Audisankara
College
of
Engineering
and
14. NNP Proper noun, singular
15. NNPS Proper noun, plural
16. PDT Predeterminer
17. POS Possessive ending
18. PRP Personal pronoun
19. PRP$ Possessive pronoun
20. RB Adverb
21. RBR Adverb, comparative
22. RBS Adverb, superlative
23. RP Particle
24. SYM Symbol
25. TO to
26. UH Interjection
27. VB Verb, base form
28. VBD Verb, past tense
29. VBG Verb, gerund or present participle
30. VBN Verb, past participle
31. VBP
Verb, non-3rd person singular
present
32. VBZ Verb, 3rd person singular present
33. WDT Wh-determiner
34. WP Wh-pronoun
35. WP$ Possessive wh-pronoun
36. WRB Wh-adverb

Audisankara
College
of
Engineering
and
PARTS-OF SPEECH TAGSET:
 It is a process of converting a sentence to forms – list of words, list of tuples (where each
tuple is having a form (word, tag)). The tag in case of is a part-of-speech tag, and signifies
whether the word is a noun, adjective, verb, and so on.
 Most of the POS tagging falls under Rule Base POS tagging, Stochastic POS tagging and
Transformation based tagging.
• Rule-based taggers use dictionary or lexicon for getting possible tags for tagging
each word.
• The stochastic taggers disambiguate the words based on the probability that a
word occurs with a particular tag.
• Transformation based tagging is also called Brill tagging. It is an instance of the
transformation-based learning (TBL),

Audisankara
College
of
Engineering
and
HMM PARTS-OF SPEECH TAGGING:
 Before digging deep into HMM POS tagging, we must understand the
concept of Hidden Markov Model (HMM).
Hidden Markov Model
An HMM model may be defined as the doubly-embedded stochastic
model, where the underlying stochastic process is hidden. This hidden
stochastic process can only be observed through another set of
stochastic processes that produces the sequence of observations.
Example
For example, a sequence of hidden coin tossing experiments is done
and we see only the observation sequence consisting of heads and
tails. The actual details of the process - how many coins used, the
order in which they are selected - are hidden from us. By observing
this sequence of heads and tails, we can build several HMMs to
explain the sequence. Following is one form of Hidden Markov Model
for this problem −

We assumed that there are two states in the HMM and each of the
state corresponds to the selection of different biased coin.
Following matrix gives the state transition probabilities −
[Math Processing Error]=[11122122]

Here,
•aij = probability of transition from one state to another from i to j.
•a11 + a12 = 1 and a21 + a22 =1
•P1 = probability of heads of the first coin i.e. the bias of the first
coin.
P2 = probability of heads of the second coin i.e. the bias of the
second coin.
•We can also create an HMM model assuming that there are 3
coins or more.
•This way, we can characterize HMM by the following elements −
•N, the number of states in the model (in the above example N
=2, only two states).
•M, the number of distinct observations that can appear with
each state in the above example M = 2, i.e., H or T).
•A, the state transition probability distribution − the matrix A in the
above example.
•P, the probability distribution of the observable symbols in each
state (in our example P1 and P2).
•I, the initial state distribution.

Hidden Markov Model
A hidden Markov model (HMM) defines a probability
distribution over a sequence of states and output
observations:
• Output sequence: x1:T = x1, x2, . . . , xT . We denote these
as vectors, but they can also be scalars or discrete
observations.
• State sequence: z0:T +1 = z0, z1, . . . , zT , zT +1. Take on
an integer value zt ∈ {0, . . . , K + 1} representing the state at
t.

An HMM is specified by:
• A set of states: {0, 1, . . . ,K + 1}
• Transition probabilities: A with Ai,j = PA(zt = j|zt−1 = i)
• Emission distribution for each state: p(xt|zt). We denote
the
emission distribution as continuous, but can also be discrete.
• Group the parameters together: = {A,}
The start and end states are special:
• Always start in z0 = 0.
Transitioning out of this start state is captured by A0,j .
• Always end in zT+1 = K + 1.
Transitioning into this final state is captured by Aj,K+1.
• States 0 and K + 1 are non-emitting: They don’t have a
corresponding x when we move in or out of them.

The three HMM problems
Problem 1: The marginal probability
Given an observed sequence x1:T and a trained HMM with
parameters
, what is the probability of the observed sequence p(x1:T )?
Problem 2: The most likely state sequence
Given an observed sequence x1:T and a trained HMM with
parameters
, what is the most likely state sequence through the HMM?
arg max
z0:T+1
P(z0:T+1|x1:T )
Problem 3: Learning
Given training data x1:T , how do we choose the HMM
parameters
to maximize p(x1:T )?

Audisankara
College
of
Engineering
and
MAXIMUM ENTROPY MARKOV MODELS:
 The Maximum Entropy Markov Model (MEMM) has dependencies
between each state and the full observation sequence explicitly. This
is more expressive than HMMs.
 In the HMM model, we saw that it uses two probabilities matrice
(state transition and emission probability). We need to predict a tag
given an observation, but HMM predicts the probability of a tag
producing a certain observation. This is due to its generative
approach. Instead of the transition and observation matrices in
HMM, MEMM has only one transition probability matrix. This
matrix encapsulates all combinations of previous states y_i−1 and
current observation x_i pairs in the training data to the current
state y_i.

Audisankara
College
of
Engineering
and
TREEBANKS:
 A Treebank is a parsed text corpus that annotates syntactic or semantic
sentence structure.
 Treebanks are often created on top of a corpus that has already been
annotated with part-of-speech tags.
 Types of TreeBank Corpus
• Semantic Treebanks:
These Treebanks use a formal representation of sentence’s semantic
structure.
• Syntactic Treebanks:
Opposite to the semantic Treebanks, inputs to the Syntactic Treebank
systems are expressions of the formal language obtained from the conversion of
parsed Treebank data. The outputs of such systems are predicate logic based
meaning representation.

Audisankara
College
of
Engineering
and
GRAMMER RULES FOR ENGLISH:
1.Adjectives and adverbs
2. Pay attention to homophones
3. Use the correct conjugation of the verb
4. Connect your ideas with conjunctions
5. Sentence construction
6. Remember the word order for questions
7. Use the right past form of verbs
8. Get familiar with the main English verb tenses
9. Never use a double negative

Audisankara
College
of
Engineering
and
GRAMMER EQUIVALENCE AND NORMAL FORM:
There are lots of ways to transform grammars so that they are more useful
for a particular purpose. the basic idea:
1.Apply transformation 1 to G to get of undesirable property 1.
Show that the language generated by G is unchanged.
2.Apply transformation 2 to G to get rid of undesirable property 2.
Show that the language generated by G is unchanged AND that undesirable
property 1 has not been reintroduced.
3.Continue until the grammar is in the desired form.
Normal Forms:
• If you want to design algorithms, it is often useful to have a limited
number of input forms that you have to deal with. Normal forms are
designed to do just that. Various ones have been developed for various
purposes.

Audisankara
College
of
Engineering
and
LEXICALIZED GRAMMER:
 A lexical grammar is a formal grammar defining the syntax of tokens.
 The program is written using characters that are defined by the lexical
structure of the language used.
 The character set is equivalent to the alphabet used by any written
language.
 we say that a grammar is lexicalized if it consists of:
1. A finite set of structures each associated with a lexical item.
2. An operation or operations for combining the structures.
 Each lexical item is called the anchor of the corresponding structure over
which it specifies linguistic constraints.
 Hence, the constraints are local to the anchored structure.

Audisankara
College
of
Engineering
and

Natural language processing UNIT-II PPTS.pptx

More Related Content

Similar to Natural language processing UNIT-II PPTS.pptx

Recently uploaded

Natural language processing UNIT-II PPTS.pptx