Natural Language Processing
Introduction
▪ Natural Language Processing is a subfield of Artificial Intelligence
and linguistics, devoted to make computers understand the
statements or words written by humans.
▪ A language is a system, a set of rules or set of symbols.
1. Symbols are combined and used for conveying information or
broadcasting the information.
2. Rules of grammar are used for handling symbols.
Introduction
▪ The history of NLP generally starts in the year 1950s. In 1950, Alan
Turing published an article titled "Machine and Intelligence" which
advertised what is now called theTuring test as a subfield of
intelligence.
▪ Natural languages are languages that living creatures use for
communication
▪ Artificial Languages are mathematically defined classes of signals
that can be used for communication with machines
▪ A language is a set of sentences that may be used as signals to
convey semantic information
▪ The meaning of a sentence is the semantic information it conveys
Problems faced in NLP
1. Incomplete description
2. Same word different Meanings
3. NewWords, Expressions and Meanings are generated quite freely.
4. There are a lot of ways of telling the same thing.
STEPS OF NATURAL LANGUAGE PROCESSING
▪ Morphological Analysis: Individual words are analyzed into
their components and non word tokens such as punctuations
are separated from the words.
▪ Syntactic Analysis: Linear sequences of words are
transformed into structures that show how the words relate to
each other.
▪ Semantic Analysis:The structures created by the syntactic
analyzer are assigned meanings.
▪ Discourse integration:The meaning of an individual sentence
may depend on the sentences that precede it and may
influence the meanings of the sentences that follow it.
▪ Pragmatic Analysis:The structure representing what was said
is reinterpreted to determine what was actually meant.
Syntax analysis
▪ The lexicon of a language is its vocabulary that includes its words and
expressions. Morphology depicts analysing, identifying and
description of structure of words.
▪ It involves dividing a text into paragraphs, words and the sentences
▪ The words are generally accepted as being the smallest units of
syntax.The syntax refers to the rules and principles that govern the
sentence structure of any individual languages
Syntactic Analysis
– S → NPVP
– NP → the NP1
– NP → PRO
– NP → PN
– NP → NP1
– NP1 →ADJS N
– ADJS → ε |ADJ ADJS
– VP →V
– VP →V NP
– N → file | printer
– PN → Bill
– PRO → I
– ADJ → short | long | fast
– V → printed | created | want
A Parse tree for a sentence :
S
NP
PN
Bill
VP
V
printed
NP
the
NP1
ADJS
E
N
file
▪ Text : Bill Printed the file
Syntax Tree Example
Syntactic Analysis Example
▪ A parse tree :
John ate the apple.
1. S -> NPVP
2. VP ->V NP
3. NP -> NAME
4. NP -> ART N
5. NAME -> John
6. V -> ate
7. ART-> the
8. N -> apple
S
NP VP
NAME
John
V
ate
NP
ART N
the apple
Semantic Analysis
▪ It must map individual words into appropriate objects in the
knowledgebase or database.
▪ It must create the correct structure to correspond to the way the
meaning of the individual words combine with each other.
▪ Thus a mapping is made between the syntactic structures and
objects in the task domain.The structures for which no such mapping
is possible is rejected.
▪ Eg: the sentence “Colorless green ideas…” would be rejected as
semantically anomalous because colorless and green makes no
sense.
Knowledge Base Fragment
Partial Meaning for a Sentence
Discourse Integration
▪ The Meaning of an individual sentence may depend on the sentences that
precede it and may influence the meaning of the sentences that follow it.
▪ Example: the word “it” in the sentence,”you wanted it” depends on the
prior discourse content.
▪ Specifically we do not know whom the pronoun “I” or the proper noun “Bill”
refers to.
▪ To pin down these references requires an appeal to a model of the current
discourse context, from which we can learn that the current user is
USER068 and that the only person named “Bill” about whom we could be
talking is USER073.
▪ Once the correct referent for Bill is known, we can also determine exactly
which file is being referred to.
Pragmatic Analysis
▪ The final step toward effective understanding is to decide what to do as a
results.
▪ One possible thing to do is to record what was said as a fact and be done
with it.
▪ For some sentences, whose intended effect is clearly declarative, that is
precisely correct thing to do.
▪ But for other sentences, including this one, the intended effect is different.
▪ We can discover this intended effect by applying a set of rules that
characterize cooperative dialogues.
▪ The final step in pragmatic processing is to translate, from the knowledge
based representation to a command to be executed by the system.
▪ The results of the understanding process is
Pragmatic Analysis
Summary
▪ We have seen the results of the main processes that combinr to form
a natural language system.
▪ In a complete system all of these processes are necessary.They will
form a complete natural language processing system.
▪ But all programs are not written with exactly these components,
sometimes two or more of such units are collapsed.
▪ Collapsing the components will result in a system that is easier to
build for restricted subsets of English but one that is harder to extend
to wider coverage.

Natural Language Processing

  • 1.
  • 2.
    Introduction ▪ Natural LanguageProcessing is a subfield of Artificial Intelligence and linguistics, devoted to make computers understand the statements or words written by humans. ▪ A language is a system, a set of rules or set of symbols. 1. Symbols are combined and used for conveying information or broadcasting the information. 2. Rules of grammar are used for handling symbols.
  • 3.
    Introduction ▪ The historyof NLP generally starts in the year 1950s. In 1950, Alan Turing published an article titled "Machine and Intelligence" which advertised what is now called theTuring test as a subfield of intelligence. ▪ Natural languages are languages that living creatures use for communication ▪ Artificial Languages are mathematically defined classes of signals that can be used for communication with machines ▪ A language is a set of sentences that may be used as signals to convey semantic information ▪ The meaning of a sentence is the semantic information it conveys
  • 4.
    Problems faced inNLP 1. Incomplete description 2. Same word different Meanings 3. NewWords, Expressions and Meanings are generated quite freely. 4. There are a lot of ways of telling the same thing.
  • 5.
    STEPS OF NATURALLANGUAGE PROCESSING ▪ Morphological Analysis: Individual words are analyzed into their components and non word tokens such as punctuations are separated from the words. ▪ Syntactic Analysis: Linear sequences of words are transformed into structures that show how the words relate to each other. ▪ Semantic Analysis:The structures created by the syntactic analyzer are assigned meanings. ▪ Discourse integration:The meaning of an individual sentence may depend on the sentences that precede it and may influence the meanings of the sentences that follow it. ▪ Pragmatic Analysis:The structure representing what was said is reinterpreted to determine what was actually meant.
  • 6.
    Syntax analysis ▪ Thelexicon of a language is its vocabulary that includes its words and expressions. Morphology depicts analysing, identifying and description of structure of words. ▪ It involves dividing a text into paragraphs, words and the sentences ▪ The words are generally accepted as being the smallest units of syntax.The syntax refers to the rules and principles that govern the sentence structure of any individual languages
  • 7.
    Syntactic Analysis – S→ NPVP – NP → the NP1 – NP → PRO – NP → PN – NP → NP1 – NP1 →ADJS N – ADJS → ε |ADJ ADJS – VP →V – VP →V NP – N → file | printer – PN → Bill – PRO → I – ADJ → short | long | fast – V → printed | created | want
  • 8.
    A Parse treefor a sentence : S NP PN Bill VP V printed NP the NP1 ADJS E N file ▪ Text : Bill Printed the file
  • 9.
  • 10.
    Syntactic Analysis Example ▪A parse tree : John ate the apple. 1. S -> NPVP 2. VP ->V NP 3. NP -> NAME 4. NP -> ART N 5. NAME -> John 6. V -> ate 7. ART-> the 8. N -> apple S NP VP NAME John V ate NP ART N the apple
  • 11.
    Semantic Analysis ▪ Itmust map individual words into appropriate objects in the knowledgebase or database. ▪ It must create the correct structure to correspond to the way the meaning of the individual words combine with each other. ▪ Thus a mapping is made between the syntactic structures and objects in the task domain.The structures for which no such mapping is possible is rejected. ▪ Eg: the sentence “Colorless green ideas…” would be rejected as semantically anomalous because colorless and green makes no sense.
  • 12.
  • 13.
  • 14.
    Discourse Integration ▪ TheMeaning of an individual sentence may depend on the sentences that precede it and may influence the meaning of the sentences that follow it. ▪ Example: the word “it” in the sentence,”you wanted it” depends on the prior discourse content. ▪ Specifically we do not know whom the pronoun “I” or the proper noun “Bill” refers to. ▪ To pin down these references requires an appeal to a model of the current discourse context, from which we can learn that the current user is USER068 and that the only person named “Bill” about whom we could be talking is USER073. ▪ Once the correct referent for Bill is known, we can also determine exactly which file is being referred to.
  • 15.
    Pragmatic Analysis ▪ Thefinal step toward effective understanding is to decide what to do as a results. ▪ One possible thing to do is to record what was said as a fact and be done with it. ▪ For some sentences, whose intended effect is clearly declarative, that is precisely correct thing to do. ▪ But for other sentences, including this one, the intended effect is different. ▪ We can discover this intended effect by applying a set of rules that characterize cooperative dialogues. ▪ The final step in pragmatic processing is to translate, from the knowledge based representation to a command to be executed by the system. ▪ The results of the understanding process is
  • 16.
  • 17.
    Summary ▪ We haveseen the results of the main processes that combinr to form a natural language system. ▪ In a complete system all of these processes are necessary.They will form a complete natural language processing system. ▪ But all programs are not written with exactly these components, sometimes two or more of such units are collapsed. ▪ Collapsing the components will result in a system that is easier to build for restricted subsets of English but one that is harder to extend to wider coverage.