NISHANTHINI
What is Natural Language?


Language is meant for Communicating about
the world.



By studying language, we can come to understand
more about the world.



Refers to the language spoken by people,
e.g. English, Japanese, Swahili, etc.
What Is Natural Language?

One of the aims of Artificial Intelligence (AI) is to
build machines that can "understand" commands in
natural language, written or spoken.
 A computer that can do this requires very powerful
hardware and sophisticated software.
 At the present time, this is at the early stages of
development.
Introduction to NLP
It is not an easy task to teach a person or computer a
natural language.
The main problems are syntax (the rules governing the way
in which words are arranged), and understanding context to
determine the meaning of a word.
To interpret even simple phrases requires a vast amount of
knowledge.
The basic goal of Natural language Processing is to enable a
person to communicate with a computer in a language that
they use in their everyday life.
Natural Language And Computer Language
Natural language are those that we use for communicating
with each other, eg. Arabic, English, French, Japanese, etc.
Natural language are expressive and easy for us to use.
Computer languages are those that we use for controlling
the operations of a computer, eg. Prolog, C, C++, C#, Java,
Python,…,etc.
Computer languages are easy for a computer to understand,
but they are not expressive.
What is Natural Language Processing?





”Natural language processing (NLP) is a field of computer
science, artificial intelligence (also called machine
learning), and linguistics concerned with the interactions
between computers and human (natural) languages.
Specifically, the process of a computer extracting
meaningful information from natural language input
and/or producing natural language output ”
Why Natural Language Processing?

kJfmmfj mmmvvv nnnffn333
Uj iheale eleee mnster vensi credur

Baboi oi cestnitze
Coovoel2^ ekk; ldsllk lkdf vnnjfj?

Fgmflmllk mlfm kfre xnnn!
Computers Lack Knowledge
•Computers “see” text in English the same you have
seen the previous text!
•People have no trouble understanding language
•Common sense knowledge
•Reasoning capacity
•Experience
•Computers have
•No common sense knowledge
•No reasoning capacity
Why Natural Language Processing?
Huge amounts of data
Internet = at least 20
billions pages
Intranet

•Classify text into categories
•Index and search large texts
•Automatic translation
•Speech understanding
•Understand phone conversations

•Information extraction
•Extract useful information from
resumes

Applications for processing •Automatic summarization
large amounts of texts
•Condense 1 book into 1 page
require NLP expertise
•Question answering
•Knowledge acquisition
•Text generations / dialogues
Why
is
Computer
Processing
of
Human Language
Difficult?
How can a machine understand these differences?

Decorate the cake with the frosting.
Decorate the cake with the kids.
Throw out the cake with the frosting.
Throw out the cake with the kids.
How To Tackle These Problems?

Solution is
NATURAL
LANGUAGE
PROCESSING
Goals Of Natural Language Processing?
•Scientific Goal
•Identify the computational machinery
needed for an agent to exhibit various forms
of linguistic behavior
•Engineering Goal
•Design, implement, and test systems that
process natural languages for practical
applications
Where does it fit in the CS taxonomy?
Computers
Databases

Robotics

Information
Retrieval

Artificial Intelligence

Algorithms

Networking

Search

Natural Language Processing

Machine
Translation

Language
Analysis

Semantics

Parsing
Methods In Natural Language Processing

•Natural Language Understanding(NLU)
The NLU task is understanding and
reasoning while the input is a natural language
•Natural Language Generation(NLG)
NLG is a subfield of Natural Language
Processing
NLG is also referred to text generation
Linguistic And Language Processing

Linguistic is the science of language. It study includes
•Sounds(phonology),

•Word formation(morphology),
•Sentence structure(syntax),

•Meaning(semantics) and Understanding(pragmatics) etc
Levels Of Linguistic Analysis
Steps in Natural Language Processing
Steps in Natural Language Processing
Natural Language Processing is done at 5 Levels

Morphological Analysis
Syntactic Analysis
Semantic Analysis
Discourse integration

Pragmatic Analysis
Morphological Analysis
Individual words are analyzed into their components and non-word
tokens such as punctuation are separated from the words.
Morphology is the structure of words.
It is concerned with inflection.
 It is also concerned with derivation of new words from existing
ones.
 In NLP, words are also known as lexicon items and a set of
words form a lexicon.
Why it is important?
Any NL analysis system needs a lexicon {a module that tells what
words there are and what properties they have}.
Simplest model is a full form dictionary that lists every word
explicitly.

Simply expanding the dictionary fails to take advantages of the
regularities.
No dictionary contains all the words one is likely to encounter in
real input.
- Languages with highly productive morphology (e.g. Finnish, where a
verb can have many thousands of forms.)
- Noun Compounding
Morphological Analysis: Example

Suppose we have an English interface to an
operating system and the following sentence is
typed:
I want to print Bill’s .init file.

Morphological analysis must do the following
things:
Pull apart the word “Bill’s” into proper noun “Bill” and
the possessive suffix “’s”
Recognize the sequence “.init” as a file extension that is
functioning as an adjective in the sentence.
Syntactic Analysis
•Here the analysis is of words in a sentence to
know the grammatical structure of a sentence.
•The words are transformed into structures that
show the words relate to each others.
•Some word sequences may be rejected if they
violate the rules of the language for how words
may be combined.
•Example : “Boy the go the to store”
Syntactic Analysis : Example
John hit the ball

S -Sentence
NP -Noun Phrase
VP -Verb Phrase
Det-Determiner
N -Noun
Semantic Analysis

Semantic analysis is concerned with the meaning
of the language.
The first step in any semantic processing system is to
look up the individual words in a dictionary
(or lexicon) and extract their meanings.
Semantic Analysis
Unfortunately, many words have several meanings, for example, the
word ‘diamond’ might have the following set of meanings:
(1) a geometrical shape with four equal sides.
(2) a baseball field
(3) an extremely hard and valuable gemstone
Semantic Analysis
To select the correct meaning for the word ‘diamond’ in the
sentence “Joan saw Susan’s diamond shimmering from across the
room”.
It is necessary to know that neither geometrical shapes nor baseball
fields shimmer, whereas gemstones do (process of elimination).
The process of determining the correct meaning of an individual
word is call
or
.
It is done by associating, with each word in the lexicon,
information about the contexts in which each of the word’s senses
may appear.
Semantic Analysis
Other useful semantic markers are
PHYSICAL-OBJECT
ANIMATE-OBJECT
ABSTRACT-OBJECT
Using these markers, the correct meaning of ‘diamond’
in the sentence “I dropped my diamond” can be
computed.
As part of the lexical entry, the verb ‘drop’ will specify
that its object must be a PHYSICAL-OBJECT.
Semantic Analysis

Finally, we have to process the text at sentence level.
There are four approaches to this.
•semantic grammar
•case grammar
•conceptual parsing
•approximately compositional semantic interpretation.
Discourse Integration
The meaning of individual sentence may
depend on the sentences that precede it and
may influence the meaning of the sentences
that follow it

Example : “You wanted it”
Once the correct reference for it is known, we
can also determine exactly which it is being
referred to.
Pragmatic Analysis
The final step in pragmatic processing is to translate, from the knowledge
based representation to a command to be executed by the system.
This is an additional stage of analysis concerned with the pragmatic
use of the language.
This is important in the understanding of texts and dialogues.
The idea is, what was said is reinterpreted to determine what was actually
meant.
Example: “Do you know what time it is?”
Should be interpreted as a request.
Real World NLP Application
Machine Translation
Information Retrieval / NL interface
Information Visualization
Autonomous interacting bots
Grammar Checking Systems

Speech Recognition Systems / Speech Synthesizers
Document Summary Systems
Machine Translation: Deluxe Universal Translator
Information Retrieval: Buzzcity
AltaVista Search Engine
AltaVista Search Engine
Information Visualization: Cartia’s Themescape
Autonomous interacting bots: Eliza’s grand-daughter -Lisa

http://stuff.simplenet.com/files/doorsam/lisa18.zip
Grammar Checking Systems: MS Word Grammar Checker
Nlp

Nlp

  • 2.
  • 3.
    What is NaturalLanguage?  Language is meant for Communicating about the world.  By studying language, we can come to understand more about the world.  Refers to the language spoken by people, e.g. English, Japanese, Swahili, etc.
  • 4.
    What Is NaturalLanguage? One of the aims of Artificial Intelligence (AI) is to build machines that can "understand" commands in natural language, written or spoken.  A computer that can do this requires very powerful hardware and sophisticated software.  At the present time, this is at the early stages of development.
  • 5.
    Introduction to NLP Itis not an easy task to teach a person or computer a natural language. The main problems are syntax (the rules governing the way in which words are arranged), and understanding context to determine the meaning of a word. To interpret even simple phrases requires a vast amount of knowledge. The basic goal of Natural language Processing is to enable a person to communicate with a computer in a language that they use in their everyday life.
  • 6.
    Natural Language AndComputer Language Natural language are those that we use for communicating with each other, eg. Arabic, English, French, Japanese, etc. Natural language are expressive and easy for us to use. Computer languages are those that we use for controlling the operations of a computer, eg. Prolog, C, C++, C#, Java, Python,…,etc. Computer languages are easy for a computer to understand, but they are not expressive.
  • 7.
    What is NaturalLanguage Processing?   ”Natural language processing (NLP) is a field of computer science, artificial intelligence (also called machine learning), and linguistics concerned with the interactions between computers and human (natural) languages. Specifically, the process of a computer extracting meaningful information from natural language input and/or producing natural language output ”
  • 8.
    Why Natural LanguageProcessing? kJfmmfj mmmvvv nnnffn333 Uj iheale eleee mnster vensi credur Baboi oi cestnitze Coovoel2^ ekk; ldsllk lkdf vnnjfj? Fgmflmllk mlfm kfre xnnn!
  • 9.
    Computers Lack Knowledge •Computers“see” text in English the same you have seen the previous text! •People have no trouble understanding language •Common sense knowledge •Reasoning capacity •Experience •Computers have •No common sense knowledge •No reasoning capacity
  • 10.
    Why Natural LanguageProcessing? Huge amounts of data Internet = at least 20 billions pages Intranet •Classify text into categories •Index and search large texts •Automatic translation •Speech understanding •Understand phone conversations •Information extraction •Extract useful information from resumes Applications for processing •Automatic summarization large amounts of texts •Condense 1 book into 1 page require NLP expertise •Question answering •Knowledge acquisition •Text generations / dialogues
  • 11.
  • 12.
    How can amachine understand these differences? Decorate the cake with the frosting. Decorate the cake with the kids. Throw out the cake with the frosting. Throw out the cake with the kids.
  • 13.
    How To TackleThese Problems? Solution is NATURAL LANGUAGE PROCESSING
  • 14.
    Goals Of NaturalLanguage Processing? •Scientific Goal •Identify the computational machinery needed for an agent to exhibit various forms of linguistic behavior •Engineering Goal •Design, implement, and test systems that process natural languages for practical applications
  • 15.
    Where does itfit in the CS taxonomy? Computers Databases Robotics Information Retrieval Artificial Intelligence Algorithms Networking Search Natural Language Processing Machine Translation Language Analysis Semantics Parsing
  • 16.
    Methods In NaturalLanguage Processing •Natural Language Understanding(NLU) The NLU task is understanding and reasoning while the input is a natural language •Natural Language Generation(NLG) NLG is a subfield of Natural Language Processing NLG is also referred to text generation
  • 17.
    Linguistic And LanguageProcessing Linguistic is the science of language. It study includes •Sounds(phonology), •Word formation(morphology), •Sentence structure(syntax), •Meaning(semantics) and Understanding(pragmatics) etc
  • 18.
  • 19.
    Steps in NaturalLanguage Processing
  • 20.
    Steps in NaturalLanguage Processing Natural Language Processing is done at 5 Levels Morphological Analysis Syntactic Analysis Semantic Analysis Discourse integration Pragmatic Analysis
  • 21.
    Morphological Analysis Individual wordsare analyzed into their components and non-word tokens such as punctuation are separated from the words. Morphology is the structure of words. It is concerned with inflection.  It is also concerned with derivation of new words from existing ones.  In NLP, words are also known as lexicon items and a set of words form a lexicon.
  • 23.
    Why it isimportant? Any NL analysis system needs a lexicon {a module that tells what words there are and what properties they have}. Simplest model is a full form dictionary that lists every word explicitly. Simply expanding the dictionary fails to take advantages of the regularities. No dictionary contains all the words one is likely to encounter in real input. - Languages with highly productive morphology (e.g. Finnish, where a verb can have many thousands of forms.) - Noun Compounding
  • 24.
    Morphological Analysis: Example Supposewe have an English interface to an operating system and the following sentence is typed: I want to print Bill’s .init file. Morphological analysis must do the following things: Pull apart the word “Bill’s” into proper noun “Bill” and the possessive suffix “’s” Recognize the sequence “.init” as a file extension that is functioning as an adjective in the sentence.
  • 26.
    Syntactic Analysis •Here theanalysis is of words in a sentence to know the grammatical structure of a sentence. •The words are transformed into structures that show the words relate to each others. •Some word sequences may be rejected if they violate the rules of the language for how words may be combined. •Example : “Boy the go the to store”
  • 27.
    Syntactic Analysis :Example John hit the ball S -Sentence NP -Noun Phrase VP -Verb Phrase Det-Determiner N -Noun
  • 28.
    Semantic Analysis Semantic analysisis concerned with the meaning of the language. The first step in any semantic processing system is to look up the individual words in a dictionary (or lexicon) and extract their meanings.
  • 29.
    Semantic Analysis Unfortunately, manywords have several meanings, for example, the word ‘diamond’ might have the following set of meanings: (1) a geometrical shape with four equal sides. (2) a baseball field (3) an extremely hard and valuable gemstone
  • 30.
    Semantic Analysis To selectthe correct meaning for the word ‘diamond’ in the sentence “Joan saw Susan’s diamond shimmering from across the room”. It is necessary to know that neither geometrical shapes nor baseball fields shimmer, whereas gemstones do (process of elimination). The process of determining the correct meaning of an individual word is call or . It is done by associating, with each word in the lexicon, information about the contexts in which each of the word’s senses may appear.
  • 31.
    Semantic Analysis Other usefulsemantic markers are PHYSICAL-OBJECT ANIMATE-OBJECT ABSTRACT-OBJECT Using these markers, the correct meaning of ‘diamond’ in the sentence “I dropped my diamond” can be computed. As part of the lexical entry, the verb ‘drop’ will specify that its object must be a PHYSICAL-OBJECT.
  • 32.
    Semantic Analysis Finally, wehave to process the text at sentence level. There are four approaches to this. •semantic grammar •case grammar •conceptual parsing •approximately compositional semantic interpretation.
  • 33.
    Discourse Integration The meaningof individual sentence may depend on the sentences that precede it and may influence the meaning of the sentences that follow it Example : “You wanted it” Once the correct reference for it is known, we can also determine exactly which it is being referred to.
  • 34.
    Pragmatic Analysis The finalstep in pragmatic processing is to translate, from the knowledge based representation to a command to be executed by the system. This is an additional stage of analysis concerned with the pragmatic use of the language. This is important in the understanding of texts and dialogues. The idea is, what was said is reinterpreted to determine what was actually meant. Example: “Do you know what time it is?” Should be interpreted as a request.
  • 35.
    Real World NLPApplication Machine Translation Information Retrieval / NL interface Information Visualization Autonomous interacting bots Grammar Checking Systems Speech Recognition Systems / Speech Synthesizers Document Summary Systems
  • 36.
    Machine Translation: DeluxeUniversal Translator
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
    Autonomous interacting bots:Eliza’s grand-daughter -Lisa http://stuff.simplenet.com/files/doorsam/lisa18.zip
  • 42.
    Grammar Checking Systems:MS Word Grammar Checker