3. What is Natural Language?
Language is meant for Communicating about
the world.
By studying language, we can come to understand
more about the world.
Refers to the language spoken by people,
e.g. English, Japanese, Swahili, etc.
4. What Is Natural Language?
One of the aims of Artificial Intelligence (AI) is to
build machines that can "understand" commands in
natural language, written or spoken.
A computer that can do this requires very powerful
hardware and sophisticated software.
At the present time, this is at the early stages of
development.
5. Introduction to NLP
It is not an easy task to teach a person or computer a
natural language.
The main problems are syntax (the rules governing the way
in which words are arranged), and understanding context to
determine the meaning of a word.
To interpret even simple phrases requires a vast amount of
knowledge.
The basic goal of Natural language Processing is to enable a
person to communicate with a computer in a language that
they use in their everyday life.
6. Natural Language And Computer Language
Natural language are those that we use for communicating
with each other, eg. Arabic, English, French, Japanese, etc.
Natural language are expressive and easy for us to use.
Computer languages are those that we use for controlling
the operations of a computer, eg. Prolog, C, C++, C#, Java,
Python,…,etc.
Computer languages are easy for a computer to understand,
but they are not expressive.
7. What is Natural Language Processing?
”Natural language processing (NLP) is a field of computer
science, artificial intelligence (also called machine
learning), and linguistics concerned with the interactions
between computers and human (natural) languages.
Specifically, the process of a computer extracting
meaningful information from natural language input
and/or producing natural language output ”
9. Computers Lack Knowledge
•Computers “see” text in English the same you have
seen the previous text!
•People have no trouble understanding language
•Common sense knowledge
•Reasoning capacity
•Experience
•Computers have
•No common sense knowledge
•No reasoning capacity
10. Why Natural Language Processing?
Huge amounts of data
Internet = at least 20
billions pages
Intranet
•Classify text into categories
•Index and search large texts
•Automatic translation
•Speech understanding
•Understand phone conversations
•Information extraction
•Extract useful information from
resumes
Applications for processing •Automatic summarization
large amounts of texts
•Condense 1 book into 1 page
require NLP expertise
•Question answering
•Knowledge acquisition
•Text generations / dialogues
12. How can a machine understand these differences?
Decorate the cake with the frosting.
Decorate the cake with the kids.
Throw out the cake with the frosting.
Throw out the cake with the kids.
13. How To Tackle These Problems?
Solution is
NATURAL
LANGUAGE
PROCESSING
14. Goals Of Natural Language Processing?
•Scientific Goal
•Identify the computational machinery
needed for an agent to exhibit various forms
of linguistic behavior
•Engineering Goal
•Design, implement, and test systems that
process natural languages for practical
applications
15. Where does it fit in the CS taxonomy?
Computers
Databases
Robotics
Information
Retrieval
Artificial Intelligence
Algorithms
Networking
Search
Natural Language Processing
Machine
Translation
Language
Analysis
Semantics
Parsing
16. Methods In Natural Language Processing
•Natural Language Understanding(NLU)
The NLU task is understanding and
reasoning while the input is a natural language
•Natural Language Generation(NLG)
NLG is a subfield of Natural Language
Processing
NLG is also referred to text generation
17. Linguistic And Language Processing
Linguistic is the science of language. It study includes
•Sounds(phonology),
•Word formation(morphology),
•Sentence structure(syntax),
•Meaning(semantics) and Understanding(pragmatics) etc
20. Steps in Natural Language Processing
Natural Language Processing is done at 5 Levels
Morphological Analysis
Syntactic Analysis
Semantic Analysis
Discourse integration
Pragmatic Analysis
21. Morphological Analysis
Individual words are analyzed into their components and non-word
tokens such as punctuation are separated from the words.
Morphology is the structure of words.
It is concerned with inflection.
It is also concerned with derivation of new words from existing
ones.
In NLP, words are also known as lexicon items and a set of
words form a lexicon.
22.
23. Why it is important?
Any NL analysis system needs a lexicon {a module that tells what
words there are and what properties they have}.
Simplest model is a full form dictionary that lists every word
explicitly.
Simply expanding the dictionary fails to take advantages of the
regularities.
No dictionary contains all the words one is likely to encounter in
real input.
- Languages with highly productive morphology (e.g. Finnish, where a
verb can have many thousands of forms.)
- Noun Compounding
24. Morphological Analysis: Example
Suppose we have an English interface to an
operating system and the following sentence is
typed:
I want to print Bill’s .init file.
Morphological analysis must do the following
things:
Pull apart the word “Bill’s” into proper noun “Bill” and
the possessive suffix “’s”
Recognize the sequence “.init” as a file extension that is
functioning as an adjective in the sentence.
25.
26. Syntactic Analysis
•Here the analysis is of words in a sentence to
know the grammatical structure of a sentence.
•The words are transformed into structures that
show the words relate to each others.
•Some word sequences may be rejected if they
violate the rules of the language for how words
may be combined.
•Example : “Boy the go the to store”
27. Syntactic Analysis : Example
John hit the ball
S -Sentence
NP -Noun Phrase
VP -Verb Phrase
Det-Determiner
N -Noun
28. Semantic Analysis
Semantic analysis is concerned with the meaning
of the language.
The first step in any semantic processing system is to
look up the individual words in a dictionary
(or lexicon) and extract their meanings.
29. Semantic Analysis
Unfortunately, many words have several meanings, for example, the
word ‘diamond’ might have the following set of meanings:
(1) a geometrical shape with four equal sides.
(2) a baseball field
(3) an extremely hard and valuable gemstone
30. Semantic Analysis
To select the correct meaning for the word ‘diamond’ in the
sentence “Joan saw Susan’s diamond shimmering from across the
room”.
It is necessary to know that neither geometrical shapes nor baseball
fields shimmer, whereas gemstones do (process of elimination).
The process of determining the correct meaning of an individual
word is call
or
.
It is done by associating, with each word in the lexicon,
information about the contexts in which each of the word’s senses
may appear.
31. Semantic Analysis
Other useful semantic markers are
PHYSICAL-OBJECT
ANIMATE-OBJECT
ABSTRACT-OBJECT
Using these markers, the correct meaning of ‘diamond’
in the sentence “I dropped my diamond” can be
computed.
As part of the lexical entry, the verb ‘drop’ will specify
that its object must be a PHYSICAL-OBJECT.
32. Semantic Analysis
Finally, we have to process the text at sentence level.
There are four approaches to this.
•semantic grammar
•case grammar
•conceptual parsing
•approximately compositional semantic interpretation.
33. Discourse Integration
The meaning of individual sentence may
depend on the sentences that precede it and
may influence the meaning of the sentences
that follow it
Example : “You wanted it”
Once the correct reference for it is known, we
can also determine exactly which it is being
referred to.
34. Pragmatic Analysis
The final step in pragmatic processing is to translate, from the knowledge
based representation to a command to be executed by the system.
This is an additional stage of analysis concerned with the pragmatic
use of the language.
This is important in the understanding of texts and dialogues.
The idea is, what was said is reinterpreted to determine what was actually
meant.
Example: “Do you know what time it is?”
Should be interpreted as a request.
35. Real World NLP Application
Machine Translation
Information Retrieval / NL interface
Information Visualization
Autonomous interacting bots
Grammar Checking Systems
Speech Recognition Systems / Speech Synthesizers
Document Summary Systems