3. LOGO
USER : Men are all alike.
ELIZA : In what way?
USER : Theyβre always bugging us about something or other.
ELIZA : Can you think of a specific example?
USER : Well, my boyfriend made me come here.
ELIZA : Your boyfriend made you come here ?
USER : He says Iβm depressed much of the time.
ELIZA : I am sorry to hear you are depressed.
USER : Itβs true, I am unhappy.
ELIZA : Do you think coming here will help you not to be
unhappy?
USER : I need some help; that much seems certain.
ELIZA : What would it mean to you if you got some help?
USER : Perhaps I could learn to get along with my mother.
ELIZA : Tell me more about your family.
USER : My mother takes care of me.
ELIZA : Who else in your family takes care of you?
USER : My father.
ELIZA : Your father ?
USER : You are like my father in some ways.
ELIZA : What resemblance do you see?
4. ο A sub-field of Artificial Intelligent, since 1960s β¦
ο Concerned with the interactions between computers and
human languages with one ultimate goal : Computers can
βunderstandβ human
ο Many applications in real world
5. ο Natural language unit?
ο Natural language understanding
ο Natural language generation
ο Data?
ο Speech processing
ο Text processing
Natural language text understanding!
6. ο Task of generating natural language from a machine
representation
ο May be viewed as the opposite of natural language
understanding .
ο Applications:
ο Jokes generation
ο Textual summaries of databases
ο Enhancing accessibility
7. ο An advanced subtopic of NLP deals with reading
comprehension
ο More complex than NLG
ο Many commercial interest in this field
ο News-gathering
ο Data-Mining
ο Voice-Activation
ο Large-scale content analysis
8. ο Logic is too clear, the lost of flexibility cause
difficulties in NLP
ο Examples :
ο Time flies like an arrow
Can be understood in 7 ways !!!
ο I never said she stole my money !
ο Someone else said it, but I didn't.
9. ο Logic is too clear, the lost of flexibility become
difficulties in NLP
ο Examples :
ο Time flies like an arrow
Can be understood in 7 ways !!!
ο I never said she stole my money !
ο I simply didn't ever say it
10. ο Logic is too clear, the lost of flexibility become
difficulties in NLP
ο Examples :
ο Time flies like an arrow
Can be understood in 7 ways !!!
ο I never said she stole my money !
ο I might have implied it in some way, but I never explicitly said it
11. ο Logic is too clear, the lost of flexibility become
difficulties in NLP
ο Examples :
ο Time flies like an arrow
Can be understood in 7 ways !!!
ο I never said she stole my money !
ο I said someone took it; I didn't say it was she
12. ο Logic is too clear, the lost of flexibility become
difficulties in NLP
ο Examples:
ο Time flies like an arrow
Can be understood in 7 ways !!!
ο I never said she stole my money !
ο I just said she probably borrowed it
13. ο Logic is too clear, the lost of flexibility become
difficulties in NLP
ο Examples :
ο Time flies like an arrow
Can be understood in 7 ways !!!
ο I never said she stole my money !
ο I said she stole someone else's money
14. ο Logic is too clear, the lost of flexibility become
difficulties in NLP
ο Examples :
ο Time flies like an arrow
Can be understood in 7 ways !!!
ο I never said she stole my money !
ο I said she stole something, but not my money
15. ο Words combination and division
ο Stress placing on words
ο The properties of subjects
ο We gave the monkeys the bananas because they were
hungry
ο We gave the monkeys the bananas because they were
over-ripe
ο Specifying which word an adjective applies to
ο A pretty little girls' school
16. ο Involves reasoning about the world
ο Embedded a social system of people interacting
ο persuading, insulting and amusing them
ο changing over time
ο Homonymous
28. ο ePi Group:
ο Automatic Vietnamese processing system
ο www.baomoi.com
ο Collecting news from all Vietnamese e-newspapers
ο EVTrans β Softex Co Ltd.
ο Cyclop
ο VnKim
29.
30.
31.
32.
33. ο Morphological analysis :
Individual words are analyzed into their
components
ο Syntactic analysis
Linear sequence of words are transformed
into structures that show how the words
relate to each other
ο Semantic analysis
ο A transformation is made from the input
text to an internal representation that
reflects the meaning
ο Pragmatic analysis
ο To reinterpret what was said to what was
actually meant
ο Discourse analysis
ο Resolving references between sentences
36. ο Morphemes: smallest meaningful unit
spoken units of language.
ο Stem: book, cat, car, β¦
ο Affixes : un-, -s, -es, .. Morphology
ο Clitic: βve, βm Syntax
Semantic
ο Morphological parsing: parsing a word
Pragmatic
into stem and affixes and identifying the
Discourse
parts and their relationships
37. ο Word Classes
ο Parts of speech: noun, verb, adjectives,
etc.
Morphology
ο Word class dictates how a word combines
with morphemes to form new words Syntax
Semantic
ο Examples Pragmatic
ο Books: book + s
Discourse
ο Unladylike = un + lady + like
38. ο Vietnamese?
ο Δn = Δn
Morphology
ο Uα»ng = uα»ng
ο Xe = xe Syntax
Semantic
ο No βXesβ in Vietnamese! Pragmatic
ο Problems are text tokenizing. Discourse
39. ο Why parse words?
Morphology
ο To identify a wordβs part-of-speech
ο To identify a wordβs stem (IR) Syntax
Semantic
β¦ then? Pragmatic
ο Spell- checking
Discourse
ο To predict next words
ο To predict the wordβs accent
40. ο Ambiguity
ο I want her to go to the cinema with me
Morphology
To - infinitive? Syntax
To - preposition? Semantic
Pragmatic
ο Con ngα»±a ΔΓ‘ ΔΓ‘ con ngα»±a ΔΓ‘.
Discourse
ΔΓ‘ = ΔΓ‘?
41. ο How to implement?
ο Regular expression
ο Finite State Transducers (FST)
ο Finite State Accepter (FSA) Morphology
Syntax
*.exe Semantic
ir??man
Pragmatic
b[0-9]+ *(Mb|[Mm]egabytes?)b
Discourse
42.
43. ο Relate terms:
ο Stem, stemming Morphology
ο Part of speech
Syntax
ο N-gram
Semantic
Pragmatic
Discourse
46. ο Linear sequence of words are transformed into
structures that show how the words relate to
each other.
Morphology
ο Determine grammatical structure.
Syntax
Semantic
Pragmatic
ο I am a boy = [Subject] [Verb] [Cardinal] [Noun] Discourse
48. ο Syntax
ο Actual structure of a sentence
Morphology
Syntax
ο Grammar
Semantic
ο The rule set used in the analysis
Pragmatic
Discourse
49. ο A grammar define syntactically legal sentences
ο I ate an apple (syntactic legal)
ο I ate apple (not syntactic legal)
ο I ate a building (syntactic legal, but?) Morphology
Syntax
doesnβt mean that itβs meaningful! Semantic
Pragmatic
Discourse
53. ο What could this meanβ¦
ο Representations of linguistic inputs that capture
the meanings of those inputs
ο For us it means Morphology
ο Representations that permit or facilitate Syntax
semantic processing
ο Permit us to reason about their truth Semantic
(relationship to some world)
Pragmatic
ο Permit us to answer questions based on their
content Discourse
ο Permit us to perform inference (answer
questions and determine the truth of things we
donβt actually know)
57. ο Pragmatics: concerns how sentences are
used in different situations and how use
Morphology
affects the interpretation of the sentence
Syntax
Semantic
ο Discourse: concerns how the Pragmatic
immediately preceding sentences affect
Discourse
the interpretation of the next sentence
58. Morphology
Syntax
ο βHeβ, βitβ, βhisβ can be inferred from
Semantic
previous sentence
Pragmatic
ο Itβs discourse Discourse
69. ο Can we use previously translated text to learn how to
translate new texts?
ο Yes! But, itβs not so easy
ο Two paradigms, statistical MT, and EBMT
ο Requirements:
ο Aligned large parallel corpus of translated sentences
ο {S source ο³ S target }
ο Bilingual dictionary for intra-S alignment
ο Generalization patterns (names, numbers, datesβ¦)
70. ο Simplest: Translation Memory
ο If S new= S source in corpus, output aligned S target
ο Compositional EBMT
ο If fragment of Snew matches fragment of Ss, output
corresponding fragment of aligned St
ο Prefer maximal-length fragments
ο Maximize grammatical compositionality
ο Via a target language grammar
ο Or, via an N-gram statistical language model
71. ο Requires an Interlingua - language-neutral Knowledge
Representation (KR)
ο Philosophical debate: Is there an interlingua?
ο FOL is not totally language neutral (predicates,
functions, expressed in a language)
ο Other near-interlinguas (Conceptual Dependency)
ο Requires a fully-disambiguating parser
ο Domain model of legal objects, actions, relations
ο Requires a NL generator (KR -> text)
ο Applicable only to well-defined technical domains
ο Produces high-quality MT in those domains
73. ο Each approach has its own strength
ο Rapidly adaptable: statistical, example-based
ο Good grammar: rule-based (grammar)
ο High precision in narrow domain: Intelingua
75. ο Spider - a browser-like program that downloads web pages.
ο Crawler β a program that automatically follows all of the
links on each web page.
ο Indexer - a program that analyzes web pages downloaded
by the spider and the crawler.
ο Databaseβ storage for downloaded and processed pages.
ο Results engine β extracts search results from the database.
ο Web server β a server that is responsible for interaction
between the user and other search engine components.
76. ο Spider - a browser-like program that downloads web pages.
ο Crawler β a program that automatically follows all of the
links on each web page.
ο Indexer - a program that analyzes web pages downloaded
by the spider and the crawler.
ο Databaseβ storage for downloaded and processed pages.
ο Results engine β extracts search results from the database.
ο Web server β a server that is responsible for interaction
between the user and other search engine components.
77.
78.
79.
80. ο Idea is to βextractβ particular types of information from
arbitrary text or transcribed speech
ο Examples:
ο Names entities: people, places, organization
ο Telephone numbers
ο Dates
ο Many uses:
ο Question answering systems, fisting of news or mailβ¦
ο Job ads, financial information, terrorist attacks
81. ο Often use a set of simple templates or frames with slots
to be filled in from input text. Ignore everything else.
ο Husniβs number is 966-3-860-2624.
ο The inventor of the First plane was Abbas ibnu Fernas
ο The British King died in March of 1932.
82. ο Named Entity recognition (NE)
ο Finds and classifies names, places etc.
ο Co-reference Resolution (CO)
ο Identifies identity relations between entities in texts.
ο Template Element construction (TE)
ο Adds descriptive information to NE results (using CO).
ο Template Relation construction (TR)
ο Finds relations between TE entities. Scenario
ο Template production (ST)
ο Fits TE and TR results into specified event scenarios.
83.
84.
85.
86.
87.
88.
89. ο AIML = Artificial Intelligent Mark-up Language
ο Alice
90. ο A.L.I.C.E. (Artificial Linguistic Internet Computer
Entity)
ο an award-winning free natural language artificial
intelligence chat robot.
ο Ruled-base
ο Human-like answer without complicated βbrainβ
ο Multi-language
91.
92. ο NLPβs course , Husni Al-Muhtaseb
ο Lexical descriptions for Vietnamese language
processing .
ο en.wikipedia.org
ο www.xulyngonngu.com