SlideShare a Scribd company logo
1 of 63
Natural Language Processing
•
•
•
•
•
•

Machine Translation
Predicate argument structures
Syntactic parses
Lexical Semantics
Probabilistic Parsing
Ambiguities in sentence interpretation

CSE391 – 2005

1

NLP
Machine Translation
• One of the first applications for computers
– bilingual dictionary > word-word translation

• Good translation requires understanding!
– War and Peace, The Sound and The Fury?

• What can we do? Sublanguages.
– technical domains, static vocabulary
– Meteo in Canada, Caterpillar Tractor Manuals,
Botanical descriptions, Military Messages
CSE391 – 2005

2

NLP
Example translation

CSE391 – 2005

3

NLP
Translation Issues:
Korean to English
- Word order
- Dropped arguments
- Lexical ambiguities
- Structure vs morphology

CSE391 – 2005

4

NLP
Common Thread
• Predicate-argument structure
– Basic constituents of the sentence and how
they are related to each other

• Constituents
– John, Mary, the dog, pleasure, the store.

• Relations
– Loves, feeds, go, to, bring

CSE391 – 2005

5

NLP
Abstracting away from surface
structure

CSE391 – 2005

6

NLP
Transfer lexicons

CSE391 – 2005

7

NLP
Machine Translation Lexical Choice- Word
Sense Disambiguation

Iraq lost the battle.
Ilakuka centwey ciessta.
[Iraq ] [battle] [lost].
John lost his computer.
John-i computer-lul ilepelyessta.
[John] [computer] [misplaced].
CSE391 – 2005

8

NLP
Natural Language Processing
• Syntax
– Grammars, parsers, parse trees,
dependency structures

• Semantics
– Subcategorization frames, semantic
classes, ontologies, formal semantics

• Pragmatics
– Pronouns, reference resolution, discourse
models
CSE391 – 2005

9

NLP
Syntactic Categories
• Nouns, pronouns, Proper nouns
• Verbs, intransitive verbs, transitive verbs,
ditransitive verbs (subcategorization
frames)
• Modifiers, Adjectives, Adverbs
• Prepositions
• Conjunctions
CSE391 – 2005

10

NLP
Syntactic Parsing
• The cat sat on the mat.
Det Noun Verb Prep Det Noun

• Time flies like an arrow.
Noun Verb

Prep Det Noun

• Fruit flies like a banana.
Noun Noun
CSE391 – 2005

Verb Det Noun

11

NLP
Parses
The cat sat on the mat

S
NP
Det
the

VP
N
cat

PP

V
sat
Prep
on

CSE391 – 2005

12

NP
Det
the

NLP

N
mat
Parses
Time flies like an arrow.

S
NP
VP
N
time

V
flies

PP
Prep
like

CSE391 – 2005

13

NP
Det
an

N
arrow
NLP
Parses
Time flies like an arrow.

S
NP
N
time

N
flies

VP
V
like

NP
Det
an

CSE391 – 2005

14

N
arrow

NLP
Recursive transition nets for CFGs
NP

S

S1

VP
S2

pp

det

NP

S4

S3
noun

noun

S5

adj

S6

pronoun

• s :- np,vp.
• np:- pronoun; noun; det,adj, noun; np,pp.
CSE391 – 2005

15

NLP
Lexicon
noun(cat).

noun(flies).

noun(mat).

noun(time).

det(the).

noun(arrow).

det(a).

det(an).

verb(sat).

verb(flies).

prep(on).

verb(time).
prep(like).

CSE391 – 2005

16

NLP
Lexicon with Roots
noun(cat,cat).

noun(flies,fly).

noun(mat,mat).

noun(time,time).

det(the,the)

noun(arrow,arrow).

det(a,a).

det(an,an).

verb(sat,sit).

verb(flies,fly).

prep(on,on).

verb(time,time).
prep(like,like).

CSE391 – 2005

17

NLP
Parses
The old can can hold the water.

S
NP

VP
det
the

adj
old

N
can

CSE391 – 2005

aux
can V
hold

18

NP
det
the

N
water

NLP
Structural ambiguities
• That factory can can tuna.
• That factory cans cans of tuna and salmon.
• Have the students in cse91 finish the exam in
212.
• Have the students in cse91 finished the exam in
212?
CSE391 – 2005

19

NLP
Lexicon
The old can can hold the water.
Noun(can,can)

Verb(hold,hold)

Noun(cans,can)

Verb(holds,hold)

Noun(water,water)

Aux(can,can)

Noun(hold,hold)

Adj(old,old)

Noun(holds,hold)
Det(the,the)
CSE391 – 2005

20

NLP
Simple Context Free Grammar in BNF
notation
S
NP

→
→

NP VP
Pronoun | Noun | Det Adj Noun |NP PP

PP

→

Prep NP

V
VP

→
→

Verb | Aux Verb
V | V NP | V NP NP | V NP PP | VP PP

CSE391 – 2005

21

NLP
Top-down parse in progress
[The, old, can, can, hold, the, water]

S → NP VP
NP → NP?
NP → Pronoun?
Pronoun? fail
NP → Noun?
Noun? fail
NP → Det Adj Noun?
Det? the
ADJ?old
Noun? Can
Succeed.
Succeed.

VP?
CSE391 – 2005

22

NLP
Top-down parse in progress
[can, hold, the, water]

VP → VP?
V → Verb?
Verb? fail
V → Aux Verb?
Aux? can
Verb? hold
succeed
succeed

fail

[the, water]

CSE391 – 2005

23

NLP
Top-down parse in progress
[can, hold, the, water]
VP → VP NP
V → Verb?
Verb? fail
V → Aux Verb?
Aux? can
Verb? hold
NP → Pronoun?
Pronoun? fail
NP → Noun?
Noun? fail
NP → Det Adj Noun?
Det? the
ADJ? fail

CSE391 – 2005

24

NLP
Lexicon
Noun(can,can)

Verb(hold,hold)

Noun(cans,can)

Verb(holds,hold)

Noun(water,water)

Aux(can,can)

Noun(hold,hold)

Adj(old,old)

Noun(holds,hold)

Adj( ,

)

Det(the,the)
CSE391 – 2005

25

NLP
Top-down parse in progress
[can, hold, the, water]
VP → V NP?
V → Verb?
Verb? fail
V → Aux Verb?
Aux? can
Verb? hold
NP → Pronoun?
Pronoun? fail
NP → Noun?
Noun? fail
NP → Det Adj Noun?
Det? the
ADJ?
Noun? water
SUCCEED
SUCCEED
CSE391 – 2005

26

NLP
Lexicon
Noun(can,can)

Verb(hold,hold)

Noun(cans,can)

Verb(holds,hold)

Noun(water,water)

Aux(can,can)

Noun(hold,hold)

Adj(old,old)

Noun(holds,hold)

Adj( ,

)

Det(the,the)
Noun(old,old)
CSE391 – 2005

27

NLP
Top-down approach
• Start with goal of sentence
S → NP VP
S → Wh-word Aux NP VP

• Will try to find an NP 4 different ways
before trying a parse where the verb
comes first.
• What does this remind you of?
– search

• What would be better?
CSE391 – 2005

28

NLP
Bottom-up approach
• Start with words in sentence.
• What structures do they correspond to?
• Once a structure is built, kept on a
CHART.

CSE391 – 2005

29

NLP
Bottom-up parse in progress

det adj noun
The old
can
det noun

CSE391 – 2005

aux
can

verb det
hold the

noun.
water.

aux/verb noun/verb noun det noun.

30

NLP
Bottom-up parse in progress

det adj noun
The old
can
det noun

CSE391 – 2005

aux
can

verb det
hold the

noun.
water.

aux/verb noun/verb noun det noun.

31

NLP
Bottom-up parse in progress

det adj noun
The old
can
det noun

CSE391 – 2005

aux
can

verb det
hold the

noun.
water.

aux/verb noun/verb noun det noun.

32

NLP
Top-down vs. Bottom-up
• Helps with POS
ambiguities – only
consider relevant
POS
• Rebuilds the same
structure repeatedly
• Spends a lot of time
on impossible parses

CSE391 – 2005

• Has to consider every
POS
• Builds each structure
once
• Spends a lot of time
on useless structures

What would be better?
33

NLP
Hybrid approach
• Top-down with a chart
• Use look ahead and heuristics to pick
most likely sentence type
• Use probabilities for pos tagging, pp
attachments, etc.
CSE391 – 2005

34

NLP
Features
• C for Case, Subjective/Objective
– She visited her.

• P for Person agreement, (1st, 2nd, 3rd)
– I like him, You like him, He likes him,

• N for Number agreement, Subject/Verb
– He likes him, They like him.

• G for Gender agreement, Subject/Verb
– English, reflexive pronouns He washed himself.
– Romance languages, det/noun

• T for Tense,
– auxiliaries, sentential complements, etc.
– * will finished is bad
CSE391 – 2005

35

NLP
Example Lexicon Entries
Using Features:
Case, Number, Gender, Person
pronoun(subj, sing, fem, third, she, she).
pronoun(obj, sing, fem, third, her, her).
pronoun(obj, Num, Gender, second, you, you).
pronoun(subj, sing, Gender, first, I, I).
noun(Case, plural, Gender, third, flies,fly).
CSE391 – 2005

36

NLP
Language to Logic
How do we get there?
• John went to the book store.
∃ John ∃ store1, go(John, store1)

• John bought a book.
buy(John,book1)

• John gave the book to Mary.
give(John,book1,Mary)

• Mary put the book on the table.
put(Mary,book1,table1)
CSE391 – 2005

37

NLP
Lexical Semantics
Same event - different sentences
John broke the window with a hammer.
John broke the window with the crack.
The hammer broke the window.
The window broke.

CSE391 – 2005

38

NLP
Same event - different syntactic frames
John broke the window with a hammer.
SUBJ VERB

OBJ

MODIFIER

John broke the window with the crack.
SUBJ VERB

OBJ

MODIFIER

The hammer broke the window.
SUBJ VERB

OBJ

The window broke.
SUBJ VERB
CSE391 – 2005

39

NLP
Semantics -predicate arguments
break(AGENT, INSTRUMENT, PATIENT)
AGENT
PATIENT
INSTRUMENT
John broke the window with a hammer.
INSTRUMENT
PATIENT
The hammer broke the window.
PATIENT
The window broke.
Fillmore 68 - The case for case
CSE391 – 2005

40

NLP
AGENT

PATIENT

INSTRUMENT

John broke the window with a hammer.
SUBJ

OBJ

INSTRUMENT

MODIFIER
PATIENT

The hammer broke the window.
SUBJ

OBJ

PATIENT

The window broke.
SUBJ
CSE391 – 2005

41

NLP
Constraint Satisfaction
break (Agent: animate,
Instrument: tool,
Patient: physical-object)
Agent
<=> subj
Instrument <=> subj, with-pp
Patient
<=> obj, subj
ACL81,ACL85,ACL86,MT90,CUP90,AIJ93
CSE391 – 2005

42

NLP
Syntax/semantics interaction
• Parsers will produce syntactically valid
parses for semantically anomalous
sentences
• Lexical semantics can be used to rule
them out

CSE391 – 2005

43

NLP
Constraint Satisfaction
give

(Agent: animate,
Patient: physical-object
Recipient: animate)

Agent
<=> subj
Patient
<=> object
Recipient <=> indirect-object, to-pp

CSE391 – 2005

44

NLP
Subcategorization Frequencies
• The women kept the dogs on the beach.
– Where keep? Keep on beach 95%
• NP XP 81%

– Which dogs? Dogs on beach 5%
• NP 19%

• The women discussed the dogs on the beach.
– Where discuss? Discuss on beach 10%
• NP PP 24%

– Which dogs? Dogs on beach 90%
• NP 76%

CSE391 – 2005

Ford, Bresnan, Kaplan 82, Jurafsky 98, Roland,Jurafsky 99
45
NLP
Reading times
• NP-bias (slower times to bold word)
The waiter confirmed the reservation was
made yesterday.
The defendant accepted the verdict would
be decided soon.
CSE391 – 2005

46

NLP
Reading times
• S-bias (no slower times to bold word)
The waiter insisted the reservation was
made yesterday.
The defendant wished the verdict would be
decided soon.
CSE391 – 2005

Trueswell, Tanenhaus and Kello, 93
47
Trueswell and Kim 98NLP
Probabilistic Context Free
Grammars
• Adding probabilities
• Lexicalizing the probabilities

CSE391 – 2005

48

NLP
Simple Context Free Grammar in BNF
S
NP

→
→

PP
V

→
→

VP

→

CSE391 – 2005

NP VP
Pronoun
| Noun
| Det Adj Noun
|NP PP
Prep NP
Verb
| Aux Verb
V
| V NP
| V NP NP
| V NP PP
| VP PP
49

NLP
Simple Probabilistic CFG
S
NP

→
→

PP
V

→
→

VP

→

CSE391 – 2005

NP VP
Pronoun
| Noun
| Det Adj Noun
|NP PP
Prep NP
Verb
| Aux Verb
V
| V NP
| V NP NP
| V NP PP
| VP PP
50

[0.10]
[0.20]
[0.50]
[0.20]
[1.00]
[0.20]
[0.20]
[0.10]
[0.40]
[0.10]
[0.20]
[0.20]
NLP
Simple Probabilistic Lexicalized CFG
S
NP

→
→

PP
V

→
→

VP

→

CSE391 – 2005

NP VP
Pronoun
| Noun
| Det Adj Noun
|NP PP
Prep NP
Verb
| Aux Verb
V
| V NP
| V NP NP
| V NP PP
| VP PP
51

[0.10]
[0.20]
[0.50]
[0.20]
[1.00]
[0.20]
[0.20]
[0.87] {sleep, cry, laugh}
[0.03]
[0.00]
[0.00]
[0.10]
NLP
Simple Probabilistic Lexicalized CFG
VP

→

V
| V NP
| V NP NP
| V NP PP
| VP PP

[0.30]
[0.60] {break,split,crack..}
[0.00]
[0.00]
[0.10]

VP

→

V
| V NP
| V NP NP
| V NP PP
| VP PP

[0.10]
[0.40]
[0.10]
[0.20]
[0.20]

CSE391 – 2005

52

what about
leave?
leave1, leave2?

NLP
A TreeBanked Sentence
(S (NP-SBJ Analysts)
(VP have
(VP been
VP
(VP expecting
(NP (NP a GM-Jaguar pact)
have VP
(SBAR (WHNP-1 that)
(S (NP-SBJ *T*-1)
NP-SBJ
been VP
(VP would
Analyst
(VP give
expectingNP
s
(NP the U.S. car maker)
SBAR
(NP (NP an eventual (ADJP 30 %) stake)
NP
S
(PP-LOC in (NP the British company))))))))))))
a GM-Jaguar WHNP-1
VP
pact
that NP-SBJ
VP
*T*-1 would
NP
give
PP-LOC
NP
Analysts have been expecting a GM-Jaguar
NP
the US car
pact that would give the U.S. car maker an
NP
an eventual
maker
eventual 30% stake in the British company.
in
the British
30% stake
company
S

CSE391 – 2005

53

NLP
The same sentence, PropBanked
(S Arg0 (NP-SBJ Analysts)
(VP have
(VP been
Arg1
(VP expecting
Arg1 (NP (NP a GM-Jaguar pact)
(SBAR (WHNP-1 that)
(S Arg0 (NP-SBJ *T*-1)
a GM-Jaguar
(VP would
pact
(VP give
Arg2 (NP the U.S. car maker)
Arg1 (NP (NP an eventual (ADJP 30 %) stake)
(PP-LOC in (NP the British company))))))))))))

have been expecting
Arg0

Analyst
s

Arg0

*T*-1

that would give
Arg2

Arg1

an eventual 30% stake in the
British company

the US car
maker

CSE391 – 2005

expect(Analysts, GM-J pact)
give(GM-J pact, US car maker, 30% stake)
54

NLP
Headlines
• Police Begin Campaign To Run Down Jaywalkers
• Iraqi Head Seeks Arms
• Teacher Strikes Idle Kids
• Miners Refuse To Work After Death
• Juvenile Court To Try Shooting Defendant

CSE391 – 2005

55

NLP
Events
• From KRR lecture

CSE391 – 2005

56

NLP
Context Sensitivity
• Programming languages are Context Free
• Natural languages are Context Sensitive?
– Movement
– Features
– respectively
John, Mary and Bill ate peaches, pears and apples,
respectively.
CSE391 – 2005

57

NLP
The Chomsky Grammar Hierarchy
• Regular grammars, aabbbb
S → aS | nil | bS

• Context free grammars, aaabbb
S →

aSb | nil

• Context sensitive grammars, aaabbbccc
xSy → xby

• Turing Machines
CSE391 – 2005

58

NLP
Recursive transition nets for CFGs
NP

S

S1

VP
S2

pp

det

NP

S4

S3
noun

noun

S5

adj

S6

pronoun

• s :- np,vp.
• np:- pronoun; noun; det,adj, noun;
np,pp.
CSE391 – 2005
59

NLP
Most parsers are Turing Machines
• To give a more natural and
comprehensible treatment of movement
• For a more efficient treatment of features
• Not because of respectively – most
parsers can’t handle it.

CSE391 – 2005

60

NLP
Nested Dependencies and
Crossing Dependencies
CF The dog chased the cat that bit the mouse that ran.

CF The mouse the cat the dog chased bit ran.

CS John, Mary and Bill ate peaches, pears and
apples, respectively
CSE391 – 2005

61

NLP
Movement
What did John give to Mary?
*Where did John give to Mary?
John gave cookies to Mary.
John gave <what> to Mary.
CSE391 – 2005

62

NLP
Handling Movement:
Hold registers/Slash Categories
• S

:- Wh, S/NP

• S/NP
• S/NP

:- VP
:- NP VP/NP

• VP/NP :- Verb

CSE391 – 2005

63

NLP

More Related Content

Recently uploaded

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 

Recently uploaded (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Nlp.set

  • 1. Natural Language Processing • • • • • • Machine Translation Predicate argument structures Syntactic parses Lexical Semantics Probabilistic Parsing Ambiguities in sentence interpretation CSE391 – 2005 1 NLP
  • 2. Machine Translation • One of the first applications for computers – bilingual dictionary > word-word translation • Good translation requires understanding! – War and Peace, The Sound and The Fury? • What can we do? Sublanguages. – technical domains, static vocabulary – Meteo in Canada, Caterpillar Tractor Manuals, Botanical descriptions, Military Messages CSE391 – 2005 2 NLP
  • 4. Translation Issues: Korean to English - Word order - Dropped arguments - Lexical ambiguities - Structure vs morphology CSE391 – 2005 4 NLP
  • 5. Common Thread • Predicate-argument structure – Basic constituents of the sentence and how they are related to each other • Constituents – John, Mary, the dog, pleasure, the store. • Relations – Loves, feeds, go, to, bring CSE391 – 2005 5 NLP
  • 6. Abstracting away from surface structure CSE391 – 2005 6 NLP
  • 8. Machine Translation Lexical Choice- Word Sense Disambiguation Iraq lost the battle. Ilakuka centwey ciessta. [Iraq ] [battle] [lost]. John lost his computer. John-i computer-lul ilepelyessta. [John] [computer] [misplaced]. CSE391 – 2005 8 NLP
  • 9. Natural Language Processing • Syntax – Grammars, parsers, parse trees, dependency structures • Semantics – Subcategorization frames, semantic classes, ontologies, formal semantics • Pragmatics – Pronouns, reference resolution, discourse models CSE391 – 2005 9 NLP
  • 10. Syntactic Categories • Nouns, pronouns, Proper nouns • Verbs, intransitive verbs, transitive verbs, ditransitive verbs (subcategorization frames) • Modifiers, Adjectives, Adverbs • Prepositions • Conjunctions CSE391 – 2005 10 NLP
  • 11. Syntactic Parsing • The cat sat on the mat. Det Noun Verb Prep Det Noun • Time flies like an arrow. Noun Verb Prep Det Noun • Fruit flies like a banana. Noun Noun CSE391 – 2005 Verb Det Noun 11 NLP
  • 12. Parses The cat sat on the mat S NP Det the VP N cat PP V sat Prep on CSE391 – 2005 12 NP Det the NLP N mat
  • 13. Parses Time flies like an arrow. S NP VP N time V flies PP Prep like CSE391 – 2005 13 NP Det an N arrow NLP
  • 14. Parses Time flies like an arrow. S NP N time N flies VP V like NP Det an CSE391 – 2005 14 N arrow NLP
  • 15. Recursive transition nets for CFGs NP S S1 VP S2 pp det NP S4 S3 noun noun S5 adj S6 pronoun • s :- np,vp. • np:- pronoun; noun; det,adj, noun; np,pp. CSE391 – 2005 15 NLP
  • 18. Parses The old can can hold the water. S NP VP det the adj old N can CSE391 – 2005 aux can V hold 18 NP det the N water NLP
  • 19. Structural ambiguities • That factory can can tuna. • That factory cans cans of tuna and salmon. • Have the students in cse91 finish the exam in 212. • Have the students in cse91 finished the exam in 212? CSE391 – 2005 19 NLP
  • 20. Lexicon The old can can hold the water. Noun(can,can) Verb(hold,hold) Noun(cans,can) Verb(holds,hold) Noun(water,water) Aux(can,can) Noun(hold,hold) Adj(old,old) Noun(holds,hold) Det(the,the) CSE391 – 2005 20 NLP
  • 21. Simple Context Free Grammar in BNF notation S NP → → NP VP Pronoun | Noun | Det Adj Noun |NP PP PP → Prep NP V VP → → Verb | Aux Verb V | V NP | V NP NP | V NP PP | VP PP CSE391 – 2005 21 NLP
  • 22. Top-down parse in progress [The, old, can, can, hold, the, water] S → NP VP NP → NP? NP → Pronoun? Pronoun? fail NP → Noun? Noun? fail NP → Det Adj Noun? Det? the ADJ?old Noun? Can Succeed. Succeed. VP? CSE391 – 2005 22 NLP
  • 23. Top-down parse in progress [can, hold, the, water] VP → VP? V → Verb? Verb? fail V → Aux Verb? Aux? can Verb? hold succeed succeed fail [the, water] CSE391 – 2005 23 NLP
  • 24. Top-down parse in progress [can, hold, the, water] VP → VP NP V → Verb? Verb? fail V → Aux Verb? Aux? can Verb? hold NP → Pronoun? Pronoun? fail NP → Noun? Noun? fail NP → Det Adj Noun? Det? the ADJ? fail CSE391 – 2005 24 NLP
  • 26. Top-down parse in progress [can, hold, the, water] VP → V NP? V → Verb? Verb? fail V → Aux Verb? Aux? can Verb? hold NP → Pronoun? Pronoun? fail NP → Noun? Noun? fail NP → Det Adj Noun? Det? the ADJ? Noun? water SUCCEED SUCCEED CSE391 – 2005 26 NLP
  • 28. Top-down approach • Start with goal of sentence S → NP VP S → Wh-word Aux NP VP • Will try to find an NP 4 different ways before trying a parse where the verb comes first. • What does this remind you of? – search • What would be better? CSE391 – 2005 28 NLP
  • 29. Bottom-up approach • Start with words in sentence. • What structures do they correspond to? • Once a structure is built, kept on a CHART. CSE391 – 2005 29 NLP
  • 30. Bottom-up parse in progress det adj noun The old can det noun CSE391 – 2005 aux can verb det hold the noun. water. aux/verb noun/verb noun det noun. 30 NLP
  • 31. Bottom-up parse in progress det adj noun The old can det noun CSE391 – 2005 aux can verb det hold the noun. water. aux/verb noun/verb noun det noun. 31 NLP
  • 32. Bottom-up parse in progress det adj noun The old can det noun CSE391 – 2005 aux can verb det hold the noun. water. aux/verb noun/verb noun det noun. 32 NLP
  • 33. Top-down vs. Bottom-up • Helps with POS ambiguities – only consider relevant POS • Rebuilds the same structure repeatedly • Spends a lot of time on impossible parses CSE391 – 2005 • Has to consider every POS • Builds each structure once • Spends a lot of time on useless structures What would be better? 33 NLP
  • 34. Hybrid approach • Top-down with a chart • Use look ahead and heuristics to pick most likely sentence type • Use probabilities for pos tagging, pp attachments, etc. CSE391 – 2005 34 NLP
  • 35. Features • C for Case, Subjective/Objective – She visited her. • P for Person agreement, (1st, 2nd, 3rd) – I like him, You like him, He likes him, • N for Number agreement, Subject/Verb – He likes him, They like him. • G for Gender agreement, Subject/Verb – English, reflexive pronouns He washed himself. – Romance languages, det/noun • T for Tense, – auxiliaries, sentential complements, etc. – * will finished is bad CSE391 – 2005 35 NLP
  • 36. Example Lexicon Entries Using Features: Case, Number, Gender, Person pronoun(subj, sing, fem, third, she, she). pronoun(obj, sing, fem, third, her, her). pronoun(obj, Num, Gender, second, you, you). pronoun(subj, sing, Gender, first, I, I). noun(Case, plural, Gender, third, flies,fly). CSE391 – 2005 36 NLP
  • 37. Language to Logic How do we get there? • John went to the book store. ∃ John ∃ store1, go(John, store1) • John bought a book. buy(John,book1) • John gave the book to Mary. give(John,book1,Mary) • Mary put the book on the table. put(Mary,book1,table1) CSE391 – 2005 37 NLP
  • 38. Lexical Semantics Same event - different sentences John broke the window with a hammer. John broke the window with the crack. The hammer broke the window. The window broke. CSE391 – 2005 38 NLP
  • 39. Same event - different syntactic frames John broke the window with a hammer. SUBJ VERB OBJ MODIFIER John broke the window with the crack. SUBJ VERB OBJ MODIFIER The hammer broke the window. SUBJ VERB OBJ The window broke. SUBJ VERB CSE391 – 2005 39 NLP
  • 40. Semantics -predicate arguments break(AGENT, INSTRUMENT, PATIENT) AGENT PATIENT INSTRUMENT John broke the window with a hammer. INSTRUMENT PATIENT The hammer broke the window. PATIENT The window broke. Fillmore 68 - The case for case CSE391 – 2005 40 NLP
  • 41. AGENT PATIENT INSTRUMENT John broke the window with a hammer. SUBJ OBJ INSTRUMENT MODIFIER PATIENT The hammer broke the window. SUBJ OBJ PATIENT The window broke. SUBJ CSE391 – 2005 41 NLP
  • 42. Constraint Satisfaction break (Agent: animate, Instrument: tool, Patient: physical-object) Agent <=> subj Instrument <=> subj, with-pp Patient <=> obj, subj ACL81,ACL85,ACL86,MT90,CUP90,AIJ93 CSE391 – 2005 42 NLP
  • 43. Syntax/semantics interaction • Parsers will produce syntactically valid parses for semantically anomalous sentences • Lexical semantics can be used to rule them out CSE391 – 2005 43 NLP
  • 44. Constraint Satisfaction give (Agent: animate, Patient: physical-object Recipient: animate) Agent <=> subj Patient <=> object Recipient <=> indirect-object, to-pp CSE391 – 2005 44 NLP
  • 45. Subcategorization Frequencies • The women kept the dogs on the beach. – Where keep? Keep on beach 95% • NP XP 81% – Which dogs? Dogs on beach 5% • NP 19% • The women discussed the dogs on the beach. – Where discuss? Discuss on beach 10% • NP PP 24% – Which dogs? Dogs on beach 90% • NP 76% CSE391 – 2005 Ford, Bresnan, Kaplan 82, Jurafsky 98, Roland,Jurafsky 99 45 NLP
  • 46. Reading times • NP-bias (slower times to bold word) The waiter confirmed the reservation was made yesterday. The defendant accepted the verdict would be decided soon. CSE391 – 2005 46 NLP
  • 47. Reading times • S-bias (no slower times to bold word) The waiter insisted the reservation was made yesterday. The defendant wished the verdict would be decided soon. CSE391 – 2005 Trueswell, Tanenhaus and Kello, 93 47 Trueswell and Kim 98NLP
  • 48. Probabilistic Context Free Grammars • Adding probabilities • Lexicalizing the probabilities CSE391 – 2005 48 NLP
  • 49. Simple Context Free Grammar in BNF S NP → → PP V → → VP → CSE391 – 2005 NP VP Pronoun | Noun | Det Adj Noun |NP PP Prep NP Verb | Aux Verb V | V NP | V NP NP | V NP PP | VP PP 49 NLP
  • 50. Simple Probabilistic CFG S NP → → PP V → → VP → CSE391 – 2005 NP VP Pronoun | Noun | Det Adj Noun |NP PP Prep NP Verb | Aux Verb V | V NP | V NP NP | V NP PP | VP PP 50 [0.10] [0.20] [0.50] [0.20] [1.00] [0.20] [0.20] [0.10] [0.40] [0.10] [0.20] [0.20] NLP
  • 51. Simple Probabilistic Lexicalized CFG S NP → → PP V → → VP → CSE391 – 2005 NP VP Pronoun | Noun | Det Adj Noun |NP PP Prep NP Verb | Aux Verb V | V NP | V NP NP | V NP PP | VP PP 51 [0.10] [0.20] [0.50] [0.20] [1.00] [0.20] [0.20] [0.87] {sleep, cry, laugh} [0.03] [0.00] [0.00] [0.10] NLP
  • 52. Simple Probabilistic Lexicalized CFG VP → V | V NP | V NP NP | V NP PP | VP PP [0.30] [0.60] {break,split,crack..} [0.00] [0.00] [0.10] VP → V | V NP | V NP NP | V NP PP | VP PP [0.10] [0.40] [0.10] [0.20] [0.20] CSE391 – 2005 52 what about leave? leave1, leave2? NLP
  • 53. A TreeBanked Sentence (S (NP-SBJ Analysts) (VP have (VP been VP (VP expecting (NP (NP a GM-Jaguar pact) have VP (SBAR (WHNP-1 that) (S (NP-SBJ *T*-1) NP-SBJ been VP (VP would Analyst (VP give expectingNP s (NP the U.S. car maker) SBAR (NP (NP an eventual (ADJP 30 %) stake) NP S (PP-LOC in (NP the British company)))))))))))) a GM-Jaguar WHNP-1 VP pact that NP-SBJ VP *T*-1 would NP give PP-LOC NP Analysts have been expecting a GM-Jaguar NP the US car pact that would give the U.S. car maker an NP an eventual maker eventual 30% stake in the British company. in the British 30% stake company S CSE391 – 2005 53 NLP
  • 54. The same sentence, PropBanked (S Arg0 (NP-SBJ Analysts) (VP have (VP been Arg1 (VP expecting Arg1 (NP (NP a GM-Jaguar pact) (SBAR (WHNP-1 that) (S Arg0 (NP-SBJ *T*-1) a GM-Jaguar (VP would pact (VP give Arg2 (NP the U.S. car maker) Arg1 (NP (NP an eventual (ADJP 30 %) stake) (PP-LOC in (NP the British company)))))))))))) have been expecting Arg0 Analyst s Arg0 *T*-1 that would give Arg2 Arg1 an eventual 30% stake in the British company the US car maker CSE391 – 2005 expect(Analysts, GM-J pact) give(GM-J pact, US car maker, 30% stake) 54 NLP
  • 55. Headlines • Police Begin Campaign To Run Down Jaywalkers • Iraqi Head Seeks Arms • Teacher Strikes Idle Kids • Miners Refuse To Work After Death • Juvenile Court To Try Shooting Defendant CSE391 – 2005 55 NLP
  • 56. Events • From KRR lecture CSE391 – 2005 56 NLP
  • 57. Context Sensitivity • Programming languages are Context Free • Natural languages are Context Sensitive? – Movement – Features – respectively John, Mary and Bill ate peaches, pears and apples, respectively. CSE391 – 2005 57 NLP
  • 58. The Chomsky Grammar Hierarchy • Regular grammars, aabbbb S → aS | nil | bS • Context free grammars, aaabbb S → aSb | nil • Context sensitive grammars, aaabbbccc xSy → xby • Turing Machines CSE391 – 2005 58 NLP
  • 59. Recursive transition nets for CFGs NP S S1 VP S2 pp det NP S4 S3 noun noun S5 adj S6 pronoun • s :- np,vp. • np:- pronoun; noun; det,adj, noun; np,pp. CSE391 – 2005 59 NLP
  • 60. Most parsers are Turing Machines • To give a more natural and comprehensible treatment of movement • For a more efficient treatment of features • Not because of respectively – most parsers can’t handle it. CSE391 – 2005 60 NLP
  • 61. Nested Dependencies and Crossing Dependencies CF The dog chased the cat that bit the mouse that ran. CF The mouse the cat the dog chased bit ran. CS John, Mary and Bill ate peaches, pears and apples, respectively CSE391 – 2005 61 NLP
  • 62. Movement What did John give to Mary? *Where did John give to Mary? John gave cookies to Mary. John gave <what> to Mary. CSE391 – 2005 62 NLP
  • 63. Handling Movement: Hold registers/Slash Categories • S :- Wh, S/NP • S/NP • S/NP :- VP :- NP VP/NP • VP/NP :- Verb CSE391 – 2005 63 NLP

Editor's Notes

  1. Also [you] time flies like an arrows, with “time” as a verb.
  2. &lt;number&gt; anything missing? noun([fly|T],T), verb([fly|T],T). verb([sit|T],T).
  3. &lt;number&gt; anything missing? noun([fly|T],T), verb([fly|T],T). verb([sit|T],T).
  4. &lt;number&gt; Do lexicon on board!
  5. &lt;number&gt; anything missing? noun([fly|T],T), verb([fly|T],T). verb([sit|T],T).
  6. &lt;number&gt; anything missing? noun([fly|T],T), verb([fly|T],T). verb([sit|T],T).
  7. &lt;number&gt; anything missing? noun([fly|T],T), verb([fly|T],T). verb([sit|T],T).
  8. &lt;number&gt; Here is an example from the Wall Street Journal Treebank II. The sentence is in the light blue box on the bottom left corner, the syntactic structure as a parse tree is in the middle, the actual annotation is in the light gray box.
  9. &lt;number&gt; Here the same annotation is still in the gray box, with the argument labels added. The tree represents the dependency structure, which gives rise to the predicates in the light blue box. Notice that the trace links back to the GM-Jaguar Pact. Notice also that it could just as easily have said, “a GM-Jaguar pact that would give an eventual…stake to the US car maker.” where it would be “ ARG0: a GM-Jaguar pact that would give an ARG1: eventual…stake to ARG2: the US car maker.” This works in exactly the same way for Chinese and Korean as it works for English (and presumably will for Arabic as well.)