Your SlideShare is downloading. ×
Natural Language Processing: Parsing
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Natural Language Processing: Parsing

764
views

Published on

This lecture talks about parsing. Briefly gives overview on lexicon, categorization, grammar rules, syntactic tree, word senses and various challenges of natural language processing

This lecture talks about parsing. Briefly gives overview on lexicon, categorization, grammar rules, syntactic tree, word senses and various challenges of natural language processing

Published in: Education, Technology

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
764
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
60
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Artificial Intelligence Natural Language Processing: Parsing Rushdi Shams Computational Linguistics Lab Western University. rshams@uwo.ca
  • 2. Natural Language • Natural Language means any language we speak • We need to process natural language (in text, speech, etc.) so that machine can exploit it. • Applications: numerous! – Watson (Jeopardy) – MS Word
  • 3. Parsing • The first task for any NLP-based system is to read (or to parse) the text • Parsing depends on three components of a language1. Lexicon 2. Categorization 3. Grammar Rules
  • 4. Lexicon stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east | .. is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn | … right | left | east | south | back | smelly | … here | there | nearby | ahead | right | left | east | south | back | … me | you | I | it | S=HE | Y’ALL … John | Mary | Boston | UCB | PAJC | … the | a | an | … to | in | on | near | … and | or | but | … 0|1|2|3|4|5|6|7|8|9 Rushdi Shams, Dept of CSE, KUET, Bangladesh 4
  • 5. Categorization Noun > stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east | .. Verb > is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn | … Adjective > right | left | east | south | back | smelly | … Adverb > here | there | nearby | ahead | right | left | east | south | back | … Pronoun > me | you | I | it | S=HE | Y’ALL … Name > John | Mary | Boston | UCB | PAJC | … Article > the | a | an | … Preposition > to | in | on | near | … Conjunction > and | or | but | … Digit > 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 Rushdi Shams, Dept of CSE, KUET, Bangladesh 5
  • 6. Grammar Rules • “The large cat” • This phrase can be parsed by an NLP-system if it has a grammar like Noun Phrase -> Determiner + Adjective + Noun • If your system finds a phrase or sentence that has a pattern not mentioned in its set of Grammar Rules it won’t be able to parse them.
  • 7. Therefore... • Parsing is the process of using grammar rules to determine whether a sentence is legal, • and to obtain its Syntactic Tree Rushdi Shams, Dept of CSE, KUET, Bangladesh 7
  • 8. Syntactic Tree ‘The large cat eats the small rat’ http://www.digitalenema.com/2012_07_01_archive.html
  • 9. Syntactic Tree The large cat eats the Rushdi Shams, Dept of CSE, KUET, Bangladesh small rat 9
  • 10. Syntactic Tree Article adjective noun Verb Article adjective noun The large cat eats Rushdi Shams, Dept of CSE, KUET, Bangladesh the small rat 10
  • 11. Syntactic Tree Article adjective noun Verb noun phrase Article adjective noun The large cat eats Rushdi Shams, Dept of CSE, KUET, Bangladesh the small rat 11
  • 12. Syntactic Tree Noun phrase Article adjective noun Verb noun phrase Article adjective noun The large cat eats Rushdi Shams, Dept of CSE, KUET, Bangladesh the small rat 12
  • 13. Syntactic Tree Noun phrase verb phrase Article adjective noun Verb noun phrase Article adjective noun The large cat eats the Rushdi Shams, Dept of CSE, KUET, Bangladesh small rat 13
  • 14. Syntactic Tree sentence Noun phrase verb phrase Article adjective noun Verb noun phrase Article adjective noun The large cat eats Rushdi Shams, Dept of CSE, KUET, Bangladesh the small rat 14
  • 15. Label Bracketing • It is a process of representing the syntactic tree in another way. Rushdi Shams, Dept of CSE, KUET, Bangladesh 15
  • 16. Do yourself: Label Bracket the tree Rushdi Shams, Dept of CSE, KUET, Bangladesh 16
  • 17. Evaluation of Parsing • The two most frequent and basic measures to evaluate parsing: 17
  • 18. Precision, Recall, and F1-Score • The notions are much clearer with a contingency table- 18
  • 19. Evaluation of Parsing
  • 20. However… http://www.cafepress.com/barrysworld/1486105
  • 21. And…
  • 22. Ambiguity • There are 2 types of ambiguity1. Lexical Ambiguity: Sentence contains an idiom/word/term that has more than one meaning. Glasses means both drinking glasses and spectacles Rushdi Shams, Dept of CSE, KUET, Bangladesh 24
  • 23. Ambiguity 2. Structural Ambiguity: Sentence has more than one syntactic tree I saw the boy with the telescope Did you see the boy with a telescope? Or Did you see the boy who was having a telescope? Rushdi Shams, Dept of CSE, KUET, Bangladesh 25
  • 24. Structural Ambiguity Rushdi Shams, Dept of CSE, KUET, Bangladesh 26
  • 25. Ambiguity • Which of the following examples have lexical ambiguity and which of them carry structural ambiguity; justify1. The painter put on another coat 2. We like flying planes 3. Visiting relatives can be tiresome Rushdi Shams, Dept of CSE, KUET, Bangladesh 27
  • 26. Ambiguity • He wrote the note yesterday • You mean you carried the information by a bus? • Connecting wires are tiring in electronics lab • Squad helps dog bite victim Rushdi Shams, Dept of CSE, KUET, Bangladesh 28
  • 27. Word Sense • Most of the lexical ambiguity arises from the differences in word sense. • Word senses vary due to several factors: – Synonymy – Antonymy – Homonymy – Polysemy and – Heteronymy
  • 28. Synonymy • Synonyms are different words (or sometimes phrases) with identical or very similar meanings. • Words that are synonyms are said to be synonymous, and the state of being a synonym is called synonymy Rushdi Shams, Dept of CSE, KUET, Bangladesh 30
  • 29. Synonymy • • • • • student and pupil (noun) buy and purchase (verb) sick and ill (adjective) quickly and speedily (adverb) on and upon (preposition) Rushdi Shams, Dept of CSE, KUET, Bangladesh 31
  • 30. Synonymy is a relation between senses rather than words • Note that synonyms are defined with respect to certain senses of words • pupil as the "aperture in the iris of the eye" is not synonymous with student. • Similarly, he expired means the same as he died, yet my passport has expired cannot be replaced by my passport has died. Rushdi Shams, Dept of CSE, KUET, Bangladesh 32
  • 31. Synonymy is a relation between senses rather than words • Consider the words big and large • Are they synonyms?: – How big is the plane? – Are we travelling with a large or small plane? • How about?: – Mrs Benjamin became a big sister of him – Mrs Benjamin became a large sister of him
  • 32. Heteronymy • heteronyms (also known as heterophones) are words with – identical spellings (or characters) – but different pronunciations and meanings. Rushdi Shams, Dept of CSE, KUET, Bangladesh 34
  • 33. Antonymy • Antonyms are words with opposite or nearly opposite meanings. • short and tall • dead and alive • increase and decrease Rushdi Shams, Dept of CSE, KUET, Bangladesh 35
  • 34. Homonymy • A homonym is one of a group of words that – share the same spelling but – Have different distinct meaning • Bank (Financial Institute) vs Bank (Sloping Land) • Bat (A club for hitting the ball) vs Bat (Mammal) • Homographs (Bank/Bank, Bat/Bat) • Homophones (Right/Write, Piece/Peace) Rushdi Shams, Dept of CSE, KUET, Bangladesh 36
  • 35. Polysemy • Homonymous words that are related with each other – The bank was constructed in 1971 (building related to a financial institute) – I draw money from the bank (financial institute)
  • 36. Hypernymy and Hyponymy • Superclass-subclass structure – Car is a hypernym of Honda – Honda is a hyponym of Car
  • 37. Zeugma Test • A test to see whether or not two words have the same sense – Which flight does serve breakfast? – Does Lufthansa serve Philadelphia? • Simply make a conjunction: – Does Lufthansa serve breakfast and Philadelphia?
  • 38. WordNet 3.0 • A hierarchically organized lexical database • On-line thesaurus + aspects of a dictionary • Some other languages available or under development – (Arabic, Finnish, German, Portuguese…) Category Unique Strings Noun 117,798 Verb 11,529 Adjective 22,479 Adverb 4,481
  • 39. Senses of “bass” in Wordnet
  • 40. WordNet Hypernym Hierarchy for “bass”
  • 41. WordNet Noun Relations
  • 42. WordNet 3.0 • Where it is: – http://wordnetweb.princeton.edu/perl/webwn • Libraries – Python: WordNet from NLTK • http://www.nltk.org/Home – Java: • JWNL, extJWNL on sourceforge
  • 43. Difficulties with Natural Language: Anaphora • Using pronouns to refer back to entities already introduced in the text – After Mary proposed to John, they found a preacher and got married. For the honeymoon, they went to Hawaii – Mary saw a ring through the window and asked John for it – Mary threw a rock at the window and broke it Rushdi Shams, Dept of CSE, KUET, Bangladesh 45
  • 44. Difficulties with Natural Language: Indexicality • Indexical sentences refer to utterance situation (place, time, etc.) – I am over here – Why did you do that? Rushdi Shams, Dept of CSE, KUET, Bangladesh 46
  • 45. Difficulties with Natural Language: Metonymy • Using one noun phrase to stand for another – I've read Shakespeare – Chrysler announced record profits – The ham sandwich on Table 4 wants another beer Rushdi Shams, Dept of CSE, KUET, Bangladesh 47
  • 46. Difficulties with Natural Language: Metaphor • “Non-literal" usage of words and phrases, often systematic. – I've tried killing the process but it won't die. Its parent keeps it alive. Rushdi Shams, Dept of CSE, KUET, Bangladesh 48
  • 47. Summary • The components of a language – Lexicon – Categorization – Grammar rules • • • • • Syntactic Tree Label Bracketing Evaluation of Parsing Word sense Problem of Parsing