SlideShare a Scribd company logo
Natural Language Processing
2 
Why “natural language”? 
 Natural vs. artificial 
 Language vs. English
3 
Why “natural language”? 
 Natural vs. artificial 
 Not precise, ambiguous, wide range of 
expression 
 Language vs. English 
 English, French, Japanese, Spanish
4 
Why “natural language”? 
 Natural vs. artificial 
 Not precise, ambiguous, wide range of 
expression 
 Language vs. English 
 English, French, Japanese, Spanish 
 Natural language processing = programs, 
theories towards understanding a problem 
or question in natural language and 
answering it
5 
Approaches 
 System building 
 Interactive 
 Understanding only 
 Generation only 
 Theoretical 
 Draws on linguistics, psychology, 
philosophy
6 
 Building an NL system is hard 
 Unlikely to be possible without solid 
theoretical underpinnings
7 
Natural language is useful 
 Question-answering systems 
 http://tangra.si.umich.edu/clair/NSIR/NSIR.cgi 
 Mixed initiative systems 
 http://www.cs.columbia.edu/~noemie/match.mpg 
 Information extraction 
 http://nlp.cs.nyu.edu/info-extr/biomedical-snapshot.jpg 
 Systems that write/speak 
 http://www-2.cs.cmu.edu/~awb/synthesizers.html 
 MAGIC 
 Machine translation 
 http://world.altavista.com/babelfish
8 
Topics 
 Syntax 
 Semantics 
 Pragmatics 
 Statistical NLP: combining learning 
and NL processing
9 
Goal of Interpretation 
 Identify sentence meaning 
 Do something with meaning 
 Need some representation of 
action/meaning
10 
Analysis of form: Syntax 
 Which parts were damaged by larger 
machines? 
 Which parts damaged larger machines? 
 Which larger machines damaged parts? 
 Approaches: 
 Statistical part of speech tagging 
 Parsing using a grammar 
 Shallow parsing: identify meaningful 
chunks
11 
Which parts were damaged by larger 
machines? 
S (Q) 
NP VP 
N NP (Q) 
machines 
V (past) 
damage Det (Q) N 
which parts 
ADJ 
larger
12 
Which parts were damaged by 
machines? – with functional roles 
S (Q) 
NP (SUBJ) VP 
N NP (Q) (OBJ) 
machines 
V (past) 
damage Det (Q) N 
which parts 
ADJ 
larger
13 
Which parts damaged machines? – with 
functional roles 
S (Q) 
NP (OBJ) 
VP 
N 
machines 
V (past) 
damage 
parts 
NP (Q) 
(SUBJ) 
Det (Q) N 
which 
ADJ 
larger
14 
Parsers 
 Grammar 
 S -> NP VP 
 NP -> DET {ADJ*} N 
 Different types of grammars 
 Context Free vs. Context Sensitive 
 Lexical Functional Grammar vs. Tree Adjoining 
Grammars 
 Different ways of acquiring grammars 
 Hand-encoded vs. machine learned 
 Domain independent (TreeBank, Wall Street 
Journal) 
 Domain dependent (Medical texts)
15 
Semantics: analysis of meaning 
 Word meaning 
 John picked up a bad cold 
 John picked up a large rock. 
 John picked up Radio Netherlands on his radio. 
 John picked up a hitchhiker on Highway 66. 
 Phrasal meaning 
 Baby bonuses -> allocations 
 Senior citizens -> personnes agees 
 Causing havoc -> seme le dessaroi 
 Approaches 
 Representing meaning 
 Statistical word disambiguation 
 Symbolic rule-based vs. shallow statistical 
semantics
16 
Representing Meaning - WordNet
17
18 
OMEGA 
 http://omega.isi.edu:8007/index 
 http://omega.is.edu/doc/browsers.html
19
Statistical Word Sense Disambiguation 
20 
Context within the sentence determines which sense is 
correct 
 The candidate picked up [sense6] thousands of 
additional votes. 
 He picked up [sense2] the book and started to read. 
 Her performance in school picked up [sense13]. 
 The swimmers got out of the river and climbed the 
bank [sloping land] to retrieve their towels. 
 The investors took their money out of the bank 
[financial institution] and moved it into stocks and 
bonds.
21 
Goal 
 A program which can predict which sense 
is the correct sense given a new sentence 
containing “pick up” or “bank” 
 Avoid manually itemizing all words which 
can occur in sentences with different 
meanings 
 Can we use machine learning?
22 
What do we need? 
 Data 
 Features 
 Machine Learning algorithm 
 Decision tree vs. SVM/Naïve Bayes 
 Inspecting the output 
 Accuracy of these methods
23 
Using Categories from Roget’s 
Thesaurus (e.g., machine vs. animal) 
for training
24 
Training data for “machines”
25
26 
Predicting the correct sense in unseen 
text 
 Use presence of the salient words in 
context 
 50 word window 
 Use Baye’s rule to compute 
probabilities for different categories
27 
“Crane” 
 Occurred 74 times in Grolliers, 36 
as animal, 38 as machine 
 Prediction in new sentences were 
99% correct 
 Example: lift water and to grind 
grain .PP Treadmills attached to 
cranes were used to lift heavy 
objects from Roman times.
28
29
30 
Going Home – A play in one act 
 Scene 1: Pennsylvania Station, NYC 
Bonnie: Long Beach? 
Passerby: Downstairs, LIRR Station 
 Scene 2: ticket counter: LIRR 
Bonnie: Long Beach? 
Clerk: $4.50 
 Scene 3: Information Booth, LIRR 
Bonnie: Long Beach? 
Clerk: 4:19, Track 17 
 Scene 4: On the train, vicinity of Forest Hills 
Bonnie: Long Beach? 
Conductor: Change at Jamaica 
 Scene 5: On the next train, vicinity of Lynbrook 
Bonnie: Long Beach? 
Conductor: Rigtht after Island Park.
31 
Question Answering on the web 
 Input: English question 
 Data: documents retrieved by a 
search engine from the web 
 Output: The phrase(s) within the 
documents that answer the question
32 
Examples 
 When was X born? 
When was Mozart born? 
Mozart was born in 1756. 
When was Gandhi born? 
 Gandhi (1869-1948) 
 Where are the Rocky Mountains 
located? 
 What is nepotism?
33 
Common Approach 
 Create a query from the question 
 When was Mozart born -> Mozart born 
 Use WordNet to expand terms and increase 
recall: 
 Which high school was ranked highest in the US in 
1998? 
 “high school” -> (high&school)| 
(senior&high&school)|(senior&high(|high| 
highschool 
 Use search engine to find relevant 
documents 
 Pinpoint passage within document that 
has answer using patterns 
 From IR to NLP
34 
PRODUCE A BIOGRAPHY OF [PERON]. 
Only these fields are Relevant: 
1. Name(s), aliases: 
2. *Date of Birth or Current Age: 
3. *Date of Death: 
4. *Place of Birth: 
5. *Place of Death: 
6. Cause of Death: 
7. Religion (Affiliations): 
8. Known locations and dates: 
9. Last known address: 
10. Previous domiciles: 
11. Ethnic or tribal affiliations: 
12. Immediate family members 
13. Native Language spoken: 
14. Secondary Languages spoken: 
15. Physical Characteristics 
16. Passport number and country of issue: 
17. Professional positions: 
18. Education 
19. Party or other organization affiliations: 
20. Publications (titles and dates):
35 
Biography of Han Ming 
 Han Ming, born 1944 March in Pyongyan, South 
Korean Lei Fa Women’s University in French law, 
literature, a former female South Korean people, 
chairman of South Korea women’s groups,…Han, 
62, has championed women’s rights and liberal 
political ideas. Han was imprisoned from 1979 to 
1981 on charges of teaching pro-Communist 
ideas to workers, farmers and low-income 
women. She became the first minister of gender 
equality in 2001 and later served as an 
environment minister.
36 
Biography – two approaches 
 To obtain high precision, we handle 
each slot independently using 
bootstrapping to learn IE patterns. 
 To improve the recall, we utilize a 
biography Language Model.
37 
Approach 
 Characteristics of the IE approach 
 Training resource: Wikipedia and its manual 
annotations 
 Bootstrapping interleaves two corpora to improve 
precision 
 Wikipedia: reliable but small 
 Web: noisy but many relevant documents 
 No manual annotation or automatic tagging of corpus 
 Use seed tuples (person, date-of-birth) to find patterns 
 This approach is scalable for any corpus 
 Irrespective of size 
 Irrespective of whether it is static or dynamic 
 The IE system is augmented with language models to 
increase recall
38 
Biography as an IE task 
 We need patterns to extract information 
from a sentence 
 Creating patterns manually is a time 
consuming task, and not scalable 
 We want to find these patterns 
automatically
39 
Biography patterns from Wikipedia
40 
Biography patterns from Wikipedia 
• Martin Luther King, Jr., (January 15, 1929 – April 4, 
1968) was the most … 
• Martin Luther King, Jr., was born on January 15, 1929, 
in Atlanta, Georgia.
41 
Run IdFinder on these sentences 
 <Person> Martin Luther King, Jr. </Person>, 
(<Date>January 15, 1929</Date> – <Date> 
April 4, 1968</Date>) was the most… 
 <Person> Martin Luther King, Jr. </Person>, was 
born on <Date> January 15, 1929 </Date>, in 
<GPE> Atlanta, Georgia </GPE>. 
 Take the token sequence that includes the tags of 
interest + some context (2 tokens before and 2 
tokens after)
42 
Convert to Patterns: 
 <My_Person> (<My_Date> – <Date>) was the 
 <My_Person> , was born on <My_Date>, in 
 Remove more specific patterns – if there is a 
pattern that contains other, take the smallest > k 
tokens. 
  <MY_Person> , was born on <My_Date> 
  <My_Person> (<My_Date> – <Date>) 
 Finally, verify the patterns manually to remove 
irrelevant patterns.
43 
Examples of Patterns: 
 502 distinct place-of-birth patterns: 
 600 <MY_Person> was born in <MY_GPE> 
 169 <MY_Person> ( born <Date> in <MY_GPE> ) 
 44 Born in <MY_GPE> <MY_Person> 
 10 <MY_Person> was a native <MY_GPE> 
 10 <MY_Person> 's hometown of <MY_GPE> 
 1 <MY_Person> was baptized in <MY_GPE> 
 … 
 291 distinct date-of-death patterns: 
 770 <MY_Person> ( <Date> - <MY_Date> ) 
 92 <MY_Person> died on <MY_Date> 
 19 <MY_Person> <Date> - <MY_Date> 
 16 <MY_Person> died in <GPE> on <MY_Date> 
 3 < MY_Person> passed away on < MY_Date > 
 1 < MY_Person> committed suicide on <MY_Date> 
 …
44 
Biography as an IE task 
 This approach is good for the 
consistently annotated fields in 
Wikipedia: place of birth, date of 
birth, place of death, date of death 
 Not all fields of interests are 
annotated, a different approach is 
needed to cover the rest of the slots
45 
Bouncing between Wikipedia and Google 
 Use one seed only: 
<my person> and <target field> 
 Google: “Arafat” “civil engineering”, we get:
46
47 
Bouncing between Wikipedia and Google 
 Use one seed only: 
 <my person> and <target field> 
 Google: “Arafat” “civil engineering”, we get: 
Þ Arafat graduated with a bachelor’s degree in civil engineering 
Þ Arafat studied civil engineering 
Þ Arafat, a civil engineering student 
Þ … 
 Using these snippets, corresponding patterns are 
created, then filtered out manually.
48 
Bouncing between Wikipedia and Google 
 Use one seed tuple only: 
 <my person> and <target field> 
 Google: “Arafat” “civil engineering”, we get: 
Þ Arafat graduated with a bachelor’s degree in civil 
engineering 
Þ Arafat studied civil engineering 
Þ Arafat, a civil engineering student 
Þ … 
 Using these snippets, corresponding patterns are 
created, then filtered out manually 
 To get more seed pairs, go to Wikipedia biography 
pages only and search for: 
 “graduated with a bachelor’s degree in” 
 We get:
49
50 
Bouncing between Wikipedia and Google 
 New seed tuples: 
 “Burnie Thompson” “political science“ 
 “Henrey Luke” “Environment Studies” 
 “Erin Crocker” “industrial and management 
engineering” 
 “Denise Bode” “political science” 
 … 
 Go back to Google and repeat the 
process to get more seed patterns!
51 
Bouncing between Wikipedia and Google 
 This approach worked well for a few 
fields such as: education, publication, 
Immediate family members, and Party or other 
organization affiliations 
 Did not provide good patterns for 
every field, such as: Religion, Ethnic or tribal 
affiliations, and Previous domiciles), we got a lot 
of noise 
 For some slots, we created some 
patterns manually
52 
Biography as Sentence Selection and Ranking 
 To obtain high recall, we also want to include 
sentences that IE may miss, perhaps due to ill-formed 
sentences (ASR and MT) 
 Get the top 100 documents from Indri 
 Extract all sentences that contain the person or 
reference to him/her 
 Use a variety of features to rank these 
sentence…

More Related Content

What's hot

Natural language processing
Natural language processingNatural language processing
Natural language processing
Saurav Aryal
 
Natural lanaguage processing
Natural lanaguage processingNatural lanaguage processing
Natural lanaguage processing
gulshan kumar
 
Nlp
NlpNlp
NLP
NLPNLP
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
Alia Hamwi
 
NLP
NLPNLP
Natural Language Processing in AI
Natural Language Processing in AINatural Language Processing in AI
Natural Language Processing in AI
Saurav Shrestha
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
rohitnayak
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review
Jayneel Vora
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Yuriy Guts
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
Abash shah
 
Natural language processing PPT presentation
Natural language processing PPT presentationNatural language processing PPT presentation
Natural language processing PPT presentation
Sai Mohith
 
Lightweight Natural Language Processing (NLP)
Lightweight Natural Language Processing (NLP)Lightweight Natural Language Processing (NLP)
Lightweight Natural Language Processing (NLP)
Lithium
 
Introduction to Named Entity Recognition
Introduction to Named Entity RecognitionIntroduction to Named Entity Recognition
Introduction to Named Entity Recognition
Tomer Lieber
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Jaganadh Gopinadhan
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Varunjeet Singh Rekhi
 
Natural language processing: feature extraction
Natural language processing: feature extractionNatural language processing: feature extraction
Natural language processing: feature extraction
Gabriel Hamilton
 
Practical Natural Language Processing
Practical Natural Language ProcessingPractical Natural Language Processing
Practical Natural Language Processing
Jaganadh Gopinadhan
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
Hansi Thenuwara
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processing
Minh Pham
 

What's hot (20)

Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural lanaguage processing
Natural lanaguage processingNatural lanaguage processing
Natural lanaguage processing
 
Nlp
NlpNlp
Nlp
 
NLP
NLPNLP
NLP
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
 
NLP
NLPNLP
NLP
 
Natural Language Processing in AI
Natural Language Processing in AINatural Language Processing in AI
Natural Language Processing in AI
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing PPT presentation
Natural language processing PPT presentationNatural language processing PPT presentation
Natural language processing PPT presentation
 
Lightweight Natural Language Processing (NLP)
Lightweight Natural Language Processing (NLP)Lightweight Natural Language Processing (NLP)
Lightweight Natural Language Processing (NLP)
 
Introduction to Named Entity Recognition
Introduction to Named Entity RecognitionIntroduction to Named Entity Recognition
Introduction to Named Entity Recognition
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing: feature extraction
Natural language processing: feature extractionNatural language processing: feature extraction
Natural language processing: feature extraction
 
Practical Natural Language Processing
Practical Natural Language ProcessingPractical Natural Language Processing
Practical Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processing
 

Viewers also liked

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Rishikese MR
 
Fuzzy logic and application in AI
Fuzzy logic and application in AIFuzzy logic and application in AI
Fuzzy logic and application in AI
Ildar Nurgaliev
 
From Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial IntelligenceFrom Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial Intelligence
Jonathan Mugan
 
Genetic Algorithms Made Easy
Genetic Algorithms Made EasyGenetic Algorithms Made Easy
Genetic Algorithms Made Easy
Prakash Pimpale
 
Genetic Algorithms
Genetic AlgorithmsGenetic Algorithms
Genetic Algorithms
Shruti Railkar
 
Genetic Algorithm by Example
Genetic Algorithm by ExampleGenetic Algorithm by Example
Genetic Algorithm by Example
Nobal Niraula
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
garima931
 
Genetic Algorithms - Artificial Intelligence
Genetic Algorithms - Artificial IntelligenceGenetic Algorithms - Artificial Intelligence
Genetic Algorithms - Artificial Intelligence
Sahil Kumar
 
Chapter 5 - Fuzzy Logic
Chapter 5 - Fuzzy LogicChapter 5 - Fuzzy Logic
Chapter 5 - Fuzzy Logic
Ashique Rasool
 
Fuzzy Sets Introduction With Example
Fuzzy Sets Introduction With ExampleFuzzy Sets Introduction With Example
Fuzzy Sets Introduction With Example
raisnasir
 

Viewers also liked (10)

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Fuzzy logic and application in AI
Fuzzy logic and application in AIFuzzy logic and application in AI
Fuzzy logic and application in AI
 
From Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial IntelligenceFrom Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial Intelligence
 
Genetic Algorithms Made Easy
Genetic Algorithms Made EasyGenetic Algorithms Made Easy
Genetic Algorithms Made Easy
 
Genetic Algorithms
Genetic AlgorithmsGenetic Algorithms
Genetic Algorithms
 
Genetic Algorithm by Example
Genetic Algorithm by ExampleGenetic Algorithm by Example
Genetic Algorithm by Example
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 
Genetic Algorithms - Artificial Intelligence
Genetic Algorithms - Artificial IntelligenceGenetic Algorithms - Artificial Intelligence
Genetic Algorithms - Artificial Intelligence
 
Chapter 5 - Fuzzy Logic
Chapter 5 - Fuzzy LogicChapter 5 - Fuzzy Logic
Chapter 5 - Fuzzy Logic
 
Fuzzy Sets Introduction With Example
Fuzzy Sets Introduction With ExampleFuzzy Sets Introduction With Example
Fuzzy Sets Introduction With Example
 

Similar to Natural Language Processing

Year 1 AI.ppt
Year 1 AI.pptYear 1 AI.ppt
Year 1 AI.ppt
KrishnaMadala1
 
I want to know more about compuerized text analysis
I want to know more about   compuerized text analysisI want to know more about   compuerized text analysis
I want to know more about compuerized text analysis
Luke Czarnecki
 
NLP Introduction.ppt machine learning presentation
NLP  Introduction.ppt machine learning presentationNLP  Introduction.ppt machine learning presentation
NLP Introduction.ppt machine learning presentation
PriyankaRamavath3
 
Human Brain Essay.pdf
Human Brain Essay.pdfHuman Brain Essay.pdf
Human Brain Essay.pdf
Jennifer Reese
 
IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)
Marina Santini
 
From Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science TalesFrom Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science Tales
Bertram Ludäscher
 
Topic models, vector semantics and applications
Topic models, vector semantics and applicationsTopic models, vector semantics and applications
Topic models, vector semantics and applications
Vasileios Lampos
 
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingAn-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
Theodore J. LaGrow
 
Big Data LDN 2018: PROMISE AND PITFALLS OF TEXT ANALYTICS
Big Data LDN 2018: PROMISE AND PITFALLS OF TEXT ANALYTICSBig Data LDN 2018: PROMISE AND PITFALLS OF TEXT ANALYTICS
Big Data LDN 2018: PROMISE AND PITFALLS OF TEXT ANALYTICS
Matt Stubbs
 
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Andre Freitas
 
Intro to nlp
Intro to nlpIntro to nlp
Intro to nlp
Rutu Mulkar-Mehta
 
Text mining voor Business Intelligence toepassingen
Text mining voor Business Intelligence toepassingenText mining voor Business Intelligence toepassingen
Text mining voor Business Intelligence toepassingen
jcscholtes
 
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Andre Freitas
 
Concepts in Application Context ( How we may think conceptually )
Concepts in Application Context ( How we may think conceptually )Concepts in Application Context ( How we may think conceptually )
Concepts in Application Context ( How we may think conceptually )
Steffen Staab
 
How can text-mining leverage developments in Deep Learning? Presentation at ...
How can text-mining leverage developments in Deep Learning?  Presentation at ...How can text-mining leverage developments in Deep Learning?  Presentation at ...
How can text-mining leverage developments in Deep Learning? Presentation at ...
jcscholtes
 
Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)
Andre Freitas
 
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Jonathan Stray
 
Headed Paper Facebook - A Letterhead, Or Letterhead
Headed Paper Facebook - A Letterhead, Or LetterheadHeaded Paper Facebook - A Letterhead, Or Letterhead
Headed Paper Facebook - A Letterhead, Or Letterhead
Richard Hogue
 
Comparison And Contrast Essay Outline Sample
Comparison And Contrast Essay Outline SampleComparison And Contrast Essay Outline Sample
Comparison And Contrast Essay Outline Sample
Bridget Dodson
 
Big, Open, Data and Semantics for Real-World Application Near You
Big, Open, Data and Semantics for Real-World Application Near YouBig, Open, Data and Semantics for Real-World Application Near You
Big, Open, Data and Semantics for Real-World Application Near You
Biplav Srivastava
 

Similar to Natural Language Processing (20)

Year 1 AI.ppt
Year 1 AI.pptYear 1 AI.ppt
Year 1 AI.ppt
 
I want to know more about compuerized text analysis
I want to know more about   compuerized text analysisI want to know more about   compuerized text analysis
I want to know more about compuerized text analysis
 
NLP Introduction.ppt machine learning presentation
NLP  Introduction.ppt machine learning presentationNLP  Introduction.ppt machine learning presentation
NLP Introduction.ppt machine learning presentation
 
Human Brain Essay.pdf
Human Brain Essay.pdfHuman Brain Essay.pdf
Human Brain Essay.pdf
 
IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)
 
From Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science TalesFrom Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science Tales
 
Topic models, vector semantics and applications
Topic models, vector semantics and applicationsTopic models, vector semantics and applications
Topic models, vector semantics and applications
 
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingAn-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
 
Big Data LDN 2018: PROMISE AND PITFALLS OF TEXT ANALYTICS
Big Data LDN 2018: PROMISE AND PITFALLS OF TEXT ANALYTICSBig Data LDN 2018: PROMISE AND PITFALLS OF TEXT ANALYTICS
Big Data LDN 2018: PROMISE AND PITFALLS OF TEXT ANALYTICS
 
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
 
Intro to nlp
Intro to nlpIntro to nlp
Intro to nlp
 
Text mining voor Business Intelligence toepassingen
Text mining voor Business Intelligence toepassingenText mining voor Business Intelligence toepassingen
Text mining voor Business Intelligence toepassingen
 
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
 
Concepts in Application Context ( How we may think conceptually )
Concepts in Application Context ( How we may think conceptually )Concepts in Application Context ( How we may think conceptually )
Concepts in Application Context ( How we may think conceptually )
 
How can text-mining leverage developments in Deep Learning? Presentation at ...
How can text-mining leverage developments in Deep Learning?  Presentation at ...How can text-mining leverage developments in Deep Learning?  Presentation at ...
How can text-mining leverage developments in Deep Learning? Presentation at ...
 
Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)
 
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
 
Headed Paper Facebook - A Letterhead, Or Letterhead
Headed Paper Facebook - A Letterhead, Or LetterheadHeaded Paper Facebook - A Letterhead, Or Letterhead
Headed Paper Facebook - A Letterhead, Or Letterhead
 
Comparison And Contrast Essay Outline Sample
Comparison And Contrast Essay Outline SampleComparison And Contrast Essay Outline Sample
Comparison And Contrast Essay Outline Sample
 
Big, Open, Data and Semantics for Real-World Application Near You
Big, Open, Data and Semantics for Real-World Application Near YouBig, Open, Data and Semantics for Real-World Application Near You
Big, Open, Data and Semantics for Real-World Application Near You
 

More from Ila Group

Automation consultants Company profile - jan 2015
Automation consultants   Company profile - jan 2015Automation consultants   Company profile - jan 2015
Automation consultants Company profile - jan 2015
Ila Group
 
Useful Techniques in Artificial Intelligence
Useful Techniques in Artificial IntelligenceUseful Techniques in Artificial Intelligence
Useful Techniques in Artificial Intelligence
Ila Group
 
Shine Technology data sheet
Shine Technology data sheetShine Technology data sheet
Shine Technology data sheet
Ila Group
 
Red lambda FAQ's
Red lambda FAQ'sRed lambda FAQ's
Red lambda FAQ's
Ila Group
 
Red lambda Brochure Meta Grid Executive Overview
Red lambda Brochure  Meta Grid Executive OverviewRed lambda Brochure  Meta Grid Executive Overview
Red lambda Brochure Meta Grid Executive Overview
Ila Group
 
Global Telecom trends by 2020
Global Telecom trends by 2020Global Telecom trends by 2020
Global Telecom trends by 2020
Ila Group
 
Big Data Analytics Research Report
Big Data Analytics Research ReportBig Data Analytics Research Report
Big Data Analytics Research Report
Ila Group
 
Analyst Report for Next Generation Firewall
Analyst Report for Next Generation FirewallAnalyst Report for Next Generation Firewall
Analyst Report for Next Generation Firewall
Ila Group
 
Understanding Artificial intelligence
Understanding Artificial intelligenceUnderstanding Artificial intelligence
Understanding Artificial intelligence
Ila Group
 
Cyber security Guide
Cyber security GuideCyber security Guide
Cyber security Guide
Ila Group
 
Analyst report for Next Generation Firewalls
Analyst report for Next Generation FirewallsAnalyst report for Next Generation Firewalls
Analyst report for Next Generation Firewalls
Ila Group
 
Next generation Search Engines
Next generation Search EnginesNext generation Search Engines
Next generation Search Engines
Ila Group
 

More from Ila Group (12)

Automation consultants Company profile - jan 2015
Automation consultants   Company profile - jan 2015Automation consultants   Company profile - jan 2015
Automation consultants Company profile - jan 2015
 
Useful Techniques in Artificial Intelligence
Useful Techniques in Artificial IntelligenceUseful Techniques in Artificial Intelligence
Useful Techniques in Artificial Intelligence
 
Shine Technology data sheet
Shine Technology data sheetShine Technology data sheet
Shine Technology data sheet
 
Red lambda FAQ's
Red lambda FAQ'sRed lambda FAQ's
Red lambda FAQ's
 
Red lambda Brochure Meta Grid Executive Overview
Red lambda Brochure  Meta Grid Executive OverviewRed lambda Brochure  Meta Grid Executive Overview
Red lambda Brochure Meta Grid Executive Overview
 
Global Telecom trends by 2020
Global Telecom trends by 2020Global Telecom trends by 2020
Global Telecom trends by 2020
 
Big Data Analytics Research Report
Big Data Analytics Research ReportBig Data Analytics Research Report
Big Data Analytics Research Report
 
Analyst Report for Next Generation Firewall
Analyst Report for Next Generation FirewallAnalyst Report for Next Generation Firewall
Analyst Report for Next Generation Firewall
 
Understanding Artificial intelligence
Understanding Artificial intelligenceUnderstanding Artificial intelligence
Understanding Artificial intelligence
 
Cyber security Guide
Cyber security GuideCyber security Guide
Cyber security Guide
 
Analyst report for Next Generation Firewalls
Analyst report for Next Generation FirewallsAnalyst report for Next Generation Firewalls
Analyst report for Next Generation Firewalls
 
Next generation Search Engines
Next generation Search EnginesNext generation Search Engines
Next generation Search Engines
 

Recently uploaded

Discover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to IndiaDiscover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to India
davidjhones387
 
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
uehowe
 
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
zoowe
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC
 
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
zyfovom
 
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
cuobya
 
7 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 20247 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 2024
Danica Gill
 
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdfMeet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Florence Consulting
 
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
cuobya
 
留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理
留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理
留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理
bseovas
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
eutxy
 
不能毕业如何获得(USYD毕业证)悉尼大学毕业证成绩单一比一原版制作
不能毕业如何获得(USYD毕业证)悉尼大学毕业证成绩单一比一原版制作不能毕业如何获得(USYD毕业证)悉尼大学毕业证成绩单一比一原版制作
不能毕业如何获得(USYD毕业证)悉尼大学毕业证成绩单一比一原版制作
bseovas
 
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaalmanuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
wolfsoftcompanyco
 
[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024
hackersuli
 
Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?
Paul Walk
 
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
cuobya
 
Explore-Insanony: Watch Instagram Stories Secretly
Explore-Insanony: Watch Instagram Stories SecretlyExplore-Insanony: Watch Instagram Stories Secretly
Explore-Insanony: Watch Instagram Stories Secretly
Trending Blogers
 
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
vmemo1
 
Gen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needsGen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needs
Laura Szabó
 
制作原版1:1(Monash毕业证)莫纳什大学毕业证成绩单办理假
制作原版1:1(Monash毕业证)莫纳什大学毕业证成绩单办理假制作原版1:1(Monash毕业证)莫纳什大学毕业证成绩单办理假
制作原版1:1(Monash毕业证)莫纳什大学毕业证成绩单办理假
ukwwuq
 

Recently uploaded (20)

Discover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to IndiaDiscover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to India
 
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
 
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
 
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
 
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
 
7 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 20247 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 2024
 
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdfMeet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
 
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
 
留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理
留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理
留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
 
不能毕业如何获得(USYD毕业证)悉尼大学毕业证成绩单一比一原版制作
不能毕业如何获得(USYD毕业证)悉尼大学毕业证成绩单一比一原版制作不能毕业如何获得(USYD毕业证)悉尼大学毕业证成绩单一比一原版制作
不能毕业如何获得(USYD毕业证)悉尼大学毕业证成绩单一比一原版制作
 
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaalmanuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
 
[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024
 
Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?
 
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
 
Explore-Insanony: Watch Instagram Stories Secretly
Explore-Insanony: Watch Instagram Stories SecretlyExplore-Insanony: Watch Instagram Stories Secretly
Explore-Insanony: Watch Instagram Stories Secretly
 
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
 
Gen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needsGen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needs
 
制作原版1:1(Monash毕业证)莫纳什大学毕业证成绩单办理假
制作原版1:1(Monash毕业证)莫纳什大学毕业证成绩单办理假制作原版1:1(Monash毕业证)莫纳什大学毕业证成绩单办理假
制作原版1:1(Monash毕业证)莫纳什大学毕业证成绩单办理假
 

Natural Language Processing

  • 2. 2 Why “natural language”?  Natural vs. artificial  Language vs. English
  • 3. 3 Why “natural language”?  Natural vs. artificial  Not precise, ambiguous, wide range of expression  Language vs. English  English, French, Japanese, Spanish
  • 4. 4 Why “natural language”?  Natural vs. artificial  Not precise, ambiguous, wide range of expression  Language vs. English  English, French, Japanese, Spanish  Natural language processing = programs, theories towards understanding a problem or question in natural language and answering it
  • 5. 5 Approaches  System building  Interactive  Understanding only  Generation only  Theoretical  Draws on linguistics, psychology, philosophy
  • 6. 6  Building an NL system is hard  Unlikely to be possible without solid theoretical underpinnings
  • 7. 7 Natural language is useful  Question-answering systems  http://tangra.si.umich.edu/clair/NSIR/NSIR.cgi  Mixed initiative systems  http://www.cs.columbia.edu/~noemie/match.mpg  Information extraction  http://nlp.cs.nyu.edu/info-extr/biomedical-snapshot.jpg  Systems that write/speak  http://www-2.cs.cmu.edu/~awb/synthesizers.html  MAGIC  Machine translation  http://world.altavista.com/babelfish
  • 8. 8 Topics  Syntax  Semantics  Pragmatics  Statistical NLP: combining learning and NL processing
  • 9. 9 Goal of Interpretation  Identify sentence meaning  Do something with meaning  Need some representation of action/meaning
  • 10. 10 Analysis of form: Syntax  Which parts were damaged by larger machines?  Which parts damaged larger machines?  Which larger machines damaged parts?  Approaches:  Statistical part of speech tagging  Parsing using a grammar  Shallow parsing: identify meaningful chunks
  • 11. 11 Which parts were damaged by larger machines? S (Q) NP VP N NP (Q) machines V (past) damage Det (Q) N which parts ADJ larger
  • 12. 12 Which parts were damaged by machines? – with functional roles S (Q) NP (SUBJ) VP N NP (Q) (OBJ) machines V (past) damage Det (Q) N which parts ADJ larger
  • 13. 13 Which parts damaged machines? – with functional roles S (Q) NP (OBJ) VP N machines V (past) damage parts NP (Q) (SUBJ) Det (Q) N which ADJ larger
  • 14. 14 Parsers  Grammar  S -> NP VP  NP -> DET {ADJ*} N  Different types of grammars  Context Free vs. Context Sensitive  Lexical Functional Grammar vs. Tree Adjoining Grammars  Different ways of acquiring grammars  Hand-encoded vs. machine learned  Domain independent (TreeBank, Wall Street Journal)  Domain dependent (Medical texts)
  • 15. 15 Semantics: analysis of meaning  Word meaning  John picked up a bad cold  John picked up a large rock.  John picked up Radio Netherlands on his radio.  John picked up a hitchhiker on Highway 66.  Phrasal meaning  Baby bonuses -> allocations  Senior citizens -> personnes agees  Causing havoc -> seme le dessaroi  Approaches  Representing meaning  Statistical word disambiguation  Symbolic rule-based vs. shallow statistical semantics
  • 17. 17
  • 18. 18 OMEGA  http://omega.isi.edu:8007/index  http://omega.is.edu/doc/browsers.html
  • 19. 19
  • 20. Statistical Word Sense Disambiguation 20 Context within the sentence determines which sense is correct  The candidate picked up [sense6] thousands of additional votes.  He picked up [sense2] the book and started to read.  Her performance in school picked up [sense13].  The swimmers got out of the river and climbed the bank [sloping land] to retrieve their towels.  The investors took their money out of the bank [financial institution] and moved it into stocks and bonds.
  • 21. 21 Goal  A program which can predict which sense is the correct sense given a new sentence containing “pick up” or “bank”  Avoid manually itemizing all words which can occur in sentences with different meanings  Can we use machine learning?
  • 22. 22 What do we need?  Data  Features  Machine Learning algorithm  Decision tree vs. SVM/Naïve Bayes  Inspecting the output  Accuracy of these methods
  • 23. 23 Using Categories from Roget’s Thesaurus (e.g., machine vs. animal) for training
  • 24. 24 Training data for “machines”
  • 25. 25
  • 26. 26 Predicting the correct sense in unseen text  Use presence of the salient words in context  50 word window  Use Baye’s rule to compute probabilities for different categories
  • 27. 27 “Crane”  Occurred 74 times in Grolliers, 36 as animal, 38 as machine  Prediction in new sentences were 99% correct  Example: lift water and to grind grain .PP Treadmills attached to cranes were used to lift heavy objects from Roman times.
  • 28. 28
  • 29. 29
  • 30. 30 Going Home – A play in one act  Scene 1: Pennsylvania Station, NYC Bonnie: Long Beach? Passerby: Downstairs, LIRR Station  Scene 2: ticket counter: LIRR Bonnie: Long Beach? Clerk: $4.50  Scene 3: Information Booth, LIRR Bonnie: Long Beach? Clerk: 4:19, Track 17  Scene 4: On the train, vicinity of Forest Hills Bonnie: Long Beach? Conductor: Change at Jamaica  Scene 5: On the next train, vicinity of Lynbrook Bonnie: Long Beach? Conductor: Rigtht after Island Park.
  • 31. 31 Question Answering on the web  Input: English question  Data: documents retrieved by a search engine from the web  Output: The phrase(s) within the documents that answer the question
  • 32. 32 Examples  When was X born? When was Mozart born? Mozart was born in 1756. When was Gandhi born?  Gandhi (1869-1948)  Where are the Rocky Mountains located?  What is nepotism?
  • 33. 33 Common Approach  Create a query from the question  When was Mozart born -> Mozart born  Use WordNet to expand terms and increase recall:  Which high school was ranked highest in the US in 1998?  “high school” -> (high&school)| (senior&high&school)|(senior&high(|high| highschool  Use search engine to find relevant documents  Pinpoint passage within document that has answer using patterns  From IR to NLP
  • 34. 34 PRODUCE A BIOGRAPHY OF [PERON]. Only these fields are Relevant: 1. Name(s), aliases: 2. *Date of Birth or Current Age: 3. *Date of Death: 4. *Place of Birth: 5. *Place of Death: 6. Cause of Death: 7. Religion (Affiliations): 8. Known locations and dates: 9. Last known address: 10. Previous domiciles: 11. Ethnic or tribal affiliations: 12. Immediate family members 13. Native Language spoken: 14. Secondary Languages spoken: 15. Physical Characteristics 16. Passport number and country of issue: 17. Professional positions: 18. Education 19. Party or other organization affiliations: 20. Publications (titles and dates):
  • 35. 35 Biography of Han Ming  Han Ming, born 1944 March in Pyongyan, South Korean Lei Fa Women’s University in French law, literature, a former female South Korean people, chairman of South Korea women’s groups,…Han, 62, has championed women’s rights and liberal political ideas. Han was imprisoned from 1979 to 1981 on charges of teaching pro-Communist ideas to workers, farmers and low-income women. She became the first minister of gender equality in 2001 and later served as an environment minister.
  • 36. 36 Biography – two approaches  To obtain high precision, we handle each slot independently using bootstrapping to learn IE patterns.  To improve the recall, we utilize a biography Language Model.
  • 37. 37 Approach  Characteristics of the IE approach  Training resource: Wikipedia and its manual annotations  Bootstrapping interleaves two corpora to improve precision  Wikipedia: reliable but small  Web: noisy but many relevant documents  No manual annotation or automatic tagging of corpus  Use seed tuples (person, date-of-birth) to find patterns  This approach is scalable for any corpus  Irrespective of size  Irrespective of whether it is static or dynamic  The IE system is augmented with language models to increase recall
  • 38. 38 Biography as an IE task  We need patterns to extract information from a sentence  Creating patterns manually is a time consuming task, and not scalable  We want to find these patterns automatically
  • 39. 39 Biography patterns from Wikipedia
  • 40. 40 Biography patterns from Wikipedia • Martin Luther King, Jr., (January 15, 1929 – April 4, 1968) was the most … • Martin Luther King, Jr., was born on January 15, 1929, in Atlanta, Georgia.
  • 41. 41 Run IdFinder on these sentences  <Person> Martin Luther King, Jr. </Person>, (<Date>January 15, 1929</Date> – <Date> April 4, 1968</Date>) was the most…  <Person> Martin Luther King, Jr. </Person>, was born on <Date> January 15, 1929 </Date>, in <GPE> Atlanta, Georgia </GPE>.  Take the token sequence that includes the tags of interest + some context (2 tokens before and 2 tokens after)
  • 42. 42 Convert to Patterns:  <My_Person> (<My_Date> – <Date>) was the  <My_Person> , was born on <My_Date>, in  Remove more specific patterns – if there is a pattern that contains other, take the smallest > k tokens.   <MY_Person> , was born on <My_Date>   <My_Person> (<My_Date> – <Date>)  Finally, verify the patterns manually to remove irrelevant patterns.
  • 43. 43 Examples of Patterns:  502 distinct place-of-birth patterns:  600 <MY_Person> was born in <MY_GPE>  169 <MY_Person> ( born <Date> in <MY_GPE> )  44 Born in <MY_GPE> <MY_Person>  10 <MY_Person> was a native <MY_GPE>  10 <MY_Person> 's hometown of <MY_GPE>  1 <MY_Person> was baptized in <MY_GPE>  …  291 distinct date-of-death patterns:  770 <MY_Person> ( <Date> - <MY_Date> )  92 <MY_Person> died on <MY_Date>  19 <MY_Person> <Date> - <MY_Date>  16 <MY_Person> died in <GPE> on <MY_Date>  3 < MY_Person> passed away on < MY_Date >  1 < MY_Person> committed suicide on <MY_Date>  …
  • 44. 44 Biography as an IE task  This approach is good for the consistently annotated fields in Wikipedia: place of birth, date of birth, place of death, date of death  Not all fields of interests are annotated, a different approach is needed to cover the rest of the slots
  • 45. 45 Bouncing between Wikipedia and Google  Use one seed only: <my person> and <target field>  Google: “Arafat” “civil engineering”, we get:
  • 46. 46
  • 47. 47 Bouncing between Wikipedia and Google  Use one seed only:  <my person> and <target field>  Google: “Arafat” “civil engineering”, we get: Þ Arafat graduated with a bachelor’s degree in civil engineering Þ Arafat studied civil engineering Þ Arafat, a civil engineering student Þ …  Using these snippets, corresponding patterns are created, then filtered out manually.
  • 48. 48 Bouncing between Wikipedia and Google  Use one seed tuple only:  <my person> and <target field>  Google: “Arafat” “civil engineering”, we get: Þ Arafat graduated with a bachelor’s degree in civil engineering Þ Arafat studied civil engineering Þ Arafat, a civil engineering student Þ …  Using these snippets, corresponding patterns are created, then filtered out manually  To get more seed pairs, go to Wikipedia biography pages only and search for:  “graduated with a bachelor’s degree in”  We get:
  • 49. 49
  • 50. 50 Bouncing between Wikipedia and Google  New seed tuples:  “Burnie Thompson” “political science“  “Henrey Luke” “Environment Studies”  “Erin Crocker” “industrial and management engineering”  “Denise Bode” “political science”  …  Go back to Google and repeat the process to get more seed patterns!
  • 51. 51 Bouncing between Wikipedia and Google  This approach worked well for a few fields such as: education, publication, Immediate family members, and Party or other organization affiliations  Did not provide good patterns for every field, such as: Religion, Ethnic or tribal affiliations, and Previous domiciles), we got a lot of noise  For some slots, we created some patterns manually
  • 52. 52 Biography as Sentence Selection and Ranking  To obtain high recall, we also want to include sentences that IE may miss, perhaps due to ill-formed sentences (ASR and MT)  Get the top 100 documents from Indri  Extract all sentences that contain the person or reference to him/her  Use a variety of features to rank these sentence…