SlideShare a Scribd company logo
NED with two-stage
coherence optimization
or
How I taught my bottle of Jack Daniel’s not to turn into a 168-years-old
person with a net income of $120.000.000
Filip Ilievski
Supervisors:
Marieke van Erp Piek Vossen
Stefan Schlobach Wouter Beek
Kick-off
● Combinatorial explosion
● FrameNet
● NWR’s Events and Situations Ontology (ESO)
● Clash of the worlds:
Roles (A0, A1, A2),
frames,
selectional restrictions
types,
subjects with domains,
objects with ranges
vs
Outline
1. Background
a. Language processing
b. Identity
2. Problem statement
3. Research scope
4. Contemporary approaches
5. Solution design
6. Tools and examples
7. Experimental Setup
8. Remarks
Language processing: Motivation
● “Ninety percent of all the data in the world was
produced in the last two years.
○ This trend is expected to grow.”
● “80 percent of all the information in the world is
unstructured information.”
● We need computers that can understand this
flood of information.
○ i.e. we need tools to automatically process
language
(IBM on Watson, 2012)
● Language is at the base of our cognition, our ability to understand the
world
● The language is subjective and relates to a discourse world
● Language is incredibly imprecise: we luv reusing and messing up
○ How can a slim chance and a fat chance be the same, but a wise man
and a wise guy are opposites?
○ How can a house burn up as it burns down?
○ Why do we fill in a form by filling it out?
● And language is amazingly accurate
○ despite all its inconsistencies, irregularities and contradictions, we
convey so much meaning and accomplish so much collaboration.
On the Ambiguity of language
● Language is at the base of our cognition, our ability to understand the
world
● The language is subjective and relates to a discourse world
● Language is incredibly imprecise: we luv reusing and messing up
○ How can a slim chance and a fat chance be the same, but a wise man
and a wise guy are opposites?
○ How can a house burn up as it burns down?
○ Why do we fill in a form by filling it out?
● And language is amazingly accurate
○ despite all its inconsistencies, irregularities and contradictions, we
convey so much meaning and accomplish so much collaboration
On the Ambiguity of language
On the Ambiguity of language
● Language is at the base of our cognition, our ability to understand the
world
● The language is subjective and relates to a discourse world
● Language is incredibly imprecise: we luv reusing and messing up
○ How can a slim chance and a fat chance be the same, but a wise man
and a wise guy are opposites?
○ How can a house burn up as it burns down?
○ Why do we fill in a form by filling it out?
● And language is amazingly accurate
○ despite all its inconsistencies, irregularities and contradictions, we
convey so much meaning and accomplish so much collaboration
(IBM on Watson, 2012)
The burden of context in language
● The language is context-dependent
● Verbal context
○ Ford fell from a tree.
■ What is “Ford” ?
● Social context
○ What is “2+2” ?
■ In mathematics it is 4
■ In the car domain it is a car configuration: 2 front + 2 back seats
■ In psychology it is a family with 2 parents and 2 children
Identity
● Problem of identity in philosophy
?
=
Identity in language
“Any two entities are both similar and dissimilar with respect to an infinite
number of properties.” (Murphy & Medin, 1985)
● Entity linking becomes tricky
○ eg. Temporality: Is Old Amsterdam identical to New Amsterdam ?
● Continuum of identity
● Metonymy
○ “Plato is on the top shelf” - Who or what is Plato?
Identity in Semantic Web
● owl:sameAs follows Leibniz’s law (maybe?)
● Approaches:
○ based on the intrinsic properties
○ weaker definitions: near-identity, intransitive,
nonsymmetric, non-reflexive constructs
Resources
Grammatical structure
and meaning of words
Background knowledgeStructured linguistic
information
Semantic Web
Natural Language
Processing
Lexical resources
Research scope
● Semantic layer of the NLP pipeline
● Named Entity Disambiguation (NED) is the
task of determining the identity of entities
mentioned in text.
● Semantic Role Labelling (SRL) detects the
semantic arguments associated with the
predicate or verb of a sentence and classifies
them into specific roles.
Research question
● Can the NED accuracy be improved by optimizing the coherence of the
entities based on binary logic and probabilistic models?
○ What is the accuracy of each of the phases of the solution ?
1. the binary filtering phase
2. the probabilistic phase
○ Which components of the system are useful/useless?
○ How does this solution compare to the existing approaches (FRED,
NewsReader or DBPedia Spotlight) ?
Example
“The United States transferred six detainees
from the Guantánamo Bay prison to Uruguay
this weekend, the Defense Department
announced early Sunday.”
State-of-the-art:
United States Guantanamo Bay Uruguay Defence Department
Geographical region GB detention camp Geographical region US Dept. of Defence
Fed. Government Place Football team Ministry of Defence of
Rep. of Korea
Men’s soccer team The naval base River
Women’s soccer team Battle of GB Rugby union team
Rugby union team U20 football team
Men’s ice hockey team U17 football team
Men’s basketball team
Secondary education in
US
State-of-the-art:
State-of-the-art: FRED
State-of-the-art: FRED
State-of-the-art: FRED
Shall we go a step further?
Proposed solution
Optimization in two phases:
1. phase excludes what is impossible
○ based on entity types and predicate restrictions
2. phase tells what is the most probable
○ based on frequency and semantic coherence of the
entities
Proposed solution (II)
Phase I: Binary Filtering
Tools: VerbNet
● largest online verb lexicon currently
available for English
● hierarchical (not ontological though)
● domain-independent
+ good coverage of predicates
+ defines syntactic-semantic relations
- thematic roles are too generic
VN: send-11.1
transferred
A0 is Animate or Organization
A0:United States
United States is Animate or
Organization
∏
A1: from
Guantanamo Bay
A2: to Uruguay
A1 is Location
A2 is Location
Guantanamo Bay is a location Uruguay is a location
VN: say-37.7
announced
A0 is Animate or OrganizationA0:the Defence
Department
∏
The Defence Department is an Animate or an Organization
After VerbNet
United States Guantanamo Bay Uruguay Defence Department
Geographical region GB detention camp Geographical region US Dept. of Defence
Fed. Government Place Football team Ministry of Defence of
Rep. of Korea
Men’s soccer team The naval base River
Women’s soccer team Battle of GB Rugby union team
Rugby union team U20 football team
Men’s ice hockey team U17 football team
Men’s basketball team
Secondary education in
US
Tools: WordNet & FrameNet
WordNet
● lexical database for English
● groups words into synsets
● provides definitions and semantic
relations between the synsets.
+ very good coverage of the verbs
hierarchy
- does not capture the syntactic nor
semantic behaviour.
FrameNet
● large-scale lexical resource with
information on semantic frames
(situations) and semantic roles.
+ Good generalization across
predicates
- Does not define selectional
restrictions for semantic roles
- Has limited coverage
Tools: NWR Events & Situations Ontology
● Reuse of existing ESO ontology (current version: 0.6)
● Global automotive industry
● Manually constructed, hence (hopefully) trustworthy
● Previous experiment on 1.3 million car industry articles demonstrated the
59 ESO classes with FrameNet and SUMO mappings cover 23% of the
predicates
● Will contain domain and range information for frequently used classes
linked to FN frames
● Complements VerbNet when no (enough) restrictions or granularity
ESO+ example
“Detroit Lions fired Mornhinweg.”
● VN comes up with two predicates:
○ fire-10.10
○ throw-17.1
● Mornhinweg can be a: Concrete, Animate or Organization
● But in the car domain they usually dismiss an employee :-)
ESO+ example
firing
FN: Shoot_projectiles FN: Use_firearmFN: Firing
ESO+: LeaveAnOrganization
correspondsToFNFrame
Tools: DBpedia/Schema.org
● Semantic web resources with domain-independent factual information
○ DBpedia has 685 classes, over 4 million instances
● But how to map the VerbNet / ESO+ restrictions to DBpedia classes?
○ Easy and manually
○ ESO+ is natively mapped. As for VN:
VerbNet DBPedia Schema.org
Animate Person Person
Organization Organization Organization
Location Place Place
Phase II: Probabilistic optimization
Module: DBpedia Spotlight
United States Guantanamo Bay Uruguay Defence Department
Geographical region GB detention camp Geographical region US Dept. of Defence
Fed. Government Place Football team Ministry of Defence of
Rep. of Korea
Men’s soccer team The naval base River
Women’s soccer team Battle of GB Rugby union team
Rugby union team U20 football team
Men’s ice hockey team U17 football team
Men’s basketball team
Secondary education in
US
United States Guantanamo Bay Uruguay Defence Department
817.71 1.74 10.95 1.60
6.69 0.89 2.73 0.26
4.31 0.46 0.48
0.72 0.34 0.14
0.29 1.82
0.51 0.51
0.32
2.55
Module: Popularity = #ins / #outs
Optimization: Graph proximity
Optimization: Graph proximity
Counter - example
“With the help of C. Harold Wills, Ford
designed, built, and successfully raced a 26-
horsepower automobile in October 1901.”
Using DBpedia
(Henry Ford, C. Wills) (Gerald Ford, C. Wills)
Shared property values 3 0
Graph distance direct connection no direct connection
Abstract search keywords found no keywords found
Class distance 0 0
Popularity / /
Frame model / /
DBPedia spotlight rank > 10 5
Points to make
● No module is perfect
○ but >1 module will be helpful
● Popularity Bias
● Topic Bias
○ old movies, the western
culture, entertainment, etc.
● Computational complexity
(reduced implicitly)
● Incompleteness and maintenance
of datasets
● The lexical resources are not
ontological
Experimental setup
Data
● 120 news articles
○ 30 GM, Ford, Chrysler car data
○ 30 Apple corpus articles
○ 30 Boeing Airbus
○ 30 stock market
● Training data
○ 1.3 M car articles
++Awesome things I won’t do
Cross-sentence check
Build a model over more ontologies
FrameNet situations reasoning
Questions?
Appendices
Identity in language
“Any two entities are both similar and dissimilar with respect to an infinite
number of properties.” (Murphy & Medin, 1985)
● Entity coreference becomes tricky
○ Temporality: Is Old Amsterdam identical to New Amsterdam ?
○ Pragmatism: Is Lord Lipton identical to the wealthiest tea importer ?
○ Granularization: Is passengers and people identical?
● Continuum of identity
Identity in NLP
● Traditional techniques interpret identify in a shallow way, based on:
○ popularity
○ TF-IDF score
○ bag-of-words similarity
● Contemporary techniques start looking into semantic coherence
○ pair-wise interpretation
○ collective interpretation
Problem statement
● Entity linking ambiguity in the NLP world
● Also present within the Semantic Web
● Lexical resources are not exploited enough
State-of-the-art:
C. Harold
Wills:
Ford:
State-of-the-art: FRED
Tools: ConceptNet 5
● because common sense knowledge is lacking often in DBpedia (and alike)
resources

More Related Content

Similar to Mini seminar presentation on context-based NED optimization

Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
Roelof Pieters
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on Language
Roelof Pieters
 
How can text-mining leverage developments in Deep Learning? Presentation at ...
How can text-mining leverage developments in Deep Learning?  Presentation at ...How can text-mining leverage developments in Deep Learning?  Presentation at ...
How can text-mining leverage developments in Deep Learning? Presentation at ...
jcscholtes
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Ila Group
 
No technology? No Problem! Creative Ways of Using UDL without Technology (Jul...
No technology? No Problem! Creative Ways of Using UDL without Technology (Jul...No technology? No Problem! Creative Ways of Using UDL without Technology (Jul...
No technology? No Problem! Creative Ways of Using UDL without Technology (Jul...
Matt Bergman
 
Social Media and Online Collaboration ToolsBusiness In.docx
Social Media and Online Collaboration ToolsBusiness In.docxSocial Media and Online Collaboration ToolsBusiness In.docx
Social Media and Online Collaboration ToolsBusiness In.docx
whitneyleman54422
 
Aera 2013a
Aera 2013aAera 2013a
Aera 2013a
avega4
 
Biemann ibm cog_comp_jan2015_noanim
Biemann ibm cog_comp_jan2015_noanimBiemann ibm cog_comp_jan2015_noanim
Biemann ibm cog_comp_jan2015_noanim
diannepatricia
 
Knowledge Representation Reasoning and Acquisition.pdf
Knowledge Representation Reasoning and Acquisition.pdfKnowledge Representation Reasoning and Acquisition.pdf
Knowledge Representation Reasoning and Acquisition.pdf
Gan Keng Hoon
 
What knowledge bases know (and what they don't)
What knowledge bases know (and what they don't)What knowledge bases know (and what they don't)
What knowledge bases know (and what they don't)
srazniewski
 
21st century skills NCE, Ede, the Netherlands
21st century skills NCE, Ede, the Netherlands21st century skills NCE, Ede, the Netherlands
21st century skills NCE, Ede, the Netherlands
Michael Harris
 
The Problem is the Solution: PBL in Social Studies
The Problem is the Solution: PBL in Social StudiesThe Problem is the Solution: PBL in Social Studies
The Problem is the Solution: PBL in Social Studies
Glenn Wiebe
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Yasir Khan
 
Eurocall 2010 panel on call and the learner
Eurocall 2010 panel on call and the learnerEurocall 2010 panel on call and the learner
Eurocall 2010 panel on call and the learner
hayoreinders
 
Fhs talk
Fhs talkFhs talk
Fhs talk
Brock Dubbels
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddings
Roelof Pieters
 
Semantic_properties-BlackboxNLP
Semantic_properties-BlackboxNLPSemantic_properties-BlackboxNLP
Semantic_properties-BlackboxNLP
Pia Sommerauer
 
Robinson Wang Creating Effective World Language Programs
Robinson Wang Creating Effective World Language ProgramsRobinson Wang Creating Effective World Language Programs
Robinson Wang Creating Effective World Language Programs
Center for Global Education at Asia Society
 
Word Embeddings, why the hype ?
Word Embeddings, why the hype ? Word Embeddings, why the hype ?
Word Embeddings, why the hype ?
Hady Elsahar
 
NLP_guest_lecture.pdf
NLP_guest_lecture.pdfNLP_guest_lecture.pdf
NLP_guest_lecture.pdf
Soha82
 

Similar to Mini seminar presentation on context-based NED optimization (20)

Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on Language
 
How can text-mining leverage developments in Deep Learning? Presentation at ...
How can text-mining leverage developments in Deep Learning?  Presentation at ...How can text-mining leverage developments in Deep Learning?  Presentation at ...
How can text-mining leverage developments in Deep Learning? Presentation at ...
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
No technology? No Problem! Creative Ways of Using UDL without Technology (Jul...
No technology? No Problem! Creative Ways of Using UDL without Technology (Jul...No technology? No Problem! Creative Ways of Using UDL without Technology (Jul...
No technology? No Problem! Creative Ways of Using UDL without Technology (Jul...
 
Social Media and Online Collaboration ToolsBusiness In.docx
Social Media and Online Collaboration ToolsBusiness In.docxSocial Media and Online Collaboration ToolsBusiness In.docx
Social Media and Online Collaboration ToolsBusiness In.docx
 
Aera 2013a
Aera 2013aAera 2013a
Aera 2013a
 
Biemann ibm cog_comp_jan2015_noanim
Biemann ibm cog_comp_jan2015_noanimBiemann ibm cog_comp_jan2015_noanim
Biemann ibm cog_comp_jan2015_noanim
 
Knowledge Representation Reasoning and Acquisition.pdf
Knowledge Representation Reasoning and Acquisition.pdfKnowledge Representation Reasoning and Acquisition.pdf
Knowledge Representation Reasoning and Acquisition.pdf
 
What knowledge bases know (and what they don't)
What knowledge bases know (and what they don't)What knowledge bases know (and what they don't)
What knowledge bases know (and what they don't)
 
21st century skills NCE, Ede, the Netherlands
21st century skills NCE, Ede, the Netherlands21st century skills NCE, Ede, the Netherlands
21st century skills NCE, Ede, the Netherlands
 
The Problem is the Solution: PBL in Social Studies
The Problem is the Solution: PBL in Social StudiesThe Problem is the Solution: PBL in Social Studies
The Problem is the Solution: PBL in Social Studies
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Eurocall 2010 panel on call and the learner
Eurocall 2010 panel on call and the learnerEurocall 2010 panel on call and the learner
Eurocall 2010 panel on call and the learner
 
Fhs talk
Fhs talkFhs talk
Fhs talk
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddings
 
Semantic_properties-BlackboxNLP
Semantic_properties-BlackboxNLPSemantic_properties-BlackboxNLP
Semantic_properties-BlackboxNLP
 
Robinson Wang Creating Effective World Language Programs
Robinson Wang Creating Effective World Language ProgramsRobinson Wang Creating Effective World Language Programs
Robinson Wang Creating Effective World Language Programs
 
Word Embeddings, why the hype ?
Word Embeddings, why the hype ? Word Embeddings, why the hype ?
Word Embeddings, why the hype ?
 
NLP_guest_lecture.pdf
NLP_guest_lecture.pdfNLP_guest_lecture.pdf
NLP_guest_lecture.pdf
 

More from Filip Ilievski

The Commonsense Knowledge Graph
The Commonsense Knowledge GraphThe Commonsense Knowledge Graph
The Commonsense Knowledge Graph
Filip Ilievski
 
Commonsense knowledge in Wikidata
Commonsense knowledge in WikidataCommonsense knowledge in Wikidata
Commonsense knowledge in Wikidata
Filip Ilievski
 
SemEval-2018 task 5: Counting events and participants in the long tail
SemEval-2018 task 5: Counting events and participants in the long tailSemEval-2018 task 5: Counting events and participants in the long tail
SemEval-2018 task 5: Counting events and participants in the long tail
Filip Ilievski
 
A look inside Babelfy: Examining the bubble
A look inside Babelfy: Examining the bubbleA look inside Babelfy: Examining the bubble
A look inside Babelfy: Examining the bubble
Filip Ilievski
 
2nd Spinoza workshop: Looking at the Long Tail - introductory slides
2nd Spinoza workshop: Looking at the Long Tail - introductory slides2nd Spinoza workshop: Looking at the Long Tail - introductory slides
2nd Spinoza workshop: Looking at the Long Tail - introductory slides
Filip Ilievski
 
Systematic Study of Long Tail Phenomena in Entity Linking
Systematic Study of Long Tail Phenomena in Entity LinkingSystematic Study of Long Tail Phenomena in Entity Linking
Systematic Study of Long Tail Phenomena in Entity Linking
Filip Ilievski
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
Filip Ilievski
 
LOTUS: Adaptive Text Search for Big Linked Data
LOTUS: Adaptive Text Search for Big Linked DataLOTUS: Adaptive Text Search for Big Linked Data
LOTUS: Adaptive Text Search for Big Linked Data
Filip Ilievski
 
Lotus: Linked Open Text UnleaShed - ISWC COLD '15
Lotus: Linked Open Text UnleaShed - ISWC COLD '15Lotus: Linked Open Text UnleaShed - ISWC COLD '15
Lotus: Linked Open Text UnleaShed - ISWC COLD '15
Filip Ilievski
 
NAF2SEM and cross-document Event Coreference
NAF2SEM and cross-document Event CoreferenceNAF2SEM and cross-document Event Coreference
NAF2SEM and cross-document Event Coreference
Filip Ilievski
 

More from Filip Ilievski (10)

The Commonsense Knowledge Graph
The Commonsense Knowledge GraphThe Commonsense Knowledge Graph
The Commonsense Knowledge Graph
 
Commonsense knowledge in Wikidata
Commonsense knowledge in WikidataCommonsense knowledge in Wikidata
Commonsense knowledge in Wikidata
 
SemEval-2018 task 5: Counting events and participants in the long tail
SemEval-2018 task 5: Counting events and participants in the long tailSemEval-2018 task 5: Counting events and participants in the long tail
SemEval-2018 task 5: Counting events and participants in the long tail
 
A look inside Babelfy: Examining the bubble
A look inside Babelfy: Examining the bubbleA look inside Babelfy: Examining the bubble
A look inside Babelfy: Examining the bubble
 
2nd Spinoza workshop: Looking at the Long Tail - introductory slides
2nd Spinoza workshop: Looking at the Long Tail - introductory slides2nd Spinoza workshop: Looking at the Long Tail - introductory slides
2nd Spinoza workshop: Looking at the Long Tail - introductory slides
 
Systematic Study of Long Tail Phenomena in Entity Linking
Systematic Study of Long Tail Phenomena in Entity LinkingSystematic Study of Long Tail Phenomena in Entity Linking
Systematic Study of Long Tail Phenomena in Entity Linking
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
LOTUS: Adaptive Text Search for Big Linked Data
LOTUS: Adaptive Text Search for Big Linked DataLOTUS: Adaptive Text Search for Big Linked Data
LOTUS: Adaptive Text Search for Big Linked Data
 
Lotus: Linked Open Text UnleaShed - ISWC COLD '15
Lotus: Linked Open Text UnleaShed - ISWC COLD '15Lotus: Linked Open Text UnleaShed - ISWC COLD '15
Lotus: Linked Open Text UnleaShed - ISWC COLD '15
 
NAF2SEM and cross-document Event Coreference
NAF2SEM and cross-document Event CoreferenceNAF2SEM and cross-document Event Coreference
NAF2SEM and cross-document Event Coreference
 

Recently uploaded

The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
Areesha Ahmad
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
European Sustainable Phosphorus Platform
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Leonel Morgado
 
Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
terusbelajar5
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
Carl Bergstrom
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
MAGOTI ERNEST
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
Leonel Morgado
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
Anagha Prasad
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
Vandana Devesh Sharma
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
muralinath2
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
AbdullaAlAsif1
 

Recently uploaded (20)

The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
 
Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
 

Mini seminar presentation on context-based NED optimization

  • 1. NED with two-stage coherence optimization or How I taught my bottle of Jack Daniel’s not to turn into a 168-years-old person with a net income of $120.000.000 Filip Ilievski Supervisors: Marieke van Erp Piek Vossen Stefan Schlobach Wouter Beek
  • 2. Kick-off ● Combinatorial explosion ● FrameNet ● NWR’s Events and Situations Ontology (ESO) ● Clash of the worlds: Roles (A0, A1, A2), frames, selectional restrictions types, subjects with domains, objects with ranges vs
  • 3. Outline 1. Background a. Language processing b. Identity 2. Problem statement 3. Research scope 4. Contemporary approaches 5. Solution design 6. Tools and examples 7. Experimental Setup 8. Remarks
  • 4. Language processing: Motivation ● “Ninety percent of all the data in the world was produced in the last two years. ○ This trend is expected to grow.” ● “80 percent of all the information in the world is unstructured information.” ● We need computers that can understand this flood of information. ○ i.e. we need tools to automatically process language (IBM on Watson, 2012)
  • 5. ● Language is at the base of our cognition, our ability to understand the world ● The language is subjective and relates to a discourse world ● Language is incredibly imprecise: we luv reusing and messing up ○ How can a slim chance and a fat chance be the same, but a wise man and a wise guy are opposites? ○ How can a house burn up as it burns down? ○ Why do we fill in a form by filling it out? ● And language is amazingly accurate ○ despite all its inconsistencies, irregularities and contradictions, we convey so much meaning and accomplish so much collaboration. On the Ambiguity of language
  • 6. ● Language is at the base of our cognition, our ability to understand the world ● The language is subjective and relates to a discourse world ● Language is incredibly imprecise: we luv reusing and messing up ○ How can a slim chance and a fat chance be the same, but a wise man and a wise guy are opposites? ○ How can a house burn up as it burns down? ○ Why do we fill in a form by filling it out? ● And language is amazingly accurate ○ despite all its inconsistencies, irregularities and contradictions, we convey so much meaning and accomplish so much collaboration On the Ambiguity of language
  • 7. On the Ambiguity of language ● Language is at the base of our cognition, our ability to understand the world ● The language is subjective and relates to a discourse world ● Language is incredibly imprecise: we luv reusing and messing up ○ How can a slim chance and a fat chance be the same, but a wise man and a wise guy are opposites? ○ How can a house burn up as it burns down? ○ Why do we fill in a form by filling it out? ● And language is amazingly accurate ○ despite all its inconsistencies, irregularities and contradictions, we convey so much meaning and accomplish so much collaboration (IBM on Watson, 2012)
  • 8. The burden of context in language ● The language is context-dependent ● Verbal context ○ Ford fell from a tree. ■ What is “Ford” ? ● Social context ○ What is “2+2” ? ■ In mathematics it is 4 ■ In the car domain it is a car configuration: 2 front + 2 back seats ■ In psychology it is a family with 2 parents and 2 children
  • 9. Identity ● Problem of identity in philosophy ? =
  • 10. Identity in language “Any two entities are both similar and dissimilar with respect to an infinite number of properties.” (Murphy & Medin, 1985) ● Entity linking becomes tricky ○ eg. Temporality: Is Old Amsterdam identical to New Amsterdam ? ● Continuum of identity ● Metonymy ○ “Plato is on the top shelf” - Who or what is Plato?
  • 11. Identity in Semantic Web ● owl:sameAs follows Leibniz’s law (maybe?) ● Approaches: ○ based on the intrinsic properties ○ weaker definitions: near-identity, intransitive, nonsymmetric, non-reflexive constructs
  • 12. Resources Grammatical structure and meaning of words Background knowledgeStructured linguistic information Semantic Web Natural Language Processing Lexical resources
  • 13. Research scope ● Semantic layer of the NLP pipeline ● Named Entity Disambiguation (NED) is the task of determining the identity of entities mentioned in text. ● Semantic Role Labelling (SRL) detects the semantic arguments associated with the predicate or verb of a sentence and classifies them into specific roles.
  • 14. Research question ● Can the NED accuracy be improved by optimizing the coherence of the entities based on binary logic and probabilistic models? ○ What is the accuracy of each of the phases of the solution ? 1. the binary filtering phase 2. the probabilistic phase ○ Which components of the system are useful/useless? ○ How does this solution compare to the existing approaches (FRED, NewsReader or DBPedia Spotlight) ?
  • 15. Example “The United States transferred six detainees from the Guantánamo Bay prison to Uruguay this weekend, the Defense Department announced early Sunday.”
  • 16. State-of-the-art: United States Guantanamo Bay Uruguay Defence Department Geographical region GB detention camp Geographical region US Dept. of Defence Fed. Government Place Football team Ministry of Defence of Rep. of Korea Men’s soccer team The naval base River Women’s soccer team Battle of GB Rugby union team Rugby union team U20 football team Men’s ice hockey team U17 football team Men’s basketball team Secondary education in US
  • 21. Shall we go a step further?
  • 22. Proposed solution Optimization in two phases: 1. phase excludes what is impossible ○ based on entity types and predicate restrictions 2. phase tells what is the most probable ○ based on frequency and semantic coherence of the entities
  • 24. Phase I: Binary Filtering
  • 25. Tools: VerbNet ● largest online verb lexicon currently available for English ● hierarchical (not ontological though) ● domain-independent + good coverage of predicates + defines syntactic-semantic relations - thematic roles are too generic
  • 26.
  • 27. VN: send-11.1 transferred A0 is Animate or Organization A0:United States United States is Animate or Organization ∏ A1: from Guantanamo Bay A2: to Uruguay A1 is Location A2 is Location Guantanamo Bay is a location Uruguay is a location
  • 28. VN: say-37.7 announced A0 is Animate or OrganizationA0:the Defence Department ∏ The Defence Department is an Animate or an Organization
  • 29. After VerbNet United States Guantanamo Bay Uruguay Defence Department Geographical region GB detention camp Geographical region US Dept. of Defence Fed. Government Place Football team Ministry of Defence of Rep. of Korea Men’s soccer team The naval base River Women’s soccer team Battle of GB Rugby union team Rugby union team U20 football team Men’s ice hockey team U17 football team Men’s basketball team Secondary education in US
  • 30. Tools: WordNet & FrameNet WordNet ● lexical database for English ● groups words into synsets ● provides definitions and semantic relations between the synsets. + very good coverage of the verbs hierarchy - does not capture the syntactic nor semantic behaviour. FrameNet ● large-scale lexical resource with information on semantic frames (situations) and semantic roles. + Good generalization across predicates - Does not define selectional restrictions for semantic roles - Has limited coverage
  • 31. Tools: NWR Events & Situations Ontology ● Reuse of existing ESO ontology (current version: 0.6) ● Global automotive industry ● Manually constructed, hence (hopefully) trustworthy ● Previous experiment on 1.3 million car industry articles demonstrated the 59 ESO classes with FrameNet and SUMO mappings cover 23% of the predicates ● Will contain domain and range information for frequently used classes linked to FN frames ● Complements VerbNet when no (enough) restrictions or granularity
  • 32. ESO+ example “Detroit Lions fired Mornhinweg.” ● VN comes up with two predicates: ○ fire-10.10 ○ throw-17.1 ● Mornhinweg can be a: Concrete, Animate or Organization ● But in the car domain they usually dismiss an employee :-)
  • 33. ESO+ example firing FN: Shoot_projectiles FN: Use_firearmFN: Firing ESO+: LeaveAnOrganization correspondsToFNFrame
  • 34. Tools: DBpedia/Schema.org ● Semantic web resources with domain-independent factual information ○ DBpedia has 685 classes, over 4 million instances ● But how to map the VerbNet / ESO+ restrictions to DBpedia classes? ○ Easy and manually ○ ESO+ is natively mapped. As for VN: VerbNet DBPedia Schema.org Animate Person Person Organization Organization Organization Location Place Place
  • 35. Phase II: Probabilistic optimization
  • 36. Module: DBpedia Spotlight United States Guantanamo Bay Uruguay Defence Department Geographical region GB detention camp Geographical region US Dept. of Defence Fed. Government Place Football team Ministry of Defence of Rep. of Korea Men’s soccer team The naval base River Women’s soccer team Battle of GB Rugby union team Rugby union team U20 football team Men’s ice hockey team U17 football team Men’s basketball team Secondary education in US
  • 37. United States Guantanamo Bay Uruguay Defence Department 817.71 1.74 10.95 1.60 6.69 0.89 2.73 0.26 4.31 0.46 0.48 0.72 0.34 0.14 0.29 1.82 0.51 0.51 0.32 2.55 Module: Popularity = #ins / #outs
  • 40. Counter - example “With the help of C. Harold Wills, Ford designed, built, and successfully raced a 26- horsepower automobile in October 1901.”
  • 41. Using DBpedia (Henry Ford, C. Wills) (Gerald Ford, C. Wills) Shared property values 3 0 Graph distance direct connection no direct connection Abstract search keywords found no keywords found Class distance 0 0 Popularity / / Frame model / / DBPedia spotlight rank > 10 5
  • 42. Points to make ● No module is perfect ○ but >1 module will be helpful ● Popularity Bias ● Topic Bias ○ old movies, the western culture, entertainment, etc. ● Computational complexity (reduced implicitly) ● Incompleteness and maintenance of datasets ● The lexical resources are not ontological
  • 44. Data ● 120 news articles ○ 30 GM, Ford, Chrysler car data ○ 30 Apple corpus articles ○ 30 Boeing Airbus ○ 30 stock market ● Training data ○ 1.3 M car articles
  • 45. ++Awesome things I won’t do Cross-sentence check Build a model over more ontologies FrameNet situations reasoning
  • 48. Identity in language “Any two entities are both similar and dissimilar with respect to an infinite number of properties.” (Murphy & Medin, 1985) ● Entity coreference becomes tricky ○ Temporality: Is Old Amsterdam identical to New Amsterdam ? ○ Pragmatism: Is Lord Lipton identical to the wealthiest tea importer ? ○ Granularization: Is passengers and people identical? ● Continuum of identity
  • 49. Identity in NLP ● Traditional techniques interpret identify in a shallow way, based on: ○ popularity ○ TF-IDF score ○ bag-of-words similarity ● Contemporary techniques start looking into semantic coherence ○ pair-wise interpretation ○ collective interpretation
  • 50. Problem statement ● Entity linking ambiguity in the NLP world ● Also present within the Semantic Web ● Lexical resources are not exploited enough
  • 53. Tools: ConceptNet 5 ● because common sense knowledge is lacking often in DBpedia (and alike) resources