SlideShare a Scribd company logo
1 of 32
Download to read offline
The Use of Class Assertions and Hypernyms to
Induce and Disambiguate Word Senses
Artem Revenko Victor Mireles
artem.revenko@semantic-web.com
@revenkoartem
MLKG, August 27, 2019
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Lynx Project
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 2 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Ambiguity of Natural Languages
Co-occurrence Graph Clustering
Disambiguation Experiments
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 3 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Finding Mentioned Entity
A Knowledge Base
Thing
Animal
Jaguar
Car
BMW
Jaguar
BMW has designed a
car that is going to drive
Jaguar X1 out of the
market.
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 4 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Finding Mentioned Entity
A Knowledge Base
Thing
Animal
Jaguar
Car
BMW
Jaguar
BMW has designed a
car that is going to drive
Jaguar X1 out of the
market.
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 4 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Finding Mentioned Entity
A Knowledge Base
Thing
Animal
Jaguar
BMW has designed a car
that is going to drive
Jaguar X1 out of the
market.
?
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 5 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
MeSH on Demand
Example
The author presents an interpretation of Hitchcock‘s ‘Vertigo‘,
focusing on the way in which its protagonist‘s drama resonates
with the analyst‘s struggle with deep unconscious identifications...
MeSH extraction tool:
https://meshb.nlm.nih.gov/MeSHonDemand.
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 6 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Ambiguity of Natural Languages
Co-occurrence Graph Clustering
Disambiguation Experiments
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 7 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Overall Description
Plan
1. Induction: Find all senses of the target E
2. Disambiguation: Assign the correct sense to each occurrence
of E
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 8 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Overall Description
Plan
1. Induction: Find all senses of the target E
1 Represent senses space
2. Disambiguation: Assign the correct sense to each occurrence
of E
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 8 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Overall Description
Plan
1. Induction: Find all senses of the target E
1 Represent senses space
2 Introduce information from the knowledge base
2. Disambiguation: Assign the correct sense to each occurrence
of E
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 8 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Overall Description
Plan
1. Induction: Find all senses of the target E
1 Represent senses space
2 Introduce information from the knowledge base
3 Cluster senses
2. Disambiguation: Assign the correct sense to each occurrence
of E
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 8 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Word Sense Induction
Describing Senses: Co-occurrence Graph
You shall know a word by the company it keeps.
Firth, J. R. 1957
Dice score: s(A, B) = 2 |AB|
|A|+|B|
Example
|A| = #(car) = 50 occurrences,
|B| = #(BMW ) = 20 occurrences,
|AB| = #(BMW and car within 10 words) = 10,
s(BMW, car) = 2 10
20+50 = 2/7 ≈ 0.3
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 9 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Co-occurrence Graph Clustering
 Target E
Term1
Term2
Term3
Term4
0.2
0.15
0.12 0.1
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Co-occurrence Graph Clustering
 Target E
Term1
Term2
Term3
Term4
Term5
Term6
0.2
0.15
0.12 0.1
0.22
0.16
0.2
0.1
0.2
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Co-occurrence Graph Clustering
Target E
Term1
Term2
Term3
Term4
Term5
Term6
0.2
0.15
0.12 0.1
0.22
0.16
0.2
0.1
0.2
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Co-occurrence Graph Clustering
Target E
Term1
Term2
Term3
Term4
Term5
Term6
0.2
0.15
0.12 0.1
0.22
0.16
0.2
0.1
0.2
Sense1
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Co-occurrence Graph Clustering
Target E
Term1Term2
Term3
Term4
Term5
Term6
0.2
0.15
0.12 0.1
0.22
0.16
0.2
0.1
0.2
Sense1
Sense2
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Co-occurrence Graph Clustering
Target E
Term1
Term2
Term3
Term4
Term5
Term6
Hypernym
0.2
0.15
0.12 0.1
0.22
0.16
0.2
0.1
0.2
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Co-occurrence Graph Clustering
Target E
Term1
Term2
Term3
Term4
Term5
Term6
Hypernym
0.2
0.15
0.12 0.1
0.22
0.16
0.2
0.1
0.2
Hypernym Sense
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Co-occurrence Graph Clustering
Target E
Term1
Term2
Term3
Term4
Term5
Term6
Hypernym
0.2
0.15
0.12 0.1
0.22
0.16
0.2
0.1
0.2
Hypernym Sense
“Other” Sense
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Ambiguity of Natural Languages
Co-occurrence Graph Clustering
Disambiguation Experiments
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 11 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Setup
Data: github.com/artreven/thesaural_wsi
Created using Wikilinks
http://www.iesl.cs.umass.edu/data/wiki-links
Cocktails
Knowledge base 250 concepts about cocktails
Corpora 13 corpora, 1227 texts.
Examples Americano, B52, Margarita, . . .
MeSH
Knowledge base 28000 concepts about cocktails
Corpora 8 corpora, 784 texts.
Examples Amnesia, Vertigo, . . .
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 12 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Example
Americano
45 texts: 13 about cocktail, 32 about coffee.
Potential hubs:
1. “campari”
2. “mix”
3. “espresso”
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 13 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Example
Americano
45 texts: 13 about cocktail, 32 about coffee.
Potential hubs:
1. “campari”
2. “mix”
3. “espresso”
With hypernyms cocktail, Mixed drink, The Unforgettables: mix .
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 13 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Example
Americano
45 texts: 13 about cocktail, 32 about coffee.
Potential hubs:
1. “campari”
2. “mix”
3. “espresso”
With hypernyms cocktail, Mixed drink, The Unforgettables: mix .
Reorder potential hubs:
1. “espresso”
2. “campari”
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 13 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Example
Americano
45 texts: 13 about cocktail, 32 about coffee.
Potential hubs:
1. “campari”
2. “mix”
3. “espresso”
With hypernyms cocktail, Mixed drink, The Unforgettables: mix .
Reorder potential hubs:
1. “espresso”
2. “campari”
Final hubs: mix , espresso .
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 13 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Evaluation
Cocktails
Method Macro average Micro average
Baseline 0.725 0.737
HyperLex 0.784 0.821
HyperHyperLex 0.841 0.896
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 14 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Evaluation
Cocktails
Method Macro average Micro average
Baseline 0.725 0.737
HyperLex 0.784 0.821
HyperHyperLex 0.841 0.896
MeSH
Method Macro average Micro average
Baseline 0.68 0.735
HyperLex 0.617 0.621
HyperHyperLex 0.723 0.739
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 14 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
SemEval 2013 Task 13
Table: Results of SemEval 2013 Task 13 challenge
Jaccard
Index
Positionally-
Weighted
Tau
Weighted
NDCG
Recall
Fuzzy
NMI
Fuzzy
BCubed
Fscore
HyperHy-
perLex
0.193 0.603 0.369 0.092 0.393
AI-KU 0.197 0.620 0.387 0.065 0.390
UoS
(top3)
0.232 0.625 0.374 0.045 0.448
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 15 / 17
SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Conclusion
Features:
Link to domain specific knowledge base without having
knowledge of all senses!
Combined unsupervised / knowledge-based approach.
Takes knowledge graph into account, but no external
resources.
Observations:
Corpus with different senses for training.
Cannot deal with unseen senses.
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 16 / 17
Thank you
Questions?
Artem Revenko Victor Mireles
artem.revenko@semantic-web.com
@revenkoartem

More Related Content

Similar to WSID with KGs @ MLKG DEXA

Using the Semantic Web, and Contributing to it
Using the Semantic Web, and Contributing to itUsing the Semantic Web, and Contributing to it
Using the Semantic Web, and Contributing to it
Mathieu d'Aquin
 

Similar to WSID with KGs @ MLKG DEXA (14)

Knowledge Graphs and Milestone
Knowledge Graphs and MilestoneKnowledge Graphs and Milestone
Knowledge Graphs and Milestone
 
Recent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingRecent Advances in Natural Language Processing
Recent Advances in Natural Language Processing
 
TLTF12 Tech Forum 2012
TLTF12 Tech Forum 2012 TLTF12 Tech Forum 2012
TLTF12 Tech Forum 2012
 
Keynote new convergences between natural language processing and knowledge ...
Keynote   new convergences between natural language processing and knowledge ...Keynote   new convergences between natural language processing and knowledge ...
Keynote new convergences between natural language processing and knowledge ...
 
[Rakuten TechConf2014] [D-2] The Pattern-Matching-Oriented Programming Langua...
[Rakuten TechConf2014] [D-2] The Pattern-Matching-Oriented Programming Langua...[Rakuten TechConf2014] [D-2] The Pattern-Matching-Oriented Programming Langua...
[Rakuten TechConf2014] [D-2] The Pattern-Matching-Oriented Programming Langua...
 
Ruby Egison
Ruby EgisonRuby Egison
Ruby Egison
 
David Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AIDavid Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AI
 
Learning for sequences - Adam Mathias
Learning for sequences  - Adam MathiasLearning for sequences  - Adam Mathias
Learning for sequences - Adam Mathias
 
Reading Street
Reading StreetReading Street
Reading Street
 
Semeval Deep Learning In Semantic Similarity
Semeval Deep Learning In Semantic SimilaritySemeval Deep Learning In Semantic Similarity
Semeval Deep Learning In Semantic Similarity
 
Using the Semantic Web, and Contributing to it
Using the Semantic Web, and Contributing to itUsing the Semantic Web, and Contributing to it
Using the Semantic Web, and Contributing to it
 
AI based language learning tools
AI based language learning toolsAI based language learning tools
AI based language learning tools
 
TwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter
TwiSent: A Multi-Stage System for Analyzing Sentiment in TwitterTwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter
TwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter
 
Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
 

Recently uploaded

Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
FIDO Alliance
 

Recently uploaded (20)

Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 

WSID with KGs @ MLKG DEXA

  • 1. The Use of Class Assertions and Hypernyms to Induce and Disambiguate Word Senses Artem Revenko Victor Mireles artem.revenko@semantic-web.com @revenkoartem MLKG, August 27, 2019
  • 2. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Lynx Project Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 2 / 17
  • 3. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 3 / 17
  • 4. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Finding Mentioned Entity A Knowledge Base Thing Animal Jaguar Car BMW Jaguar BMW has designed a car that is going to drive Jaguar X1 out of the market. Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 4 / 17
  • 5. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Finding Mentioned Entity A Knowledge Base Thing Animal Jaguar Car BMW Jaguar BMW has designed a car that is going to drive Jaguar X1 out of the market. Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 4 / 17
  • 6. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Finding Mentioned Entity A Knowledge Base Thing Animal Jaguar BMW has designed a car that is going to drive Jaguar X1 out of the market. ? Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 5 / 17
  • 7. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments MeSH on Demand Example The author presents an interpretation of Hitchcock‘s ‘Vertigo‘, focusing on the way in which its protagonist‘s drama resonates with the analyst‘s struggle with deep unconscious identifications... MeSH extraction tool: https://meshb.nlm.nih.gov/MeSHonDemand. Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 6 / 17
  • 8. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 7 / 17
  • 9. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Overall Description Plan 1. Induction: Find all senses of the target E 2. Disambiguation: Assign the correct sense to each occurrence of E Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 8 / 17
  • 10. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Overall Description Plan 1. Induction: Find all senses of the target E 1 Represent senses space 2. Disambiguation: Assign the correct sense to each occurrence of E Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 8 / 17
  • 11. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Overall Description Plan 1. Induction: Find all senses of the target E 1 Represent senses space 2 Introduce information from the knowledge base 2. Disambiguation: Assign the correct sense to each occurrence of E Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 8 / 17
  • 12. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Overall Description Plan 1. Induction: Find all senses of the target E 1 Represent senses space 2 Introduce information from the knowledge base 3 Cluster senses 2. Disambiguation: Assign the correct sense to each occurrence of E Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 8 / 17
  • 13. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Word Sense Induction Describing Senses: Co-occurrence Graph You shall know a word by the company it keeps. Firth, J. R. 1957 Dice score: s(A, B) = 2 |AB| |A|+|B| Example |A| = #(car) = 50 occurrences, |B| = #(BMW ) = 20 occurrences, |AB| = #(BMW and car within 10 words) = 10, s(BMW, car) = 2 10 20+50 = 2/7 ≈ 0.3 Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 9 / 17
  • 14. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Co-occurrence Graph Clustering  Target E Term1 Term2 Term3 Term4 0.2 0.15 0.12 0.1 Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
  • 15. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Co-occurrence Graph Clustering  Target E Term1 Term2 Term3 Term4 Term5 Term6 0.2 0.15 0.12 0.1 0.22 0.16 0.2 0.1 0.2 Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
  • 16. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Co-occurrence Graph Clustering Target E Term1 Term2 Term3 Term4 Term5 Term6 0.2 0.15 0.12 0.1 0.22 0.16 0.2 0.1 0.2 Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
  • 17. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Co-occurrence Graph Clustering Target E Term1 Term2 Term3 Term4 Term5 Term6 0.2 0.15 0.12 0.1 0.22 0.16 0.2 0.1 0.2 Sense1 Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
  • 18. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Co-occurrence Graph Clustering Target E Term1Term2 Term3 Term4 Term5 Term6 0.2 0.15 0.12 0.1 0.22 0.16 0.2 0.1 0.2 Sense1 Sense2 Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
  • 19. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Co-occurrence Graph Clustering Target E Term1 Term2 Term3 Term4 Term5 Term6 Hypernym 0.2 0.15 0.12 0.1 0.22 0.16 0.2 0.1 0.2 Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
  • 20. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Co-occurrence Graph Clustering Target E Term1 Term2 Term3 Term4 Term5 Term6 Hypernym 0.2 0.15 0.12 0.1 0.22 0.16 0.2 0.1 0.2 Hypernym Sense Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
  • 21. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Co-occurrence Graph Clustering Target E Term1 Term2 Term3 Term4 Term5 Term6 Hypernym 0.2 0.15 0.12 0.1 0.22 0.16 0.2 0.1 0.2 Hypernym Sense “Other” Sense Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
  • 22. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 11 / 17
  • 23. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Setup Data: github.com/artreven/thesaural_wsi Created using Wikilinks http://www.iesl.cs.umass.edu/data/wiki-links Cocktails Knowledge base 250 concepts about cocktails Corpora 13 corpora, 1227 texts. Examples Americano, B52, Margarita, . . . MeSH Knowledge base 28000 concepts about cocktails Corpora 8 corpora, 784 texts. Examples Amnesia, Vertigo, . . . Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 12 / 17
  • 24. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Example Americano 45 texts: 13 about cocktail, 32 about coffee. Potential hubs: 1. “campari” 2. “mix” 3. “espresso” Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 13 / 17
  • 25. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Example Americano 45 texts: 13 about cocktail, 32 about coffee. Potential hubs: 1. “campari” 2. “mix” 3. “espresso” With hypernyms cocktail, Mixed drink, The Unforgettables: mix . Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 13 / 17
  • 26. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Example Americano 45 texts: 13 about cocktail, 32 about coffee. Potential hubs: 1. “campari” 2. “mix” 3. “espresso” With hypernyms cocktail, Mixed drink, The Unforgettables: mix . Reorder potential hubs: 1. “espresso” 2. “campari” Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 13 / 17
  • 27. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Example Americano 45 texts: 13 about cocktail, 32 about coffee. Potential hubs: 1. “campari” 2. “mix” 3. “espresso” With hypernyms cocktail, Mixed drink, The Unforgettables: mix . Reorder potential hubs: 1. “espresso” 2. “campari” Final hubs: mix , espresso . Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 13 / 17
  • 28. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Evaluation Cocktails Method Macro average Micro average Baseline 0.725 0.737 HyperLex 0.784 0.821 HyperHyperLex 0.841 0.896 Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 14 / 17
  • 29. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Evaluation Cocktails Method Macro average Micro average Baseline 0.725 0.737 HyperLex 0.784 0.821 HyperHyperLex 0.841 0.896 MeSH Method Macro average Micro average Baseline 0.68 0.735 HyperLex 0.617 0.621 HyperHyperLex 0.723 0.739 Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 14 / 17
  • 30. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments SemEval 2013 Task 13 Table: Results of SemEval 2013 Task 13 challenge Jaccard Index Positionally- Weighted Tau Weighted NDCG Recall Fuzzy NMI Fuzzy BCubed Fscore HyperHy- perLex 0.193 0.603 0.369 0.092 0.393 AI-KU 0.197 0.620 0.387 0.065 0.390 UoS (top3) 0.232 0.625 0.374 0.045 0.448 Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 15 / 17
  • 31. SEMANTIC W school • consulting • pro SEMANTIC W SEMANTIC WEB COMPANY SWC Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments Conclusion Features: Link to domain specific knowledge base without having knowledge of all senses! Combined unsupervised / knowledge-based approach. Takes knowledge graph into account, but no external resources. Observations: Corpus with different senses for training. Cannot deal with unseen senses. Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 16 / 17
  • 32. Thank you Questions? Artem Revenko Victor Mireles artem.revenko@semantic-web.com @revenkoartem