The document describes a method for inducing and disambiguating word senses using class assertions and hypernyms. It involves building a co-occurrence graph to represent word senses, introducing information from a knowledge base, and clustering the graph to induce senses. Experiments applying this method achieved improved disambiguation performance over baselines on datasets involving ambiguous words related to cocktails and medical concepts.
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
WSID with KGs @ MLKG DEXA
1. The Use of Class Assertions and Hypernyms to
Induce and Disambiguate Word Senses
Artem Revenko Victor Mireles
artem.revenko@semantic-web.com
@revenkoartem
MLKG, August 27, 2019
2. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Lynx Project
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 2 / 17
3. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Ambiguity of Natural Languages
Co-occurrence Graph Clustering
Disambiguation Experiments
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 3 / 17
4. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Finding Mentioned Entity
A Knowledge Base
Thing
Animal
Jaguar
Car
BMW
Jaguar
BMW has designed a
car that is going to drive
Jaguar X1 out of the
market.
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 4 / 17
5. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Finding Mentioned Entity
A Knowledge Base
Thing
Animal
Jaguar
Car
BMW
Jaguar
BMW has designed a
car that is going to drive
Jaguar X1 out of the
market.
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 4 / 17
6. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Finding Mentioned Entity
A Knowledge Base
Thing
Animal
Jaguar
BMW has designed a car
that is going to drive
Jaguar X1 out of the
market.
?
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 5 / 17
7. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
MeSH on Demand
Example
The author presents an interpretation of Hitchcock‘s ‘Vertigo‘,
focusing on the way in which its protagonist‘s drama resonates
with the analyst‘s struggle with deep unconscious identifications...
MeSH extraction tool:
https://meshb.nlm.nih.gov/MeSHonDemand.
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 6 / 17
8. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Ambiguity of Natural Languages
Co-occurrence Graph Clustering
Disambiguation Experiments
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 7 / 17
9. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Overall Description
Plan
1. Induction: Find all senses of the target E
2. Disambiguation: Assign the correct sense to each occurrence
of E
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 8 / 17
10. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Overall Description
Plan
1. Induction: Find all senses of the target E
1 Represent senses space
2. Disambiguation: Assign the correct sense to each occurrence
of E
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 8 / 17
11. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Overall Description
Plan
1. Induction: Find all senses of the target E
1 Represent senses space
2 Introduce information from the knowledge base
2. Disambiguation: Assign the correct sense to each occurrence
of E
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 8 / 17
12. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Overall Description
Plan
1. Induction: Find all senses of the target E
1 Represent senses space
2 Introduce information from the knowledge base
3 Cluster senses
2. Disambiguation: Assign the correct sense to each occurrence
of E
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 8 / 17
13. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Word Sense Induction
Describing Senses: Co-occurrence Graph
You shall know a word by the company it keeps.
Firth, J. R. 1957
Dice score: s(A, B) = 2 |AB|
|A|+|B|
Example
|A| = #(car) = 50 occurrences,
|B| = #(BMW ) = 20 occurrences,
|AB| = #(BMW and car within 10 words) = 10,
s(BMW, car) = 2 10
20+50 = 2/7 ≈ 0.3
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 9 / 17
14. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Co-occurrence Graph Clustering
Target E
Term1
Term2
Term3
Term4
0.2
0.15
0.12 0.1
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
15. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Co-occurrence Graph Clustering
Target E
Term1
Term2
Term3
Term4
Term5
Term6
0.2
0.15
0.12 0.1
0.22
0.16
0.2
0.1
0.2
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
16. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Co-occurrence Graph Clustering
Target E
Term1
Term2
Term3
Term4
Term5
Term6
0.2
0.15
0.12 0.1
0.22
0.16
0.2
0.1
0.2
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
17. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Co-occurrence Graph Clustering
Target E
Term1
Term2
Term3
Term4
Term5
Term6
0.2
0.15
0.12 0.1
0.22
0.16
0.2
0.1
0.2
Sense1
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
18. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Co-occurrence Graph Clustering
Target E
Term1Term2
Term3
Term4
Term5
Term6
0.2
0.15
0.12 0.1
0.22
0.16
0.2
0.1
0.2
Sense1
Sense2
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
19. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Co-occurrence Graph Clustering
Target E
Term1
Term2
Term3
Term4
Term5
Term6
Hypernym
0.2
0.15
0.12 0.1
0.22
0.16
0.2
0.1
0.2
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
20. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Co-occurrence Graph Clustering
Target E
Term1
Term2
Term3
Term4
Term5
Term6
Hypernym
0.2
0.15
0.12 0.1
0.22
0.16
0.2
0.1
0.2
Hypernym Sense
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
21. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Co-occurrence Graph Clustering
Target E
Term1
Term2
Term3
Term4
Term5
Term6
Hypernym
0.2
0.15
0.12 0.1
0.22
0.16
0.2
0.1
0.2
Hypernym Sense
“Other” Sense
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 10 / 17
22. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Ambiguity of Natural Languages
Co-occurrence Graph Clustering
Disambiguation Experiments
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 11 / 17
23. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Setup
Data: github.com/artreven/thesaural_wsi
Created using Wikilinks
http://www.iesl.cs.umass.edu/data/wiki-links
Cocktails
Knowledge base 250 concepts about cocktails
Corpora 13 corpora, 1227 texts.
Examples Americano, B52, Margarita, . . .
MeSH
Knowledge base 28000 concepts about cocktails
Corpora 8 corpora, 784 texts.
Examples Amnesia, Vertigo, . . .
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 12 / 17
24. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Example
Americano
45 texts: 13 about cocktail, 32 about coffee.
Potential hubs:
1. “campari”
2. “mix”
3. “espresso”
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 13 / 17
25. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Example
Americano
45 texts: 13 about cocktail, 32 about coffee.
Potential hubs:
1. “campari”
2. “mix”
3. “espresso”
With hypernyms cocktail, Mixed drink, The Unforgettables: mix .
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 13 / 17
26. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Example
Americano
45 texts: 13 about cocktail, 32 about coffee.
Potential hubs:
1. “campari”
2. “mix”
3. “espresso”
With hypernyms cocktail, Mixed drink, The Unforgettables: mix .
Reorder potential hubs:
1. “espresso”
2. “campari”
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 13 / 17
27. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Example
Americano
45 texts: 13 about cocktail, 32 about coffee.
Potential hubs:
1. “campari”
2. “mix”
3. “espresso”
With hypernyms cocktail, Mixed drink, The Unforgettables: mix .
Reorder potential hubs:
1. “espresso”
2. “campari”
Final hubs: mix , espresso .
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 13 / 17
28. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Evaluation
Cocktails
Method Macro average Micro average
Baseline 0.725 0.737
HyperLex 0.784 0.821
HyperHyperLex 0.841 0.896
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 14 / 17
29. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Evaluation
Cocktails
Method Macro average Micro average
Baseline 0.725 0.737
HyperLex 0.784 0.821
HyperHyperLex 0.841 0.896
MeSH
Method Macro average Micro average
Baseline 0.68 0.735
HyperLex 0.617 0.621
HyperHyperLex 0.723 0.739
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 14 / 17
30. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
SemEval 2013 Task 13
Table: Results of SemEval 2013 Task 13 challenge
Jaccard
Index
Positionally-
Weighted
Tau
Weighted
NDCG
Recall
Fuzzy
NMI
Fuzzy
BCubed
Fscore
HyperHy-
perLex
0.193 0.603 0.369 0.092 0.393
AI-KU 0.197 0.620 0.387 0.065 0.390
UoS
(top3)
0.232 0.625 0.374 0.045 0.448
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 15 / 17
31. SEMANTIC W
school • consulting • pro
SEMANTIC W
SEMANTIC
WEB COMPANY
SWC
Ambiguity of Natural Languages Co-occurrence Graph Clustering Disambiguation Experiments
Conclusion
Features:
Link to domain specific knowledge base without having
knowledge of all senses!
Combined unsupervised / knowledge-based approach.
Takes knowledge graph into account, but no external
resources.
Observations:
Corpus with different senses for training.
Cannot deal with unseen senses.
Artem Revenko, Victor Mireles (Semantic Web Company) artem.revenko@semantic-web.com @revenkoartem 16 / 17