0
A Survey on Unsupervised Graph-based Word
           Sense Disambiguation



                Elena-Oana Tabaranu
         ...
Plan
1.Introduction
2.State of the Art
3.Experiments and Results
4.Conclusions
5.References




                     Elena...
Introduction
●   WSD = assign automatically the most
    appropriate meaning to a polysemous word
    within a given conte...
State of the Art
●   Supervised WSD vs Unsupervised WSD
●   GWSD and Semantic Graph Construction
●   SAN Method
●   Page-R...
Supervised WSD vs Unsupervised WSD
●   Most approaches transform           ●   Identify the best sense
    the sense of th...
Graph-based WSD
●   GWSD = graph representation used to model
    word sense dependencies in text (WSD with
    graphs, no...
Semantic Graph Construction (I)
●   Example (Sinha et al, 2007)




                          Elena-Oana Tabaranu   7
Semantic Graph Construction (II)
●   Example (Tsatsaronis et al, 2010)




                          Elena-Oana Tabaranu  ...
The Page-Rank Method (Brin and
             Page, 1998)
●   Ranking algorithm based on the idea of voting:
    when one no...
The P-Rank Method (Zao et al,
                2009)
●   Check the structural similarity of nodes in an
    information net...
The HITS Method (Kleinberg,1999)
●   Identify authorities = the most important nodes
    in the graph
●   Identify hubs = ...
Experiments and Results (I)
●   Senseval 2 and 3 data sets often used for testing
●   Occurencies for Senseval 2 using Wor...
Experiments and Results (II)
●   Accuracies on the Senseval 2 and 3 English All
    Words Task data sets (Tsatsaronis et a...
Conclusions
●   Recent systems minimise the gap between supervised
    and unsupervised approaches.
●   The graph-based me...
References
1. Tsatsaronis, G., Varlamis, I., Norvag, K. : An Experimental
   Study on Unsupervised Graph-based Word Sense
...
Questions?




  Elena-Oana Tabaranu   16
Upcoming SlideShare
Loading in...5
×

A Survey on Unsupervised Graph-based Word Sense Disambiguation

1,993

Published on

Presents comparative evaluations of graph
based word sense disambiguation techniques using several measures of
word semantic similarity and several ranking algorithms. Unsupervised
word sense disambiguation has received a lot of attention lately because
of it's fast execution time and it's ability to make the most of a small
input corpus. Recent state of the art graph based systems have tried to
close the gap between the supervised and the unsupervised approaches.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,993
On Slideshare
0
From Embeds
0
Number of Embeds
12
Actions
Shares
0
Downloads
47
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "A Survey on Unsupervised Graph-based Word Sense Disambiguation"

  1. 1. A Survey on Unsupervised Graph-based Word Sense Disambiguation Elena-Oana Tabaranu elena.tabaranu@info.uaic.ro UAIC, Iasi
  2. 2. Plan 1.Introduction 2.State of the Art 3.Experiments and Results 4.Conclusions 5.References Elena-Oana Tabaranu 2
  3. 3. Introduction ● WSD = assign automatically the most appropriate meaning to a polysemous word within a given context (Sinha et al, 2007) ● Use Cases: ● Machine translation ● Speech processing ● Boosting the performance of tasks like text retrieval, document classification and document clustering Elena-Oana Tabaranu 3
  4. 4. State of the Art ● Supervised WSD vs Unsupervised WSD ● GWSD and Semantic Graph Construction ● SAN Method ● Page-Rank Method ● HITS Method ● P-Rank Method Elena-Oana Tabaranu 4
  5. 5. Supervised WSD vs Unsupervised WSD ● Most approaches transform ● Identify the best sense the sense of the word into a candidate for a model of the feature vector word sense dependency in text ● Low execution time ● Ranking algorithm to choose ● Accuracy of 60%-70% their most likely combination ● Major disadvantage: ● Window, graph based knowledge aquisition representation of the model bottleneck (accuracy connected to the amount of ● Fast execution time manually anotated data) ● Accuracy of 40%-60% Elena-Oana Tabaranu 5
  6. 6. Graph-based WSD ● GWSD = graph representation used to model word sense dependencies in text (WSD with graphs, not just word window) ● Goal: identify the most probable sense (label) for each word ● Advantage: takes into account information drawn from the entire graph Elena-Oana Tabaranu 6
  7. 7. Semantic Graph Construction (I) ● Example (Sinha et al, 2007) Elena-Oana Tabaranu 7
  8. 8. Semantic Graph Construction (II) ● Example (Tsatsaronis et al, 2010) Elena-Oana Tabaranu 8
  9. 9. The Page-Rank Method (Brin and Page, 1998) ● Ranking algorithm based on the idea of voting: when one node links to another it offers a vote to that other node ● The higher the number of votes for a note, the higher the importance of the node ● Recursively score the candidate nodes for a weighted undirected graph Elena-Oana Tabaranu 9
  10. 10. The P-Rank Method (Zao et al, 2009) ● Check the structural similarity of nodes in an information network ● Based on the idea that two nodes are similar if they reference and also reference similar nodes ● Represents a generalization of other state of the art measures like CoCitation, Coupling, Amsler, SimLink Elena-Oana Tabaranu 10
  11. 11. The HITS Method (Kleinberg,1999) ● Identify authorities = the most important nodes in the graph ● Identify hubs = the nodes which point to authorities ● The sense with the highest authority is chosen as the most likely one for each word ● Major disadvantage: densely connected nodes can attract the highest score (clique attack) Elena-Oana Tabaranu 11
  12. 12. Experiments and Results (I) ● Senseval 2 and 3 data sets often used for testing ● Occurencies for Senseval 2 using WordNet 2 ● Occurencies for Senseval 3 using WordNet 2 Elena-Oana Tabaranu 12
  13. 13. Experiments and Results (II) ● Accuracies on the Senseval 2 and 3 English All Words Task data sets (Tsatsaronis et al) Elena-Oana Tabaranu 13
  14. 14. Conclusions ● Recent systems minimise the gap between supervised and unsupervised approaches. ● The graph-based methods make the most of the rich semantic model they employ. ● Unsupervised approaches seek the optimal value for the parameters using as little training data as possible and testing on as large a dataset as possible. ● Future work: implement P-Rank using a different representation, for example Sinha et al. Elena-Oana Tabaranu 14
  15. 15. References 1. Tsatsaronis, G., Varlamis, I., Norvag, K. : An Experimental Study on Unsupervised Graph-based Word Sense Disambiguation. In Proc. of CICLing (2010). 2. Sinha, R., Mihalcea, R. :Unsupervised graph-based word sense disambiguation using measures of semantic similarity. In Proc. of ICSC (2007). 3. Mihalcea, R., Csomai, A. : Senselearner: Word sense disambiguation for all words in unrestricted text. In Proc. of ACL, pages 53-56 (2005). 4. Tsatsaronis, G., Vazirgiannis, M., Androutsopoulos, I. :Word Sense Disambiguation with Spreading Activation Networks Generated from Thesauri. In Proc. of IJCAI (2007). Elena-Oana Tabaranu 15
  16. 16. Questions? Elena-Oana Tabaranu 16
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×