A Survey on Unsupervised Graph-based Word Sense Disambiguation
Upcoming SlideShare
Loading in...5
×
 

A Survey on Unsupervised Graph-based Word Sense Disambiguation

on

  • 2,595 views

Presents comparative evaluations of graph ...

Presents comparative evaluations of graph
based word sense disambiguation techniques using several measures of
word semantic similarity and several ranking algorithms. Unsupervised
word sense disambiguation has received a lot of attention lately because
of it's fast execution time and it's ability to make the most of a small
input corpus. Recent state of the art graph based systems have tried to
close the gap between the supervised and the unsupervised approaches.

Statistics

Views

Total Views
2,595
Views on SlideShare
2,401
Embed Views
194

Actions

Likes
1
Downloads
34
Comments
0

18 Embeds 194

http://elena-oana.blogspot.com 111
http://elena-oana.blogspot.ro 23
http://elena-oana.blogspot.fr 16
http://elena-oana.blogspot.co.uk 11
http://www.slideshare.net 6
http://elena-oana.blogspot.gr 4
http://elena-oana.blogspot.co.at 4
http://elena-oana.blogspot.in 3
http://elena-oana.blogspot.com.br 3
http://elena-oana.blogspot.de 3
http://elena-oana.blogspot.it 3
http://webcache.googleusercontent.com 1
http://elena-oana.blogspot.be 1
http://elena-oana.blogspot.nl 1
http://elena-oana.blogspot.mx 1
http://elena-oana.blogspot.kr 1
http://elena-oana.blogspot.ae 1
http://elena-oana.blogspot.ca 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    A Survey on Unsupervised Graph-based Word Sense Disambiguation A Survey on Unsupervised Graph-based Word Sense Disambiguation Presentation Transcript

    • A Survey on Unsupervised Graph-based Word Sense Disambiguation Elena-Oana Tabaranu elena.tabaranu@info.uaic.ro UAIC, Iasi
    • Plan 1.Introduction 2.State of the Art 3.Experiments and Results 4.Conclusions 5.References Elena-Oana Tabaranu 2
    • Introduction ● WSD = assign automatically the most appropriate meaning to a polysemous word within a given context (Sinha et al, 2007) ● Use Cases: ● Machine translation ● Speech processing ● Boosting the performance of tasks like text retrieval, document classification and document clustering Elena-Oana Tabaranu 3
    • State of the Art ● Supervised WSD vs Unsupervised WSD ● GWSD and Semantic Graph Construction ● SAN Method ● Page-Rank Method ● HITS Method ● P-Rank Method Elena-Oana Tabaranu 4
    • Supervised WSD vs Unsupervised WSD ● Most approaches transform ● Identify the best sense the sense of the word into a candidate for a model of the feature vector word sense dependency in text ● Low execution time ● Ranking algorithm to choose ● Accuracy of 60%-70% their most likely combination ● Major disadvantage: ● Window, graph based knowledge aquisition representation of the model bottleneck (accuracy connected to the amount of ● Fast execution time manually anotated data) ● Accuracy of 40%-60% Elena-Oana Tabaranu 5
    • Graph-based WSD ● GWSD = graph representation used to model word sense dependencies in text (WSD with graphs, not just word window) ● Goal: identify the most probable sense (label) for each word ● Advantage: takes into account information drawn from the entire graph Elena-Oana Tabaranu 6
    • Semantic Graph Construction (I) ● Example (Sinha et al, 2007) Elena-Oana Tabaranu 7
    • Semantic Graph Construction (II) ● Example (Tsatsaronis et al, 2010) Elena-Oana Tabaranu 8
    • The Page-Rank Method (Brin and Page, 1998) ● Ranking algorithm based on the idea of voting: when one node links to another it offers a vote to that other node ● The higher the number of votes for a note, the higher the importance of the node ● Recursively score the candidate nodes for a weighted undirected graph Elena-Oana Tabaranu 9
    • The P-Rank Method (Zao et al, 2009) ● Check the structural similarity of nodes in an information network ● Based on the idea that two nodes are similar if they reference and also reference similar nodes ● Represents a generalization of other state of the art measures like CoCitation, Coupling, Amsler, SimLink Elena-Oana Tabaranu 10
    • The HITS Method (Kleinberg,1999) ● Identify authorities = the most important nodes in the graph ● Identify hubs = the nodes which point to authorities ● The sense with the highest authority is chosen as the most likely one for each word ● Major disadvantage: densely connected nodes can attract the highest score (clique attack) Elena-Oana Tabaranu 11
    • Experiments and Results (I) ● Senseval 2 and 3 data sets often used for testing ● Occurencies for Senseval 2 using WordNet 2 ● Occurencies for Senseval 3 using WordNet 2 Elena-Oana Tabaranu 12
    • Experiments and Results (II) ● Accuracies on the Senseval 2 and 3 English All Words Task data sets (Tsatsaronis et al) Elena-Oana Tabaranu 13
    • Conclusions ● Recent systems minimise the gap between supervised and unsupervised approaches. ● The graph-based methods make the most of the rich semantic model they employ. ● Unsupervised approaches seek the optimal value for the parameters using as little training data as possible and testing on as large a dataset as possible. ● Future work: implement P-Rank using a different representation, for example Sinha et al. Elena-Oana Tabaranu 14
    • References 1. Tsatsaronis, G., Varlamis, I., Norvag, K. : An Experimental Study on Unsupervised Graph-based Word Sense Disambiguation. In Proc. of CICLing (2010). 2. Sinha, R., Mihalcea, R. :Unsupervised graph-based word sense disambiguation using measures of semantic similarity. In Proc. of ICSC (2007). 3. Mihalcea, R., Csomai, A. : Senselearner: Word sense disambiguation for all words in unrestricted text. In Proc. of ACL, pages 53-56 (2005). 4. Tsatsaronis, G., Vazirgiannis, M., Androutsopoulos, I. :Word Sense Disambiguation with Spreading Activation Networks Generated from Thesauri. In Proc. of IJCAI (2007). Elena-Oana Tabaranu 15
    • Questions? Elena-Oana Tabaranu 16