SlideShare a Scribd company logo
1 of 11
Download to read offline
A Survey on Unsupervised Graph-based Word
               Sense Disambiguation

                              Elena-Oana T˘b˘ranu
                                          a a

                           Faculty of Computer Science
                       “Alexandru I. Cuza” University of Ia¸i
                                                           s
                         {elena.tabaranu@info.uaic.ro}




      Abstract. This paper presents comparative evaluations of graph based
      word sense disambiguation techniques using several measures of word
      semantic similarity and several ranking algorithms. Unsupervised word
      sense disambiguation has received a lot of attention lately because of it’s
      fast execution time and it’s ability to make the most of a small input
      corpus. Recent state of the art graph based systems have tried to close
      the gap between the supervised and the unsupervised approaches.

      Key words: WordNet, WSD, Semantic Graphs, SAN, HITS, PageR-
      ank, P-Rank



1    Introduction

The problem of word sense disambiguation (WSD) is defined by Sinha et al[2]
as the task of automatically assigning the most appropriate meaning to a poly-
semous word within a given context.
    WSD methods are critical for solving natural language processing tasks like
machine translation and speech processing, but also boost the performance of
other tasks like text retrieval, document classification and document clustering.
    Approaches found in the bibliography face the trade off between unsupervised
and supervised methods: the first one has fast execution time, but low accuracy
and the second one requires training in a large amount of manually annotated
data.
    The graph based methods make the most of the semantic model they em-
ploy, thus trying to close the gap between the unsupervised and supervised ap-
proaches.
    This paper is organized as follows. It describes the latest state-of the art
methods for unsupervised graph-based word sense disambiguation. Next, this
paper presents several comparative evaluations carried on the Senseval data sets
using the same semantic representation.
2        A Survey on Unsupervised Graph-based Word Sense Disambiguation

2     State of the Art

2.1     Supervised Word Sense Disambiguation

Supervised word sense disambiguation systems have an accuracy of 60%-70%
while the unsupervised ones struggle between 45% and 60%. Most approaches
transform the sense of a particular word into a feature vector to be used in
the learning process. The major disadvantage of using such supervised learning
methods emerges from the knowledge acquisition bottleneck problem because
their accuracy is strongly connected to the amount of annotated corpus available.
    State of the art results include Mihalcea and Csomai[3]’s SenseLearner which
employs seven semantic models trained using a memory based algorithm, the
Simil-Prime1 system and the results reported by Hoste et al.2
    The SenseLearner uses a minimal supervised approach because it’s aim is to
process a relatively small data set for training and also generalize the learned
concepts as global models for general word categories. SenseLearner takes as
input raw text which is preprocessed before computing the feature vector. Next,
a semantic model is learned for all predefined word categories, which are defined
as groups of words that share some common syntactic or semantic properties.
Once defined and trained, the models are used to annotate the ambiguous words
in the test corpus with their corresponding meaning.
    Training the SenseLearner system used the SemCor semantically annotated
dataset and evaluation was done with Senseval 2 and 3 English All Words data
sets with results of 71.3% and 68.1% respectively. The best supervised results
were reported by SMUaw3 and GAMBL4 systems as winners of the Senseval 2
and 3 All English Words Task. The former is based on pattern learning from sense
tagged corpora and instance based learning with automatic feature selection,
while the latter needs extensive training using memory based classifiers.


2.2     Unsupervised Word Sense Disambiguation

Unsupervised Word Sense Disambiguation systems seek to identify the best sense
candidate for a model of the word sense dependency in text. Such systems use a
metric of semantic similarity to compute the relatedness between the senses and
an algorithm which chooses their most likely combination.
1
    Kohomban, U., Lee, W.: Learning semantic classes for word sense disambiguation.
    In Proc. of ACL, pages 34-41, 2005.
2
    Hoste, V., Daelemens, W., Hendrickx, I., van den Bosch, A.: Evaluating the results of
    the memory-based word-expert approach to unrestricted word sense disambiguation.
    In Proc. of the ACL Workshop on Word Sense Disambiguation, 2002.
3
    Mihalcea, R.: Word sense disambiguation with pattern learning and automatic fea-
    ture selection. Natural Language Engineering, 1(1):1-15, 2002.
4
    Decadt, B., Hoste, V., Daelemens, W., van den Bosch, A.: GAMBL, genetic algo-
    rithm optimization for memory-based wsd. In Proc. of the Senseval3: Third Inter-
    national Workshop on the Evaluation of Systems for the Semantic Analysis of Text,
    2004.
A Survey on Unsupervised Graph-based Word Sense Disambiguation               3




                    Fig. 1. Semantic model learning in SenseLearner.


    Sinha et al.[2] have evaluated six methods of semantic similarity assuming as
input a pair of concepts from the WordNet5 hierarchy: Leacock & Chodorow6 (lcj),
Lesk7 (lesk), Wu & Palmer8 , Resnik9 , Lin10 , and Jiang & Conrath11 (jcn). They
also use a normalization technique to implement a combination of the similarity
measures, which accounts for the strength of each individual metric.
    Leacock & Chodorow is a similarity metric computed using the equation (1)
where the length is the length of the shortest path between two concepts using
 5
     Fellbaum, C.: WordNet an electronic lexical database. MIT Press, 1998.
 6
     Leacock, C., Chodorow, M.: Combining local context and WordNet sense similarity
     for word sense identification. In WordNet, An Electronic Lexical Database. The MIT
     Press, 1998
 7
     Lesk, M.: Automatic sense disambiguation using machine readable dictionaries: How
     to tell a pine cone from an ice cream cone. In Proceedings of the SIGDOC Conference
     1986, Toronto, June 1986.
 8
     Wu, Z., Palmer, M.: Verb semantics and lexical selection.In Proceedings of the 32nd
     Annual Meeting of the Association for Computational Linguistics, Las Cruces, New
     Mexico, 1994.
 9
     Resnik, P.: Using information content to evaluate semantic similarity. In Proceed-
     ings of the 14th International Joint Conference on Artificial Intelligence, Montreal,
     Canada, 1995.
10
     Lin, D.: An information-theoretic definition of similarity. In Proceedings of the 15th
     International Conference on Machine Learning, Madison, WI, 1998.
11
     Jiang, J., Conrath, D.: Semantic similarity based on corpus statistics and lexical
     taxonomy. In Proceedings of the International Conference on Research in Computa-
     tional Linguistics, Taiwan, 1997.
4         A Survey on Unsupervised Graph-based Word Sense Disambiguation

node-counting, and D is the maximum depth of the taxonomy.
                                                length
                                simlch = −log                                    (1)
                                                 2∗D
   The metric introduced by Jiang & Conrath uses the least common subsumer
(LCS)and combines the information content (IC) of two input concepts:
                                               1
               simjcn =                                                          (2)
                          IC(concept1 ) + IC(concept2 ) − 2 ∗ IC(LCS)
      The information content is defined as:

                                 IC(c) = −log(P (c))                             (3)

   The table below proves that a combination of the jcn, lch and lesk measures
performs better then using them individually.


          Table 1. Results for the individual or combined similarity measures

             jcn   lch    lesk combined
Precission 51.57 41.47 51.87 53.43
Recall     19.12 16.02 44.97 53.43
F-measure 27.89 23.11 48.17 53.43


    Tsatsaronis et al.[1] propose a new node similarity algorithm P-Rank for
their graph representation which actually does not seem to perform better then
the other unsupervised methods. They justify the lower results based on Nav-
igli and Lapata12 ’s observations which also reported lower performance for the
betweenness and indegree methods of structural similarity.

2.3     Graph-based Methods
Graph-based methods model the word sense dependency in text using a graph
representation.
Senses are represented as labelled nodes in the graph and weighted edges are
added to mark the dependency among them. Each word thus has a window
associated with it, including several words before and after that word, which in
turn means that each word has a corresponding graph associated with it, and it
is that word that gets disambiguated after the ranking algorithm are run on that
graph. The node with the highest value is chosen as the most probable sense for
that word.
    Sinha et al[1] have noticed a remarkable property that makes these graph-
based algorithms appealing: the fact that they take into account information
12
     Navigli, R., Lapata, M.: Graph connectivity measures for unsupervised word sense
     disambiguation. In Proc of IJCAI, pages 1683-1688, 2007.
A Survey on Unsupervised Graph-based Word Sense Disambiguation                5

drawn from the entire graph, capturing relationships among all the words in a
sequence, which makes them superior to other approaches that rely only on local
information individually derived for each word.


2.4   Semantic Graph Construction

Graph-based methods usually associate a node for each word to be processed.
Senses can be represented as labels and their dependencies are indicated as edge
weights. The likelihood of each sense can be determined using a graph-based
ranking algorithm, which runs over the graph of potential senses and identifies
the ‘best’ one.
   Given a sequence of words W = w1 , w2 , w3 , w4 and their corresponding labels
        1    2          Nw
Lwi = lwi , lwi , ..., lwi i , Sinha et al[1] define a labeled graph G = (V, E) such that
                                                          j
there is a node v ∈ V for every possible label lwi , i = 1..n, j = 1.Nwi . Edges
e ∈ E map the dependencies between pairs of labels.




Fig. 2. Sample semantic representation used by Sinha et al[2] for a sequence of four
words w1, w2, w3, w4 and their corresponding labels.


    Tsatsaronis et al[1] have used a semantic model which contains only the
words which have an entry in the WordNet thesaurus. Their approach first adds
all the words and their corresponding senses represented by WordNet synsets to
the network (Initial Phase). The expansion phase extends the network iteratively
for each word with all the semantically related senses from the WordNet (Expan-
sion Round 1) until the network is connected. Failing to construct a connected
network will imply that the words in the sentence cannot be disambiguated.
Weights are added in the next step computed based on the frequency of each
edge type (Expansion Round 2 ). At some point in the construction phase, some
6         A Survey on Unsupervised Graph-based Word Sense Disambiguation

nodes could share the same sense (Expansion Example 2) and in this particular
case only one labelled node is added to the network.




Fig. 3. Sample semantic representation used by Tsatsaronis et al[1] for words ii and
tj and their corresponding senses.


   Other approaches in the literature have used the gloss words of the WordNet
entries13 , have defined additional composite semantic relations14 or have used
the Extended WordNet to enhance their model15 .


2.5     Spreading of Activation (SAN) Method

The spreading of activation in semantic networks proposed by Tsasaronis et al[4]
consider all nodes to have an activation level of 0, except for the input nodes
13
     Veronis, J., Ide, N.: Word Sense Disambiguation with very large neural networks
     extracted from machine readable dictionaries. In Proc. of COLING, pages 389-394,
     1990.
14
     Mihalcea, R., Tarau, P., Figa, E.: PageRank on semantic networks with application
     to word sense disambiguation. In Proc. of COLING, 2004.
15
     Agirre, E., Soroa, A.: Personalizing page rank for word sense disambiguation. In
     Proc. of EACL, pages 33-41, 2009.
A Survey on Unsupervised Graph-based Word Sense Disambiguation             7

which have a value of 1. At each iteration p the node j propagates it’s output
activation Oj (p) to it’s neighbours as a function f of it’s current activation value
A(j (p) and the weights of the edges that connect it with it’s neighbours.

                                  Oj (p) = f (Aj (p))                            (4)

   The activation level of a node k at iteration p is influenced by the output
function at iteration p − 1 of every neighbour j with a direct edge ejk . Wjk if
the function for the edge weights.

                           Ak (p) =         Oj (p − 1) · Wjk                     (5)
                                       j

    The function to compute the output activation level must be chosen carefully
since the network can be flooded. Tsataronis et al[1] use the function of equation
3 with a threshold value τ to prevent the nodes with a low activation level to
                                                   1
influence their corresponding neighbours. Also, 1+p is a factor used to reduce the
influence of a node to its neighbours as iterations go by, while the Fj function will
reduce the influence of nodes that connect to many neighbours. This algorithm
requires no training.

                                            0      if Aj (p) < τ
                       Oj (p) =      Fj                                          (6)
                                    p+1    · Aj (p) otherwise

   CT represents the total number of nodes, while Cj is the number of nodes
with a direct edge from j.

                                                 Cj
                                   Fj = (1 −        )                            (7)
                                                 CT


2.6   Page-Rank Method

Page-Rank is a graph ranking algorithm based on the idea of “voting” or “rec-
ommendation”. When one node links to another one, it basically offers a recom-
mendation for that other node. The higher the number of recommendations that
are offered for a node, the higher the importance of the node. Furthermore, the
importance of the node offering the recommendation determines how important
the vote itself is, and this information is also taken into account by the ranking
algorithm.

                                                              P ageRank(Vb )
             P ageRank(Va ) = (1 − d) + d                                        (8)
                                                               |degree(Vb )|
                                                (Va ,Vb )∈E

   Sinha et al.[2] have used the Page-Rank algorithm to recursively score the
candidate nodes for a weighted undirected graph. Va and Vb are two nodes in the
8         A Survey on Unsupervised Graph-based Word Sense Disambiguation

graph connected by edges with weight wba and the Page-Rank score is computed
based in the following equation:

                                                           wba
      P ageRank(Va ) = (1 − d) + d                                        P ageRank(Vb )    (9)
                                                      (Vc ,Vb )∈E   wbc
                                      (Va ,Vb )∈E



2.7     HITS Method

Tsataronis et al[1] use the same semantic representation for the HITS ranking
algorithm. This approach identifies the most important nodes in the graph also
known as authorities and the nodes that point to this kind of nodes, also known
as hubs. The major disadvantage of the HITS algorithm is that the densely
connected nodes can attract the highest score (clique attack ). Every node has
attached a pair of values for it’s authority and hub score with initial values set
to 1. Hubs and authorities are iteratively updated using the equations (10) and
(11).
                            authority(p) =                 hub(q)                          (10)
                                                q∈In(p))


                            hub(p) =                authority(r)                           (11)
                                       r∈Out(p)

    In(i) are all the nodes that link to i and Out(i) are all the nodes i links
to. Equations (10) and (11) are extended with weights for the graph edges. In
equations (12) and (13), wi,j is the weight for the edge connecting node i with
node j.

                         authority(p) =               wq,p · hub(q)                        (12)
                                           q∈In(p))


                         hub(p) =              wp,r · authority(r)                         (13)
                                    r∈Out(p)

A normalization is used for the scores which divides each authority by the sum
of all authority values and each hub by the sum of all hub values. The sense with
the highest authority score is chosen as the most likely one for each word.


2.8     P-Rank Method

The P-Rank measure16 is a recently introduced method for the structural sim-
ilarity of nodes in an information network and represents a generalization of
16
     Zhao, P., Han, J., Sun, Y.: P-Rank: a comprehensive structural similarity measure
     over information networks. In Proc. of CIKM, pages 553-562, 2009.
A Survey on Unsupervised Graph-based Word Sense Disambiguation                 9

other state of the art measures like CoCitation17 , Coupling18 , Amsler19 and
SimRank20 . P-Rank is based on the idea that two nodes are similar if they are
referenced and also reference similar nodes. Rk+1 (a, b) represents the P-Rank
score for nodes a and b at iteration k + 1 and is computed based on the recursive
equation:

                                                  |I(a)| |I(b)|
                                        C
                 Rk+1 (a, b) = λ ·                                Rk (Ii (a), Ij (b))
                                   |I(a)||I(b)|    i=1   j=1
                                               |O(a)| |O(b)|
                                     C
                   +(1 − λ) ·                                  Rk (Oi (a), Oj (b))
                                |O(a)||O(b)|    i=1    j=1


    In equations (14) and (15), Incoming(a) and Outgoing(a) are the lists for
the incoming and outgoing neighbours of node a and the definition of |I(a)|
and |O(a)| takes into consideration the weights of all the edges that connect
the neighbours of node. The parameter λ ∈ [0, 1] is used to balance the weight
on in- and out link directions. The value Tsatsaronis et al.[1] have chosen for
their experiments is 0.5. C ∈ [0, 1] is a damping factor for the in- and out-link
directions with an usual value of 0.8.

                                |I(a)| =                     wi,a                       (14)
                                           i∈Incoming(a)



                                |O(a)| =                     wa,j                       (15)
                                           j∈Outgoing(a)



3      Experiments and Results

The Senseval 2 and 3 All English Words Task data sets are often used for testing
WSD systems since they are manually annotated by human experts. Tables the
statistics of the data sets for nouns (N), verbs (V), adjectives (Adj), adverbs
(Adv) and all the words computed considering their senses from the WordNet
2 thesaurus. Verbs are the most difficult to disambiguate and have an average
polysemy close to 11, while adverbs have an average polysemy close to 1.
17
     Small, H. G.: Co-citation in the scientific literature: A new measure of relationship
     between two documents. Journal of the American Society for Information Science,
     24(4):265269, 1973
18
     Kessler, M. M.: Bibliographic coupling between scientific papers. American Docu-
     mentation, 14(1):1025, 1963
19
     Amsler, R.: Application of citation-based automatic classification. Technical report,
     The University of Texas at Austin, Linguistics Research Center, Austin, TX,, 1972
20
     Jeh, G., Widom, J.: SimRank: A measure of structural-context similarity. In Proc.
     of KDD, pages 538-543, 2002.
10        A Survey on Unsupervised Graph-based Word Sense Disambiguation

Table 2. Polysemous and monosemous occurrences for the Senseval 2 words using
WordNet 2

                                 N       V     Adj    Adv     All
        Monosemous              260      33     80     91     464
        Polysemous              813     502    352    172    1839
    Average Polysemous          4.21    9.9    3.94   3.23   5.37
Average Polysemous (P. only)    5.24   10.48   4.61   4.41   6.48

Table 3. Polysemous and monosemous occurrences for the Senseval 3 words using
WordNet 2

                                 N       V     Adj    Adv     All
        Monosemous              193      39     72     13     317
        Polysemous              699     686    276      1    1662
    Average Polysemous          5.07   11.49   4.13   1.07   7.23
Average Polysemous (P. only)    6.19   12.08   4.95    2.0   8.41



    A baseline was computed selecting a random sense from the WordNet. Other
supervised systems have used as baseline the most frequent sense in the the-
saurus.
    Table 4 presents a comparison between different WSD results, independently
of the type of methods used. The top tree unsupervised methods PR, HITS and
the method of Agirre and Soroa are compared with the highest results reported
in the literature for the Senseval 2 and 3 data sets. The best performing method
is the supervised approach Simil-Prime with an overall accuracy of 65%. The
results table shows that, though the unsupervised systems do not perform as
good as the unsupervised ones, they indeed reduced the gap between the two
approaches.


     Table 4. Accuracies on the Senseval 2 and 3 All English Words Task data sets.

 Dataset SenseLearner Simil-Prime SSI WE FS PR HITS Agi09
Senseval 2  64.82        65.00    n/a 63.2 63.7 58.8 58.3 59.5
Senseval 3  63.01        65.85    60.4 n/a 61.3 56.7 57.4 57.4




4     Conclusions

The recent state of the art WSD systems minimise the gap between supervised
and unsupervised approaches. This paper describes several graph based methods
which make the most of the rich semantic model they employ. Unsupervised
systems have also the advantage of seeking the optimal value for the parameters
using as little data as possible and testing on as large a dataset as possible.
Future work could investigate the results of the recently introduced P-Rank
A Survey on Unsupervised Graph-based Word Sense Disambiguation            11

algorithm on a different model like the one proposed by Sinha et al. This way
we could investigate the influence of the model upon each algorithm result.


5   References
1. Tsatsaronis, G., Varlamis, I., Norvag, K. : An Experimental Study on Unsupervised
   Graph-based Word Sense Disambiguation. In Proc. of CICLing (2010).
2. Sinha, R., Mihalcea, R. :Unsupervised graph-based word sense disambiguation using
   measures of semantic similarity. In Proc. of ICSC (2007).
3. Mihalcea, R., Csomai, A. : Senselearner: Word sense disambiguation for all words
   in unrestricted text. In Proc. of ACL, pages 53-56 (2005).
4. Tsatsaronis, G., Vazirgiannis, M., Androutsopoulos, I. : Word Sense Disambiguation
   with Spreading Activation Networks Generated from Thesauri. In Proc. of IJCAI
   (2007).

More Related Content

What's hot

A Data-driven Method for the Detection of Close Submitters in Online Learning...
A Data-driven Method for the Detection of Close Submitters in Online Learning...A Data-driven Method for the Detection of Close Submitters in Online Learning...
A Data-driven Method for the Detection of Close Submitters in Online Learning...MIT
 
Deep Neural Networks in Text Classification using Active Learning
Deep Neural Networks in Text Classification using Active LearningDeep Neural Networks in Text Classification using Active Learning
Deep Neural Networks in Text Classification using Active LearningMirsaeid Abolghasemi
 
Attention scores and mechanisms
Attention scores and mechanismsAttention scores and mechanisms
Attention scores and mechanismsJaeHo Jang
 
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATIONEXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATIONijaia
 
Paper id 28201441
Paper id 28201441Paper id 28201441
Paper id 28201441IJRAT
 
Intelligent information extraction based on artificial neural network
Intelligent information extraction based on artificial neural networkIntelligent information extraction based on artificial neural network
Intelligent information extraction based on artificial neural networkijfcstjournal
 
Non-parametric Subject Prediction
Non-parametric Subject PredictionNon-parametric Subject Prediction
Non-parametric Subject PredictionShenghui Wang
 
The effect of number of concepts on readability of schemas 2
The effect of number of concepts on readability of schemas 2The effect of number of concepts on readability of schemas 2
The effect of number of concepts on readability of schemas 2Saman Sara
 
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGICA SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGICcscpconf
 
Unit 1 Introduction to Data Compression
Unit 1 Introduction to Data CompressionUnit 1 Introduction to Data Compression
Unit 1 Introduction to Data CompressionDr Piyush Charan
 
Dms introduction Sharmila Chidaravalli
Dms introduction Sharmila ChidaravalliDms introduction Sharmila Chidaravalli
Dms introduction Sharmila ChidaravalliSharmilaChidaravalli
 
Data Tactics Analytics Brown Bag (Aug 22, 2013)
Data Tactics Analytics Brown Bag (Aug 22, 2013)Data Tactics Analytics Brown Bag (Aug 22, 2013)
Data Tactics Analytics Brown Bag (Aug 22, 2013)Rich Heimann
 
Discrete Structure
Discrete Structure Discrete Structure
Discrete Structure Syed Shah
 
Review: Semi-Supervised Learning Methods for Word Sense Disambiguation
Review: Semi-Supervised Learning Methods for Word Sense DisambiguationReview: Semi-Supervised Learning Methods for Word Sense Disambiguation
Review: Semi-Supervised Learning Methods for Word Sense DisambiguationIOSR Journals
 
76 s201906
76 s20190676 s201906
76 s201906IJRAT
 
Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Rich Heimann
 
Stock markets and_human_genomics
Stock markets and_human_genomicsStock markets and_human_genomics
Stock markets and_human_genomicsShyam Sarkar
 

What's hot (20)

A Data-driven Method for the Detection of Close Submitters in Online Learning...
A Data-driven Method for the Detection of Close Submitters in Online Learning...A Data-driven Method for the Detection of Close Submitters in Online Learning...
A Data-driven Method for the Detection of Close Submitters in Online Learning...
 
Using Knowledge Graph for Promoting Cognitive Computing
Using Knowledge Graph for Promoting Cognitive ComputingUsing Knowledge Graph for Promoting Cognitive Computing
Using Knowledge Graph for Promoting Cognitive Computing
 
Deep Neural Networks in Text Classification using Active Learning
Deep Neural Networks in Text Classification using Active LearningDeep Neural Networks in Text Classification using Active Learning
Deep Neural Networks in Text Classification using Active Learning
 
Attention scores and mechanisms
Attention scores and mechanismsAttention scores and mechanisms
Attention scores and mechanisms
 
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATIONEXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
 
Paper id 28201441
Paper id 28201441Paper id 28201441
Paper id 28201441
 
Intelligent information extraction based on artificial neural network
Intelligent information extraction based on artificial neural networkIntelligent information extraction based on artificial neural network
Intelligent information extraction based on artificial neural network
 
Non-parametric Subject Prediction
Non-parametric Subject PredictionNon-parametric Subject Prediction
Non-parametric Subject Prediction
 
The effect of number of concepts on readability of schemas 2
The effect of number of concepts on readability of schemas 2The effect of number of concepts on readability of schemas 2
The effect of number of concepts on readability of schemas 2
 
IFITT PhD Seminar 2015. Text Mining Ideas & Examples
IFITT PhD Seminar 2015. Text Mining Ideas & ExamplesIFITT PhD Seminar 2015. Text Mining Ideas & Examples
IFITT PhD Seminar 2015. Text Mining Ideas & Examples
 
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGICA SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
 
Unit 1 Introduction to Data Compression
Unit 1 Introduction to Data CompressionUnit 1 Introduction to Data Compression
Unit 1 Introduction to Data Compression
 
Dms introduction Sharmila Chidaravalli
Dms introduction Sharmila ChidaravalliDms introduction Sharmila Chidaravalli
Dms introduction Sharmila Chidaravalli
 
Data Tactics Analytics Brown Bag (Aug 22, 2013)
Data Tactics Analytics Brown Bag (Aug 22, 2013)Data Tactics Analytics Brown Bag (Aug 22, 2013)
Data Tactics Analytics Brown Bag (Aug 22, 2013)
 
Discrete Structure
Discrete Structure Discrete Structure
Discrete Structure
 
Review: Semi-Supervised Learning Methods for Word Sense Disambiguation
Review: Semi-Supervised Learning Methods for Word Sense DisambiguationReview: Semi-Supervised Learning Methods for Word Sense Disambiguation
Review: Semi-Supervised Learning Methods for Word Sense Disambiguation
 
76 s201906
76 s20190676 s201906
76 s201906
 
Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)
 
228-SE3001_2
228-SE3001_2228-SE3001_2
228-SE3001_2
 
Stock markets and_human_genomics
Stock markets and_human_genomicsStock markets and_human_genomics
Stock markets and_human_genomics
 

Similar to A Survey on Unsupervised Graph-based Word Sense Disambiguation

Computational Intelligence Methods for Clustering of Sense Tagged Nepali Docu...
Computational Intelligence Methods for Clustering of Sense Tagged Nepali Docu...Computational Intelligence Methods for Clustering of Sense Tagged Nepali Docu...
Computational Intelligence Methods for Clustering of Sense Tagged Nepali Docu...IOSR Journals
 
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSA COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSgerogepatton
 
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSA COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSgerogepatton
 
Usage of word sense disambiguation in concept identification in ontology cons...
Usage of word sense disambiguation in concept identification in ontology cons...Usage of word sense disambiguation in concept identification in ontology cons...
Usage of word sense disambiguation in concept identification in ontology cons...Innovation Quotient Pvt Ltd
 
Automated Essay Scoring Using Generalized Latent Semantic Analysis
Automated Essay Scoring Using Generalized Latent Semantic AnalysisAutomated Essay Scoring Using Generalized Latent Semantic Analysis
Automated Essay Scoring Using Generalized Latent Semantic AnalysisGina Rizzo
 
Sentence Validation by Statistical Language Modeling and Semantic Relations
Sentence Validation by Statistical Language Modeling and Semantic RelationsSentence Validation by Statistical Language Modeling and Semantic Relations
Sentence Validation by Statistical Language Modeling and Semantic RelationsEditor IJCATR
 
An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...
An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...
An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...iosrjce
 
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATA
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATAIDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATA
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATAijistjournal
 
Identifying the semantic relations on
Identifying the semantic relations onIdentifying the semantic relations on
Identifying the semantic relations onijistjournal
 
Context Sensitive Relatedness Measure of Word Pairs
Context Sensitive Relatedness Measure of Word PairsContext Sensitive Relatedness Measure of Word Pairs
Context Sensitive Relatedness Measure of Word PairsIJCSIS Research Publications
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD Editor
 
A SURVEY ON SIMILARITY MEASURES IN TEXT MINING
A SURVEY ON SIMILARITY MEASURES IN TEXT MINING A SURVEY ON SIMILARITY MEASURES IN TEXT MINING
A SURVEY ON SIMILARITY MEASURES IN TEXT MINING mlaij
 
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURES
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURESNAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURES
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURESacijjournal
 
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING cscpconf
 
A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...
A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...
A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...ijaia
 

Similar to A Survey on Unsupervised Graph-based Word Sense Disambiguation (20)

Computational Intelligence Methods for Clustering of Sense Tagged Nepali Docu...
Computational Intelligence Methods for Clustering of Sense Tagged Nepali Docu...Computational Intelligence Methods for Clustering of Sense Tagged Nepali Docu...
Computational Intelligence Methods for Clustering of Sense Tagged Nepali Docu...
 
L017158389
L017158389L017158389
L017158389
 
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSA COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
 
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSA COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
 
Usage of word sense disambiguation in concept identification in ontology cons...
Usage of word sense disambiguation in concept identification in ontology cons...Usage of word sense disambiguation in concept identification in ontology cons...
Usage of word sense disambiguation in concept identification in ontology cons...
 
Automated Essay Scoring Using Generalized Latent Semantic Analysis
Automated Essay Scoring Using Generalized Latent Semantic AnalysisAutomated Essay Scoring Using Generalized Latent Semantic Analysis
Automated Essay Scoring Using Generalized Latent Semantic Analysis
 
Sentence Validation by Statistical Language Modeling and Semantic Relations
Sentence Validation by Statistical Language Modeling and Semantic RelationsSentence Validation by Statistical Language Modeling and Semantic Relations
Sentence Validation by Statistical Language Modeling and Semantic Relations
 
F017243241
F017243241F017243241
F017243241
 
An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...
An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...
An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...
 
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATA
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATAIDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATA
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATA
 
Identifying the semantic relations on
Identifying the semantic relations onIdentifying the semantic relations on
Identifying the semantic relations on
 
Context Sensitive Relatedness Measure of Word Pairs
Context Sensitive Relatedness Measure of Word PairsContext Sensitive Relatedness Measure of Word Pairs
Context Sensitive Relatedness Measure of Word Pairs
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
 
call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...
 
A SURVEY ON SIMILARITY MEASURES IN TEXT MINING
A SURVEY ON SIMILARITY MEASURES IN TEXT MINING A SURVEY ON SIMILARITY MEASURES IN TEXT MINING
A SURVEY ON SIMILARITY MEASURES IN TEXT MINING
 
mlss
mlssmlss
mlss
 
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURES
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURESNAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURES
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURES
 
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
 
A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...
A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...
A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...
 
P13 corley
P13 corleyP13 corley
P13 corley
 

More from Elena-Oana Tabaranu

Recunoasterea organizatiilor in postarile pe Tweeter
Recunoasterea organizatiilor in postarile pe TweeterRecunoasterea organizatiilor in postarile pe Tweeter
Recunoasterea organizatiilor in postarile pe TweeterElena-Oana Tabaranu
 
SXSW 2012 JavaScript MythBusters
SXSW 2012 JavaScript MythBustersSXSW 2012 JavaScript MythBusters
SXSW 2012 JavaScript MythBustersElena-Oana Tabaranu
 
Semantic Tagging for the XWiki Platform with Zemanta and DBpedia
Semantic Tagging for the XWiki Platform with Zemanta and DBpediaSemantic Tagging for the XWiki Platform with Zemanta and DBpedia
Semantic Tagging for the XWiki Platform with Zemanta and DBpediaElena-Oana Tabaranu
 
Miscarea "NoSQL" in contextul Web-ului social/semantic
Miscarea "NoSQL" in contextul Web-ului social/semanticMiscarea "NoSQL" in contextul Web-ului social/semantic
Miscarea "NoSQL" in contextul Web-ului social/semanticElena-Oana Tabaranu
 
Folosirea instumentului Zemanta in recomandarea de continut
Folosirea instumentului Zemanta in recomandarea de continutFolosirea instumentului Zemanta in recomandarea de continut
Folosirea instumentului Zemanta in recomandarea de continutElena-Oana Tabaranu
 

More from Elena-Oana Tabaranu (7)

Recunoasterea organizatiilor in postarile pe Tweeter
Recunoasterea organizatiilor in postarile pe TweeterRecunoasterea organizatiilor in postarile pe Tweeter
Recunoasterea organizatiilor in postarile pe Tweeter
 
SXSW 2012 JavaScript MythBusters
SXSW 2012 JavaScript MythBustersSXSW 2012 JavaScript MythBusters
SXSW 2012 JavaScript MythBusters
 
Notes on a Standard: Unicode
Notes on a Standard: UnicodeNotes on a Standard: Unicode
Notes on a Standard: Unicode
 
Semantic Tagging for the XWiki Platform with Zemanta and DBpedia
Semantic Tagging for the XWiki Platform with Zemanta and DBpediaSemantic Tagging for the XWiki Platform with Zemanta and DBpedia
Semantic Tagging for the XWiki Platform with Zemanta and DBpedia
 
Miscarea "NoSQL" in contextul Web-ului social/semantic
Miscarea "NoSQL" in contextul Web-ului social/semanticMiscarea "NoSQL" in contextul Web-ului social/semantic
Miscarea "NoSQL" in contextul Web-ului social/semantic
 
Folosirea instumentului Zemanta in recomandarea de continut
Folosirea instumentului Zemanta in recomandarea de continutFolosirea instumentului Zemanta in recomandarea de continut
Folosirea instumentului Zemanta in recomandarea de continut
 
Adobe Flex Framework
Adobe Flex FrameworkAdobe Flex Framework
Adobe Flex Framework
 

Recently uploaded

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 

A Survey on Unsupervised Graph-based Word Sense Disambiguation

  • 1. A Survey on Unsupervised Graph-based Word Sense Disambiguation Elena-Oana T˘b˘ranu a a Faculty of Computer Science “Alexandru I. Cuza” University of Ia¸i s {elena.tabaranu@info.uaic.ro} Abstract. This paper presents comparative evaluations of graph based word sense disambiguation techniques using several measures of word semantic similarity and several ranking algorithms. Unsupervised word sense disambiguation has received a lot of attention lately because of it’s fast execution time and it’s ability to make the most of a small input corpus. Recent state of the art graph based systems have tried to close the gap between the supervised and the unsupervised approaches. Key words: WordNet, WSD, Semantic Graphs, SAN, HITS, PageR- ank, P-Rank 1 Introduction The problem of word sense disambiguation (WSD) is defined by Sinha et al[2] as the task of automatically assigning the most appropriate meaning to a poly- semous word within a given context. WSD methods are critical for solving natural language processing tasks like machine translation and speech processing, but also boost the performance of other tasks like text retrieval, document classification and document clustering. Approaches found in the bibliography face the trade off between unsupervised and supervised methods: the first one has fast execution time, but low accuracy and the second one requires training in a large amount of manually annotated data. The graph based methods make the most of the semantic model they em- ploy, thus trying to close the gap between the unsupervised and supervised ap- proaches. This paper is organized as follows. It describes the latest state-of the art methods for unsupervised graph-based word sense disambiguation. Next, this paper presents several comparative evaluations carried on the Senseval data sets using the same semantic representation.
  • 2. 2 A Survey on Unsupervised Graph-based Word Sense Disambiguation 2 State of the Art 2.1 Supervised Word Sense Disambiguation Supervised word sense disambiguation systems have an accuracy of 60%-70% while the unsupervised ones struggle between 45% and 60%. Most approaches transform the sense of a particular word into a feature vector to be used in the learning process. The major disadvantage of using such supervised learning methods emerges from the knowledge acquisition bottleneck problem because their accuracy is strongly connected to the amount of annotated corpus available. State of the art results include Mihalcea and Csomai[3]’s SenseLearner which employs seven semantic models trained using a memory based algorithm, the Simil-Prime1 system and the results reported by Hoste et al.2 The SenseLearner uses a minimal supervised approach because it’s aim is to process a relatively small data set for training and also generalize the learned concepts as global models for general word categories. SenseLearner takes as input raw text which is preprocessed before computing the feature vector. Next, a semantic model is learned for all predefined word categories, which are defined as groups of words that share some common syntactic or semantic properties. Once defined and trained, the models are used to annotate the ambiguous words in the test corpus with their corresponding meaning. Training the SenseLearner system used the SemCor semantically annotated dataset and evaluation was done with Senseval 2 and 3 English All Words data sets with results of 71.3% and 68.1% respectively. The best supervised results were reported by SMUaw3 and GAMBL4 systems as winners of the Senseval 2 and 3 All English Words Task. The former is based on pattern learning from sense tagged corpora and instance based learning with automatic feature selection, while the latter needs extensive training using memory based classifiers. 2.2 Unsupervised Word Sense Disambiguation Unsupervised Word Sense Disambiguation systems seek to identify the best sense candidate for a model of the word sense dependency in text. Such systems use a metric of semantic similarity to compute the relatedness between the senses and an algorithm which chooses their most likely combination. 1 Kohomban, U., Lee, W.: Learning semantic classes for word sense disambiguation. In Proc. of ACL, pages 34-41, 2005. 2 Hoste, V., Daelemens, W., Hendrickx, I., van den Bosch, A.: Evaluating the results of the memory-based word-expert approach to unrestricted word sense disambiguation. In Proc. of the ACL Workshop on Word Sense Disambiguation, 2002. 3 Mihalcea, R.: Word sense disambiguation with pattern learning and automatic fea- ture selection. Natural Language Engineering, 1(1):1-15, 2002. 4 Decadt, B., Hoste, V., Daelemens, W., van den Bosch, A.: GAMBL, genetic algo- rithm optimization for memory-based wsd. In Proc. of the Senseval3: Third Inter- national Workshop on the Evaluation of Systems for the Semantic Analysis of Text, 2004.
  • 3. A Survey on Unsupervised Graph-based Word Sense Disambiguation 3 Fig. 1. Semantic model learning in SenseLearner. Sinha et al.[2] have evaluated six methods of semantic similarity assuming as input a pair of concepts from the WordNet5 hierarchy: Leacock & Chodorow6 (lcj), Lesk7 (lesk), Wu & Palmer8 , Resnik9 , Lin10 , and Jiang & Conrath11 (jcn). They also use a normalization technique to implement a combination of the similarity measures, which accounts for the strength of each individual metric. Leacock & Chodorow is a similarity metric computed using the equation (1) where the length is the length of the shortest path between two concepts using 5 Fellbaum, C.: WordNet an electronic lexical database. MIT Press, 1998. 6 Leacock, C., Chodorow, M.: Combining local context and WordNet sense similarity for word sense identification. In WordNet, An Electronic Lexical Database. The MIT Press, 1998 7 Lesk, M.: Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone. In Proceedings of the SIGDOC Conference 1986, Toronto, June 1986. 8 Wu, Z., Palmer, M.: Verb semantics and lexical selection.In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, New Mexico, 1994. 9 Resnik, P.: Using information content to evaluate semantic similarity. In Proceed- ings of the 14th International Joint Conference on Artificial Intelligence, Montreal, Canada, 1995. 10 Lin, D.: An information-theoretic definition of similarity. In Proceedings of the 15th International Conference on Machine Learning, Madison, WI, 1998. 11 Jiang, J., Conrath, D.: Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of the International Conference on Research in Computa- tional Linguistics, Taiwan, 1997.
  • 4. 4 A Survey on Unsupervised Graph-based Word Sense Disambiguation node-counting, and D is the maximum depth of the taxonomy. length simlch = −log (1) 2∗D The metric introduced by Jiang & Conrath uses the least common subsumer (LCS)and combines the information content (IC) of two input concepts: 1 simjcn = (2) IC(concept1 ) + IC(concept2 ) − 2 ∗ IC(LCS) The information content is defined as: IC(c) = −log(P (c)) (3) The table below proves that a combination of the jcn, lch and lesk measures performs better then using them individually. Table 1. Results for the individual or combined similarity measures jcn lch lesk combined Precission 51.57 41.47 51.87 53.43 Recall 19.12 16.02 44.97 53.43 F-measure 27.89 23.11 48.17 53.43 Tsatsaronis et al.[1] propose a new node similarity algorithm P-Rank for their graph representation which actually does not seem to perform better then the other unsupervised methods. They justify the lower results based on Nav- igli and Lapata12 ’s observations which also reported lower performance for the betweenness and indegree methods of structural similarity. 2.3 Graph-based Methods Graph-based methods model the word sense dependency in text using a graph representation. Senses are represented as labelled nodes in the graph and weighted edges are added to mark the dependency among them. Each word thus has a window associated with it, including several words before and after that word, which in turn means that each word has a corresponding graph associated with it, and it is that word that gets disambiguated after the ranking algorithm are run on that graph. The node with the highest value is chosen as the most probable sense for that word. Sinha et al[1] have noticed a remarkable property that makes these graph- based algorithms appealing: the fact that they take into account information 12 Navigli, R., Lapata, M.: Graph connectivity measures for unsupervised word sense disambiguation. In Proc of IJCAI, pages 1683-1688, 2007.
  • 5. A Survey on Unsupervised Graph-based Word Sense Disambiguation 5 drawn from the entire graph, capturing relationships among all the words in a sequence, which makes them superior to other approaches that rely only on local information individually derived for each word. 2.4 Semantic Graph Construction Graph-based methods usually associate a node for each word to be processed. Senses can be represented as labels and their dependencies are indicated as edge weights. The likelihood of each sense can be determined using a graph-based ranking algorithm, which runs over the graph of potential senses and identifies the ‘best’ one. Given a sequence of words W = w1 , w2 , w3 , w4 and their corresponding labels 1 2 Nw Lwi = lwi , lwi , ..., lwi i , Sinha et al[1] define a labeled graph G = (V, E) such that j there is a node v ∈ V for every possible label lwi , i = 1..n, j = 1.Nwi . Edges e ∈ E map the dependencies between pairs of labels. Fig. 2. Sample semantic representation used by Sinha et al[2] for a sequence of four words w1, w2, w3, w4 and their corresponding labels. Tsatsaronis et al[1] have used a semantic model which contains only the words which have an entry in the WordNet thesaurus. Their approach first adds all the words and their corresponding senses represented by WordNet synsets to the network (Initial Phase). The expansion phase extends the network iteratively for each word with all the semantically related senses from the WordNet (Expan- sion Round 1) until the network is connected. Failing to construct a connected network will imply that the words in the sentence cannot be disambiguated. Weights are added in the next step computed based on the frequency of each edge type (Expansion Round 2 ). At some point in the construction phase, some
  • 6. 6 A Survey on Unsupervised Graph-based Word Sense Disambiguation nodes could share the same sense (Expansion Example 2) and in this particular case only one labelled node is added to the network. Fig. 3. Sample semantic representation used by Tsatsaronis et al[1] for words ii and tj and their corresponding senses. Other approaches in the literature have used the gloss words of the WordNet entries13 , have defined additional composite semantic relations14 or have used the Extended WordNet to enhance their model15 . 2.5 Spreading of Activation (SAN) Method The spreading of activation in semantic networks proposed by Tsasaronis et al[4] consider all nodes to have an activation level of 0, except for the input nodes 13 Veronis, J., Ide, N.: Word Sense Disambiguation with very large neural networks extracted from machine readable dictionaries. In Proc. of COLING, pages 389-394, 1990. 14 Mihalcea, R., Tarau, P., Figa, E.: PageRank on semantic networks with application to word sense disambiguation. In Proc. of COLING, 2004. 15 Agirre, E., Soroa, A.: Personalizing page rank for word sense disambiguation. In Proc. of EACL, pages 33-41, 2009.
  • 7. A Survey on Unsupervised Graph-based Word Sense Disambiguation 7 which have a value of 1. At each iteration p the node j propagates it’s output activation Oj (p) to it’s neighbours as a function f of it’s current activation value A(j (p) and the weights of the edges that connect it with it’s neighbours. Oj (p) = f (Aj (p)) (4) The activation level of a node k at iteration p is influenced by the output function at iteration p − 1 of every neighbour j with a direct edge ejk . Wjk if the function for the edge weights. Ak (p) = Oj (p − 1) · Wjk (5) j The function to compute the output activation level must be chosen carefully since the network can be flooded. Tsataronis et al[1] use the function of equation 3 with a threshold value τ to prevent the nodes with a low activation level to 1 influence their corresponding neighbours. Also, 1+p is a factor used to reduce the influence of a node to its neighbours as iterations go by, while the Fj function will reduce the influence of nodes that connect to many neighbours. This algorithm requires no training. 0 if Aj (p) < τ Oj (p) = Fj (6) p+1 · Aj (p) otherwise CT represents the total number of nodes, while Cj is the number of nodes with a direct edge from j. Cj Fj = (1 − ) (7) CT 2.6 Page-Rank Method Page-Rank is a graph ranking algorithm based on the idea of “voting” or “rec- ommendation”. When one node links to another one, it basically offers a recom- mendation for that other node. The higher the number of recommendations that are offered for a node, the higher the importance of the node. Furthermore, the importance of the node offering the recommendation determines how important the vote itself is, and this information is also taken into account by the ranking algorithm. P ageRank(Vb ) P ageRank(Va ) = (1 − d) + d (8) |degree(Vb )| (Va ,Vb )∈E Sinha et al.[2] have used the Page-Rank algorithm to recursively score the candidate nodes for a weighted undirected graph. Va and Vb are two nodes in the
  • 8. 8 A Survey on Unsupervised Graph-based Word Sense Disambiguation graph connected by edges with weight wba and the Page-Rank score is computed based in the following equation: wba P ageRank(Va ) = (1 − d) + d P ageRank(Vb ) (9) (Vc ,Vb )∈E wbc (Va ,Vb )∈E 2.7 HITS Method Tsataronis et al[1] use the same semantic representation for the HITS ranking algorithm. This approach identifies the most important nodes in the graph also known as authorities and the nodes that point to this kind of nodes, also known as hubs. The major disadvantage of the HITS algorithm is that the densely connected nodes can attract the highest score (clique attack ). Every node has attached a pair of values for it’s authority and hub score with initial values set to 1. Hubs and authorities are iteratively updated using the equations (10) and (11). authority(p) = hub(q) (10) q∈In(p)) hub(p) = authority(r) (11) r∈Out(p) In(i) are all the nodes that link to i and Out(i) are all the nodes i links to. Equations (10) and (11) are extended with weights for the graph edges. In equations (12) and (13), wi,j is the weight for the edge connecting node i with node j. authority(p) = wq,p · hub(q) (12) q∈In(p)) hub(p) = wp,r · authority(r) (13) r∈Out(p) A normalization is used for the scores which divides each authority by the sum of all authority values and each hub by the sum of all hub values. The sense with the highest authority score is chosen as the most likely one for each word. 2.8 P-Rank Method The P-Rank measure16 is a recently introduced method for the structural sim- ilarity of nodes in an information network and represents a generalization of 16 Zhao, P., Han, J., Sun, Y.: P-Rank: a comprehensive structural similarity measure over information networks. In Proc. of CIKM, pages 553-562, 2009.
  • 9. A Survey on Unsupervised Graph-based Word Sense Disambiguation 9 other state of the art measures like CoCitation17 , Coupling18 , Amsler19 and SimRank20 . P-Rank is based on the idea that two nodes are similar if they are referenced and also reference similar nodes. Rk+1 (a, b) represents the P-Rank score for nodes a and b at iteration k + 1 and is computed based on the recursive equation: |I(a)| |I(b)| C Rk+1 (a, b) = λ · Rk (Ii (a), Ij (b)) |I(a)||I(b)| i=1 j=1 |O(a)| |O(b)| C +(1 − λ) · Rk (Oi (a), Oj (b)) |O(a)||O(b)| i=1 j=1 In equations (14) and (15), Incoming(a) and Outgoing(a) are the lists for the incoming and outgoing neighbours of node a and the definition of |I(a)| and |O(a)| takes into consideration the weights of all the edges that connect the neighbours of node. The parameter λ ∈ [0, 1] is used to balance the weight on in- and out link directions. The value Tsatsaronis et al.[1] have chosen for their experiments is 0.5. C ∈ [0, 1] is a damping factor for the in- and out-link directions with an usual value of 0.8. |I(a)| = wi,a (14) i∈Incoming(a) |O(a)| = wa,j (15) j∈Outgoing(a) 3 Experiments and Results The Senseval 2 and 3 All English Words Task data sets are often used for testing WSD systems since they are manually annotated by human experts. Tables the statistics of the data sets for nouns (N), verbs (V), adjectives (Adj), adverbs (Adv) and all the words computed considering their senses from the WordNet 2 thesaurus. Verbs are the most difficult to disambiguate and have an average polysemy close to 11, while adverbs have an average polysemy close to 1. 17 Small, H. G.: Co-citation in the scientific literature: A new measure of relationship between two documents. Journal of the American Society for Information Science, 24(4):265269, 1973 18 Kessler, M. M.: Bibliographic coupling between scientific papers. American Docu- mentation, 14(1):1025, 1963 19 Amsler, R.: Application of citation-based automatic classification. Technical report, The University of Texas at Austin, Linguistics Research Center, Austin, TX,, 1972 20 Jeh, G., Widom, J.: SimRank: A measure of structural-context similarity. In Proc. of KDD, pages 538-543, 2002.
  • 10. 10 A Survey on Unsupervised Graph-based Word Sense Disambiguation Table 2. Polysemous and monosemous occurrences for the Senseval 2 words using WordNet 2 N V Adj Adv All Monosemous 260 33 80 91 464 Polysemous 813 502 352 172 1839 Average Polysemous 4.21 9.9 3.94 3.23 5.37 Average Polysemous (P. only) 5.24 10.48 4.61 4.41 6.48 Table 3. Polysemous and monosemous occurrences for the Senseval 3 words using WordNet 2 N V Adj Adv All Monosemous 193 39 72 13 317 Polysemous 699 686 276 1 1662 Average Polysemous 5.07 11.49 4.13 1.07 7.23 Average Polysemous (P. only) 6.19 12.08 4.95 2.0 8.41 A baseline was computed selecting a random sense from the WordNet. Other supervised systems have used as baseline the most frequent sense in the the- saurus. Table 4 presents a comparison between different WSD results, independently of the type of methods used. The top tree unsupervised methods PR, HITS and the method of Agirre and Soroa are compared with the highest results reported in the literature for the Senseval 2 and 3 data sets. The best performing method is the supervised approach Simil-Prime with an overall accuracy of 65%. The results table shows that, though the unsupervised systems do not perform as good as the unsupervised ones, they indeed reduced the gap between the two approaches. Table 4. Accuracies on the Senseval 2 and 3 All English Words Task data sets. Dataset SenseLearner Simil-Prime SSI WE FS PR HITS Agi09 Senseval 2 64.82 65.00 n/a 63.2 63.7 58.8 58.3 59.5 Senseval 3 63.01 65.85 60.4 n/a 61.3 56.7 57.4 57.4 4 Conclusions The recent state of the art WSD systems minimise the gap between supervised and unsupervised approaches. This paper describes several graph based methods which make the most of the rich semantic model they employ. Unsupervised systems have also the advantage of seeking the optimal value for the parameters using as little data as possible and testing on as large a dataset as possible. Future work could investigate the results of the recently introduced P-Rank
  • 11. A Survey on Unsupervised Graph-based Word Sense Disambiguation 11 algorithm on a different model like the one proposed by Sinha et al. This way we could investigate the influence of the model upon each algorithm result. 5 References 1. Tsatsaronis, G., Varlamis, I., Norvag, K. : An Experimental Study on Unsupervised Graph-based Word Sense Disambiguation. In Proc. of CICLing (2010). 2. Sinha, R., Mihalcea, R. :Unsupervised graph-based word sense disambiguation using measures of semantic similarity. In Proc. of ICSC (2007). 3. Mihalcea, R., Csomai, A. : Senselearner: Word sense disambiguation for all words in unrestricted text. In Proc. of ACL, pages 53-56 (2005). 4. Tsatsaronis, G., Vazirgiannis, M., Androutsopoulos, I. : Word Sense Disambiguation with Spreading Activation Networks Generated from Thesauri. In Proc. of IJCAI (2007).