Introduction         Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures




               Semantic Similarity Measures for
                Semantic Relation Extraction

                            Alexander Panchenko
               Center for Natural Language Processing (CENTAL)
                  Universit´ catholique de Louvain – Belgium
                           e
                    alexander.panchenko@uclouvain.be



                                  September 21, 2012



                                                                                       1 / 60
Introduction         Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures




Plan




      Introduction


      Pattern-Based Similarity Measures


      Hybrid Semantic Similarity Measures




                                                                                       2 / 60
Introduction            Pattern-Based Similarity Measures    Hybrid Semantic Similarity Measures




Semantic Similarity Measures


         1. A similarity measure sij = sim(ci , cj ) → [0, 1]
               • ci , cj – terms
               • sij – high for semantic relations ci , cj
                      • synonyms, hyponyms, co-hyponyms
               • sij – low for other pairs ci , cj
         2. Semantic similarity measures are useful for NLP/IR:
               •   WSD (Patwardhan et al., 2003)
               •   Query Expansion (Hsu et al., 2006)
               •   QA (Sun et al., 2005)
               •   Text Categorization (Tikk et al, 2003)
               •                    ˇ
                   Text Similarity (Saric et al., 2012)



                                                                                           3 / 60
Introduction          Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures




State of the Art
          • WordNet-based measures
              • WuPalmer (1994), LeacockChodorow (1998), Resnik (1995)
              • rely on manually crafted resources
              • highest precision, limited coverage
          • Dictionary-based measures
              • ExtendedLesk (Banerjee and Pedersen, 2003), GlossVectors
                (Patward han and Pedersen, 2006) and WiktionaryOverlap
                (Zesch et al., 2008)
              • rely on manually crafted resources
              • high precision, limited coverage
          • Corpus-based measures
              • ContextWindow (Van de Cruys, 2010), SyntacticContext (Lin,
                1998), LSA (Landauer et al., 1998)
              • no semantic resources are needed
              • low precision, high recall
          • Combined e.g. WikiRelate! (Strube and Ponzetto, 2006) . . .
                                                                                        4 / 60
Introduction         Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Introduction


Plan

      Introduction
      Pattern-Based Similarity Measures
         Introduction
         Lexico-Syntactic Patterns
         Semantic Similarity Measures
         Results
         Conclusion
      Hybrid Semantic Similarity Measures
         Introduction
         Features: Single Similarity Measures
         Hybrid Similarity Measures
         Results
         Conclusion

                                                                                       5 / 60
Introduction           Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Introduction


Reference Paper




          • Panchenko A., Morozova O., Naets H. “A Semantic
               Similarity Measure Based on Lexico-Syntactic Patterns”.
               In Proceedings of KONVENS 2012, pp.174–178, 2012




                                                                                         6 / 60
Introduction       Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Introduction


Try a Demo
          • http://serelex.cental.be/




                                                                                     7 / 60
Introduction                Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Lexico-Syntactic Patterns


Plan

      Introduction
      Pattern-Based Similarity Measures
         Introduction
         Lexico-Syntactic Patterns
         Semantic Similarity Measures
         Results
         Conclusion
      Hybrid Semantic Similarity Measures
         Introduction
         Features: Single Similarity Measures
         Hybrid Similarity Measures
         Results
         Conclusion

                                                                                              8 / 60
Introduction                Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Lexico-Syntactic Patterns


General architecture


          • 6 classical Hearst (1992) patterns
          • 12 further patterns
          • extracting hypernyms, co-hyponyms and synonyms




                                                                                              9 / 60
Introduction                Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Lexico-Syntactic Patterns


The main transducer

          • A cascade of FSTs
          • Unitex




                                                                                             10 / 60
Introduction                Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Lexico-Syntactic Patterns


The 2nd pattern




          • Allow for language variation, preserving precision
          • Compare to surface-based patterns (Bollegala et al., 2007)

                                                                                             11 / 60
Introduction                Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Lexico-Syntactic Patterns


Explicit extraction rules
          • positive/negative contexts,
          • dictionaries,
          • insertions of adjectives, . . .




                                                                                             12 / 60
Introduction                Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Lexico-Syntactic Patterns


Patterns are applied to corpora




          • No preprocessing is needed
          • 250Mb blocks
          • 1 block ≈ 1 hour @ Intel i5 M520@2.40GHz




                                                                                             13 / 60
Introduction                Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Lexico-Syntactic Patterns


Patterns extract concordances


          • such diverse {[occupations]} as {[doctors]},
               {[engineers]} and {[scientists]}[PATTERN=1]
          • such {non-alcoholic [sodas]} as {[root beer]} and
               {[cream soda]}[PATTERN=1]
          • {traditional[food]}, such as
               {[sandwich]},{[burger]}, and {[fry]}[PATTERN=2]
      Number of concordances:
          • WaCypedia – 1.196.468
          • ukWaC – 2.227.025
          • WaCypedia+ukWaC – 3.423.493


                                                                                             14 / 60
Introduction                   Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Semantic Similarity Measures


Plan

      Introduction
      Pattern-Based Similarity Measures
         Introduction
         Lexico-Syntactic Patterns
         Semantic Similarity Measures
         Results
         Conclusion
      Hybrid Semantic Similarity Measures
         Introduction
         Features: Single Similarity Measures
         Hybrid Similarity Measures
         Results
         Conclusion

                                                                                                15 / 60
Introduction                   Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Semantic Similarity Measures


General procedure




                                                                                                16 / 60
Introduction                   Pattern-Based Similarity Measures           Hybrid Semantic Similarity Measures

Semantic Similarity Measures


Reranking

          • Efreq. No re-ranking.

                                                         sij = eij
               sij – semantic similarity between terms ci , cj ∈ C
               eij – frequency of co-occurrence of ci and cj in concordances K
          • Efreq-Rfreq. Penalizes terms strongly related to many words.

                                                           2 · α · eij
                                                   sij =               ,
                                                           ei∗ + e∗j

               ei∗ – a number of concordances containing word ci
               α – an expected number of semantically related words per term


                                                                                                        17 / 60
Introduction                   Pattern-Based Similarity Measures              Hybrid Semantic Similarity Measures

Semantic Similarity Measures


Reranking

          • Efreq-Rnum. Penalizes terms strongly related to many words:

                                                             2 · µb · eij
                                                     sij =                ,
                                                             bi∗ + b∗j
               bi∗ =          j:eij ≥β 1 –      number of extractions with a frequency ≥ β
                        1           |C |
               µb =    |C |         i=1 bi∗      – an average number of relations per term
          • Efreq-Cfreq. Penalizes relations to general words e.g. “item”.

                                                              P(ci , cj )
                                                     sij =
                                                             P(ci )P(cj )
                                    eij
               P(ci , cj ) =              eij   – extraction probability of the pair ci , cj
                                     ij

               P(ci ) = fi fi – probability of the word ci
                          i
               fi – frequency of ci in the corpus
                                                                                                           18 / 60
Introduction                   Pattern-Based Similarity Measures           Hybrid Semantic Similarity Measures

Semantic Similarity Measures


Reranking




          • Efreq-Rnum-Cfreq-Pnum. Combines previous formulas +
               pattern redundancy.

                                             √             2 · µb   P(ci , cj )
                                     sij =       pij ·            ·             .
                                                         bi∗ + b∗j P(ci )P(cj )
               pij = 1, 18 – number of patterns extracted the relation ci , cj




                                                                                                        19 / 60
Introduction         Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Results


Plan

      Introduction
      Pattern-Based Similarity Measures
         Introduction
         Lexico-Syntactic Patterns
         Semantic Similarity Measures
         Results
         Conclusion
      Hybrid Semantic Similarity Measures
         Introduction
         Features: Single Similarity Measures
         Hybrid Similarity Measures
         Results
         Conclusion

                                                                                      20 / 60
Introduction             Pattern-Based Similarity Measures              Hybrid Semantic Similarity Measures

Results


Correlation with Human Judgements
            term, ci     term, cj      judgement, s          sim, s   judgement, r      sim, ˆr
              tiger         cat            7.35               0.85         1               3
              book         paper           7.46               0.95         2               2
           computer      keyboard          7.62               0.81         3               1
                ...          ...            ...                ...        ...             ...
           possibility      girl           1.94               0.25         64             65
             sugar       approach          0.88               0.05         65             23

      Data:
          • WordSim353 – 353 term pairs (Finkelstein, 2002)
          • MC – 30 term pairs (Miller Charles, 1991)
          • RG – 65 term pairs (Rubenstein Goodenough, 1965)
      Criteria:
                                                      s
                                               cov (s,ˆ)
        • Pearson correlation: ρ =                    s
                                               σ(s)σ(ˆ)
                                                           r
                                                    cov (r,ˆ)
          • Spearman’s correlation:           r = σ(r)σ(ˆ) r
                                                                                                     21 / 60
Introduction   Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Results


Correlation with Human Judgements




                                                                                22 / 60
Introduction          Pattern-Based Similarity Measures                Hybrid Semantic Similarity Measures

Results


Semantic Relation Ranking
                         term, ci      term, cj           relation type, t
                          judge        adjudicate         syn
                          judge        arbitrate          syn
                          judge        chancellor         syn
                            ...        ...                ...
                          judge        pc                 random
                          judge        fare               random
                          judge        lemon              random

          • BLESS (Baroni and Lenci, 2011)
               • 26554 relations
               • hyperonyms, co-hypernyms, meronyms, associations,
                  attributes, random relations
          • SN (Panchenko and Morozova, 2012)
               • 14682 relations
               • synonyms, co-hyponyms, hyponyms, random relations
          • |Rrandom | ≈ 0.5
               |R|
                                                                                                    23 / 60
Introduction           Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Results


Semantic Relation Ranking


          • Based on the number of correctly ranked relations.
          • R – all non-random relations
            ˆ
          • R(k) – top k% relations of targets

      Criteria
                                             ˆ
               • Precision: P(k) = |R∩R(k)| ,
                                     ˆ |R(k)|
                                      ˆ
               • Recall: R(k)    = |R∩R(k)| ,
                                     |R|


          • We use P(10), P(20), P(50), R(50).




                                                                                        24 / 60
Introduction         Pattern-Based Similarity Measures              Hybrid Semantic Similarity Measures

Results


Semantic Relation Ranking
                               1
          • Precision P(50%) = 7 ≈ 0.86

                  term, ci       term, cj           relation type    sij
                  aficionado      enthusiast         syn              0.07197
                  aficionado      fan                syn              0.05195
                  aficionado      admirer            syn              0.01964
                  aficionado      addict             syn              0.01326
                  aficionado      devotee            syn              0.01163
                  aficionado      foundling          random           0.00777
                  aficionado      fanatic            syn              0.00414
                  aficionado      adherent           syn              0.00353
                  aficionado      capital            random           0.00232
                  aficionado      statute            random           0.00029
                  aficionado      blot               random           0.00025
                  aficionado      meddler            random           0.00005
                  aficionado      enlargement        random           0.00003
                  aficionado      bawdyhouse         random           0.00000

                                                                                                 25 / 60
Introduction   Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Results


Semantic Relation Ranking




                                                                                26 / 60
Introduction         Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Results


Semantic Relation Ranking




      Figure: Precision-Recall graphs calculated on the BLESS dataset: (a)
      PatternSim measures; (b) the best PatternSim measure versus baselines.




                                                                                      27 / 60
Introduction          Pattern-Based Similarity Measures    Hybrid Semantic Similarity Measures

Results


Semantic Relation Extraction




                 Figure: Semantic relation extraction: precision at k.

          • 49 words – vocabulary of the RG dataset
          • three annotators, binary annotations
                                                                                        28 / 60
Introduction         Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Conclusion


Plan

      Introduction
      Pattern-Based Similarity Measures
         Introduction
         Lexico-Syntactic Patterns
         Semantic Similarity Measures
         Results
         Conclusion
      Hybrid Semantic Similarity Measures
         Introduction
         Features: Single Similarity Measures
         Hybrid Similarity Measures
         Results
         Conclusion

                                                                                      29 / 60
Introduction            Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Conclusion


Conclusion


          • We presented a similarity measure based on manually-crafted
               lexico-syntactic patterns.
          • The measure provides results comparable to the baselines
               and does not require semantic resources.
          • Future work – using a supervised model to
              • combine different factors;
              • tune the meta-parameters.

      Data: http://cental.fltr.ucl.ac.be/team/~panchenko/sim-eval/
      Code: http://github.com/cental/patternsim/
      Demo: http://serelex.cental.be/



                                                                                         30 / 60
Introduction         Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Introduction


Plan

      Introduction
      Pattern-Based Similarity Measures
         Introduction
         Lexico-Syntactic Patterns
         Semantic Similarity Measures
         Results
         Conclusion
      Hybrid Semantic Similarity Measures
         Introduction
         Features: Single Similarity Measures
         Hybrid Similarity Measures
         Results
         Conclusion

                                                                                      31 / 60
Introduction            Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Introduction


Reference Paper




          • Panchenko A. Morozova O. “A Study of Hybrid Similarity
               Measures for Semantic Relation Extraction”. In
               Proceedings of Workshop of Innovative Hybrid Approaches to
               the Processing of Textual Data Workshop, EACL 2012,
               pp.10-18, 2012




                                                                                         32 / 60
Introduction             Pattern-Based Similarity Measures     Hybrid Semantic Similarity Measures

Introduction


The State of Art

          • A multitude of complimentary measures were proposed to
            extract synonyms, hypernyms, and co-hyponyms
          • Most of them are based on one of the 5 key approaches:
               1.   distributional analysis (Lin, 1998b)
               2.   web as a corpus (Cilibrasi and Vitanyi, 2007)
               3.   lexico-syntactic patterns (Bollegala et al., 2007)
               4.   semantic networks (Resnik, 1995)
               5.   definitions of dictionaries or encyclopedias (Zesch et al., 2008a)




                                                                                            33 / 60
Introduction              Pattern-Based Similarity Measures     Hybrid Semantic Similarity Measures

Introduction


The State of Art

          • A multitude of complimentary measures were proposed to
            extract synonyms, hypernyms, and co-hyponyms
          • Most of them are based on one of the 5 key approaches:
                1.   distributional analysis (Lin, 1998b)
                2.   web as a corpus (Cilibrasi and Vitanyi, 2007)
                3.   lexico-syntactic patterns (Bollegala et al., 2007)
                4.   semantic networks (Resnik, 1995)
                5.   definitions of dictionaries or encyclopedias (Zesch et al., 2008a)
          • Some attempts were made to combine measures (Curran,
               2002; Cederberg and Widdows, 2003; Mihalcea et al., 2006;
               Agirre et al., 2009; Yang and Callan, 2009)
          • However, most studies are still not taking into account all 5
               existing extraction approaches.

                                                                                             34 / 60
Introduction              Pattern-Based Similarity Measures        Hybrid Semantic Similarity Measures

Introduction


Contributions


          • A systematic analysis of
              • 16 baseline similarity measures of 5 key extraction principles
              • their combinations with 8 fusion methods
          • Hybrid similarity measures based on all the 5 extraction
               approaches:
                1.   distributional analysis
                2.   Web as a corpus
                3.   lexico-syntactic patterns
                4.   semantic networks
                5.   definitions of dictionaries or encyclopedias




                                                                                                35 / 60
Introduction          Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Introduction


Single and Hybrid Similarity Measures


          • 16 single measures
              • 5 measures based on a semantic network
              • 3 web-based measures
              • 5 corpus-based measures
                   • 2 distributional
                   • 1 lexico-syntactic patterns
                   • 2 other co-occurence based
              • 3 definition-based measures
          • 64 hybrid measures
              • 8 combination methods
              • 8 measure sets obtained with 3 measure selection techniques




                                                                                       36 / 60
Introduction                 Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Features: Single Similarity Measures


Plan

      Introduction
      Pattern-Based Similarity Measures
         Introduction
         Lexico-Syntactic Patterns
         Semantic Similarity Measures
         Results
         Conclusion
      Hybrid Semantic Similarity Measures
         Introduction
         Features: Single Similarity Measures
         Hybrid Similarity Measures
         Results
         Conclusion

                                                                                              37 / 60
Introduction                 Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Features: Single Similarity Measures


Measures Based on a Semantic Network

         1. Wu and Palmer (1994)
         2. Leacock and Chodorow (1998)
         3. Resnik (1995)
         4. Jiang and Conrath (1997)
         5. Lin (1998)
      Data:
          • WordNet 3.0
          • SemCor corpus
      Variables:
          • Lengths of the shortest paths between terms in the network
          • Probability of terms derived from a corpus
      Coverage: 155.287 English terms encoded in WordNet 3.0.
                                                                                              38 / 60
Introduction                 Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Features: Single Similarity Measures


Web-based Measures

      Normalized Google Distance (NGD) (Cilibrasi and Vitanyi, 2007)
         6. NGD-Yahoo!
         7. NGD-Bing
         8. NGD-Google over wikipedia.org domain
      Data: number of times the terms co-occur in the documents as
      indexed by an IR system.
      Variables:
          • number of hits returned by query ”ci ”
          • number of hits returned by query ”ci AND cj

      Coverage: huge vocabulary in dozens of languages.


                                                                                              39 / 60
Introduction                 Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Features: Single Similarity Measures


Corpus-based Measures



         9. Bag-of-word Distributional Analysis (BDA) (Sahlgren, 2006)
       10. Syntactic Distributional Analysis (SDA) (Curran, 2003)
      Data: WaCkypedia (800M tokens) and PukWaC (2000M tokens)
      corpora (Baroni et al., 2009)
      Variables:
          • feature vector based on the context window
          • feature vector based on the syntactic context
      Coverage: word should occur in the corpora.




                                                                                              40 / 60
Introduction                 Pattern-Based Similarity Measures             Hybrid Semantic Similarity Measures

Features: Single Similarity Measures


Corpus-based Measures

       11. A measure based on lexico-syntactic patterns
      Data: WaCkypedia corpus (800M tokens)
      Method:
          • 10 patterns for hypernymy extraction: 6 Hearst (1992)
               patterns + 4 other patterns
          • such diverse {[occupations]} as {[doctors]},
               {[engineers]} and {[scientists]}[PATTERN=1]
          • Efreq: semantic similarity sij between terms ci , cj ∈ C – the
               number of term co-occurences in the same concordance nij :
                                                                    nij
                                       sim(ci , cj ) = sij =                  .
                                                                 maxij (nij )

                                                                                                        41 / 60
Introduction                 Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Features: Single Similarity Measures


Corpus-based Measures




       12. Latent Semantic Analysis (LSA) on TASA corpus
           (Landauer and Dumais, 1997)
       13. NGD on Factiva corpus (Veksler et al., 2008)




                                                                                              42 / 60
Introduction                 Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Features: Single Similarity Measures


Definition-based Measures



       14. Extended Lesk (Banerjee and Pedersen, 2003)
       15. GlossVectors (Patwardhan and Pedersen, 2006)
      Data: WordNet glosses.
      Variables:
          • bag-of-words vector of a term ci derived from the glosses
          • relation between words (ci , cj ) in the network
      Coverage: 117.659 glosses encoded in WordNet 3.0




                                                                                              43 / 60
Introduction                 Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Features: Single Similarity Measures


Definition-based Measures


       16. WktWiki – BDA on definitions of Wiktionary and Wikipedia                         1

      Data: Wikipedia abstracts, Wiktionary.
      Method:
          • Definition = abstract of Wikipedia article with title ”ci ” +
                glosses, examples, quotations, related words, categories from
                Wiktionary for ci
          • Represent a definition as a bag-of-words vector
          • Calculate similarities with cosine
          • Update similarities according to relations in the Wiktionary.
      Coverage: Wiktionary: 536.594 glosses, Wikipedia: 3.8M articles

          1
               The method stems from the work of Zesch et al. (2008)
                                                                                               44 / 60
Introduction                 Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Hybrid Similarity Measures


Plan

      Introduction
      Pattern-Based Similarity Measures
         Introduction
         Lexico-Syntactic Patterns
         Semantic Similarity Measures
         Results
         Conclusion
      Hybrid Semantic Similarity Measures
         Introduction
         Features: Single Similarity Measures
         Hybrid Similarity Measures
         Results
         Conclusion

                                                                                              45 / 60
Introduction                 Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Hybrid Similarity Measures


Combination Methods



          • A goal of a combination method is to produce “better”
               similarity scores than the scores of single measures.
          • A combination method takes as an input {S1 , . . . , SK }
               produced by K single measures and outputs Scmb .
             k
          • sij ∈ Sk is a pairwise similarity score of terms ci and cj
               produced by k-th measure.
          • We tested 8 combination methods.




                                                                                              46 / 60
Introduction                 Pattern-Based Similarity Measures             Hybrid Semantic Similarity Measures

Hybrid Similarity Measures


Combination Methods
         1. Mean. A mean of K pairwise similarity scores:
                                                K
                                           1               cmb         1            k
                              Scmb =                 Sk ⇔ sij =                    sij .
                                           K                           K
                                               k=1                         k=1,K
         2. Mean-Nnz. A mean of scores having non-zero value:
                      cmb             1                   k
                     sij =       k > 0, k = 1, K |
                                                         sij .
                           |k : sij                k=1,K

         3. Mean-Zscore. A mean of scores transformed into Z-scores:
                                                            K
                                                       1         S k − µk
                                          Scmb =                          ,
                                                       K             σk
                                                           k=1
               where µk and σk are a mean and a standard deviation of the
               scores of the k-th measure (Sk ).
                                                                                                        47 / 60
Introduction                 Pattern-Based Similarity Measures                   Hybrid Semantic Similarity Measures

Hybrid Similarity Measures


Combination Methods

         4. Median. A median of K pairwise similarities:
                                         cmb          1             K
                                        sij = median(sij , . . . , sij ).

         5. Max. A maximum of K pairwise similarities:
                                           cmb       1             K
                                          sij = max(sij , . . . , sij ).

         6. RankFusion. A mean of scores converted to ranks:

                                               cmb        1                k
                                              sij =                      rij ,
                                                          K
                                                                 k=1,K

                       k                                                    k
               where rij is the rank corresponding to the similarity score sij .

                                                                                                              48 / 60
Introduction                 Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Hybrid Similarity Measures


Combination Methods




         7. RelationFusion.
                 • Unions the top relations found by each measure separately.
                 • A relation extracted by several measures has more weight.
                 • See (Panchenko and Morozova, 2012) for details.




                                                                                              49 / 60
Introduction                 Pattern-Based Similarity Measures           Hybrid Semantic Similarity Measures

Hybrid Similarity Measures


Combination Methods
         8. Logit. A supervised combination of similarity measures
                 • Training a binary classifier (a Logistic Regression) on a set of
                     manually constructed semantic relations R (BLESS or SN)
                 • Positive training examples are “meaningful” relations
                     (synonyms, hyponyms, co-hyponyms, associations)
                 • Negative training examples are pairs of semantically
                     unrelated words (generated randomly and verified manually).
                 • A relation ci , t, cj ∈ R is represented with an N-dimensional
                                                              1             N
                     vector of pairwise similarities: xij = (sij , . . . , sij ).
                 • Category yij :
                                          0         if ci , t, cj is a random relation
                               yij =
                                          1         otherwise
                 • Using the model (w1 , . . . , wK ) to combine measures:
                                                                        K
                                        cmb          1                            k
                                       sij =               , z = w0 +         wk sij ,
                                                  1 + e −z
                                                                        k=1
                                                                                                      50 / 60
Introduction                 Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Hybrid Similarity Measures


Measure Selection


      A problem
      Number of ways to choose which of 16 single measures to combine:

                                               216 = 65.535

          • Expert choice of measures – 5, 9 and 15 measures
          • Forward Stepwise Procedure – 7, 8a, 8b, 10 measures
          • Analysis of LR weights – 12 measures




                                                                                              51 / 60
Introduction                 Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Hybrid Similarity Measures


Measure Selection


      A problem
      Number of ways to choose which of 16 single measures to combine:

                                               216 = 65.535

          • Expert choice of measures – 5, 9 and 15 measures
          • Forward Stepwise Procedure – 7, 8a, 8b, 10 measures
          • Analysis of LR weights – 12 measures
          • The best predictors: C-BDA, C-SDA, C-LSA-Tasa,
               D-WktWiki, D-GlossVectors, D-ExtendedLesk.


                                                                                              52 / 60
Introduction         Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Results


Plan

      Introduction
      Pattern-Based Similarity Measures
         Introduction
         Lexico-Syntactic Patterns
         Semantic Similarity Measures
         Results
         Conclusion
      Hybrid Semantic Similarity Measures
         Introduction
         Features: Single Similarity Measures
         Hybrid Similarity Measures
         Results
         Conclusion

                                                                                      53 / 60
Introduction        Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Results


Single Similarity Measures




      Figure: Performance of 16 single similarity measures on human
      judgement datasets (MC, RG, WordSim353). The best scores in a
      group are in bold.




                                                                                     54 / 60
Introduction         Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Results


Single Similarity Measures




      Figure: Performance of 16 single similarity measures on human
      judgement datasets (MC, RG, WordSim353) and semantic relation
      datasets (BLESS and SN). The best scores in a group are in bold.




                                                                                      55 / 60
Introduction        Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Results


Hybrid Similarity Measures




      Figure: Performance of 16 single and 8 hybrid similarity measures on
      human judgements datasets (MC, RG, WordSim353) and semantic
      relation datasets (BLESS and SN). The best scores in a group
      (single/hybrid) are in bold; the very best scores are in grey.
                                                                                     56 / 60
Introduction         Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Results


Hybrid Similarity Measures




      Figure: Precision-Recall graphs calculated on the BLESS dataset of (a)
      16 single measures and the best hybrid measure H-Logit-E15; (b) 8
      hybrid measures.



                                                                                      57 / 60
Introduction         Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Conclusion


Plan

      Introduction
      Pattern-Based Similarity Measures
         Introduction
         Lexico-Syntactic Patterns
         Semantic Similarity Measures
         Results
         Conclusion
      Hybrid Semantic Similarity Measures
         Introduction
         Features: Single Similarity Measures
         Hybrid Similarity Measures
         Results
         Conclusion

                                                                                      58 / 60
Introduction            Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Conclusion


Conclusion:


          • We have undertaken a study of 16 baseline measures, 8
               combination methods, and 3 measure selection techniques.
          • The proposed hybrid measures:
                 • use all 5 main types of baseline measures;
                 • outperform the single measures on all datasets.
          • The best results were provided by
              • a combination of 15 corpus-, web-, network-, and
                definition-based measures
              • with Logistic Regression
              • ρ = 0.870, P(20) = 0.987, R(50) = 0.814.




                                                                                         59 / 60
Introduction   Pattern-Based Similarity Measures   Hybrid Semantic Similarity Measures

Conclusion




      Thank you! Questions?




                                                                                60 / 60

Semantic Similarity Measures for Semantic Relation Extraction

  • 1.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Semantic Similarity Measures for Semantic Relation Extraction Alexander Panchenko Center for Natural Language Processing (CENTAL) Universit´ catholique de Louvain – Belgium e alexander.panchenko@uclouvain.be September 21, 2012 1 / 60
  • 2.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Plan Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures 2 / 60
  • 3.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Semantic Similarity Measures 1. A similarity measure sij = sim(ci , cj ) → [0, 1] • ci , cj – terms • sij – high for semantic relations ci , cj • synonyms, hyponyms, co-hyponyms • sij – low for other pairs ci , cj 2. Semantic similarity measures are useful for NLP/IR: • WSD (Patwardhan et al., 2003) • Query Expansion (Hsu et al., 2006) • QA (Sun et al., 2005) • Text Categorization (Tikk et al, 2003) • ˇ Text Similarity (Saric et al., 2012) 3 / 60
  • 4.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures State of the Art • WordNet-based measures • WuPalmer (1994), LeacockChodorow (1998), Resnik (1995) • rely on manually crafted resources • highest precision, limited coverage • Dictionary-based measures • ExtendedLesk (Banerjee and Pedersen, 2003), GlossVectors (Patward han and Pedersen, 2006) and WiktionaryOverlap (Zesch et al., 2008) • rely on manually crafted resources • high precision, limited coverage • Corpus-based measures • ContextWindow (Van de Cruys, 2010), SyntacticContext (Lin, 1998), LSA (Landauer et al., 1998) • no semantic resources are needed • low precision, high recall • Combined e.g. WikiRelate! (Strube and Ponzetto, 2006) . . . 4 / 60
  • 5.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Introduction Plan Introduction Pattern-Based Similarity Measures Introduction Lexico-Syntactic Patterns Semantic Similarity Measures Results Conclusion Hybrid Semantic Similarity Measures Introduction Features: Single Similarity Measures Hybrid Similarity Measures Results Conclusion 5 / 60
  • 6.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Introduction Reference Paper • Panchenko A., Morozova O., Naets H. “A Semantic Similarity Measure Based on Lexico-Syntactic Patterns”. In Proceedings of KONVENS 2012, pp.174–178, 2012 6 / 60
  • 7.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Introduction Try a Demo • http://serelex.cental.be/ 7 / 60
  • 8.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Lexico-Syntactic Patterns Plan Introduction Pattern-Based Similarity Measures Introduction Lexico-Syntactic Patterns Semantic Similarity Measures Results Conclusion Hybrid Semantic Similarity Measures Introduction Features: Single Similarity Measures Hybrid Similarity Measures Results Conclusion 8 / 60
  • 9.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Lexico-Syntactic Patterns General architecture • 6 classical Hearst (1992) patterns • 12 further patterns • extracting hypernyms, co-hyponyms and synonyms 9 / 60
  • 10.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Lexico-Syntactic Patterns The main transducer • A cascade of FSTs • Unitex 10 / 60
  • 11.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Lexico-Syntactic Patterns The 2nd pattern • Allow for language variation, preserving precision • Compare to surface-based patterns (Bollegala et al., 2007) 11 / 60
  • 12.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Lexico-Syntactic Patterns Explicit extraction rules • positive/negative contexts, • dictionaries, • insertions of adjectives, . . . 12 / 60
  • 13.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Lexico-Syntactic Patterns Patterns are applied to corpora • No preprocessing is needed • 250Mb blocks • 1 block ≈ 1 hour @ Intel i5 M520@2.40GHz 13 / 60
  • 14.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Lexico-Syntactic Patterns Patterns extract concordances • such diverse {[occupations]} as {[doctors]}, {[engineers]} and {[scientists]}[PATTERN=1] • such {non-alcoholic [sodas]} as {[root beer]} and {[cream soda]}[PATTERN=1] • {traditional[food]}, such as {[sandwich]},{[burger]}, and {[fry]}[PATTERN=2] Number of concordances: • WaCypedia – 1.196.468 • ukWaC – 2.227.025 • WaCypedia+ukWaC – 3.423.493 14 / 60
  • 15.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Semantic Similarity Measures Plan Introduction Pattern-Based Similarity Measures Introduction Lexico-Syntactic Patterns Semantic Similarity Measures Results Conclusion Hybrid Semantic Similarity Measures Introduction Features: Single Similarity Measures Hybrid Similarity Measures Results Conclusion 15 / 60
  • 16.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Semantic Similarity Measures General procedure 16 / 60
  • 17.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Semantic Similarity Measures Reranking • Efreq. No re-ranking. sij = eij sij – semantic similarity between terms ci , cj ∈ C eij – frequency of co-occurrence of ci and cj in concordances K • Efreq-Rfreq. Penalizes terms strongly related to many words. 2 · α · eij sij = , ei∗ + e∗j ei∗ – a number of concordances containing word ci α – an expected number of semantically related words per term 17 / 60
  • 18.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Semantic Similarity Measures Reranking • Efreq-Rnum. Penalizes terms strongly related to many words: 2 · µb · eij sij = , bi∗ + b∗j bi∗ = j:eij ≥β 1 – number of extractions with a frequency ≥ β 1 |C | µb = |C | i=1 bi∗ – an average number of relations per term • Efreq-Cfreq. Penalizes relations to general words e.g. “item”. P(ci , cj ) sij = P(ci )P(cj ) eij P(ci , cj ) = eij – extraction probability of the pair ci , cj ij P(ci ) = fi fi – probability of the word ci i fi – frequency of ci in the corpus 18 / 60
  • 19.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Semantic Similarity Measures Reranking • Efreq-Rnum-Cfreq-Pnum. Combines previous formulas + pattern redundancy. √ 2 · µb P(ci , cj ) sij = pij · · . bi∗ + b∗j P(ci )P(cj ) pij = 1, 18 – number of patterns extracted the relation ci , cj 19 / 60
  • 20.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Results Plan Introduction Pattern-Based Similarity Measures Introduction Lexico-Syntactic Patterns Semantic Similarity Measures Results Conclusion Hybrid Semantic Similarity Measures Introduction Features: Single Similarity Measures Hybrid Similarity Measures Results Conclusion 20 / 60
  • 21.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Results Correlation with Human Judgements term, ci term, cj judgement, s sim, s judgement, r sim, ˆr tiger cat 7.35 0.85 1 3 book paper 7.46 0.95 2 2 computer keyboard 7.62 0.81 3 1 ... ... ... ... ... ... possibility girl 1.94 0.25 64 65 sugar approach 0.88 0.05 65 23 Data: • WordSim353 – 353 term pairs (Finkelstein, 2002) • MC – 30 term pairs (Miller Charles, 1991) • RG – 65 term pairs (Rubenstein Goodenough, 1965) Criteria: s cov (s,ˆ) • Pearson correlation: ρ = s σ(s)σ(ˆ) r cov (r,ˆ) • Spearman’s correlation: r = σ(r)σ(ˆ) r 21 / 60
  • 22.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Results Correlation with Human Judgements 22 / 60
  • 23.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Results Semantic Relation Ranking term, ci term, cj relation type, t judge adjudicate syn judge arbitrate syn judge chancellor syn ... ... ... judge pc random judge fare random judge lemon random • BLESS (Baroni and Lenci, 2011) • 26554 relations • hyperonyms, co-hypernyms, meronyms, associations, attributes, random relations • SN (Panchenko and Morozova, 2012) • 14682 relations • synonyms, co-hyponyms, hyponyms, random relations • |Rrandom | ≈ 0.5 |R| 23 / 60
  • 24.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Results Semantic Relation Ranking • Based on the number of correctly ranked relations. • R – all non-random relations ˆ • R(k) – top k% relations of targets Criteria ˆ • Precision: P(k) = |R∩R(k)| , ˆ |R(k)| ˆ • Recall: R(k) = |R∩R(k)| , |R| • We use P(10), P(20), P(50), R(50). 24 / 60
  • 25.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Results Semantic Relation Ranking 1 • Precision P(50%) = 7 ≈ 0.86 term, ci term, cj relation type sij aficionado enthusiast syn 0.07197 aficionado fan syn 0.05195 aficionado admirer syn 0.01964 aficionado addict syn 0.01326 aficionado devotee syn 0.01163 aficionado foundling random 0.00777 aficionado fanatic syn 0.00414 aficionado adherent syn 0.00353 aficionado capital random 0.00232 aficionado statute random 0.00029 aficionado blot random 0.00025 aficionado meddler random 0.00005 aficionado enlargement random 0.00003 aficionado bawdyhouse random 0.00000 25 / 60
  • 26.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Results Semantic Relation Ranking 26 / 60
  • 27.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Results Semantic Relation Ranking Figure: Precision-Recall graphs calculated on the BLESS dataset: (a) PatternSim measures; (b) the best PatternSim measure versus baselines. 27 / 60
  • 28.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Results Semantic Relation Extraction Figure: Semantic relation extraction: precision at k. • 49 words – vocabulary of the RG dataset • three annotators, binary annotations 28 / 60
  • 29.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Conclusion Plan Introduction Pattern-Based Similarity Measures Introduction Lexico-Syntactic Patterns Semantic Similarity Measures Results Conclusion Hybrid Semantic Similarity Measures Introduction Features: Single Similarity Measures Hybrid Similarity Measures Results Conclusion 29 / 60
  • 30.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Conclusion Conclusion • We presented a similarity measure based on manually-crafted lexico-syntactic patterns. • The measure provides results comparable to the baselines and does not require semantic resources. • Future work – using a supervised model to • combine different factors; • tune the meta-parameters. Data: http://cental.fltr.ucl.ac.be/team/~panchenko/sim-eval/ Code: http://github.com/cental/patternsim/ Demo: http://serelex.cental.be/ 30 / 60
  • 31.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Introduction Plan Introduction Pattern-Based Similarity Measures Introduction Lexico-Syntactic Patterns Semantic Similarity Measures Results Conclusion Hybrid Semantic Similarity Measures Introduction Features: Single Similarity Measures Hybrid Similarity Measures Results Conclusion 31 / 60
  • 32.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Introduction Reference Paper • Panchenko A. Morozova O. “A Study of Hybrid Similarity Measures for Semantic Relation Extraction”. In Proceedings of Workshop of Innovative Hybrid Approaches to the Processing of Textual Data Workshop, EACL 2012, pp.10-18, 2012 32 / 60
  • 33.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Introduction The State of Art • A multitude of complimentary measures were proposed to extract synonyms, hypernyms, and co-hyponyms • Most of them are based on one of the 5 key approaches: 1. distributional analysis (Lin, 1998b) 2. web as a corpus (Cilibrasi and Vitanyi, 2007) 3. lexico-syntactic patterns (Bollegala et al., 2007) 4. semantic networks (Resnik, 1995) 5. definitions of dictionaries or encyclopedias (Zesch et al., 2008a) 33 / 60
  • 34.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Introduction The State of Art • A multitude of complimentary measures were proposed to extract synonyms, hypernyms, and co-hyponyms • Most of them are based on one of the 5 key approaches: 1. distributional analysis (Lin, 1998b) 2. web as a corpus (Cilibrasi and Vitanyi, 2007) 3. lexico-syntactic patterns (Bollegala et al., 2007) 4. semantic networks (Resnik, 1995) 5. definitions of dictionaries or encyclopedias (Zesch et al., 2008a) • Some attempts were made to combine measures (Curran, 2002; Cederberg and Widdows, 2003; Mihalcea et al., 2006; Agirre et al., 2009; Yang and Callan, 2009) • However, most studies are still not taking into account all 5 existing extraction approaches. 34 / 60
  • 35.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Introduction Contributions • A systematic analysis of • 16 baseline similarity measures of 5 key extraction principles • their combinations with 8 fusion methods • Hybrid similarity measures based on all the 5 extraction approaches: 1. distributional analysis 2. Web as a corpus 3. lexico-syntactic patterns 4. semantic networks 5. definitions of dictionaries or encyclopedias 35 / 60
  • 36.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Introduction Single and Hybrid Similarity Measures • 16 single measures • 5 measures based on a semantic network • 3 web-based measures • 5 corpus-based measures • 2 distributional • 1 lexico-syntactic patterns • 2 other co-occurence based • 3 definition-based measures • 64 hybrid measures • 8 combination methods • 8 measure sets obtained with 3 measure selection techniques 36 / 60
  • 37.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Features: Single Similarity Measures Plan Introduction Pattern-Based Similarity Measures Introduction Lexico-Syntactic Patterns Semantic Similarity Measures Results Conclusion Hybrid Semantic Similarity Measures Introduction Features: Single Similarity Measures Hybrid Similarity Measures Results Conclusion 37 / 60
  • 38.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Features: Single Similarity Measures Measures Based on a Semantic Network 1. Wu and Palmer (1994) 2. Leacock and Chodorow (1998) 3. Resnik (1995) 4. Jiang and Conrath (1997) 5. Lin (1998) Data: • WordNet 3.0 • SemCor corpus Variables: • Lengths of the shortest paths between terms in the network • Probability of terms derived from a corpus Coverage: 155.287 English terms encoded in WordNet 3.0. 38 / 60
  • 39.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Features: Single Similarity Measures Web-based Measures Normalized Google Distance (NGD) (Cilibrasi and Vitanyi, 2007) 6. NGD-Yahoo! 7. NGD-Bing 8. NGD-Google over wikipedia.org domain Data: number of times the terms co-occur in the documents as indexed by an IR system. Variables: • number of hits returned by query ”ci ” • number of hits returned by query ”ci AND cj Coverage: huge vocabulary in dozens of languages. 39 / 60
  • 40.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Features: Single Similarity Measures Corpus-based Measures 9. Bag-of-word Distributional Analysis (BDA) (Sahlgren, 2006) 10. Syntactic Distributional Analysis (SDA) (Curran, 2003) Data: WaCkypedia (800M tokens) and PukWaC (2000M tokens) corpora (Baroni et al., 2009) Variables: • feature vector based on the context window • feature vector based on the syntactic context Coverage: word should occur in the corpora. 40 / 60
  • 41.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Features: Single Similarity Measures Corpus-based Measures 11. A measure based on lexico-syntactic patterns Data: WaCkypedia corpus (800M tokens) Method: • 10 patterns for hypernymy extraction: 6 Hearst (1992) patterns + 4 other patterns • such diverse {[occupations]} as {[doctors]}, {[engineers]} and {[scientists]}[PATTERN=1] • Efreq: semantic similarity sij between terms ci , cj ∈ C – the number of term co-occurences in the same concordance nij : nij sim(ci , cj ) = sij = . maxij (nij ) 41 / 60
  • 42.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Features: Single Similarity Measures Corpus-based Measures 12. Latent Semantic Analysis (LSA) on TASA corpus (Landauer and Dumais, 1997) 13. NGD on Factiva corpus (Veksler et al., 2008) 42 / 60
  • 43.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Features: Single Similarity Measures Definition-based Measures 14. Extended Lesk (Banerjee and Pedersen, 2003) 15. GlossVectors (Patwardhan and Pedersen, 2006) Data: WordNet glosses. Variables: • bag-of-words vector of a term ci derived from the glosses • relation between words (ci , cj ) in the network Coverage: 117.659 glosses encoded in WordNet 3.0 43 / 60
  • 44.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Features: Single Similarity Measures Definition-based Measures 16. WktWiki – BDA on definitions of Wiktionary and Wikipedia 1 Data: Wikipedia abstracts, Wiktionary. Method: • Definition = abstract of Wikipedia article with title ”ci ” + glosses, examples, quotations, related words, categories from Wiktionary for ci • Represent a definition as a bag-of-words vector • Calculate similarities with cosine • Update similarities according to relations in the Wiktionary. Coverage: Wiktionary: 536.594 glosses, Wikipedia: 3.8M articles 1 The method stems from the work of Zesch et al. (2008) 44 / 60
  • 45.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Hybrid Similarity Measures Plan Introduction Pattern-Based Similarity Measures Introduction Lexico-Syntactic Patterns Semantic Similarity Measures Results Conclusion Hybrid Semantic Similarity Measures Introduction Features: Single Similarity Measures Hybrid Similarity Measures Results Conclusion 45 / 60
  • 46.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Hybrid Similarity Measures Combination Methods • A goal of a combination method is to produce “better” similarity scores than the scores of single measures. • A combination method takes as an input {S1 , . . . , SK } produced by K single measures and outputs Scmb . k • sij ∈ Sk is a pairwise similarity score of terms ci and cj produced by k-th measure. • We tested 8 combination methods. 46 / 60
  • 47.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Hybrid Similarity Measures Combination Methods 1. Mean. A mean of K pairwise similarity scores: K 1 cmb 1 k Scmb = Sk ⇔ sij = sij . K K k=1 k=1,K 2. Mean-Nnz. A mean of scores having non-zero value: cmb 1 k sij = k > 0, k = 1, K | sij . |k : sij k=1,K 3. Mean-Zscore. A mean of scores transformed into Z-scores: K 1 S k − µk Scmb = , K σk k=1 where µk and σk are a mean and a standard deviation of the scores of the k-th measure (Sk ). 47 / 60
  • 48.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Hybrid Similarity Measures Combination Methods 4. Median. A median of K pairwise similarities: cmb 1 K sij = median(sij , . . . , sij ). 5. Max. A maximum of K pairwise similarities: cmb 1 K sij = max(sij , . . . , sij ). 6. RankFusion. A mean of scores converted to ranks: cmb 1 k sij = rij , K k=1,K k k where rij is the rank corresponding to the similarity score sij . 48 / 60
  • 49.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Hybrid Similarity Measures Combination Methods 7. RelationFusion. • Unions the top relations found by each measure separately. • A relation extracted by several measures has more weight. • See (Panchenko and Morozova, 2012) for details. 49 / 60
  • 50.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Hybrid Similarity Measures Combination Methods 8. Logit. A supervised combination of similarity measures • Training a binary classifier (a Logistic Regression) on a set of manually constructed semantic relations R (BLESS or SN) • Positive training examples are “meaningful” relations (synonyms, hyponyms, co-hyponyms, associations) • Negative training examples are pairs of semantically unrelated words (generated randomly and verified manually). • A relation ci , t, cj ∈ R is represented with an N-dimensional 1 N vector of pairwise similarities: xij = (sij , . . . , sij ). • Category yij : 0 if ci , t, cj is a random relation yij = 1 otherwise • Using the model (w1 , . . . , wK ) to combine measures: K cmb 1 k sij = , z = w0 + wk sij , 1 + e −z k=1 50 / 60
  • 51.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Hybrid Similarity Measures Measure Selection A problem Number of ways to choose which of 16 single measures to combine: 216 = 65.535 • Expert choice of measures – 5, 9 and 15 measures • Forward Stepwise Procedure – 7, 8a, 8b, 10 measures • Analysis of LR weights – 12 measures 51 / 60
  • 52.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Hybrid Similarity Measures Measure Selection A problem Number of ways to choose which of 16 single measures to combine: 216 = 65.535 • Expert choice of measures – 5, 9 and 15 measures • Forward Stepwise Procedure – 7, 8a, 8b, 10 measures • Analysis of LR weights – 12 measures • The best predictors: C-BDA, C-SDA, C-LSA-Tasa, D-WktWiki, D-GlossVectors, D-ExtendedLesk. 52 / 60
  • 53.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Results Plan Introduction Pattern-Based Similarity Measures Introduction Lexico-Syntactic Patterns Semantic Similarity Measures Results Conclusion Hybrid Semantic Similarity Measures Introduction Features: Single Similarity Measures Hybrid Similarity Measures Results Conclusion 53 / 60
  • 54.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Results Single Similarity Measures Figure: Performance of 16 single similarity measures on human judgement datasets (MC, RG, WordSim353). The best scores in a group are in bold. 54 / 60
  • 55.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Results Single Similarity Measures Figure: Performance of 16 single similarity measures on human judgement datasets (MC, RG, WordSim353) and semantic relation datasets (BLESS and SN). The best scores in a group are in bold. 55 / 60
  • 56.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Results Hybrid Similarity Measures Figure: Performance of 16 single and 8 hybrid similarity measures on human judgements datasets (MC, RG, WordSim353) and semantic relation datasets (BLESS and SN). The best scores in a group (single/hybrid) are in bold; the very best scores are in grey. 56 / 60
  • 57.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Results Hybrid Similarity Measures Figure: Precision-Recall graphs calculated on the BLESS dataset of (a) 16 single measures and the best hybrid measure H-Logit-E15; (b) 8 hybrid measures. 57 / 60
  • 58.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Conclusion Plan Introduction Pattern-Based Similarity Measures Introduction Lexico-Syntactic Patterns Semantic Similarity Measures Results Conclusion Hybrid Semantic Similarity Measures Introduction Features: Single Similarity Measures Hybrid Similarity Measures Results Conclusion 58 / 60
  • 59.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Conclusion Conclusion: • We have undertaken a study of 16 baseline measures, 8 combination methods, and 3 measure selection techniques. • The proposed hybrid measures: • use all 5 main types of baseline measures; • outperform the single measures on all datasets. • The best results were provided by • a combination of 15 corpus-, web-, network-, and definition-based measures • with Logistic Regression • ρ = 0.870, P(20) = 0.987, R(50) = 0.814. 59 / 60
  • 60.
    Introduction Pattern-Based Similarity Measures Hybrid Semantic Similarity Measures Conclusion Thank you! Questions? 60 / 60