Statistics-based Approaches to Lexical Semantics

2,470 views

Published on

My Ph.D. trial lecture presentation, given February 5th 2010.

0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,470
On SlideShare
0
From Embeds
0
Number of Embeds
1,051
Actions
Shares
0
Downloads
25
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Statistics-based Approaches to Lexical Semantics

  1. 1. Statistics-based Approaches to Lexical Semantics Martin Thorsen Ranang Department of Computer and Information Science (IDI) Trial Lecture, February 5th 2010www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  2. 2. 2 Outline Introduction What is Lexical Semantics? Natural Language Processing (NLP) Applications My PhD Research Statistics-based Approaches to Lexical Semantics Word Sense Disambiguation (WSD) Vector Space Model (VSM) Dimensionality Reduction Ontology Merging and Alignment Summarywww.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  3. 3. 3 Outline Introduction What is Lexical Semantics? Natural Language Processing (NLP) Applications My PhD Research Statistics-based Approaches to Lexical Semantics Word Sense Disambiguation (WSD) Vector Space Model (VSM) Dimensionality Reduction Ontology Merging and Alignment Summarywww.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  4. 4. 4 Lexical Semantics — “The study of how and what the words of a language denote.” (Pustejovsky, 1998) — lexical semantic relations like: synonymy, antonymy (“close vs. distant”), hypo-/hypernymy (“car vs. vehicle”) — polysemy (lexical ambiguity) — selectional restrictions: “Joe ate <. . . > in a hurry.” — Typical resources: • Dictionaries, Machine Readable Dictionaries (MRDs) (Wilks et al., 1996) • Ontologies and Semantic Networkswww.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  5. 5. 5 The Distributional Hypothesis — “You shall know a word by the company it keeps.” Firth (1957). — “There is a positive relationship between the degree of synonymy (semantic similarity) existing between a pair of words and the degree to which their contexts are similar.” (Rubenstein and Goodenough, 1965) — “The meaning of entities, and the meaning of grammatical relations among them, is related to the restriction of combinations of these entities relative to other entities.” (Harris, 1968)www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  6. 6. 6 Outline Introduction What is Lexical Semantics? Natural Language Processing (NLP) Applications My PhD Research Statistics-based Approaches to Lexical Semantics Word Sense Disambiguation (WSD) Vector Space Model (VSM) Dimensionality Reduction Ontology Merging and Alignment Summarywww.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  7. 7. 7 Example Areas — Word Sense Disambiguation (WSD) — Natural Language Understanding (NLU) and Text Interpretation (TI) — Machine Translation (MT) — Information Retrieval (IR) What parts of of Natural Language Processing (NLP) are not affected by Lexical Semantics?www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  8. 8. 8 Outline Introduction What is Lexical Semantics? Natural Language Processing (NLP) Applications My PhD Research Statistics-based Approaches to Lexical Semantics Word Sense Disambiguation (WSD) Vector Space Model (VSM) Dimensionality Reduction Ontology Merging and Alignment Summarywww.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  9. 9. 9 My PhD Research — Developed a method for automatically mapping words from languages other than English to concepts in the Princeton WordNet by Miller et al. (1990); Fellbaum (1998)www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  10. 10. 10 WordNet Examplewww.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  11. 11. 11 Why Statistics-based? — Frequencies of actual language usage — Adapts to changes of the above — Well suited to provide generalizations and to summarize features of huge text corpora. (Manning and Schütze, 1999)www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  12. 12. 12 Outline Introduction What is Lexical Semantics? Natural Language Processing (NLP) Applications My PhD Research Statistics-based Approaches to Lexical Semantics Word Sense Disambiguation (WSD) Vector Space Model (VSM) Dimensionality Reduction Ontology Merging and Alignment Summarywww.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  13. 13. 13 Word Sense Disambiguation (WSD) Morone saxatilis Tones of low Bass frequency Marchione bass guitarwww.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  14. 14. 14 Usage Context — “He fished for bass using scented attractants.” — “Joe played the bass fluently, while George played the piano.” — “When the neighbors play their music I can’t hear the tune but can hear the bass tones.”www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  15. 15. 15 Word Sense Disambiguation (WSD) — Two main approaches: Integrated approach: postponed until semantic analysis; elimination of ill-formed semantic representations Stand-alone approach: independent of, and prior to compositional semantic analysis; more often statistics-basedwww.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  16. 16. 16 Statistics-based Stand-alone Approaches I Supervised learning Training: sense-tagged corpus; naïve Bayesian classifiers; feature vectors; “sliding window” Feature vectors represent local context, and may include words and POS. Application: Use the trained classifier on unseen ambiguous words, given a local-context feature vectorwww.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  17. 17. 17 Statistics-based Stand-alone Approaches II Bootstrapping small number of training instances used as seeds; classifier trained through supervised learning Unsupervised disambiguation sense-discrimination, not sense tagging; groups of similar words, based on their local-context Dictionary-based approach Count overlap between sliding window and dictionary definition of candidate senses.www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  18. 18. 18 Outline Introduction What is Lexical Semantics? Natural Language Processing (NLP) Applications My PhD Research Statistics-based Approaches to Lexical Semantics Word Sense Disambiguation (WSD) Vector Space Model (VSM) Dimensionality Reduction Ontology Merging and Alignment Summarywww.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  19. 19. 19 Vector Space Model (Salton, 1971) Term Frequency: ni,j Importance of term i tfi,j = k nk ,j to doc j Inverse Document Frequency: |D| Common words are idfi = log less descriptive |{d : ti ∈ d}| Vector elements: wi,j = tfi,j · idfi v 2 1 v2 ... vd 3 w1,1 w1,2 ... w1,d Weight vector for doc d: 6 w2,1 6 w2,2 ... w2,d 7 7 4. . . . . . . . . . . . . . . . . . . . . . .5 vd = wN,1 wN,2 ... wN,d [w1,d , w2,d , . . . , wN,d ]Twww.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  20. 20. 20 Vector Space Model Astronaut Rocket Cosmonaut — Enables comparison with other documents, based on content. — Does it really describe a document’s meaning? — Restrictions?www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  21. 21. 21 Semantic Augmentation of the Vector Space Model Several attempts to improve document retrieval efficiency by incorporating lexical semantic information: — Voorhees (1994, 1998) — Moldovan and Mihalcea (2000) — Buscaldi et al. (2005) No, or small, improvements to IR; some improvement for document classification.www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  22. 22. 22 Outline Introduction What is Lexical Semantics? Natural Language Processing (NLP) Applications My PhD Research Statistics-based Approaches to Lexical Semantics Word Sense Disambiguation (WSD) Vector Space Model (VSM) Dimensionality Reduction Ontology Merging and Alignment Summarywww.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  23. 23. 23 Latent Semantic Analysis (LSA) / Indexing (LSI) — Discrete entities are mapped onto a continuous vector space; — the mapping is determined by global correlation patterns; and — Dimensionality reduction is an integral part of the process (Landauer and Dumais, 1997; Ando, 2000; Bellegarda, 2007)www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  24. 24. 24 Dimensionality Reduction — Singular Value Decomposition {0.65 Cosmonaut, 0.35 Astronaut} Rocket Quantitative evaluation of different semantic word space models: Van de Cruys (2010)www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  25. 25. 25 Outline Introduction What is Lexical Semantics? Natural Language Processing (NLP) Applications My PhD Research Statistics-based Approaches to Lexical Semantics Word Sense Disambiguation (WSD) Vector Space Model (VSM) Dimensionality Reduction Ontology Merging and Alignment Summarywww.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  26. 26. 26 Ontology Matching — Lacher and Groh (2001) used signature tfidf vectors for computing similarity between two ontology nodes.www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  27. 27. 27 Summary — Lexical semantics — How this relates to my PhD research — Examples of statistics-based approaches to Lexical Semantics, including: • different Word Sense Disambiguation techniques • semantic augmentation of the vector space model • how LSA/dimensionality reduction of vector spaces handles synonymy • how statistics-based similarity measures are used to align and merge ontologieswww.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  28. 28. 28 References I Ando, Rie Kubota. 2000. Latent semantic space: Iterative scaling improves precision of inter-document similarity measurement. In SIGIR’00. Bellegarda, Jerome R. 2007. Latent Semantic Mapping: Principles & Applications, vol. 3 of Synthesis Lectures on Speech and Audio Processing. Morgan & Claypool Publishers. Buscaldi, D., P. Rosso, and E.S. Arnal. 2005. A WordNet-based query expansion method for geographical information retrieval. In Working Notes for the CLEF Workshop.www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  29. 29. 29 References II Van de Cruys, Tim. 2010. A quantitative evaluation of semantic word space models. In Computational Linguistics In The Netherlands (CLIN) 20. Utrecht, Netherlands. Fellbaum, Christiane, ed. 1998. WordNet: An electronic lexical database. Language, Speech, and Communication, Cambridge, Massachusetts, USA: The MIT Press. Firth, John Rupert. 1957. Papers in linguistics 1934–1951. Oxford, UK: Oxford University Press. Harris, Zellig Sabbettai. 1968. Mathematical structures of language. Krieger Publishing Company.www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  30. 30. 30 References III Lacher, Martin S., and Georg Groh. 2001. Facilitating the exchange of explicit knowledge through ontology mappings. In Proceedings of the fourteenth international florida artificial intelligence research society conference, 305–309. AAAI Press. Landauer, Thomas K., and Susan T. Dumais. 1997. A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review (104):211–240. Manning, Christopher D., and Hinrich Schütze. 1999. Foundations of statistical natural language processing. Cambridge, Massachusetts, USA: The MIT Press.www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  31. 31. 31 References IV Miller, George A., Richard Beckwith, Christiane Fellbaum, Derek Gross, and Katherine J. Miller. 1990. Introduction to WordNet: an on-line lexical database. International Journal of Lexicography 3(4):235–244. (Revised August 1993). Moldovan, Dan I., and Rada Mihalcea. 2000. Using WordNet and lexical operators to improve Internet searches. Internet Computing, IEEE 4:34–43. Pustejovsky, James. 1998. The generative lexicon. Cambridge, Massachusetts, USA: The MIT Press. Rubenstein, Herbert, and John B. Goodenough. 1965. Contextual correlates of synonymy. Commun. ACM 8(10):627–633.www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
  32. 32. 32 References V Salton, Gerard, ed. 1971. The smart retrieval system: Experiments in automatic document processing. Englewood Cliffs, NJ: Prentice-Hall. Voorhees, Ellen M. 1994. Query expansion using lexical-semantic relations. In SIGIR’94: Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 61–69. ———. 1998. Using WordNet for text retrieval. In Fellbaum (1998), chap. 12, 285–304. Wilks, Yorick, Louise Guthrie, and Brian M. Slator. 1996. Electric words: Dictionaries, computers, and meanings. Cambridge, Massachusetts, USA: The MIT Press.www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics

×