Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

2020 10-26-language change-stuttgart-workshop

31 views

Published on

An introduction to lexical semantic change by Nina Tahmasebi at Online Workshop on Automatic Detection of Semantic Change in Stuttgart, October 2020

https://www.ims.uni-stuttgart.de/en/institute/news/event/Online-Workshop-on-Automatic-Detection-of-Semantic-Change/

https://languagechange.org/

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

2020 10-26-language change-stuttgart-workshop

  1. 1. An introduction to lexical semantic change Nina Tahmasebi, Associate Professor University of Gothenburg October 2020, Stuttgart
  2. 2. awesome He was an awesome leader! He was an awesome leader! time Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020
  3. 3. Aims semantic change Find word sense changes automatically to find what changes, how it changed and when it changed Stone Music Lifestyle Rock Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020
  4. 4. Vision Given a word in a document at time t a 1 Mark words that are likely to have a changed meaning 2 Find all changes to the word a Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020
  5. 5. LSC Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020
  6. 6. Methods for computational semantic change Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020
  7. 7. collective text individual individual text signal topic, cluster, vector… Pipeline Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020 signal change
  8. 8. Nina Tahmasebi, On Lexical Semantic Change and Evaluation, London, November 2019 wordword Single-senseSense-differentiated
  9. 9. embeddings dynamic embeddings neural embeddings Single-sense Sense-differentiated 20132008 201220102009 2011 2014 2015 2016 2017 2018 2019 topic models word sense induction contextual embeddings 2020 Giulianelli et al 2020 Hu et al 2019 Tahmasebi et al. 2008 Mitra et al 2015 Tahmasebi & Risse 2017 Wijaya & Yentizerzi 2011 Lau et al 2012 Frerman & Lapata 2016 Bamler & Mandt 2018 Kim et al 2014 Kulkarni et al 2015 Hamilton et al 2016 Sagi et al 2009 Basile et al 2016
  10. 10. Word-level semantic change embeddings / context-based methods dynamic embeddings neural embeddings Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020
  11. 11. Context-based method Sagi et al. GEMS 2009 context vectors w ti tj Broadening of sense Narrowing of sense With grouping: Added/removed sense Data set split in approp. sets BUT: 1. 2. No alignment of senses over time!No discrimination between senses Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020
  12. 12. Difficulty: What does a word mean? When are two meanings the same? Nina Tahmasebi, On Lexical Semantic Change and Evaluation, London, November 2019
  13. 13. Nina Tahmasebi, On Lexical Semantic Change and Evaluation, London, November 2019 wordword Single-sense
  14. 14. Word embedding-based models Kulkarni et al. WWW’15 Project a word onto a vector/point (POS, frequency and embeddings) Track vectors over time Kim et al. LACSS 2014 Basile et al. CLiC-it 2016 Hamilton et al. ACL 2016 Image: Kulkarni et al. WWW’15 Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020
  15. 15. LSC – individually trained embedding spaces Single-point embedding space ti multiple time points Track an individual word w over time Change point/degree detection align Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020
  16. 16. Dynamic Embeddings Sharing data is highly beneficial! Bamler & Mandt: • Bayesian Skip-gram Yao et al: • PPMI embeddings Rudolph & Blei: • Exponential family embeddings (Beronoulli embeddings) Share data across all time points Avoids aligning Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020
  17. 17. Temporal Referencing Sharing data is highly beneficial! Share contexts across all time points Indivudal vectors for words for each bin Avoids aligning Dubossarsky et al • SGNS • PPMI embeddings Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020
  18. 18. Sense-differentiated semantic change topic models word sense induction contextual embeddings Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020
  19. 19. Wijaya & Yeniterzi DETECT '11 Cook et al. Coling 2014 Frermann & Lapata TACL 2016 Topic-based methods Finally, we conduct a preliminary evaluation in which we apply our methods to the task of The meanings of words are not fixed but in fact undergo change BNC ukWaC 1 Topic model (HDP) Assign topics to all instances of a word.2 If a word sense WSi is assigned to collection 2 but not 1 then WSi is a novel word sense.3 BUT: Only two time points (typically there is much noise!) No alignment of senses over time! A B Lau et al. EACL 2014 Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020
  20. 20. Downsides topic models Topic change Sense change Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020
  21. 21. Word sense induction Word sense induction (curvature clustering) individual time slices Tahmasebi & Risse, RANLP2017 Stone Music Lifestyle Rock Step 1: Step 2: Step 3: Detecting stable senses  units Relating units Paths Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020
  22. 22. Complexity O(|S|T) Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020
  23. 23. Type-based embedding methods wSentence with w and more Different sentence with w and more Last sentence with w and more Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020
  24. 24. Token-based embedding methods w w w Sentence with w and more Different sentence with w and more Last sentence with w and more Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020
  25. 25. • Positive examples • Negative examples • Pairs Evaluation Controlled data 3 ways Top/bottom results Pre-determined list of: Nina Tahmasebi, Digital Literacy, Sept. 2020 26
  26. 26. Summary of methods • Most co-occurrence methods • are outperformed by type- embeddings • Type-embeddings • average embeddings • need alignment across corpora • need very much data • Dynamic embeddings • ‘remember’ too much historical • Topic-based method • have little correspondence to senses • (and run badly on too large datasets) • WSI-based method • have typically too low coverage • Contextual embeddings • need to be clustered into senses Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020
  27. 27. Requirements • Large & small data • Separate senses • Disambiguate instances Nina Tahmasebi, An introduction to lexical semantic change, Stuttgart,Oct 2020 Stone Music Lifestyle Rock a
  28. 28. Danke! Nina.tahmasebi@gu.se nina@tahmasebi.se

×