Statistical Semantic入門 ~分布仮説からword2vecまで~

67,909 views

Published on

Published in: Technology, Education
0 Comments
203 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
67,909
On SlideShare
0
From Embeds
0
Number of Embeds
35,433
Actions
Shares
0
Downloads
572
Comments
0
Likes
203
Embeds 0
No embeds

No notes for slide

Statistical Semantic入門 ~分布仮説からword2vecまで~

  1. 1. 2014/02/06 PFI Statistical Semantic ~ word2vec Preferred Infrastructure (@unnonouno) ~
  2. 2. (@unnonouno) !  !  !  !  !  IBM PFI
  3. 3. Semantics
  4. 4. [Bird+10] 10 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8
  5. 5. [ +96] 5. 5.1 5.2 5.3 5.4
  6. 6. Wikipedia !  !  !  !  !  ! 
  7. 7. !  !  !  Statistical Semantics Statistical Semantics
  8. 8. Statistical Semantics Distributional Semantics !  !  ! 
  9. 9. [Evert10] NAACL2010 Stefan Evert Semantic Models Distributional
  10. 10. ??? [Evert10]
  11. 11. ??? 2 cat pig knife [Evert10]
  12. 12. dog [Evert10]
  13. 13. (Distributional Hypothesis) The Distributional Hypothesis is that words that occur in the same contexts tend to have similar meanings (Harris, 1954). (ACL wiki ) !  ! 
  14. 14. (Statistical Semantics) Statistical Semantics is the study of "how the statistical patterns of human word usage can be used to figure out what people mean, at least to a level sufficient for information access” (ACL wiki ) !  ! 
  15. 15. [ 13] !  !  ! 
  16. 16. !  !  !  !  !  !  !  ! 
  17. 17. !  !  PFI !  !  !  !  1
  18. 18. 3 !  !  ex: !  etc… ex: !  - etc… !  !  !  ex: NN NN etc…
  19. 19. : Latent Semantic Indexing (LSI), Latent Semantic Analysis (LSA) [Deerwester+90] !  !  !  ! 
  20. 20. LSI k: (SVD) U = x ∑ x i i k V
  21. 21. LSI !  !  !  !  SVD
  22. 22. !  - - etc. etc. !  - !  etc.
  23. 23. LSI NMF PLSI LDA NNLM RNNLM NTF Skipgram NN
  24. 24. !  LSI !  Good !  !  Bad ! 
  25. 25. !  !  ! 
  26. 26. Probabilistic Latent Semantic Indexing (PLSI) [Hofmann99] !  LSI !  !  !  ex: LSI
  27. 27. PLSI !  !  !  !  !  !  ex:
  28. 28. Latent Dirichlet Allocation (LDA) [Blei03] PLSI !  PLSI LDA ! 
  29. 29. LDA !  NLP !  !  1
  30. 30. !  !  !  ex: etc. !  !  !  1.0
  31. 31. !  !  Good !  Bad !  !  LSI SVD
  32. 32. Non-negative Matrix Factorization (NMF) [Lee +99] !  SVD !  !  [Lee+99]
  33. 33. NMF = PLSI [Dinga+08] !  NMF PLSI !  NMF PLSI
  34. 34. Non-negative Tensor Factorization (NTF) [Cruys10] 3 !  !  2 3
  35. 35. !  !  SVD
  36. 36. !  !  Good !  Bad !  !  word2vec
  37. 37. Neural Network Language Model (NNLM) [Bengio +03] !  !  N NN N-1
  38. 38. Recurrent Neural Network Language Model (RNNLM) [Mikolov+10] !  t-1 t !  NNLM N !  !  http://rnnlm.org
  39. 39. RNNLM !  [Mikolov+13a]
  40. 40. RNNLM !  Transition-based parser RNNLM !  !  !  Stack recurrent Transition-based parser
  41. 41. Skip-gram (word2vec) [Mikolov+13b] !  !  CBOW !  Analogical reasoning !  Parser
  42. 42. Skip-gram [Mikolov+13b] : w1, w2, …, wT !  wi c vw w 5
  43. 43. ! 
  44. 44. [Mikolov+13c] ! 
  45. 45. word2vec !  !  !  !  !  NMF
  46. 46. [Kim+13] !  “good” ”best” ”better”
  47. 47. [Mikolov+13d] !  ! 
  48. 48. NN !  !  !  2013 !  !  !  Mikolov 15
  49. 49. !  N !  !  !  NN !  !  !  !  NN N
  50. 50. !  NN !  !  !  !  !  ! 
  51. 51. !  Statistical Semantics !  3 !  !  !  NN !  !  NN
  52. 52. 1 !  !  !  !  !  [Bird+10] Steven Bird, Ewan Klein, Edward Loper, , , . . , 2010. [ +96] . . , 1996. [Evert10] Stefan Evert. Distributional Semantic Models. NAACL 2010 Tutorial. [ 13] . . , 2013. [Deerwester+90] Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, Richard Harshman. Indexing by Latent Semantic Analysis. JASIS, 1990.
  53. 53. 2 !  !  !  !  !  [Hofmann99] Thomas Hofmann. Probabilistic Latent Semantic Indexing. SIGIR, 1999. [Blei+03] David M. Blei, Andrew Y. Ng, Michael I. Jordan. Latent Dirichlet Allocation. JMLR, 2003. [Lee+99] Daniel D. Lee, H. Sebastian Seung. Learning the parts of objects by non-negative matrix factorization. Nature, vol 401, 1999. [Ding+08] Chris Ding, Tao Li, Wei Peng. On the equivalence between Non-negative Matrix Factorization and Probabilistic Latent Semantic Indexing. Computational Statistics & Data Analysis, 52(8), 2008. [Cruys10] Tim Van de Cruys. A Non-negative Tensor Factorization Model for Selectional Preference Induction. Natural Language Engineering, 16(4), 2010.
  54. 54. 3 !  !  !  !  NN 1 [Bengio+03] Yoshua Bengio, Réjean Ducharme, Pascal Vincent, Christian Jauvin. A Neural Probabilistic Language Model. JMLR, 2003. [Mikolov+10] Tomas Mikolov, Martin Karafiat, Lukas Burget, Jan "Honza" Cernocky, Sanjeev Khudanpur. Recurrent neural network based language model. Interspeech, 2010. [Mikolov+13a] Tomas Mikolov, Wen-tau Yih, Geoffrey Zweig. Linguistic Regularities in Continuous Space Word Representations. HLT-NAACL, 2013. [Mikolov+13b] Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean. Efficient Estimation of Word Representations in Vector Space. CoRR, 2013.
  55. 55. 4 !  !  !  NN 2 [Mikolov+13c] Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, Jeffrey Dean. Distributed Representations of Words and Phrases and their Compositionality. NIPS, 2013. [Kim+13] Joo-Kyung Kim, Marie-Catherine de Marneffe. Deriving adjectival scales from continuous space word representations. EMNLP 2013. , [Mikolov+13d] Tomas Mikolov, Quoc V. Le, Ilya Sutskever. Exploiting Similarities among Languages for Machine Translation. CoRR, 2013.

×