Upcoming SlideShare
×

# Statistical Semantic入門 ~分布仮説からword2vecまで~

67,909 views

Published on

Published in: Technology, Education
203 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

Views
Total views
67,909
On SlideShare
0
From Embeds
0
Number of Embeds
35,433
Actions
Shares
0
572
0
Likes
203
Embeds 0
No embeds

No notes for slide

### Statistical Semantic入門 ~分布仮説からword2vecまで~

1. 1. 2014/02/06 PFI Statistical Semantic ~ word2vec Preferred Infrastructure (@unnonouno) ~
2. 2. (@unnonouno) !  !  !  !  !  IBM PFI
3. 3. Semantics
4. 4. [Bird+10] 10 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8
5. 5. [ +96] 5. 5.1 5.2 5.3 5.4
6. 6. Wikipedia !  !  !  !  !  !
7. 7. !  !  !  Statistical Semantics Statistical Semantics
8. 8. Statistical Semantics Distributional Semantics !  !  !
9. 9. [Evert10] NAACL2010 Stefan Evert Semantic Models Distributional
10. 10. ??? [Evert10]
11. 11. ??? 2 cat pig knife [Evert10]
12. 12. dog [Evert10]
13. 13. (Distributional Hypothesis) The Distributional Hypothesis is that words that occur in the same contexts tend to have similar meanings (Harris, 1954). (ACL wiki ) !  !
14. 14. (Statistical Semantics) Statistical Semantics is the study of "how the statistical patterns of human word usage can be used to ﬁgure out what people mean, at least to a level sufficient for information access” (ACL wiki ) !  !
15. 15. [ 13] !  !  !
16. 16. !  !  !  !  !  !  !  !
17. 17. !  !  PFI !  !  !  !  1
18. 18. 3 !  !  ex: !  etc… ex: !  - etc… !  !  !  ex: NN NN etc…
19. 19. : Latent Semantic Indexing (LSI), Latent Semantic Analysis (LSA) [Deerwester+90] !  !  !  !
20. 20. LSI k: (SVD) U = x ∑ x i i k V
21. 21. LSI !  !  !  !  SVD
22. 22. !  - - etc. etc. !  - !  etc.
23. 23. LSI NMF PLSI LDA NNLM RNNLM NTF Skipgram NN
24. 24. !  LSI !  Good !  !  Bad !
25. 25. !  !  !
26. 26. Probabilistic Latent Semantic Indexing (PLSI) [Hofmann99] !  LSI !  !  !  ex: LSI
27. 27. PLSI !  !  !  !  !  !  ex:
28. 28. Latent Dirichlet Allocation (LDA) [Blei03] PLSI !  PLSI LDA !
29. 29. LDA !  NLP !  !  1
30. 30. !  !  !  ex: etc. !  !  !  1.0
31. 31. !  !  Good !  Bad !  !  LSI SVD
32. 32. Non-negative Matrix Factorization (NMF) [Lee +99] !  SVD !  !  [Lee+99]
33. 33. NMF = PLSI [Dinga+08] !  NMF PLSI !  NMF PLSI
34. 34. Non-negative Tensor Factorization (NTF) [Cruys10] 3 !  !  2 3
35. 35. !  !  SVD
36. 36. !  !  Good !  Bad !  !  word2vec
37. 37. Neural Network Language Model (NNLM) [Bengio +03] !  !  N NN N-1
38. 38. Recurrent Neural Network Language Model (RNNLM) [Mikolov+10] !  t-1 t !  NNLM N !  !  http://rnnlm.org
39. 39. RNNLM !  [Mikolov+13a]
40. 40. RNNLM !  Transition-based parser RNNLM !  !  !  Stack recurrent Transition-based parser
41. 41. Skip-gram (word2vec) [Mikolov+13b] !  !  CBOW !  Analogical reasoning !  Parser
42. 42. Skip-gram [Mikolov+13b] : w1, w2, …, wT !  wi c vw w 5
43. 43. !
44. 44. [Mikolov+13c] !
45. 45. word2vec !  !  !  !  !  NMF
46. 46. [Kim+13] !  “good” ”best” ”better”
47. 47. [Mikolov+13d] !  !
48. 48. NN !  !  !  2013 !  !  !  Mikolov 15
49. 49. !  N !  !  !  NN !  !  !  !  NN N
50. 50. !  NN !  !  !  !  !  !
51. 51. !  Statistical Semantics !  3 !  !  !  NN !  !  NN
52. 52. 1 !  !  !  !  !  [Bird+10] Steven Bird, Ewan Klein, Edward Loper, , , . . , 2010. [ +96] . . , 1996. [Evert10] Stefan Evert. Distributional Semantic Models. NAACL 2010 Tutorial. [ 13] . . , 2013. [Deerwester+90] Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, Richard Harshman. Indexing by Latent Semantic Analysis. JASIS, 1990.
53. 53. 2 !  !  !  !  !  [Hofmann99] Thomas Hofmann. Probabilistic Latent Semantic Indexing. SIGIR, 1999. [Blei+03] David M. Blei, Andrew Y. Ng, Michael I. Jordan. Latent Dirichlet Allocation. JMLR, 2003. [Lee+99] Daniel D. Lee, H. Sebastian Seung. Learning the parts of objects by non-negative matrix factorization. Nature, vol 401, 1999. [Ding+08] Chris Ding, Tao Li, Wei Peng. On the equivalence between Non-negative Matrix Factorization and Probabilistic Latent Semantic Indexing. Computational Statistics & Data Analysis, 52(8), 2008. [Cruys10] Tim Van de Cruys. A Non-negative Tensor Factorization Model for Selectional Preference Induction. Natural Language Engineering, 16(4), 2010.
54. 54. 3 !  !  !  !  NN 1 [Bengio+03] Yoshua Bengio, Réjean Ducharme, Pascal Vincent, Christian Jauvin. A Neural Probabilistic Language Model. JMLR, 2003. [Mikolov+10] Tomas Mikolov, Martin Karafiat, Lukas Burget, Jan "Honza" Cernocky, Sanjeev Khudanpur. Recurrent neural network based language model. Interspeech, 2010. [Mikolov+13a] Tomas Mikolov, Wen-tau Yih, Geoffrey Zweig. Linguistic Regularities in Continuous Space Word Representations. HLT-NAACL, 2013. [Mikolov+13b] Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean. Efficient Estimation of Word Representations in Vector Space. CoRR, 2013.
55. 55. 4 !  !  !  NN 2 [Mikolov+13c] Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, Jeffrey Dean. Distributed Representations of Words and Phrases and their Compositionality. NIPS, 2013. [Kim+13] Joo-Kyung Kim, Marie-Catherine de Marneffe. Deriving adjectival scales from continuous space word representations. EMNLP 2013. , [Mikolov+13d] Tomas Mikolov, Quoc V. Le, Ilya Sutskever. Exploiting Similarities among Languages for Machine Translation. CoRR, 2013.