34. • GluckとMyersの計算モデル[1993]:Autoencoder[Hinton, 1989]を用いて,海馬が
入力刺激の圧縮表現を教師なしで学習していることを主張した(おそらく)初め
ての研究.(モデルは制限付きボルツマンマシンとほぼ同じ)
• Deepmindは最近立て続けに,Spatial Encodingに関する計算神経科学的な研究を
発表している.(基本的には,次ステップの刺激を予測する圧縮表現(Successor
Representation)が有効だとする主張)
- The hippocampus as a predictive map [Stachenfeld, Nat Neuro 2017]
- The successor representation in human reinforcement learning [Momennejad,
Nat Human 2017]
• 今年話題になったGrid-Like Navigation[Banino, Nature 2018]でのモデルも,表現
を獲得するLSTMとPolicy LSTMに分けている.その意味では,MERLINとすごく
似ている.
• MERLINでは,自分の過去行動に条件づけられた潜在空間表現の重要性を示唆して
いる.
MBPを導入する背景2:海馬の空間表現
34
35. • MBPの行っていること→Environmentのモデル化
• World Models[Ha, 2018]:環境のモデル化と方策の学習を,完全に切り離してい
る.この研究におけるControllerは,MERLINのPolicy LSTMに相当する.
World Modelについて
35
Ha et al. 2018 Schmidhuber. 2015
36. • The Kanerva Machine[Wu, ICLR 2018]:一言で言うと外部メモリに条件づけられ
たConditional VAEだが,外部メモリへの”読み込み”と”書き込み”も含めて,全て確
率推論で表せる.
• MERLINは,書き込みに関しては学習の余地がない機械的な操作.
外部メモリを持った深層生成モデル
36
Generative
model
Reading
Inference
Writing
Inference
50. 1. https://www.chiikunote.com/entry/conditioning
2. R. S. Sutton. “Learning to Predict by the Methods of Temporal Differences,” 1988
3. C.J.C.H. Watkins. “Learning from delayed rewards,” 1989
4. http://discovermagazine.com/2015/may/17-resetting-the-addictive-brain
5. Schultz W, Dayan P, Montague PR. “A neural substrate of prediction and reward,” 1997
6. Doya, K. “Metalearning and neuromodulation,” 2002
7. http://www.actioforma.net/kokikawa/Evolutional_aspects/Evolutional_aspects.html
8. Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu. “Asynchronous Methods for Deep Reinforcement Learning,” 2016
9. John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, Pieter Abbeel. “High-Dimensional Continuous Control Using Generalized Advantage Estimation,” 2015
10. Alex Graves, Greg Wayne, Ivo Danihelka. “Neural Turing Machine,” 2014
11. Alex Graves et al. “Hybrid computing using a neural network with dynamic external memory,” 2016
12. Rajesh P. N. Rao, Dana H. Ballard. “Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects,” 1999
13. Nicholas C Hindy, Felicia Y Ng & Nicholas B Turk-Browne. “Linking pattern completion in the hippocampus to predictive coding in visual cortex,” 2016
14. William Lotter, Gabriel Kreiman, David Cox. “Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning,” 2016
15. Aaron van den Oord, Yazhe Li, Oriol Vinyals. “Representation Learning with Contrastive Predictive Coding,” 2018
16. Karl J. Friston and Stefan Kiebel. “Predictive coding under the free-energy principle,” 2009
17. Karl J. Friston , Jean Daunizeau, Stefan J. Kiebel. “Reinforcement Learning or Active Inference?,” 2009
18. Karl J. Friston. “The free-energy principle: a unified brain theory?,” 2010
19. Andy Clark. “Whatever next? Predictive brains,situated agents, and the future ofcognitive science,” 2013
20. Martin Biehl, Christian Guckelsberger, Christoph Salge, Simón C. Smith, Daniel Polani. “Expanding the Active Inference Landscape: More Intrinsic Motivations in the Perception-Action Loop,” 2018
21. Mark A. Gluck Catherine E. Myers. “Hippocampal mediation of stimulus representation: A computational theory,” 1993
22. G. E. Hinton and R. R. Salakhutdinov. “Reducing the Dimensionality of Data with Neural Networks,” 2006
23. Kimberly L Stachenfeld, Matthew M Botvinick & Samuel J Gershman. “The hippocampus as a predictive map,” 2017
24. I. Momennejad, E. M. Russek, J. H. Cheong, M. M. Botvinick, N. D. Daw & S. J. Gershman. “The successor representation in human reinforcement learning,” 2017
25. Andrea Banino et al. “Vector-based navigation using grid-like representations in artificial agents,” 2018
26. David Ha, Jürgen Schmidhuber. “World Models,” 2018
27. Juergen Schmidhuber. “On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models,” 2015
28. Yan Wu, Greg Wayne, Alex Graves, Timothy Lillicrap. “The Kanerva Machine: A Generative Distributed Memory,” 2018
29. Wojciech Zaremba, Ilya Sutskever. “Reinforcement Learning Neural Turing Machines - Revised,” 2015
参考文献
50