In this report, the key theoretical foundations and implementation details of AlphaGo, AlphaGo Zero, and Alpha Zero presented in the literature (mainly the two Nature papers) are introduced, followed a short discussion to imitation learning paradigm.
In this report, the key theoretical foundations and implementation details of AlphaGo, AlphaGo Zero, and Alpha Zero presented in the literature (mainly the two Nature papers) are introduced, followed a short discussion to imitation learning paradigm.
The presentation is an introduction to AI (deep learning). The key to success with AI is “asking good questions.” The talk was given in "Seminar in Information Systems and Applications" at National Tsing Hua University in Taiwan. During this talk, we discussed what a good question is, how we use design thinking process to improve our question, and how can we “answer” the question by deep learning.
The presentation is an introduction to AI (deep learning). The key to success with AI is “asking good questions.” The talk was given in "Seminar in Information Systems and Applications" at National Tsing Hua University in Taiwan. During this talk, we discussed what a good question is, how we use design thinking process to improve our question, and how can we “answer” the question by deep learning.
15. 15
基本上有三種模式
policy based
學習 policy function, 這是「動
作函數」
value based
學習 value function, 預估在某個
state, 做某動作會得到的 reward
model based
學習或建構整個環境 (Wow, 聽來
好⾼級)
1
2
3