Reinforcement Learning is learning what to do – what action to take in a specific situation – in order to maximize some type of reward. It’s one of the most promising areas of Machine Learning today. It plays an important part in some very high-profile success stories of AI, such as mastering Go, learning to play computer games, autonomous driving, autonomous stock trading, and more. In this talk we’ll introduce the main theoretical and practical aspects of Reinforcement Learning, discuss its very distinctive set of challenges, and explore what the future looks like for self-training machines.
2. Reinforcement learning what to do
maximize a numerical
reward signal
by
trying them
-- Reinforcement Learning: An Introduction
R. Sutton, A. Barto, MIT Press, 1998
35. qmodel.init('random')
policy.init('greedy')
for N episodes do:
state = environment.init_episode()
do:
a = policy.select_action(state, qmodel)
(reward, next_state, done) = environment.step(state, action)
qmodel.learn((state, action, reward, next_state))
state = next_state
while (!done)
end for
36. do:
a = policy.select_action(state, qmodel)
(reward, next_state, done) =
environment.step(state, action)
qmodel.learn((state, action, reward, next_state))
state = next_state
while (!done)