2019: RL review

January
Deep Neural Networks
play Starcraft II
Deepmind introduces AlphaStar, a system that plays Starcraft II, a
challenging real-time strategy game. The system employs a deep neural
network trained using supervised learning to learn good strategies.
Then, this network is used by multiple reinforcement learning agents
that play against each other in order to improve these strategies. This
work combined the established power of supervised learning with game-
theory and sparked interest in multi-agent reinforcement learning.

April
Deep Neural Networks
play Dota 2
OpenAI Five, a system that is being developed since 2017, beats the
world-champion team at Dota 2 and plays against the internet with a
success rate of 99.4%. The system employs 5 neural networks that
coordinate with each other in a simple but effective way. OpenAI uses a
simple learning algorithm to train the neural networks, called Proximal
Policy Optimization, and emphasizes the importance of training on
multiple random environments.

May
Agents play
Capture the Flag
Deepmind solves Quake III Arena Capture The flag, a complex multi-
agent game where agents need to learn to cooperate with their
teammates to capture the flag of the opposing team. The agents are
trained using deep reinforcement learning and develop their own
temporally hierarchical representations, which helps them develop
human-level strategies.

September
Agents play
Hide and Seek
OpenAI trains reinforcement learning agents that play hide-and-seek in a
simulated environment. As the agents train by acting against each other,
we can observe how they continuously adapt to their opponents and
changes in their environment. This work showed that complex
intelligent behaviour can emerge without human supervision.

October
A Robot hand learns to
manipulate the Rubik’s cube
OpenAI creates a robotic hand that uses reinforcement learning to
manipulate the Rubik’s cube. Although the puzzle is solved by an
algorithm that does not use AI, the task remains hard, as it requires
fine manipulation skills. The important contribution of this work was that
the hand appears robust to distractions it has not been trained on,
which is considered a first step towards general-purpose robotics.

November
An agent masters Atari, Go,
Chess and Shogi
Although different AI systems have solved these games in the past,
Deepmind’s Muzero is the first agent to rule them all. Muzero does not
need a description of the rules of the game, but learns an internal model
using model-based reinforcement learning. It cannot solve all kinds of
games, such as Poker, where there is partial observability, or real-world
problems, but is a first step towards general agents.

2019: RL review

Recommended

Recommended

More Related Content

Similar to 2019: RL review

Similar to 2019: RL review (20)

Recently uploaded

Recently uploaded (20)

2019: RL review