SlideShare a Scribd company logo
January
Deep Neural Networks
play Starcraft II
Deepmind introduces AlphaStar, a system that plays Starcraft II, a
challenging real-time strategy game. The system employs a deep neural
network trained using supervised learning to learn good strategies.
Then, this network is used by multiple reinforcement learning agents
that play against each other in order to improve these strategies. This
work combined the established power of supervised learning with game-
theory and sparked interest in multi-agent reinforcement learning.
April
Deep Neural Networks
play Dota 2
OpenAI Five, a system that is being developed since 2017, beats the
world-champion team at Dota 2 and plays against the internet with a
success rate of 99.4%. The system employs 5 neural networks that
coordinate with each other in a simple but effective way. OpenAI uses a
simple learning algorithm to train the neural networks, called Proximal
Policy Optimization, and emphasizes the importance of training on
multiple random environments.
May
Agents play
Capture the Flag
Deepmind solves Quake III Arena Capture The flag, a complex multi-
agent game where agents need to learn to cooperate with their
teammates to capture the flag of the opposing team. The agents are
trained using deep reinforcement learning and develop their own
temporally hierarchical representations, which helps them develop
human-level strategies.
September
Agents play
Hide and Seek
OpenAI trains reinforcement learning agents that play hide-and-seek in a
simulated environment. As the agents train by acting against each other,
we can observe how they continuously adapt to their opponents and
changes in their environment. This work showed that complex
intelligent behaviour can emerge without human supervision.
October
A Robot hand learns to
manipulate the Rubik’s cube
OpenAI creates a robotic hand that uses reinforcement learning to
manipulate the Rubik’s cube. Although the puzzle is solved by an
algorithm that does not use AI, the task remains hard, as it requires
fine manipulation skills. The important contribution of this work was that
the hand appears robust to distractions it has not been trained on,
which is considered a first step towards general-purpose robotics.
November
An agent masters Atari, Go,
Chess and Shogi
Although different AI systems have solved these games in the past,
Deepmind’s Muzero is the first agent to rule them all. Muzero does not
need a description of the rules of the game, but learns an internal model
using model-based reinforcement learning. It cannot solve all kinds of
games, such as Poker, where there is partial observability, or real-world
problems, but is a first step towards general agents.

More Related Content

Similar to 2019: RL review

Artificial Intelligence in Gaming
Artificial Intelligence in GamingArtificial Intelligence in Gaming
Artificial Intelligence in Gaming
Anmol Sawhney
 
Artificial Intelligence in Gaming
Artificial Intelligence in GamingArtificial Intelligence in Gaming
Artificial Intelligence in Gaming
ijtsrd
 
Artificial_intelligence.pptx
Artificial_intelligence.pptxArtificial_intelligence.pptx
Artificial_intelligence.pptx
john6938
 
DevSecCon London 2018: Building effective DevSecOps teams through role-playin...
DevSecCon London 2018: Building effective DevSecOps teams through role-playin...DevSecCon London 2018: Building effective DevSecOps teams through role-playin...
DevSecCon London 2018: Building effective DevSecOps teams through role-playin...
DevSecCon
 
The Role of Shologuti in Artificial Intelligence Research: A Rural Game of Ba...
The Role of Shologuti in Artificial Intelligence Research: A Rural Game of Ba...The Role of Shologuti in Artificial Intelligence Research: A Rural Game of Ba...
The Role of Shologuti in Artificial Intelligence Research: A Rural Game of Ba...
IJCSIS Research Publications
 
Artificial Intelligence - An Introduction
Artificial Intelligence - An IntroductionArtificial Intelligence - An Introduction
Artificial Intelligence - An Introduction
LeoSoft
 
Bringing Machine Learning to Unity by Arthur Juliani from Unity
Bringing Machine Learning to Unity  by Arthur Juliani from UnityBringing Machine Learning to Unity  by Arthur Juliani from Unity
Bringing Machine Learning to Unity by Arthur Juliani from Unity
Bill Liu
 
Knowledge Management, Collaboration and Games: A Perfect Storm
Knowledge Management, Collaboration and Games: A Perfect StormKnowledge Management, Collaboration and Games: A Perfect Storm
Knowledge Management, Collaboration and Games: A Perfect Storm
Dr. Marigo Raftopoulos
 
OpenAI Gym & Universe
OpenAI Gym & UniverseOpenAI Gym & Universe
OpenAI Gym & Universe
Entrepreneur / Startup
 
Exploratory Analysis of AI Techniques in Computer Games and Challenges faced ...
Exploratory Analysis of AI Techniques in Computer Games and Challenges faced ...Exploratory Analysis of AI Techniques in Computer Games and Challenges faced ...
Exploratory Analysis of AI Techniques in Computer Games and Challenges faced ...
International Journal of Computer and Communication System Engineering
 
Game mechanics for thinking users
Game mechanics for thinking usersGame mechanics for thinking users
Game mechanics for thinking users
Pietro Polsinelli
 
Is this the end of the e sport
Is this the end of the e sportIs this the end of the e sport
Is this the end of the e sport
Thierry Dion Fajal
 
Deep Learning Jump Start
Deep Learning Jump StartDeep Learning Jump Start
Deep Learning Jump Start
Michele Toni
 
Open ai openpower
Open ai openpowerOpen ai openpower
Open ai openpower
Ganesan Narayanasamy
 
Bhawna Garg
Bhawna GargBhawna Garg
Bhawna Garg
bhawna garg
 
Excite artificial intelligence Class 9
Excite artificial intelligence Class 9Excite artificial intelligence Class 9
Excite artificial intelligence Class 9
TutorialAICSIP
 
Better Game Design with Object-Oriented User Experience (OOUX)
Better Game Design with Object-Oriented User Experience (OOUX)Better Game Design with Object-Oriented User Experience (OOUX)
Better Game Design with Object-Oriented User Experience (OOUX)
Caroline Sober-James
 
Ai presentation (1) 2
Ai presentation (1) 2Ai presentation (1) 2
Ai presentation (1) 2
Josh Matthews
 
Ai presentation (1) 2
Ai presentation (1) 2Ai presentation (1) 2
Ai presentation (1) 2
Josh Matthews
 
2020 Fighting Game AI Competition
2020 Fighting Game AI Competition2020 Fighting Game AI Competition
2020 Fighting Game AI Competition
ftgaic
 

Similar to 2019: RL review (20)

Artificial Intelligence in Gaming
Artificial Intelligence in GamingArtificial Intelligence in Gaming
Artificial Intelligence in Gaming
 
Artificial Intelligence in Gaming
Artificial Intelligence in GamingArtificial Intelligence in Gaming
Artificial Intelligence in Gaming
 
Artificial_intelligence.pptx
Artificial_intelligence.pptxArtificial_intelligence.pptx
Artificial_intelligence.pptx
 
DevSecCon London 2018: Building effective DevSecOps teams through role-playin...
DevSecCon London 2018: Building effective DevSecOps teams through role-playin...DevSecCon London 2018: Building effective DevSecOps teams through role-playin...
DevSecCon London 2018: Building effective DevSecOps teams through role-playin...
 
The Role of Shologuti in Artificial Intelligence Research: A Rural Game of Ba...
The Role of Shologuti in Artificial Intelligence Research: A Rural Game of Ba...The Role of Shologuti in Artificial Intelligence Research: A Rural Game of Ba...
The Role of Shologuti in Artificial Intelligence Research: A Rural Game of Ba...
 
Artificial Intelligence - An Introduction
Artificial Intelligence - An IntroductionArtificial Intelligence - An Introduction
Artificial Intelligence - An Introduction
 
Bringing Machine Learning to Unity by Arthur Juliani from Unity
Bringing Machine Learning to Unity  by Arthur Juliani from UnityBringing Machine Learning to Unity  by Arthur Juliani from Unity
Bringing Machine Learning to Unity by Arthur Juliani from Unity
 
Knowledge Management, Collaboration and Games: A Perfect Storm
Knowledge Management, Collaboration and Games: A Perfect StormKnowledge Management, Collaboration and Games: A Perfect Storm
Knowledge Management, Collaboration and Games: A Perfect Storm
 
OpenAI Gym & Universe
OpenAI Gym & UniverseOpenAI Gym & Universe
OpenAI Gym & Universe
 
Exploratory Analysis of AI Techniques in Computer Games and Challenges faced ...
Exploratory Analysis of AI Techniques in Computer Games and Challenges faced ...Exploratory Analysis of AI Techniques in Computer Games and Challenges faced ...
Exploratory Analysis of AI Techniques in Computer Games and Challenges faced ...
 
Game mechanics for thinking users
Game mechanics for thinking usersGame mechanics for thinking users
Game mechanics for thinking users
 
Is this the end of the e sport
Is this the end of the e sportIs this the end of the e sport
Is this the end of the e sport
 
Deep Learning Jump Start
Deep Learning Jump StartDeep Learning Jump Start
Deep Learning Jump Start
 
Open ai openpower
Open ai openpowerOpen ai openpower
Open ai openpower
 
Bhawna Garg
Bhawna GargBhawna Garg
Bhawna Garg
 
Excite artificial intelligence Class 9
Excite artificial intelligence Class 9Excite artificial intelligence Class 9
Excite artificial intelligence Class 9
 
Better Game Design with Object-Oriented User Experience (OOUX)
Better Game Design with Object-Oriented User Experience (OOUX)Better Game Design with Object-Oriented User Experience (OOUX)
Better Game Design with Object-Oriented User Experience (OOUX)
 
Ai presentation (1) 2
Ai presentation (1) 2Ai presentation (1) 2
Ai presentation (1) 2
 
Ai presentation (1) 2
Ai presentation (1) 2Ai presentation (1) 2
Ai presentation (1) 2
 
2020 Fighting Game AI Competition
2020 Fighting Game AI Competition2020 Fighting Game AI Competition
2020 Fighting Game AI Competition
 

Recently uploaded

TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 

Recently uploaded (20)

TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 

2019: RL review

  • 1. January Deep Neural Networks play Starcraft II Deepmind introduces AlphaStar, a system that plays Starcraft II, a challenging real-time strategy game. The system employs a deep neural network trained using supervised learning to learn good strategies. Then, this network is used by multiple reinforcement learning agents that play against each other in order to improve these strategies. This work combined the established power of supervised learning with game- theory and sparked interest in multi-agent reinforcement learning.
  • 2. April Deep Neural Networks play Dota 2 OpenAI Five, a system that is being developed since 2017, beats the world-champion team at Dota 2 and plays against the internet with a success rate of 99.4%. The system employs 5 neural networks that coordinate with each other in a simple but effective way. OpenAI uses a simple learning algorithm to train the neural networks, called Proximal Policy Optimization, and emphasizes the importance of training on multiple random environments.
  • 3. May Agents play Capture the Flag Deepmind solves Quake III Arena Capture The flag, a complex multi- agent game where agents need to learn to cooperate with their teammates to capture the flag of the opposing team. The agents are trained using deep reinforcement learning and develop their own temporally hierarchical representations, which helps them develop human-level strategies.
  • 4. September Agents play Hide and Seek OpenAI trains reinforcement learning agents that play hide-and-seek in a simulated environment. As the agents train by acting against each other, we can observe how they continuously adapt to their opponents and changes in their environment. This work showed that complex intelligent behaviour can emerge without human supervision.
  • 5. October A Robot hand learns to manipulate the Rubik’s cube OpenAI creates a robotic hand that uses reinforcement learning to manipulate the Rubik’s cube. Although the puzzle is solved by an algorithm that does not use AI, the task remains hard, as it requires fine manipulation skills. The important contribution of this work was that the hand appears robust to distractions it has not been trained on, which is considered a first step towards general-purpose robotics.
  • 6. November An agent masters Atari, Go, Chess and Shogi Although different AI systems have solved these games in the past, Deepmind’s Muzero is the first agent to rule them all. Muzero does not need a description of the rules of the game, but learns an internal model using model-based reinforcement learning. It cannot solve all kinds of games, such as Poker, where there is partial observability, or real-world problems, but is a first step towards general agents.