SlideShare a Scribd company logo
1 of 17
Gym & Universe
Ashish Kumar
LinkedIn: https://www.linkedin.com/in/ashkmr1
Twitter : @ashish_fagna
e-mail : ashish.fagna@gmail.com
OpenAI - Introduction
● OpenAI is a non-profit artificial
intelligence (AI) research company.
● It aims to promote and develop
friendly AI in a way to benefit
humanity as a whole.
● It aims to "freely collaborate" by
making its patents and research open
to the public.
2
Latest in News
● OpenAI Five vs Dota 2
● Event was streamed online
● OpenAI Five is a set of five neural networks
3
Source: https://medium.com/deep-math-machine-learning-ai/different-types-of-machine-learning-and-their-types-34760b9128a2
4
ReInforcement Learning
● One of the most important type of Machine Learning,
● An agent learns how to behave in a environment by performing actions and
seeing the results.
5
ReInforcement Learning
There are two basic concepts in reinforcement learning:
1. Environment (namely, the outside world) and
2. Agent (namely, the algorithm you are writing).
The agent sends actions to the environment, and the environment replies with
observations and rewards (that is, a score).
6
Example : ReInforcement Learning
Imagine you’re a child in a living room.
Action1 : You see a fireplace, and you approach it. It’s warm
(Positive Reward +1).
Action 2: But when you try to touch the fire. It burns your hand
(Negative reward -1).
Learning 1 : fire is positive when you are a sufficient distance
away, because it produces warmth.
Learning 2 : But getting too close to it and you will be burned.
Image Source : https://medium.freecodecamp.org/an-introduction-to-reinforcement-learning-4339519de419
That’s how humans learn, through interaction.
Reinforcement Learning is just a computational approach of learning from action.
7
Reinforcement Learning
Actions influence the state, which determines reward.
Image Source : https://medium.freecodecamp.org/an-introduction-to-reinforcement-learning-4339519de419
8
Reinforcement Learning Process (Super Mario)
Let’s imagine an agent learning to play Super Mario.
The Reinforcement Learning (RL) process can be modeled as a loop that works
like this:
● Our Agent receives state S0 from the Environment (In our case we receive
the first frame of our game (state) from Super Mario (environment))
● Based on that state S0, agent takes an action A0 (our agent will move right)
● Environment transitions to a new state S1 (new frame)
● Environment gives some reward R1 to the agent (not dead: +1)
This RL loop outputs a sequence of state, action and reward.
9
The Reinforcement Learning process
● The goal of the agent is to maximize the expected cumulative reward.
● By running more and more loops, the agent will learn to play better and
better.
10
OpenAI Gym
● Gym is a toolkit for Researching
(developing and comparing)
reinforcement learning algorithms.
● It supports teaching agents everything
from walking to playing games like Pong
or Pinball.
● Gym Envs:
https://gym.openai.com/envs/#mujoco
Ant-v2
Make a 3D four-legged robot walk. 11
OpenAI Universe
● Platform for measuring and training an AGI across games, websites and other
applications.
● Makes it possible for any existing program to become an OpenAI Gym
environment, without needing special access to the program's internals,
source code, or APIs.
● It does this by packaging the program into a Docker container, and presenting
the AI with the same interface a human uses: sending keyboard and mouse
events, and receiving screen pixels.
● Contains over 1,000 environments in which an AI agent can take actions and
gather observations. 12
Command: Start Docker Container via Conda
● Conda is an open source package management system and environment
management system that runs on Windows, macOS and Linux.
● Conda quickly installs, runs and updates packages and their dependencies.
● Conda easily creates, saves, loads and switches between environments on
your local computer.
Command:
conda create --name universe-starter-agent python=3.5
source activate universe-starter-agent
13
OpenAI Universe Demo
14
Gym vs Universe
● OpenAI Universe is like a much bigger OpenAI Gym.
● OpenAI Gym’s got some basic tasks, like pole balancing, and pendulum
uprighting, and some more difficult ones like basic Atari games like Space
Invaders.
● like an enclosed world, or a “gym” to exercise and develop RL algorithms.
● OpenAI Universe has a much wider variety of tasks, and is more involved in
giving RL networks/algorithms the ability to interact with the real world:
playing games, using an actual (virtual) keyboard and mouse to interact with
buttons and sliders on webpages, etc.
● Universe is based on Gym
15
Use Cases
Environments for doing various tasks, like
● Sending an email,
● Doing some mouse clicking, keyboard events,
● More and more environments are being added
16
Thank You
Ashish Kumar
LinkedIn: https://www.linkedin.com/in/ashkmr1
Twitter : @ashish_fagna
E-mail: ashish.fagna@gmail.com
17

More Related Content

What's hot

Deep Learning: Application Landscape - March 2018
Deep Learning: Application Landscape - March 2018Deep Learning: Application Landscape - March 2018
Deep Learning: Application Landscape - March 2018Grigory Sapunov
 
An introduction to reinforcement learning
An introduction to reinforcement learningAn introduction to reinforcement learning
An introduction to reinforcement learningSubrat Panda, PhD
 
Machine Learning Ml Overview Algorithms Use Cases And Applications
Machine Learning Ml Overview Algorithms Use Cases And ApplicationsMachine Learning Ml Overview Algorithms Use Cases And Applications
Machine Learning Ml Overview Algorithms Use Cases And ApplicationsSlideTeam
 
Android Interview Questions And Answers | Android Tutorial | Android Online T...
Android Interview Questions And Answers | Android Tutorial | Android Online T...Android Interview Questions And Answers | Android Tutorial | Android Online T...
Android Interview Questions And Answers | Android Tutorial | Android Online T...Edureka!
 
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdfRetrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdfPo-Chuan Chen
 
Introduction to NumPy (PyData SV 2013)
Introduction to NumPy (PyData SV 2013)Introduction to NumPy (PyData SV 2013)
Introduction to NumPy (PyData SV 2013)PyData
 
Generative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsGenerative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsArtifacia
 
A friendly introduction to GANs
A friendly introduction to GANsA friendly introduction to GANs
A friendly introduction to GANsCsongor Barabasi
 
chatgpt seminar ppt.pptx
chatgpt seminar ppt.pptxchatgpt seminar ppt.pptx
chatgpt seminar ppt.pptxAltafSMT
 
Virtual Personal Assistant
Virtual Personal AssistantVirtual Personal Assistant
Virtual Personal Assistantsohaildanish
 
ppt about chatgpt.pptx
ppt about chatgpt.pptxppt about chatgpt.pptx
ppt about chatgpt.pptxSrinivas237938
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroNumenta
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual IntroductionLukas Masuch
 
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기NAVER D2
 
Generative AI For Everyone on AWS.pdf
Generative AI For Everyone on AWS.pdfGenerative AI For Everyone on AWS.pdf
Generative AI For Everyone on AWS.pdfManjunatha Sai
 
Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)butest
 
An introduction to deep reinforcement learning
An introduction to deep reinforcement learningAn introduction to deep reinforcement learning
An introduction to deep reinforcement learningBig Data Colombia
 
Exception Handling In Python | Exceptions In Python | Python Programming Tuto...
Exception Handling In Python | Exceptions In Python | Python Programming Tuto...Exception Handling In Python | Exceptions In Python | Python Programming Tuto...
Exception Handling In Python | Exceptions In Python | Python Programming Tuto...Edureka!
 

What's hot (20)

Deep Learning: Application Landscape - March 2018
Deep Learning: Application Landscape - March 2018Deep Learning: Application Landscape - March 2018
Deep Learning: Application Landscape - March 2018
 
An introduction to reinforcement learning
An introduction to reinforcement learningAn introduction to reinforcement learning
An introduction to reinforcement learning
 
Machine Learning Ml Overview Algorithms Use Cases And Applications
Machine Learning Ml Overview Algorithms Use Cases And ApplicationsMachine Learning Ml Overview Algorithms Use Cases And Applications
Machine Learning Ml Overview Algorithms Use Cases And Applications
 
Android Interview Questions And Answers | Android Tutorial | Android Online T...
Android Interview Questions And Answers | Android Tutorial | Android Online T...Android Interview Questions And Answers | Android Tutorial | Android Online T...
Android Interview Questions And Answers | Android Tutorial | Android Online T...
 
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdfRetrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
 
Introduction to NumPy (PyData SV 2013)
Introduction to NumPy (PyData SV 2013)Introduction to NumPy (PyData SV 2013)
Introduction to NumPy (PyData SV 2013)
 
Generative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsGenerative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their Applications
 
A friendly introduction to GANs
A friendly introduction to GANsA friendly introduction to GANs
A friendly introduction to GANs
 
chatgpt seminar ppt.pptx
chatgpt seminar ppt.pptxchatgpt seminar ppt.pptx
chatgpt seminar ppt.pptx
 
AI: Learning in AI
AI: Learning in AI AI: Learning in AI
AI: Learning in AI
 
Virtual Personal Assistant
Virtual Personal AssistantVirtual Personal Assistant
Virtual Personal Assistant
 
ppt about chatgpt.pptx
ppt about chatgpt.pptxppt about chatgpt.pptx
ppt about chatgpt.pptx
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
 
Machine Can Think
Machine Can ThinkMachine Can Think
Machine Can Think
 
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기
 
Generative AI For Everyone on AWS.pdf
Generative AI For Everyone on AWS.pdfGenerative AI For Everyone on AWS.pdf
Generative AI For Everyone on AWS.pdf
 
Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)
 
An introduction to deep reinforcement learning
An introduction to deep reinforcement learningAn introduction to deep reinforcement learning
An introduction to deep reinforcement learning
 
Exception Handling In Python | Exceptions In Python | Python Programming Tuto...
Exception Handling In Python | Exceptions In Python | Python Programming Tuto...Exception Handling In Python | Exceptions In Python | Python Programming Tuto...
Exception Handling In Python | Exceptions In Python | Python Programming Tuto...
 

Similar to OpenAI Gym & Universe

Bringing Machine Learning to Unity by Arthur Juliani from Unity
Bringing Machine Learning to Unity  by Arthur Juliani from UnityBringing Machine Learning to Unity  by Arthur Juliani from Unity
Bringing Machine Learning to Unity by Arthur Juliani from UnityBill Liu
 
Deep reinforcement learning&Robotics
Deep reinforcement learning&RoboticsDeep reinforcement learning&Robotics
Deep reinforcement learning&Robotics湯米吳 Tommy Wu
 
What is goap, and why is it not already mainstream
What is goap, and why is it not already mainstreamWhat is goap, and why is it not already mainstream
What is goap, and why is it not already mainstreamAakash Chotrani
 
Machine Learning in Unity - How to give your game AI a real brain
Machine Learning in Unity - How to give your game AI a real brainMachine Learning in Unity - How to give your game AI a real brain
Machine Learning in Unity - How to give your game AI a real brainDevGAMM Conference
 
Ciro Continisio - Implementing Machine Learning the Unity way - Codemotion Mi...
Ciro Continisio - Implementing Machine Learning the Unity way - Codemotion Mi...Ciro Continisio - Implementing Machine Learning the Unity way - Codemotion Mi...
Ciro Continisio - Implementing Machine Learning the Unity way - Codemotion Mi...Codemotion
 
AbadIA: the abbey of the crime AI - GDG Cloud London 2018
AbadIA:  the abbey of the crime AI - GDG Cloud London 2018AbadIA:  the abbey of the crime AI - GDG Cloud London 2018
AbadIA: the abbey of the crime AI - GDG Cloud London 2018Juantomás García Molina
 
AI INTRODUCTION.pptx,INFORMATION TECHNOLOGY
AI INTRODUCTION.pptx,INFORMATION TECHNOLOGYAI INTRODUCTION.pptx,INFORMATION TECHNOLOGY
AI INTRODUCTION.pptx,INFORMATION TECHNOLOGYsantoshverma90
 
Biological organism simulation using procedural growth "Organimo 1.0"
Biological organism simulation using procedural growth "Organimo 1.0"Biological organism simulation using procedural growth "Organimo 1.0"
Biological organism simulation using procedural growth "Organimo 1.0"Devyani Singh
 
ARTIFICIAL INTELLIGENCEr.pdf
ARTIFICIAL INTELLIGENCEr.pdfARTIFICIAL INTELLIGENCEr.pdf
ARTIFICIAL INTELLIGENCEr.pdfssusere55750
 
ARTIFICIAL INTELLIGENCEr.pdf
ARTIFICIAL INTELLIGENCEr.pdfARTIFICIAL INTELLIGENCEr.pdf
ARTIFICIAL INTELLIGENCEr.pdfMuhammad Sohail
 
[244]로봇이 현실 세계에 대해 학습하도록 만들기
[244]로봇이 현실 세계에 대해 학습하도록 만들기[244]로봇이 현실 세계에 대해 학습하도록 만들기
[244]로봇이 현실 세계에 대해 학습하도록 만들기NAVER D2
 
Hrms industrial training report
Hrms industrial training reportHrms industrial training report
Hrms industrial training reportNitesh Dubey
 
2013 Gartner ITO Conference - IT Ops Gamification with ITPA
2013 Gartner ITO Conference - IT Ops Gamification with ITPA2013 Gartner ITO Conference - IT Ops Gamification with ITPA
2013 Gartner ITO Conference - IT Ops Gamification with ITPAckindiger
 
Functional Requirements Of System Requirements
Functional Requirements Of System RequirementsFunctional Requirements Of System Requirements
Functional Requirements Of System RequirementsLaura Arrigo
 
Is Production RL at a tipping point?
Is Production RL at a tipping point?Is Production RL at a tipping point?
Is Production RL at a tipping point?M Waleed Kadous
 
How to generate game character behaviors using AI and ML - Unite Copenhagen
How to generate game character behaviors using AI and ML - Unite CopenhagenHow to generate game character behaviors using AI and ML - Unite Copenhagen
How to generate game character behaviors using AI and ML - Unite CopenhagenUnity Technologies
 
Building a deep learning ai.pptx
Building a deep learning ai.pptxBuilding a deep learning ai.pptx
Building a deep learning ai.pptxDaniel Slater
 
200109-Open AI Chat GPT.pptx
200109-Open AI Chat GPT.pptx200109-Open AI Chat GPT.pptx
200109-Open AI Chat GPT.pptxAnkurGuputa
 

Similar to OpenAI Gym & Universe (20)

Bringing Machine Learning to Unity by Arthur Juliani from Unity
Bringing Machine Learning to Unity  by Arthur Juliani from UnityBringing Machine Learning to Unity  by Arthur Juliani from Unity
Bringing Machine Learning to Unity by Arthur Juliani from Unity
 
Deep reinforcement learning&Robotics
Deep reinforcement learning&RoboticsDeep reinforcement learning&Robotics
Deep reinforcement learning&Robotics
 
What is goap, and why is it not already mainstream
What is goap, and why is it not already mainstreamWhat is goap, and why is it not already mainstream
What is goap, and why is it not already mainstream
 
Machine Learning in Unity - How to give your game AI a real brain
Machine Learning in Unity - How to give your game AI a real brainMachine Learning in Unity - How to give your game AI a real brain
Machine Learning in Unity - How to give your game AI a real brain
 
Ciro Continisio - Implementing Machine Learning the Unity way - Codemotion Mi...
Ciro Continisio - Implementing Machine Learning the Unity way - Codemotion Mi...Ciro Continisio - Implementing Machine Learning the Unity way - Codemotion Mi...
Ciro Continisio - Implementing Machine Learning the Unity way - Codemotion Mi...
 
OpenAI_Company.pptx
OpenAI_Company.pptxOpenAI_Company.pptx
OpenAI_Company.pptx
 
AbadIA: the abbey of the crime AI - GDG Cloud London 2018
AbadIA:  the abbey of the crime AI - GDG Cloud London 2018AbadIA:  the abbey of the crime AI - GDG Cloud London 2018
AbadIA: the abbey of the crime AI - GDG Cloud London 2018
 
AI INTRODUCTION.pptx,INFORMATION TECHNOLOGY
AI INTRODUCTION.pptx,INFORMATION TECHNOLOGYAI INTRODUCTION.pptx,INFORMATION TECHNOLOGY
AI INTRODUCTION.pptx,INFORMATION TECHNOLOGY
 
Biological organism simulation using procedural growth "Organimo 1.0"
Biological organism simulation using procedural growth "Organimo 1.0"Biological organism simulation using procedural growth "Organimo 1.0"
Biological organism simulation using procedural growth "Organimo 1.0"
 
ARTIFICIAL INTELLIGENCEr.pdf
ARTIFICIAL INTELLIGENCEr.pdfARTIFICIAL INTELLIGENCEr.pdf
ARTIFICIAL INTELLIGENCEr.pdf
 
ARTIFICIAL INTELLIGENCEr.pdf
ARTIFICIAL INTELLIGENCEr.pdfARTIFICIAL INTELLIGENCEr.pdf
ARTIFICIAL INTELLIGENCEr.pdf
 
[244]로봇이 현실 세계에 대해 학습하도록 만들기
[244]로봇이 현실 세계에 대해 학습하도록 만들기[244]로봇이 현실 세계에 대해 학습하도록 만들기
[244]로봇이 현실 세계에 대해 학습하도록 만들기
 
Hrms industrial training report
Hrms industrial training reportHrms industrial training report
Hrms industrial training report
 
2013 Gartner ITO Conference - IT Ops Gamification with ITPA
2013 Gartner ITO Conference - IT Ops Gamification with ITPA2013 Gartner ITO Conference - IT Ops Gamification with ITPA
2013 Gartner ITO Conference - IT Ops Gamification with ITPA
 
Functional Requirements Of System Requirements
Functional Requirements Of System RequirementsFunctional Requirements Of System Requirements
Functional Requirements Of System Requirements
 
Is Production RL at a tipping point?
Is Production RL at a tipping point?Is Production RL at a tipping point?
Is Production RL at a tipping point?
 
How to generate game character behaviors using AI and ML - Unite Copenhagen
How to generate game character behaviors using AI and ML - Unite CopenhagenHow to generate game character behaviors using AI and ML - Unite Copenhagen
How to generate game character behaviors using AI and ML - Unite Copenhagen
 
Building a deep learning ai.pptx
Building a deep learning ai.pptxBuilding a deep learning ai.pptx
Building a deep learning ai.pptx
 
Lec 2-agents
Lec 2-agentsLec 2-agents
Lec 2-agents
 
200109-Open AI Chat GPT.pptx
200109-Open AI Chat GPT.pptx200109-Open AI Chat GPT.pptx
200109-Open AI Chat GPT.pptx
 

More from Entrepreneur / Startup

R-FCN : object detection via region-based fully convolutional networks
R-FCN :  object detection via region-based fully convolutional networksR-FCN :  object detection via region-based fully convolutional networks
R-FCN : object detection via region-based fully convolutional networksEntrepreneur / Startup
 
You only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detectionYou only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detectionEntrepreneur / Startup
 
Machine Learning Algorithms in Enterprise Applications
Machine Learning Algorithms in Enterprise ApplicationsMachine Learning Algorithms in Enterprise Applications
Machine Learning Algorithms in Enterprise ApplicationsEntrepreneur / Startup
 
Build a Neural Network for ITSM with TensorFlow
Build a Neural Network for ITSM with TensorFlowBuild a Neural Network for ITSM with TensorFlow
Build a Neural Network for ITSM with TensorFlowEntrepreneur / Startup
 
Understanding Autoencoder (Deep Learning Book, Chapter 14)
Understanding Autoencoder  (Deep Learning Book, Chapter 14)Understanding Autoencoder  (Deep Learning Book, Chapter 14)
Understanding Autoencoder (Deep Learning Book, Chapter 14)Entrepreneur / Startup
 
Building chat bots using ai platforms (wit.ai or api.ai) in nodejs
Building chat bots using ai platforms (wit.ai or api.ai) in nodejsBuilding chat bots using ai platforms (wit.ai or api.ai) in nodejs
Building chat bots using ai platforms (wit.ai or api.ai) in nodejsEntrepreneur / Startup
 

More from Entrepreneur / Startup (13)

R-FCN : object detection via region-based fully convolutional networks
R-FCN :  object detection via region-based fully convolutional networksR-FCN :  object detection via region-based fully convolutional networks
R-FCN : object detection via region-based fully convolutional networks
 
You only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detectionYou only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detection
 
Machine Learning Algorithms in Enterprise Applications
Machine Learning Algorithms in Enterprise ApplicationsMachine Learning Algorithms in Enterprise Applications
Machine Learning Algorithms in Enterprise Applications
 
Build a Neural Network for ITSM with TensorFlow
Build a Neural Network for ITSM with TensorFlowBuild a Neural Network for ITSM with TensorFlow
Build a Neural Network for ITSM with TensorFlow
 
Understanding Autoencoder (Deep Learning Book, Chapter 14)
Understanding Autoencoder  (Deep Learning Book, Chapter 14)Understanding Autoencoder  (Deep Learning Book, Chapter 14)
Understanding Autoencoder (Deep Learning Book, Chapter 14)
 
Build an AI based virtual agent
Build an AI based virtual agent Build an AI based virtual agent
Build an AI based virtual agent
 
Building Bots Using IBM Watson
Building Bots Using IBM WatsonBuilding Bots Using IBM Watson
Building Bots Using IBM Watson
 
Building chat bots using ai platforms (wit.ai or api.ai) in nodejs
Building chat bots using ai platforms (wit.ai or api.ai) in nodejsBuilding chat bots using ai platforms (wit.ai or api.ai) in nodejs
Building chat bots using ai platforms (wit.ai or api.ai) in nodejs
 
Building mobile apps using meteorJS
Building mobile apps using meteorJSBuilding mobile apps using meteorJS
Building mobile apps using meteorJS
 
Building iOS app using meteor
Building iOS app using meteorBuilding iOS app using meteor
Building iOS app using meteor
 
Understanding angular meteor
Understanding angular meteorUnderstanding angular meteor
Understanding angular meteor
 
Introducing ElasticSearch - Ashish
Introducing ElasticSearch - AshishIntroducing ElasticSearch - Ashish
Introducing ElasticSearch - Ashish
 
Meteor Introduction - Ashish
Meteor Introduction - AshishMeteor Introduction - Ashish
Meteor Introduction - Ashish
 

Recently uploaded

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 

Recently uploaded (20)

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 

OpenAI Gym & Universe

  • 1. Gym & Universe Ashish Kumar LinkedIn: https://www.linkedin.com/in/ashkmr1 Twitter : @ashish_fagna e-mail : ashish.fagna@gmail.com
  • 2. OpenAI - Introduction ● OpenAI is a non-profit artificial intelligence (AI) research company. ● It aims to promote and develop friendly AI in a way to benefit humanity as a whole. ● It aims to "freely collaborate" by making its patents and research open to the public. 2
  • 3. Latest in News ● OpenAI Five vs Dota 2 ● Event was streamed online ● OpenAI Five is a set of five neural networks 3
  • 5. ReInforcement Learning ● One of the most important type of Machine Learning, ● An agent learns how to behave in a environment by performing actions and seeing the results. 5
  • 6. ReInforcement Learning There are two basic concepts in reinforcement learning: 1. Environment (namely, the outside world) and 2. Agent (namely, the algorithm you are writing). The agent sends actions to the environment, and the environment replies with observations and rewards (that is, a score). 6
  • 7. Example : ReInforcement Learning Imagine you’re a child in a living room. Action1 : You see a fireplace, and you approach it. It’s warm (Positive Reward +1). Action 2: But when you try to touch the fire. It burns your hand (Negative reward -1). Learning 1 : fire is positive when you are a sufficient distance away, because it produces warmth. Learning 2 : But getting too close to it and you will be burned. Image Source : https://medium.freecodecamp.org/an-introduction-to-reinforcement-learning-4339519de419 That’s how humans learn, through interaction. Reinforcement Learning is just a computational approach of learning from action. 7
  • 8. Reinforcement Learning Actions influence the state, which determines reward. Image Source : https://medium.freecodecamp.org/an-introduction-to-reinforcement-learning-4339519de419 8
  • 9. Reinforcement Learning Process (Super Mario) Let’s imagine an agent learning to play Super Mario. The Reinforcement Learning (RL) process can be modeled as a loop that works like this: ● Our Agent receives state S0 from the Environment (In our case we receive the first frame of our game (state) from Super Mario (environment)) ● Based on that state S0, agent takes an action A0 (our agent will move right) ● Environment transitions to a new state S1 (new frame) ● Environment gives some reward R1 to the agent (not dead: +1) This RL loop outputs a sequence of state, action and reward. 9
  • 10. The Reinforcement Learning process ● The goal of the agent is to maximize the expected cumulative reward. ● By running more and more loops, the agent will learn to play better and better. 10
  • 11. OpenAI Gym ● Gym is a toolkit for Researching (developing and comparing) reinforcement learning algorithms. ● It supports teaching agents everything from walking to playing games like Pong or Pinball. ● Gym Envs: https://gym.openai.com/envs/#mujoco Ant-v2 Make a 3D four-legged robot walk. 11
  • 12. OpenAI Universe ● Platform for measuring and training an AGI across games, websites and other applications. ● Makes it possible for any existing program to become an OpenAI Gym environment, without needing special access to the program's internals, source code, or APIs. ● It does this by packaging the program into a Docker container, and presenting the AI with the same interface a human uses: sending keyboard and mouse events, and receiving screen pixels. ● Contains over 1,000 environments in which an AI agent can take actions and gather observations. 12
  • 13. Command: Start Docker Container via Conda ● Conda is an open source package management system and environment management system that runs on Windows, macOS and Linux. ● Conda quickly installs, runs and updates packages and their dependencies. ● Conda easily creates, saves, loads and switches between environments on your local computer. Command: conda create --name universe-starter-agent python=3.5 source activate universe-starter-agent 13
  • 15. Gym vs Universe ● OpenAI Universe is like a much bigger OpenAI Gym. ● OpenAI Gym’s got some basic tasks, like pole balancing, and pendulum uprighting, and some more difficult ones like basic Atari games like Space Invaders. ● like an enclosed world, or a “gym” to exercise and develop RL algorithms. ● OpenAI Universe has a much wider variety of tasks, and is more involved in giving RL networks/algorithms the ability to interact with the real world: playing games, using an actual (virtual) keyboard and mouse to interact with buttons and sliders on webpages, etc. ● Universe is based on Gym 15
  • 16. Use Cases Environments for doing various tasks, like ● Sending an email, ● Doing some mouse clicking, keyboard events, ● More and more environments are being added 16
  • 17. Thank You Ashish Kumar LinkedIn: https://www.linkedin.com/in/ashkmr1 Twitter : @ashish_fagna E-mail: ashish.fagna@gmail.com 17