SlideShare a Scribd company logo
1 of 30
Download to read offline
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS DeepRacer
Revving up with
Reinforcement Learning
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How can we put
reinforcement learning
in the hands of all
developers? literally
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Robotic autonomous
race car
Racing LeagueVirtual simulator, to train
and experiment
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is AWS DeepRacer?
My first attempt at
building a self driving
car…
(2014)
AWS Robocar Rally (2017)
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is Reinforcement Learning?
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SUPERVISED UNSUPERVISED REINFORCEMENT
Machine learning overview
METHOD Supervised learning
HOW IT WORKS Expert driver controls a real
world car, that has a camera. Save the images
from the camera as inputs and corresponding
driving actions (speed and steering angle) as
outputs. Train a model.
RESULT Provide state(image) into model and
receive driving action
RL vs. other approaches for robotic racing
METHOD Reinforcement learning
HOW IT WORKS Virtual agent repeatedly
interacts with a simulated environment and
logs experience (image, action, new state,
reward). Experience is used to train a model,
and new model is used to get more
experience.
RESULT Provide state(image) into model and
receive driving action
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AUTONOMOUS CARS FINANCIAL TRADING DATACENTER COOLINGFLEET LOGISTICS
Reinforcement Learning use cases
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
RL for AB Testing
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Reinforcement learning terms
AGENT ENVIRONMENT STATE
ACTION
EPISODEREWARD
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
VALUE FUNCTION
POLICY FUNCTION
How does learning happen?
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Policy Function
Input
Output
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
RL algorithms: Vanilla policy gradient
J(q)New
weights
New
weights
0.4 ± 𝛿 0.3 ± 𝛿
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
RL algorithms: Proximal policy optimization (PPO)
(State, action, reward,
next state)
(st,at, rt, st+1)
Advantage
Improved model
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What does a reward function look like?
def reward_function(on_track, x, y, distance_from_center, car_orientation,
progress, steps, throttle, steering, track_width, waypoints,
closest_waypoint):
import math
# Example Centerline following reward function
marker_1 = 0.1 * track_width
marker_2 = 0.25 * track_width
marker_3 = 0.5 * track_width
reward = 1e-3
if distance_from_center >= 0.0 and distance_from_center <= marker_1:
reward = 1
elif distance_from_center <= marker_2:
reward = 0.5
elif distance_from_center <= marker_3:
reward = 0.1
else:
reward = 1e-3 # likely crashed/ close to off track
return float(reward)
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Snakes on the (control) plane
@frankmunz)
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Fish and Chips Chole Poori Paneer Uttappam Khara Dosa
Explore vs Exploit
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Explore the grid and accumulate rewards
Episode : Process of exploring the grids earning rewards until the car
moves out of the bounds or reaches the goal.
Out of bounds Final Destination.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Iterate! Learning doesn’t happen on the first go!
The model learns which subsequent actions will results
highest cumulative rewards.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Agent Improves as it gains more experience.
As the agent gains more and more experience, it learns to
stay on the central squares to get higher rewards.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Exploration
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Agent Improves as it gains more experience.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Exploration vs. exploitation
EXPLORATION EXPLOITATION
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Convergence
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

More Related Content

Similar to Revving up with Reinforcement Learning by Ricardo Sueiras

Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...Amazon Web Services
 
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...Amazon Web Services
 
Get hands on with AWS DeepRacer & compete in the AWS DeepRacer League - AIM20...
Get hands on with AWS DeepRacer & compete in the AWS DeepRacer League - AIM20...Get hands on with AWS DeepRacer & compete in the AWS DeepRacer League - AIM20...
Get hands on with AWS DeepRacer & compete in the AWS DeepRacer League - AIM20...Amazon Web Services
 
Racing with Artificial Intelligence
Racing with Artificial IntelligenceRacing with Artificial Intelligence
Racing with Artificial IntelligenceDaniel Zivkovic
 
DeepRacer-Workshop-HongKong-Donnie-Prakoso
DeepRacer-Workshop-HongKong-Donnie-PrakosoDeepRacer-Workshop-HongKong-Donnie-Prakoso
DeepRacer-Workshop-HongKong-Donnie-PrakosoAmazon Web Services
 
Reinforcement Learning with Sagemaker, DeepRacer and Robomaker
Reinforcement Learning with Sagemaker, DeepRacer and RobomakerReinforcement Learning with Sagemaker, DeepRacer and Robomaker
Reinforcement Learning with Sagemaker, DeepRacer and RobomakerAlex Barbosa Coqueiro
 
Tools for building your Startup on AWS
Tools for building your Startup on AWSTools for building your Startup on AWS
Tools for building your Startup on AWSRob De Feo
 
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019Amazon Web Services
 
[REPEAT 1] Create and Publish AR, VR, and 3D Applications Using Amazon Sumeri...
[REPEAT 1] Create and Publish AR, VR, and 3D Applications Using Amazon Sumeri...[REPEAT 1] Create and Publish AR, VR, and 3D Applications Using Amazon Sumeri...
[REPEAT 1] Create and Publish AR, VR, and 3D Applications Using Amazon Sumeri...Amazon Web Services
 
[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...
[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...
[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...Amazon Web Services
 
AWS AI and Machine Learning Journey
AWS AI and Machine Learning JourneyAWS AI and Machine Learning Journey
AWS AI and Machine Learning JourneyAmazon Web Services
 
The Theory and Practice, Practice, Practice of AWS Operations - AWS Summit Sy...
The Theory and Practice, Practice, Practice of AWS Operations - AWS Summit Sy...The Theory and Practice, Practice, Practice of AWS Operations - AWS Summit Sy...
The Theory and Practice, Practice, Practice of AWS Operations - AWS Summit Sy...Amazon Web Services
 
How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019
How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019 How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019
How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019 Amazon Web Services
 
완전 관리형 ML 서비스인 Amazon SageMaker 의 신규 기능 소개 - 김필호 AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS ...
완전 관리형 ML 서비스인 Amazon SageMaker 의 신규 기능 소개 - 김필호 AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS ...완전 관리형 ML 서비스인 Amazon SageMaker 의 신규 기능 소개 - 김필호 AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS ...
완전 관리형 ML 서비스인 Amazon SageMaker 의 신규 기능 소개 - 김필호 AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS ...Amazon Web Services Korea
 
Introduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day IsraelIntroduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day IsraelAmazon Web Services
 
Introduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day IsraelIntroduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day IsraelAmazon Web Services
 

Similar to Revving up with Reinforcement Learning by Ricardo Sueiras (20)

Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
 
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
 
Get hands on with AWS DeepRacer & compete in the AWS DeepRacer League - AIM20...
Get hands on with AWS DeepRacer & compete in the AWS DeepRacer League - AIM20...Get hands on with AWS DeepRacer & compete in the AWS DeepRacer League - AIM20...
Get hands on with AWS DeepRacer & compete in the AWS DeepRacer League - AIM20...
 
Racing with Artificial Intelligence
Racing with Artificial IntelligenceRacing with Artificial Intelligence
Racing with Artificial Intelligence
 
DeepRacer-Workshop-HongKong-Donnie-Prakoso
DeepRacer-Workshop-HongKong-Donnie-PrakosoDeepRacer-Workshop-HongKong-Donnie-Prakoso
DeepRacer-Workshop-HongKong-Donnie-Prakoso
 
Reinforcement Learning with Sagemaker, DeepRacer and Robomaker
Reinforcement Learning with Sagemaker, DeepRacer and RobomakerReinforcement Learning with Sagemaker, DeepRacer and Robomaker
Reinforcement Learning with Sagemaker, DeepRacer and Robomaker
 
Tools for building your Startup on AWS
Tools for building your Startup on AWSTools for building your Startup on AWS
Tools for building your Startup on AWS
 
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
 
[REPEAT 1] Create and Publish AR, VR, and 3D Applications Using Amazon Sumeri...
[REPEAT 1] Create and Publish AR, VR, and 3D Applications Using Amazon Sumeri...[REPEAT 1] Create and Publish AR, VR, and 3D Applications Using Amazon Sumeri...
[REPEAT 1] Create and Publish AR, VR, and 3D Applications Using Amazon Sumeri...
 
[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...
[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...
[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...
 
AWS AI and Machine Learning Journey
AWS AI and Machine Learning JourneyAWS AI and Machine Learning Journey
AWS AI and Machine Learning Journey
 
Moving to DevOps the Amazon Way
Moving to DevOps the Amazon WayMoving to DevOps the Amazon Way
Moving to DevOps the Amazon Way
 
The Theory and Practice, Practice, Practice of AWS Operations - AWS Summit Sy...
The Theory and Practice, Practice, Practice of AWS Operations - AWS Summit Sy...The Theory and Practice, Practice, Practice of AWS Operations - AWS Summit Sy...
The Theory and Practice, Practice, Practice of AWS Operations - AWS Summit Sy...
 
Moving to DevOps the Amazon Way
Moving to DevOps the Amazon WayMoving to DevOps the Amazon Way
Moving to DevOps the Amazon Way
 
How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019
How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019 How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019
How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019
 
AWS Initiate - DevOps do Jeito Amazon
AWS Initiate - DevOps do Jeito AmazonAWS Initiate - DevOps do Jeito Amazon
AWS Initiate - DevOps do Jeito Amazon
 
완전 관리형 ML 서비스인 Amazon SageMaker 의 신규 기능 소개 - 김필호 AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS ...
완전 관리형 ML 서비스인 Amazon SageMaker 의 신규 기능 소개 - 김필호 AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS ...완전 관리형 ML 서비스인 Amazon SageMaker 의 신규 기능 소개 - 김필호 AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS ...
완전 관리형 ML 서비스인 Amazon SageMaker 의 신규 기능 소개 - 김필호 AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS ...
 
Are you Well Architected?
Are you Well Architected?Are you Well Architected?
Are you Well Architected?
 
Introduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day IsraelIntroduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day Israel
 
Introduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day IsraelIntroduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day Israel
 

More from Alex Cachia

No Onions, No Tiers - An Introduction to Vertical Slice Architecture by Bill ...
No Onions, No Tiers - An Introduction to Vertical Slice Architecture by Bill ...No Onions, No Tiers - An Introduction to Vertical Slice Architecture by Bill ...
No Onions, No Tiers - An Introduction to Vertical Slice Architecture by Bill ...Alex Cachia
 
Supporting IT by David Meares
Supporting IT by David MearesSupporting IT by David Meares
Supporting IT by David MearesAlex Cachia
 
OWASP Top 10 2021 - let's take a closer look by Glenn Wilson
OWASP Top 10 2021 - let's take a closer look by Glenn WilsonOWASP Top 10 2021 - let's take a closer look by Glenn Wilson
OWASP Top 10 2021 - let's take a closer look by Glenn WilsonAlex Cachia
 
If you think open source is not for you, think again by Jane Chakravorty
If you think open source is not for you, think again by Jane ChakravortyIf you think open source is not for you, think again by Jane Chakravorty
If you think open source is not for you, think again by Jane ChakravortyAlex Cachia
 
Chaos Engineering – why we should all practice breaking things on purpose by ...
Chaos Engineering – why we should all practice breaking things on purpose by ...Chaos Engineering – why we should all practice breaking things on purpose by ...
Chaos Engineering – why we should all practice breaking things on purpose by ...Alex Cachia
 
A brief overview of the history and practice of user experience by Ian Westbrook
A brief overview of the history and practice of user experience by Ian WestbrookA brief overview of the history and practice of user experience by Ian Westbrook
A brief overview of the history and practice of user experience by Ian WestbrookAlex Cachia
 
Return the carriage, feed the line by Aaron Taylor
Return the carriage, feed the line by Aaron TaylorReturn the carriage, feed the line by Aaron Taylor
Return the carriage, feed the line by Aaron TaylorAlex Cachia
 
Treating your career path and training like leveling up in games by Raymond C...
Treating your career path and training like leveling up in games by Raymond C...Treating your career path and training like leveling up in games by Raymond C...
Treating your career path and training like leveling up in games by Raymond C...Alex Cachia
 
Digital forensics and giving evidence by Jonathan Haddock
Digital forensics and giving evidence by Jonathan Haddock Digital forensics and giving evidence by Jonathan Haddock
Digital forensics and giving evidence by Jonathan Haddock Alex Cachia
 
Software Security by Glenn Wilson
Software Security by Glenn WilsonSoftware Security by Glenn Wilson
Software Security by Glenn WilsonAlex Cachia
 
Data Preparation and the Importance of How Machines Learn by Rebecca Vickery
Data Preparation and the Importance of How Machines Learn by Rebecca VickeryData Preparation and the Importance of How Machines Learn by Rebecca Vickery
Data Preparation and the Importance of How Machines Learn by Rebecca VickeryAlex Cachia
 
Why Rust? by Edd Barrett (codeHarbour December 2019)
Why Rust? by Edd Barrett (codeHarbour December 2019)Why Rust? by Edd Barrett (codeHarbour December 2019)
Why Rust? by Edd Barrett (codeHarbour December 2019)Alex Cachia
 
Issue with tracking? Fail that build! by Steve Coppin-Smith (codeHarbour Nove...
Issue with tracking? Fail that build! by Steve Coppin-Smith (codeHarbour Nove...Issue with tracking? Fail that build! by Steve Coppin-Smith (codeHarbour Nove...
Issue with tracking? Fail that build! by Steve Coppin-Smith (codeHarbour Nove...Alex Cachia
 
Hack your voicemail with Javascript by Chris Willmott (codeHarbour October 2019)
Hack your voicemail with Javascript by Chris Willmott (codeHarbour October 2019)Hack your voicemail with Javascript by Chris Willmott (codeHarbour October 2019)
Hack your voicemail with Javascript by Chris Willmott (codeHarbour October 2019)Alex Cachia
 
Developing for Africa by Jonathan Haddock (codeHarbour October 2019)
Developing for Africa by Jonathan Haddock (codeHarbour October 2019)Developing for Africa by Jonathan Haddock (codeHarbour October 2019)
Developing for Africa by Jonathan Haddock (codeHarbour October 2019)Alex Cachia
 
Blockchain For Your Business by Kenneth Cox (codeHarbour July 2019)
Blockchain For Your Business by Kenneth Cox (codeHarbour July 2019)Blockchain For Your Business by Kenneth Cox (codeHarbour July 2019)
Blockchain For Your Business by Kenneth Cox (codeHarbour July 2019)Alex Cachia
 
Seeking Simplicity by Phil Nash (codeHarbour June 2019)
Seeking Simplicity by Phil Nash (codeHarbour June 2019)Seeking Simplicity by Phil Nash (codeHarbour June 2019)
Seeking Simplicity by Phil Nash (codeHarbour June 2019)Alex Cachia
 
Sharing Data is Caring Data by Mark Terry (codeHarbour June 2019)
Sharing Data is Caring Data by Mark Terry (codeHarbour June 2019)Sharing Data is Caring Data by Mark Terry (codeHarbour June 2019)
Sharing Data is Caring Data by Mark Terry (codeHarbour June 2019)Alex Cachia
 
Managing technical debt by Chris Willmott (codeHarbour April 2019)
Managing technical debt by Chris Willmott (codeHarbour April 2019)Managing technical debt by Chris Willmott (codeHarbour April 2019)
Managing technical debt by Chris Willmott (codeHarbour April 2019)Alex Cachia
 
Telephone Systems and Voice over IP by Bob Eager (codeHarbour April 2019)
Telephone Systems and Voice over IP by Bob Eager (codeHarbour April 2019)Telephone Systems and Voice over IP by Bob Eager (codeHarbour April 2019)
Telephone Systems and Voice over IP by Bob Eager (codeHarbour April 2019)Alex Cachia
 

More from Alex Cachia (20)

No Onions, No Tiers - An Introduction to Vertical Slice Architecture by Bill ...
No Onions, No Tiers - An Introduction to Vertical Slice Architecture by Bill ...No Onions, No Tiers - An Introduction to Vertical Slice Architecture by Bill ...
No Onions, No Tiers - An Introduction to Vertical Slice Architecture by Bill ...
 
Supporting IT by David Meares
Supporting IT by David MearesSupporting IT by David Meares
Supporting IT by David Meares
 
OWASP Top 10 2021 - let's take a closer look by Glenn Wilson
OWASP Top 10 2021 - let's take a closer look by Glenn WilsonOWASP Top 10 2021 - let's take a closer look by Glenn Wilson
OWASP Top 10 2021 - let's take a closer look by Glenn Wilson
 
If you think open source is not for you, think again by Jane Chakravorty
If you think open source is not for you, think again by Jane ChakravortyIf you think open source is not for you, think again by Jane Chakravorty
If you think open source is not for you, think again by Jane Chakravorty
 
Chaos Engineering – why we should all practice breaking things on purpose by ...
Chaos Engineering – why we should all practice breaking things on purpose by ...Chaos Engineering – why we should all practice breaking things on purpose by ...
Chaos Engineering – why we should all practice breaking things on purpose by ...
 
A brief overview of the history and practice of user experience by Ian Westbrook
A brief overview of the history and practice of user experience by Ian WestbrookA brief overview of the history and practice of user experience by Ian Westbrook
A brief overview of the history and practice of user experience by Ian Westbrook
 
Return the carriage, feed the line by Aaron Taylor
Return the carriage, feed the line by Aaron TaylorReturn the carriage, feed the line by Aaron Taylor
Return the carriage, feed the line by Aaron Taylor
 
Treating your career path and training like leveling up in games by Raymond C...
Treating your career path and training like leveling up in games by Raymond C...Treating your career path and training like leveling up in games by Raymond C...
Treating your career path and training like leveling up in games by Raymond C...
 
Digital forensics and giving evidence by Jonathan Haddock
Digital forensics and giving evidence by Jonathan Haddock Digital forensics and giving evidence by Jonathan Haddock
Digital forensics and giving evidence by Jonathan Haddock
 
Software Security by Glenn Wilson
Software Security by Glenn WilsonSoftware Security by Glenn Wilson
Software Security by Glenn Wilson
 
Data Preparation and the Importance of How Machines Learn by Rebecca Vickery
Data Preparation and the Importance of How Machines Learn by Rebecca VickeryData Preparation and the Importance of How Machines Learn by Rebecca Vickery
Data Preparation and the Importance of How Machines Learn by Rebecca Vickery
 
Why Rust? by Edd Barrett (codeHarbour December 2019)
Why Rust? by Edd Barrett (codeHarbour December 2019)Why Rust? by Edd Barrett (codeHarbour December 2019)
Why Rust? by Edd Barrett (codeHarbour December 2019)
 
Issue with tracking? Fail that build! by Steve Coppin-Smith (codeHarbour Nove...
Issue with tracking? Fail that build! by Steve Coppin-Smith (codeHarbour Nove...Issue with tracking? Fail that build! by Steve Coppin-Smith (codeHarbour Nove...
Issue with tracking? Fail that build! by Steve Coppin-Smith (codeHarbour Nove...
 
Hack your voicemail with Javascript by Chris Willmott (codeHarbour October 2019)
Hack your voicemail with Javascript by Chris Willmott (codeHarbour October 2019)Hack your voicemail with Javascript by Chris Willmott (codeHarbour October 2019)
Hack your voicemail with Javascript by Chris Willmott (codeHarbour October 2019)
 
Developing for Africa by Jonathan Haddock (codeHarbour October 2019)
Developing for Africa by Jonathan Haddock (codeHarbour October 2019)Developing for Africa by Jonathan Haddock (codeHarbour October 2019)
Developing for Africa by Jonathan Haddock (codeHarbour October 2019)
 
Blockchain For Your Business by Kenneth Cox (codeHarbour July 2019)
Blockchain For Your Business by Kenneth Cox (codeHarbour July 2019)Blockchain For Your Business by Kenneth Cox (codeHarbour July 2019)
Blockchain For Your Business by Kenneth Cox (codeHarbour July 2019)
 
Seeking Simplicity by Phil Nash (codeHarbour June 2019)
Seeking Simplicity by Phil Nash (codeHarbour June 2019)Seeking Simplicity by Phil Nash (codeHarbour June 2019)
Seeking Simplicity by Phil Nash (codeHarbour June 2019)
 
Sharing Data is Caring Data by Mark Terry (codeHarbour June 2019)
Sharing Data is Caring Data by Mark Terry (codeHarbour June 2019)Sharing Data is Caring Data by Mark Terry (codeHarbour June 2019)
Sharing Data is Caring Data by Mark Terry (codeHarbour June 2019)
 
Managing technical debt by Chris Willmott (codeHarbour April 2019)
Managing technical debt by Chris Willmott (codeHarbour April 2019)Managing technical debt by Chris Willmott (codeHarbour April 2019)
Managing technical debt by Chris Willmott (codeHarbour April 2019)
 
Telephone Systems and Voice over IP by Bob Eager (codeHarbour April 2019)
Telephone Systems and Voice over IP by Bob Eager (codeHarbour April 2019)Telephone Systems and Voice over IP by Bob Eager (codeHarbour April 2019)
Telephone Systems and Voice over IP by Bob Eager (codeHarbour April 2019)
 

Recently uploaded

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 

Recently uploaded (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 

Revving up with Reinforcement Learning by Ricardo Sueiras

  • 1. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS DeepRacer Revving up with Reinforcement Learning
  • 2. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. How can we put reinforcement learning in the hands of all developers? literally
  • 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Robotic autonomous race car Racing LeagueVirtual simulator, to train and experiment © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What is AWS DeepRacer?
  • 4. My first attempt at building a self driving car… (2014)
  • 6.
  • 7.
  • 8. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What is Reinforcement Learning?
  • 9. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. SUPERVISED UNSUPERVISED REINFORCEMENT Machine learning overview
  • 10. METHOD Supervised learning HOW IT WORKS Expert driver controls a real world car, that has a camera. Save the images from the camera as inputs and corresponding driving actions (speed and steering angle) as outputs. Train a model. RESULT Provide state(image) into model and receive driving action RL vs. other approaches for robotic racing METHOD Reinforcement learning HOW IT WORKS Virtual agent repeatedly interacts with a simulated environment and logs experience (image, action, new state, reward). Experience is used to train a model, and new model is used to get more experience. RESULT Provide state(image) into model and receive driving action
  • 11. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AUTONOMOUS CARS FINANCIAL TRADING DATACENTER COOLINGFLEET LOGISTICS Reinforcement Learning use cases
  • 12. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. RL for AB Testing
  • 13. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Reinforcement learning terms AGENT ENVIRONMENT STATE ACTION EPISODEREWARD
  • 14. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. VALUE FUNCTION POLICY FUNCTION How does learning happen?
  • 15. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Policy Function Input Output
  • 16. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. RL algorithms: Vanilla policy gradient J(q)New weights New weights 0.4 ± 𝛿 0.3 ± 𝛿
  • 17. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. RL algorithms: Proximal policy optimization (PPO) (State, action, reward, next state) (st,at, rt, st+1) Advantage Improved model
  • 18. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What does a reward function look like? def reward_function(on_track, x, y, distance_from_center, car_orientation, progress, steps, throttle, steering, track_width, waypoints, closest_waypoint): import math # Example Centerline following reward function marker_1 = 0.1 * track_width marker_2 = 0.25 * track_width marker_3 = 0.5 * track_width reward = 1e-3 if distance_from_center >= 0.0 and distance_from_center <= marker_1: reward = 1 elif distance_from_center <= marker_2: reward = 0.5 elif distance_from_center <= marker_3: reward = 0.1 else: reward = 1e-3 # likely crashed/ close to off track return float(reward)
  • 19. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Snakes on the (control) plane @frankmunz)
  • 20. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Fish and Chips Chole Poori Paneer Uttappam Khara Dosa Explore vs Exploit
  • 21. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Explore the grid and accumulate rewards Episode : Process of exploring the grids earning rewards until the car moves out of the bounds or reaches the goal. Out of bounds Final Destination.
  • 22. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Iterate! Learning doesn’t happen on the first go! The model learns which subsequent actions will results highest cumulative rewards.
  • 23. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agent Improves as it gains more experience. As the agent gains more and more experience, it learns to stay on the central squares to get higher rewards.
  • 24. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Exploration
  • 25. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agent Improves as it gains more experience.
  • 26. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Exploration vs. exploitation EXPLORATION EXPLOITATION
  • 27. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Convergence
  • 28. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 29.
  • 30. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.