SlideShare a Scribd company logo
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS DeepRacer
Revving up with
Reinforcement Learning
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How can we put
reinforcement learning
in the hands of all
developers? literally
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Robotic autonomous
race car
Racing LeagueVirtual simulator, to train
and experiment
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is AWS DeepRacer?
My first attempt at
building a self driving
car…
(2014)
AWS Robocar Rally (2017)
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is Reinforcement Learning?
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SUPERVISED UNSUPERVISED REINFORCEMENT
Machine learning overview
METHOD Supervised learning
HOW IT WORKS Expert driver controls a real
world car, that has a camera. Save the images
from the camera as inputs and corresponding
driving actions (speed and steering angle) as
outputs. Train a model.
RESULT Provide state(image) into model and
receive driving action
RL vs. other approaches for robotic racing
METHOD Reinforcement learning
HOW IT WORKS Virtual agent repeatedly
interacts with a simulated environment and
logs experience (image, action, new state,
reward). Experience is used to train a model,
and new model is used to get more
experience.
RESULT Provide state(image) into model and
receive driving action
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AUTONOMOUS CARS FINANCIAL TRADING DATACENTER COOLINGFLEET LOGISTICS
Reinforcement Learning use cases
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
RL for AB Testing
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Reinforcement learning terms
AGENT ENVIRONMENT STATE
ACTION
EPISODEREWARD
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
VALUE FUNCTION
POLICY FUNCTION
How does learning happen?
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Policy Function
Input
Output
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
RL algorithms: Vanilla policy gradient
J(q)New
weights
New
weights
0.4 ± 𝛿 0.3 ± 𝛿
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
RL algorithms: Proximal policy optimization (PPO)
(State, action, reward,
next state)
(st,at, rt, st+1)
Advantage
Improved model
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What does a reward function look like?
def reward_function(on_track, x, y, distance_from_center, car_orientation,
progress, steps, throttle, steering, track_width, waypoints,
closest_waypoint):
import math
# Example Centerline following reward function
marker_1 = 0.1 * track_width
marker_2 = 0.25 * track_width
marker_3 = 0.5 * track_width
reward = 1e-3
if distance_from_center >= 0.0 and distance_from_center <= marker_1:
reward = 1
elif distance_from_center <= marker_2:
reward = 0.5
elif distance_from_center <= marker_3:
reward = 0.1
else:
reward = 1e-3 # likely crashed/ close to off track
return float(reward)
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Snakes on the (control) plane
@frankmunz)
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Fish and Chips Chole Poori Paneer Uttappam Khara Dosa
Explore vs Exploit
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Explore the grid and accumulate rewards
Episode : Process of exploring the grids earning rewards until the car
moves out of the bounds or reaches the goal.
Out of bounds Final Destination.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Iterate! Learning doesn’t happen on the first go!
The model learns which subsequent actions will results
highest cumulative rewards.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Agent Improves as it gains more experience.
As the agent gains more and more experience, it learns to
stay on the central squares to get higher rewards.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Exploration
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Agent Improves as it gains more experience.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Exploration vs. exploitation
EXPLORATION EXPLOITATION
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Convergence
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

More Related Content

Similar to Revving up with Reinforcement Learning by Ricardo Sueiras

Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Amazon Web Services
 
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Amazon Web Services
 
Get hands on with AWS DeepRacer & compete in the AWS DeepRacer League - AIM20...
Get hands on with AWS DeepRacer & compete in the AWS DeepRacer League - AIM20...Get hands on with AWS DeepRacer & compete in the AWS DeepRacer League - AIM20...
Get hands on with AWS DeepRacer & compete in the AWS DeepRacer League - AIM20...
Amazon Web Services
 
Racing with Artificial Intelligence
Racing with Artificial IntelligenceRacing with Artificial Intelligence
Racing with Artificial Intelligence
Daniel Zivkovic
 
DeepRacer-Workshop-HongKong-Donnie-Prakoso
DeepRacer-Workshop-HongKong-Donnie-PrakosoDeepRacer-Workshop-HongKong-Donnie-Prakoso
DeepRacer-Workshop-HongKong-Donnie-Prakoso
Amazon Web Services
 
Reinforcement Learning with Sagemaker, DeepRacer and Robomaker
Reinforcement Learning with Sagemaker, DeepRacer and RobomakerReinforcement Learning with Sagemaker, DeepRacer and Robomaker
Reinforcement Learning with Sagemaker, DeepRacer and Robomaker
Alex Barbosa Coqueiro
 
Tools for building your Startup on AWS
Tools for building your Startup on AWSTools for building your Startup on AWS
Tools for building your Startup on AWS
Rob De Feo
 
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
Amazon Web Services
 
[REPEAT 1] Create and Publish AR, VR, and 3D Applications Using Amazon Sumeri...
[REPEAT 1] Create and Publish AR, VR, and 3D Applications Using Amazon Sumeri...[REPEAT 1] Create and Publish AR, VR, and 3D Applications Using Amazon Sumeri...
[REPEAT 1] Create and Publish AR, VR, and 3D Applications Using Amazon Sumeri...
Amazon Web Services
 
[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...
[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...
[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...
Amazon Web Services
 
AWS AI and Machine Learning Journey
AWS AI and Machine Learning JourneyAWS AI and Machine Learning Journey
AWS AI and Machine Learning JourneyAmazon Web Services
 
Moving to DevOps the Amazon Way
Moving to DevOps the Amazon WayMoving to DevOps the Amazon Way
Moving to DevOps the Amazon Way
Amazon Web Services
 
The Theory and Practice, Practice, Practice of AWS Operations - AWS Summit Sy...
The Theory and Practice, Practice, Practice of AWS Operations - AWS Summit Sy...The Theory and Practice, Practice, Practice of AWS Operations - AWS Summit Sy...
The Theory and Practice, Practice, Practice of AWS Operations - AWS Summit Sy...
Amazon Web Services
 
Moving to DevOps the Amazon Way
Moving to DevOps the Amazon WayMoving to DevOps the Amazon Way
Moving to DevOps the Amazon Way
Amazon Web Services LATAM
 
How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019
How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019 How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019
How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019
Amazon Web Services
 
AWS Initiate - DevOps do Jeito Amazon
AWS Initiate - DevOps do Jeito AmazonAWS Initiate - DevOps do Jeito Amazon
AWS Initiate - DevOps do Jeito Amazon
Amazon Web Services LATAM
 
완전 관리형 ML 서비스인 Amazon SageMaker 의 신규 기능 소개 - 김필호 AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS ...
완전 관리형 ML 서비스인 Amazon SageMaker 의 신규 기능 소개 - 김필호 AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS ...완전 관리형 ML 서비스인 Amazon SageMaker 의 신규 기능 소개 - 김필호 AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS ...
완전 관리형 ML 서비스인 Amazon SageMaker 의 신규 기능 소개 - 김필호 AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS ...
Amazon Web Services Korea
 
Are you Well Architected?
Are you Well Architected?Are you Well Architected?
Are you Well Architected?
Amazon Web Services
 
Introduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day IsraelIntroduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day Israel
Amazon Web Services
 
Introduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day IsraelIntroduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day Israel
Amazon Web Services
 

Similar to Revving up with Reinforcement Learning by Ricardo Sueiras (20)

Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
 
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
 
Get hands on with AWS DeepRacer & compete in the AWS DeepRacer League - AIM20...
Get hands on with AWS DeepRacer & compete in the AWS DeepRacer League - AIM20...Get hands on with AWS DeepRacer & compete in the AWS DeepRacer League - AIM20...
Get hands on with AWS DeepRacer & compete in the AWS DeepRacer League - AIM20...
 
Racing with Artificial Intelligence
Racing with Artificial IntelligenceRacing with Artificial Intelligence
Racing with Artificial Intelligence
 
DeepRacer-Workshop-HongKong-Donnie-Prakoso
DeepRacer-Workshop-HongKong-Donnie-PrakosoDeepRacer-Workshop-HongKong-Donnie-Prakoso
DeepRacer-Workshop-HongKong-Donnie-Prakoso
 
Reinforcement Learning with Sagemaker, DeepRacer and Robomaker
Reinforcement Learning with Sagemaker, DeepRacer and RobomakerReinforcement Learning with Sagemaker, DeepRacer and Robomaker
Reinforcement Learning with Sagemaker, DeepRacer and Robomaker
 
Tools for building your Startup on AWS
Tools for building your Startup on AWSTools for building your Startup on AWS
Tools for building your Startup on AWS
 
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
 
[REPEAT 1] Create and Publish AR, VR, and 3D Applications Using Amazon Sumeri...
[REPEAT 1] Create and Publish AR, VR, and 3D Applications Using Amazon Sumeri...[REPEAT 1] Create and Publish AR, VR, and 3D Applications Using Amazon Sumeri...
[REPEAT 1] Create and Publish AR, VR, and 3D Applications Using Amazon Sumeri...
 
[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...
[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...
[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...
 
AWS AI and Machine Learning Journey
AWS AI and Machine Learning JourneyAWS AI and Machine Learning Journey
AWS AI and Machine Learning Journey
 
Moving to DevOps the Amazon Way
Moving to DevOps the Amazon WayMoving to DevOps the Amazon Way
Moving to DevOps the Amazon Way
 
The Theory and Practice, Practice, Practice of AWS Operations - AWS Summit Sy...
The Theory and Practice, Practice, Practice of AWS Operations - AWS Summit Sy...The Theory and Practice, Practice, Practice of AWS Operations - AWS Summit Sy...
The Theory and Practice, Practice, Practice of AWS Operations - AWS Summit Sy...
 
Moving to DevOps the Amazon Way
Moving to DevOps the Amazon WayMoving to DevOps the Amazon Way
Moving to DevOps the Amazon Way
 
How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019
How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019 How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019
How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019
 
AWS Initiate - DevOps do Jeito Amazon
AWS Initiate - DevOps do Jeito AmazonAWS Initiate - DevOps do Jeito Amazon
AWS Initiate - DevOps do Jeito Amazon
 
완전 관리형 ML 서비스인 Amazon SageMaker 의 신규 기능 소개 - 김필호 AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS ...
완전 관리형 ML 서비스인 Amazon SageMaker 의 신규 기능 소개 - 김필호 AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS ...완전 관리형 ML 서비스인 Amazon SageMaker 의 신규 기능 소개 - 김필호 AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS ...
완전 관리형 ML 서비스인 Amazon SageMaker 의 신규 기능 소개 - 김필호 AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS ...
 
Are you Well Architected?
Are you Well Architected?Are you Well Architected?
Are you Well Architected?
 
Introduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day IsraelIntroduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day Israel
 
Introduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day IsraelIntroduction to AI services for Developers - Builders Day Israel
Introduction to AI services for Developers - Builders Day Israel
 

More from Alex Cachia

No Onions, No Tiers - An Introduction to Vertical Slice Architecture by Bill ...
No Onions, No Tiers - An Introduction to Vertical Slice Architecture by Bill ...No Onions, No Tiers - An Introduction to Vertical Slice Architecture by Bill ...
No Onions, No Tiers - An Introduction to Vertical Slice Architecture by Bill ...
Alex Cachia
 
Supporting IT by David Meares
Supporting IT by David MearesSupporting IT by David Meares
Supporting IT by David Meares
Alex Cachia
 
OWASP Top 10 2021 - let's take a closer look by Glenn Wilson
OWASP Top 10 2021 - let's take a closer look by Glenn WilsonOWASP Top 10 2021 - let's take a closer look by Glenn Wilson
OWASP Top 10 2021 - let's take a closer look by Glenn Wilson
Alex Cachia
 
If you think open source is not for you, think again by Jane Chakravorty
If you think open source is not for you, think again by Jane ChakravortyIf you think open source is not for you, think again by Jane Chakravorty
If you think open source is not for you, think again by Jane Chakravorty
Alex Cachia
 
Chaos Engineering – why we should all practice breaking things on purpose by ...
Chaos Engineering – why we should all practice breaking things on purpose by ...Chaos Engineering – why we should all practice breaking things on purpose by ...
Chaos Engineering – why we should all practice breaking things on purpose by ...
Alex Cachia
 
A brief overview of the history and practice of user experience by Ian Westbrook
A brief overview of the history and practice of user experience by Ian WestbrookA brief overview of the history and practice of user experience by Ian Westbrook
A brief overview of the history and practice of user experience by Ian Westbrook
Alex Cachia
 
Return the carriage, feed the line by Aaron Taylor
Return the carriage, feed the line by Aaron TaylorReturn the carriage, feed the line by Aaron Taylor
Return the carriage, feed the line by Aaron Taylor
Alex Cachia
 
Treating your career path and training like leveling up in games by Raymond C...
Treating your career path and training like leveling up in games by Raymond C...Treating your career path and training like leveling up in games by Raymond C...
Treating your career path and training like leveling up in games by Raymond C...
Alex Cachia
 
Digital forensics and giving evidence by Jonathan Haddock
Digital forensics and giving evidence by Jonathan Haddock Digital forensics and giving evidence by Jonathan Haddock
Digital forensics and giving evidence by Jonathan Haddock
Alex Cachia
 
Software Security by Glenn Wilson
Software Security by Glenn WilsonSoftware Security by Glenn Wilson
Software Security by Glenn Wilson
Alex Cachia
 
Data Preparation and the Importance of How Machines Learn by Rebecca Vickery
Data Preparation and the Importance of How Machines Learn by Rebecca VickeryData Preparation and the Importance of How Machines Learn by Rebecca Vickery
Data Preparation and the Importance of How Machines Learn by Rebecca Vickery
Alex Cachia
 
Why Rust? by Edd Barrett (codeHarbour December 2019)
Why Rust? by Edd Barrett (codeHarbour December 2019)Why Rust? by Edd Barrett (codeHarbour December 2019)
Why Rust? by Edd Barrett (codeHarbour December 2019)
Alex Cachia
 
Issue with tracking? Fail that build! by Steve Coppin-Smith (codeHarbour Nove...
Issue with tracking? Fail that build! by Steve Coppin-Smith (codeHarbour Nove...Issue with tracking? Fail that build! by Steve Coppin-Smith (codeHarbour Nove...
Issue with tracking? Fail that build! by Steve Coppin-Smith (codeHarbour Nove...
Alex Cachia
 
Hack your voicemail with Javascript by Chris Willmott (codeHarbour October 2019)
Hack your voicemail with Javascript by Chris Willmott (codeHarbour October 2019)Hack your voicemail with Javascript by Chris Willmott (codeHarbour October 2019)
Hack your voicemail with Javascript by Chris Willmott (codeHarbour October 2019)
Alex Cachia
 
Developing for Africa by Jonathan Haddock (codeHarbour October 2019)
Developing for Africa by Jonathan Haddock (codeHarbour October 2019)Developing for Africa by Jonathan Haddock (codeHarbour October 2019)
Developing for Africa by Jonathan Haddock (codeHarbour October 2019)
Alex Cachia
 
Blockchain For Your Business by Kenneth Cox (codeHarbour July 2019)
Blockchain For Your Business by Kenneth Cox (codeHarbour July 2019)Blockchain For Your Business by Kenneth Cox (codeHarbour July 2019)
Blockchain For Your Business by Kenneth Cox (codeHarbour July 2019)
Alex Cachia
 
Seeking Simplicity by Phil Nash (codeHarbour June 2019)
Seeking Simplicity by Phil Nash (codeHarbour June 2019)Seeking Simplicity by Phil Nash (codeHarbour June 2019)
Seeking Simplicity by Phil Nash (codeHarbour June 2019)
Alex Cachia
 
Sharing Data is Caring Data by Mark Terry (codeHarbour June 2019)
Sharing Data is Caring Data by Mark Terry (codeHarbour June 2019)Sharing Data is Caring Data by Mark Terry (codeHarbour June 2019)
Sharing Data is Caring Data by Mark Terry (codeHarbour June 2019)
Alex Cachia
 
Managing technical debt by Chris Willmott (codeHarbour April 2019)
Managing technical debt by Chris Willmott (codeHarbour April 2019)Managing technical debt by Chris Willmott (codeHarbour April 2019)
Managing technical debt by Chris Willmott (codeHarbour April 2019)
Alex Cachia
 
Telephone Systems and Voice over IP by Bob Eager (codeHarbour April 2019)
Telephone Systems and Voice over IP by Bob Eager (codeHarbour April 2019)Telephone Systems and Voice over IP by Bob Eager (codeHarbour April 2019)
Telephone Systems and Voice over IP by Bob Eager (codeHarbour April 2019)
Alex Cachia
 

More from Alex Cachia (20)

No Onions, No Tiers - An Introduction to Vertical Slice Architecture by Bill ...
No Onions, No Tiers - An Introduction to Vertical Slice Architecture by Bill ...No Onions, No Tiers - An Introduction to Vertical Slice Architecture by Bill ...
No Onions, No Tiers - An Introduction to Vertical Slice Architecture by Bill ...
 
Supporting IT by David Meares
Supporting IT by David MearesSupporting IT by David Meares
Supporting IT by David Meares
 
OWASP Top 10 2021 - let's take a closer look by Glenn Wilson
OWASP Top 10 2021 - let's take a closer look by Glenn WilsonOWASP Top 10 2021 - let's take a closer look by Glenn Wilson
OWASP Top 10 2021 - let's take a closer look by Glenn Wilson
 
If you think open source is not for you, think again by Jane Chakravorty
If you think open source is not for you, think again by Jane ChakravortyIf you think open source is not for you, think again by Jane Chakravorty
If you think open source is not for you, think again by Jane Chakravorty
 
Chaos Engineering – why we should all practice breaking things on purpose by ...
Chaos Engineering – why we should all practice breaking things on purpose by ...Chaos Engineering – why we should all practice breaking things on purpose by ...
Chaos Engineering – why we should all practice breaking things on purpose by ...
 
A brief overview of the history and practice of user experience by Ian Westbrook
A brief overview of the history and practice of user experience by Ian WestbrookA brief overview of the history and practice of user experience by Ian Westbrook
A brief overview of the history and practice of user experience by Ian Westbrook
 
Return the carriage, feed the line by Aaron Taylor
Return the carriage, feed the line by Aaron TaylorReturn the carriage, feed the line by Aaron Taylor
Return the carriage, feed the line by Aaron Taylor
 
Treating your career path and training like leveling up in games by Raymond C...
Treating your career path and training like leveling up in games by Raymond C...Treating your career path and training like leveling up in games by Raymond C...
Treating your career path and training like leveling up in games by Raymond C...
 
Digital forensics and giving evidence by Jonathan Haddock
Digital forensics and giving evidence by Jonathan Haddock Digital forensics and giving evidence by Jonathan Haddock
Digital forensics and giving evidence by Jonathan Haddock
 
Software Security by Glenn Wilson
Software Security by Glenn WilsonSoftware Security by Glenn Wilson
Software Security by Glenn Wilson
 
Data Preparation and the Importance of How Machines Learn by Rebecca Vickery
Data Preparation and the Importance of How Machines Learn by Rebecca VickeryData Preparation and the Importance of How Machines Learn by Rebecca Vickery
Data Preparation and the Importance of How Machines Learn by Rebecca Vickery
 
Why Rust? by Edd Barrett (codeHarbour December 2019)
Why Rust? by Edd Barrett (codeHarbour December 2019)Why Rust? by Edd Barrett (codeHarbour December 2019)
Why Rust? by Edd Barrett (codeHarbour December 2019)
 
Issue with tracking? Fail that build! by Steve Coppin-Smith (codeHarbour Nove...
Issue with tracking? Fail that build! by Steve Coppin-Smith (codeHarbour Nove...Issue with tracking? Fail that build! by Steve Coppin-Smith (codeHarbour Nove...
Issue with tracking? Fail that build! by Steve Coppin-Smith (codeHarbour Nove...
 
Hack your voicemail with Javascript by Chris Willmott (codeHarbour October 2019)
Hack your voicemail with Javascript by Chris Willmott (codeHarbour October 2019)Hack your voicemail with Javascript by Chris Willmott (codeHarbour October 2019)
Hack your voicemail with Javascript by Chris Willmott (codeHarbour October 2019)
 
Developing for Africa by Jonathan Haddock (codeHarbour October 2019)
Developing for Africa by Jonathan Haddock (codeHarbour October 2019)Developing for Africa by Jonathan Haddock (codeHarbour October 2019)
Developing for Africa by Jonathan Haddock (codeHarbour October 2019)
 
Blockchain For Your Business by Kenneth Cox (codeHarbour July 2019)
Blockchain For Your Business by Kenneth Cox (codeHarbour July 2019)Blockchain For Your Business by Kenneth Cox (codeHarbour July 2019)
Blockchain For Your Business by Kenneth Cox (codeHarbour July 2019)
 
Seeking Simplicity by Phil Nash (codeHarbour June 2019)
Seeking Simplicity by Phil Nash (codeHarbour June 2019)Seeking Simplicity by Phil Nash (codeHarbour June 2019)
Seeking Simplicity by Phil Nash (codeHarbour June 2019)
 
Sharing Data is Caring Data by Mark Terry (codeHarbour June 2019)
Sharing Data is Caring Data by Mark Terry (codeHarbour June 2019)Sharing Data is Caring Data by Mark Terry (codeHarbour June 2019)
Sharing Data is Caring Data by Mark Terry (codeHarbour June 2019)
 
Managing technical debt by Chris Willmott (codeHarbour April 2019)
Managing technical debt by Chris Willmott (codeHarbour April 2019)Managing technical debt by Chris Willmott (codeHarbour April 2019)
Managing technical debt by Chris Willmott (codeHarbour April 2019)
 
Telephone Systems and Voice over IP by Bob Eager (codeHarbour April 2019)
Telephone Systems and Voice over IP by Bob Eager (codeHarbour April 2019)Telephone Systems and Voice over IP by Bob Eager (codeHarbour April 2019)
Telephone Systems and Voice over IP by Bob Eager (codeHarbour April 2019)
 

Recently uploaded

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 

Recently uploaded (20)

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 

Revving up with Reinforcement Learning by Ricardo Sueiras

  • 1. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS DeepRacer Revving up with Reinforcement Learning
  • 2. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. How can we put reinforcement learning in the hands of all developers? literally
  • 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Robotic autonomous race car Racing LeagueVirtual simulator, to train and experiment © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What is AWS DeepRacer?
  • 4. My first attempt at building a self driving car… (2014)
  • 6.
  • 7.
  • 8. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What is Reinforcement Learning?
  • 9. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. SUPERVISED UNSUPERVISED REINFORCEMENT Machine learning overview
  • 10. METHOD Supervised learning HOW IT WORKS Expert driver controls a real world car, that has a camera. Save the images from the camera as inputs and corresponding driving actions (speed and steering angle) as outputs. Train a model. RESULT Provide state(image) into model and receive driving action RL vs. other approaches for robotic racing METHOD Reinforcement learning HOW IT WORKS Virtual agent repeatedly interacts with a simulated environment and logs experience (image, action, new state, reward). Experience is used to train a model, and new model is used to get more experience. RESULT Provide state(image) into model and receive driving action
  • 11. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AUTONOMOUS CARS FINANCIAL TRADING DATACENTER COOLINGFLEET LOGISTICS Reinforcement Learning use cases
  • 12. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. RL for AB Testing
  • 13. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Reinforcement learning terms AGENT ENVIRONMENT STATE ACTION EPISODEREWARD
  • 14. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. VALUE FUNCTION POLICY FUNCTION How does learning happen?
  • 15. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Policy Function Input Output
  • 16. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. RL algorithms: Vanilla policy gradient J(q)New weights New weights 0.4 ± 𝛿 0.3 ± 𝛿
  • 17. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. RL algorithms: Proximal policy optimization (PPO) (State, action, reward, next state) (st,at, rt, st+1) Advantage Improved model
  • 18. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What does a reward function look like? def reward_function(on_track, x, y, distance_from_center, car_orientation, progress, steps, throttle, steering, track_width, waypoints, closest_waypoint): import math # Example Centerline following reward function marker_1 = 0.1 * track_width marker_2 = 0.25 * track_width marker_3 = 0.5 * track_width reward = 1e-3 if distance_from_center >= 0.0 and distance_from_center <= marker_1: reward = 1 elif distance_from_center <= marker_2: reward = 0.5 elif distance_from_center <= marker_3: reward = 0.1 else: reward = 1e-3 # likely crashed/ close to off track return float(reward)
  • 19. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Snakes on the (control) plane @frankmunz)
  • 20. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Fish and Chips Chole Poori Paneer Uttappam Khara Dosa Explore vs Exploit
  • 21. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Explore the grid and accumulate rewards Episode : Process of exploring the grids earning rewards until the car moves out of the bounds or reaches the goal. Out of bounds Final Destination.
  • 22. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Iterate! Learning doesn’t happen on the first go! The model learns which subsequent actions will results highest cumulative rewards.
  • 23. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agent Improves as it gains more experience. As the agent gains more and more experience, it learns to stay on the central squares to get higher rewards.
  • 24. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Exploration
  • 25. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agent Improves as it gains more experience.
  • 26. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Exploration vs. exploitation EXPLORATION EXPLOITATION
  • 27. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Convergence
  • 28. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 29.
  • 30. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.