This document summarizes Jouni Luoma's experience winning an AWS DeepRacer competition. The key points are:
1) Jouni Luoma won the DeepRacer league competition in Stockholm with a time of 8.7 seconds, having only 2 weeks of experience with DeepRacer prior.
2) AWS DeepRacer uses reinforcement learning to train a neural network model to control a 1/18th scale race car in a simulator and real races. The model is trained using rewards to optimize lap times.
3) Jouni found that training in the simulator did not translate well to the real race, as the real track conditions like reflections were different. With testing on a physical track
2. Who am I
Rolf Koski
CTO
Cybercom AWS Business Group
rolf.koski@cybercom.com
rolle
therolle
- “Guy with the sticker”
- Cloud Advisor & Evangelist
- Community Leader
- AWS Partner Ambassador
- Well-Architected Lead
3. Outline
- Intro
- AWS DeepRacer
- Reinforcement Learning
- Building a model & Simulator
- Reality vs. Simulation
- The Race!
4. Jouni Luoma
• Won the DeepRacer league competition
in Stockholm AWS Summit on 22.5.2019
• Winning time was 8,7s
• Some experience with AWS, a bit more
with machine learning and 15+ years in IT
consulting business
• About 2 weeks experience with
DeepRacer before the Summit
jouni.luoma@cybercom.com
https://www.linkedin.com/in/jouniluoma/
6. AWS DeepRacer
• 1/18th scale race car driven
by reinforcement learning
• 3D racing simulator
• Global racing league
• https://aws.amazon.com/deepracer/
7. • DeepRacer car just runs inference on images
captured from the on-board camera.
– The model defines the actions (speed, steering)
• DeepRacer console is service for training the model
– AWS RoboMaker provides the simulation environment
– AWS SageMaker does the training of a neural network
• DeepRacer league is a competition on
– Real track & cars in AWS Summits
– Virtual league in simulator
DeepRacer – Principles
8. DeepRacer Car Specs
CAR 18th scale 4WD with monster truck chassis
CPU Intel Atom™ Processor
MEMORY 4GB RAM
STORAGE 32GB (expandable)
WI-FI 802.11ac
CAMERA 4 MP camera with MJPEG
SOFTWARE
Ubuntu OS 16.04.3 LTS, Intel® OpenVINO™ toolkit,
ROS Kinetic
DRIVE BATTERY 7.4V/1100mAh lithium polymer
COMPUTE BATTERY 13600mAh USB-C PD
PORTS 4x USB-A, 1x USB-C, 1x Micro-USB, 1x HDMI
SENSORS Integrated accelerometer and gyroscope
13. • Reinforcement Learning (RL) is a type of machine
learning along supervised and unsupervised learning.
• RL is the study of agents and how they learn by trial
and error.
– Formalizes the idea of rewarding or punishing an agent in order to …
– Make it more likely to repeat or avoid certain behavior in the future.
Reinforcement Learning
14. Reinforcement Learning – Principles
Agent
Environment
Action
at
State, Reward
st , rt
• Agent interacts with the Environment
• Every step agent sees observation of
the state of the environment
• Fully observed
• Partially observed
• Agent decides an action based on the
observation
• Taken action is based on Policy
• Agent gets new observation of the state
and a reward
• The goal is to maximize the cumulative
reward, called return
16. • Your job:
– Select Action Space for your model
– Provide logic how the agent (the car) gets rewards
– Provide hyperparameters for the training of the model
• AWS DeepRacer takes care of the rest
– AWS DeepRacer console to start and evaluate the training
– AWS RoboMaker for simulating the driving
– The data for neural network training comes from here
– AWS SageMaker for training the model
– AWS Kinesis Video Streams for visual feedback
– AWS CloudWatch logs of the simulation and training events
Building a model & Simulation
18. • Build a rewarding logic based on following inputs
"all_wheels_on_track": Boolean, # flag to indicate if the vehicle is on the track
"x": float, # vehicle's x-coordinate in meters
"y": float, # vehicle's y-coordinate in meters
"distance_from_center": float, # distance in meters from the track center
"is_left_of_center": Boolean, # Flag to indicate if the vehicle is on the left side to the track center or not.
"heading": float, # vehicle's yaw in degrees
"progress": float, # percentage of track completed
"steps": int, # number steps completed
"speed": float, # vehicle's speed in meters per second (m/s)
"steering_angle": float, # vehicle's steering angle in degrees
"track_width": float, # width of the track
"waypoints": [[float, float], … ], # list of [x,y] as milestones along the track center
"closest_waypoints": [int, int] # indices of the two nearest waypoints.
• https://docs.aws.amazon.com/deepracer/latest/developerguide/deepracer-reward-function-input.html
Building a model & Simulation
20. • So, had the first model doing around 45s laps in simulator –
Not that good!
– The fastest times on real track were ~8s at this point
• Continue training
– More rewards for speed
– Training with faster speeds à 20s laps, still not good!
• Continue training
– Testing different reward functions and params à ~12s laps
– Still not getting near the 8s
– But the car runs at the top speed almost all the time? What now?
Building a model & Simulation
22. • We received a track to our
office a week before the summit
• It was HUGE and we had to
move some furniture to even
get it unfolded
• Good enough, let’s try!
Cybercom Circuit Tampere
25. • The car seemed to drive off the track all the time!
– We did not have the visual borders around the track
– The track surface was quite shiny à reflections
• Tried different models (also during night)
• Built visual borders
– Hanging gardening fabric over office chairs
• Even tried to attach polarizing filter to the camera
• After a while got couple of models doing full laps
Towards Results
27. • The doors to Summit opened at 7.45
• Was the first in line to the DeepRacer track at 8.00
• Developers have 4 minutes time on the track
– Only fastest lap counts à 11s lap on first try!
– With model that had 20s laps in simulator
• Stayed at the top for 1-2 hours, but then competitors got
under 10s laps.
• Hitting the queue again (and again)
– Found a model that was fast, but quite unstable
– After some tries hit the 8,7s time
– Did not manage to get a better time for the rest of the day
The Race!