Simulation To Reality: Reinforcement Learning For Autonomous Driving

Simulation To Reality:
Reinforcement Learning For Autonomous
Vehicles
Donal Byrne
October 13th, 2019

Donal Byrne
dbyrne6@jaguarlandrover.com
https://www.linkedin.com/in/don
al-byrne-ai/
Who am I?

● Practical Talk
● Understand what RL is
● How can it be used
● Using RL in the real world
Objective

Agenda
Intro AD Case Study
How To Get
Started
● What is RL
● Where it can be applied
● Identifying the problem
● Designing the agent
● Training
● Simulation to reality
● Simple project
● Learning resources
● Libraries

Intro
Intro
What is Reinforcement Learning?

● Originates from Behaviourism
● Learn the best action to take,
given a specific scenario
● Can be applied to a wide
range of problems
Reinforcement Learning

RewardState
Action
Environment
Agent
RL Lifecycle

Gaming:
● Atari Games - 2013
● Go - 2015
● Dota2 - 2018
● AlphaStar - 2019
Live Use Cases:
● Robotics/Manufacturing
● Medical
● Finance
● Autonomous Driving
● Education
● Resource Management
Use Cases

Intro
Case Study
RL For Autonomous Driving

Project Steps
1. Identify your problem

Autonomous Driving: Motion Control
Should we use RL?
● Can it be optimized/learned?
● Can you explain the goal simply?
● Will the agent have enough
information?
● Can you simulate it?
● Is there a better solution?

Project Steps
2. Reward

What is good driving?
Can you explain this in a single sentence?
Good
Bad
Accuracy Smoothness Adaptability
What about...

How to design a reward
Find its simplest form
Base on outcome, not
method
Shaping

Project Steps
2. Reward
3. State Space

State Information
Key Info:
● Position on the road
● Current speed
● Angle of the steering
wheel
● Velocity (yaw, pitch, roll)
Concerns:
● How much data is being
taken in?
● What format ?
● Correct credit
assignment
● Is this info generic?
● Will it reflect the real
environment
● Should it have noise or
latency

Learning Environment
Open AI Gym
Unity ML Agents
• Where the agent will live and learn
• Will produce all state features
• Bespoke or Prebuilt
Criteria
• Provide all required state features
• Run at 10x real time speed
• Utilise GPU or Parallel Processing

Project Steps
2. Reward
3. State Space
4. Algorithm

Key Factors:
● How does it learn?
● Is it sample efficient?
● Does it scale?
● Does it require a model of it
world
● Curiosity driven
● Probabilistic vs Deterministic
A non-exhaustive, but useful taxonomy of algorithms
in modern RL - Spinning Up In Deep RL
The Brain

What Algorithm Should I Choose?
It depends…..
3 Requirements For Practical RL Algos:
• Sample efficient
• Learn from previous experiences (Off Policy)
• Robust to hyper parameters and environment

Algorithm for
Autonomous Driving
Then:
Deep Deterministic Policy Gradient (DDPG)
• State of the art (at the time….)
• Deterministic
• Sample Efficient -
• Off Policy -
• Robust -
Now:
Soft Actor Critic (SAC)
• Improves upon DDPG’s short comings
• Real World Robotics
• Non-Deterministic

Good place to start:
SAC:
https://github.com/tensorflow/agents/blob/master/tf_agents/col
abs/7_SAC_minitaur_tutorial.ipynb
PPO:
https://github.com/tensorflow/agents/tree/master/tf_agents/age
nts/ppo
How To Choose An Algorithm
• Identify what is critical
• Find papers or examples of
similar problems
• Quickly experiment with high
level libraries
• Try with simple toy environments
Rlkit: https://github.com/vitchyr/rlkit
TF Agents: https://github.com/tensorflow/agents

Project Steps
2. Reward
3. State Space
4. Algorithm
5. Training

Training
Training plan:
● Complex enough to learn how to
generalize
● Not so hard that the agent can’t
succeed
● Reset the agent at random
locations on the track
Curriculum Based Training:
● Introduce small tasks one at a
time
● The agent learns a task and then
builds upon it with the next task
● Meta Learning

Things Will Go Wrong….
• When deep learning fails, it fails
silently
• Parameters are hard to find
• Can get stuck in a local optima
• Catastrophic forgetting
• Exploration / Exploitation
• Cost: Time / Money

Project Steps
2. Reward
3. State Space
4. Algorithm
5. Training
6. Evaluation

Is your agent learning?
After training comes evaluation
testing Tracks, Speeds and Vehicles
Use KPIs and Metrics that are not
just based on your reward function
Track every run, make 1 change at a
time!

Project Steps
2. Reward
3. State Space
4. Algorithm
5. Training
6. Evaluation
7. Simulation To Reality

How to go from simulation to
reality?
Good Parenting
• Algorithm is important for learning a task
• For moving to reality, how it is trained is
more important
• Nurture vs Nature
Key Criteria
• Generalizable Reward
• Independent State Space
• Randomized & Noisy Training
• Mix Of Challenges During Training

Additional considerations
Effect Of Memory
1. Adding a LSTM head to the agent
2. Allows the agent to know the rate of
change in state values
3. Greater capacity to generalize
Additional Training Info
1. Arbitrary information given to value
network
2. Learns to evaluate good actions
faster
3. Learns optimal policy faster

Intro
Reinforcement Learning
How to get started

● Learn the basics
of how RL works
● Deep Mind
Lectures
● OpenAI Spinning
Up
● Take some small
toy examples
● Train some
prebuilt agents
● See what works
● TF Agents
● OpenAI Gym
● Now you have some
intuition
● Go through the project
steps
● Build a custom
environment
● Unity ML-Agents
Go over the
theory
Get your hands
dirty
Build your Proof of
Concept
Where to get started

Should you use RL?
● Can achieve incredible
results
● Capable of finding the
optimal solution to a wide
range of problems
● Technology is growing
rapidly providing better and
simpler solutions

● Not quite there yet
● Can be very expensive and
time consuming
● Not a silver bullet

Simulation To Reality: Reinforcement Learning For Autonomous Driving

More Related Content

Similar to Simulation To Reality: Reinforcement Learning For Autonomous Driving

Recently uploaded

Simulation To Reality: Reinforcement Learning For Autonomous Driving

Editor's Notes