#OSSPARIS19 - Overcoming open source challenges in reinforcement learning - WILLIAM CLEMENTS, Stealth

William Clements, PhD
Mastering the challenges of open source
reinforcement learning
william.clements@indust.ai
indust.ai
Independent
Applied AI
R&D lab

What this talk is about
 RL has huge potential for decision-making and control
 We were one of the first companies to start a dedicated applied
RL lab in mid-2018
 In this talk, we will share what we learned in the process
2indust.ai

Who we are
Solutions in time series analysis, speech recognition, sentiment analysis, …
World-class research in learning from bad data and in decision-making:
 3 publications (including AAAI 2020)
 Academic collaborations: Ecole polytechnique, Oxford, FAIR, …
3indust.ai
Providing medium-size businesses with easy access
to a rare, deep and broad AI resources pool they
need to unlock future growth
A skilled and experienced team, passionate
about applying AI to the core of industrial and
financial services businesses

Overview
 What is reinforcement learning, and what is it good for?
 Use case: building an RL system for smart homes
 Challenges to overcome
 Is open source the answer?
 Our approach
4indust.ai

What is reinforcement learning?
 Create agents that learn to maximize a given objective by interacting with the
environment
 Advances have been made possible by combining reinforcement learning and neural
networks
Environment
Agent
Observation
Reward
Action
Trial and Error
5indust.ai

Reinforcement learning: state of the art
 RL has taken off in the last few years, and achieved impressive results
Go : Learns to beat
humans with no initial
knowledge apart from
the rules of the game
Silver, David, et al.
"Mastering the game of Go
with deep neural networks
and tree search." nature
529.7587 (2016): 484.
Henderson, Peter, et al. "Deep reinforcement learning that matters."
Thirty-Second AAAI Conference on Artificial Intelligence. 2018.
Publications per year
6indust.ai

How RL differs from traditional ML
ML
Requires data
First collect data, then
learn
Mature field
Cannot outperform the
people who label
RL
Requires a simulator
Exploration to collect the right
experience
New field, mostly academic
Can outperform human-specified
systems
7indust.ai

Industrial applications
 Potential applications exist in any system that reacts to external stimulus
Healthcare Recommendation Thermal Control Robotics
A human body reacts
to treatments
A customer reacts
to advertisement
A building reacts to
heating elements
A robot reacts to
motor controls
- Digital doctor
- Dynamic treatments
- Facebook notifications
- Newspaper links
- Google data centers
- 3D printing
- Industrial ovens
- Delivery drones
- Sorting robots
- Robotic tool use
Other fields: logistics, data networks, autonomous driving, portfolio optimization, etc.
Despite this potential, very few real world implementations
8indust.ai

Example Use Case: Smart Home
Role playing: Imagine you’ve been asked to use RL to improve thermal control
How would you do this ?
Observations: thermostat + energy readings
Actions: air conditioning or radiator controls
RL can work here but it won’t be easy
9indust.ai

Why RL is hard – human aspects
 Requires a change of mindset away from traditional ML
 Very little expertise available outside of academia
 RL requires either a simulator or exploration in real life
 Will your customer trust your RL agent?
10indust.ai

Why RL is hard – technical aspects
High compute
requirements
100 years of simulated
experience just to turn
this cube around
Bad at generalizing
A car trained with a green background
fails when the color changes
Hard to specify
« cobra effect » : you may
not get what you want
Equip smart homes
with GPUs?
What will happen in a heat wave? Heat/cost may not be the best
reward for a smart home
11indust.ai

Open Source RL to the rescue?
- In 2018, 72% of companies used open source software, even for critical tasks
(source: Linux Foundation)
- Open source ML tools have been hugely successful (scikit-learn, tensorflow,
pytorch, keras, etc.)
However, open source RL is not the same as open source ML
12indust.ai
 Two choices in RL: build from scratch or build from open source (virtually
no proprietary software yet)
 Why open source:

Open Source RL vs Open Source ML
 Maturity:
 Open source ML has been around for a long time (scikit-learn: 2007), with both industry
and academia in mind
 Open source RL is much more recent (OpenAI gym: 2016), mostly for the academic
community
 Structure:
 Open source ML provides the algorithm, you provide the data
 Open source RL can provide both the algorithm and the simulator
13indust.ai

Open Source Resources
Environments
Spriteworld
OpenAI Gym
BSuite
DM Control
AI Safety Gridworld
PyBullet
OpenSpiel
RLCard
OffWorld Gym
Industrial
Benchmark
DMLab
Algorithms
TFRL
Dopamine
OpenAI
Baselines
Stable
Baselines
ChainerRL
KerasRL
Frameworks
Surreal
Facebook
ReAgent
RLLib
Coach
Simple RL
SLM Lab
14indust.ai

Open Source Resources: Environments
Environments
Spriteworld
OpenAI Gym
BSuite
DM Control
AI Safety Gridworld
PyBullet
OpenSpiel
RLCard
OffWorld Gym
Industrial
Benchmark
DMLab OffWorld GymOpenAI Gym
PyBullet DMLab
15indust.ai

Issues with open source Environments
Environments are generally designed as academic benchmarks, with no
connection to real world applications
You will have to make
your own environment
16indust.ai

Open source Algorithms and Frameworks
https://github.com/openai/baselines
Allow for comparisons between
algorithms
Can support distributed
calculations
Adapted for specific applications
https://surreal.stanford.edu/
https://github.com/facebookresearch/ReAgent
17indust.ai

Issues with open source Algorithms and Frameworks
Open source algorithms tend to be:
- Designed for specific environments
- Not easily customisable
- Written by researchers, not developers
- Not always reproducible
Henderson, Peter, et al. "Deep reinforcement learning that matters."
Thirty-Second AAAI Conference on Artificial Intelligence. 2018.
Example: Different implementations
of the same algorithm
You will have to make
your own framework
18indust.ai

Our approach
19indust.ai
Unreliable and opaque open
source frameworks
We built our own, which we plan
to open source!
- Great for upskilling
Unsuitable open source
environments
We make our own (proprietary)
simulators, working with industry experts
Disconnect between
academia and industry
We started a research activity, in
collaboration with academics

Smart home: our solution
Design a simulator with industry
experts, using OpenAI gym template
Validate simulator using real world data
Build RL framework from
ground up using
Benchmark algorithms on academic environments
Train algorithms on simulator, real world tests, and refinements
20indust.ai

How we tackle the RL human challenge
- Managers often don’t see opportunities: we are evangelists who work
on changing mindsets
- Talents shortage: we do outreach
- “Nuit de l’IA” at Polytechnique
- Lecture at the Machine Learning Summer School 2019
- “Maths et IA” event at Université Paris-Sud 2019
21indust.ai
 Challenges are not just technical:

Conclusion
22indust.ai
 RL has huge potential for impact on industry
 Open source RL does not yet rise to the challenge
 We’ve built a dedicated R&D lab to solve RL for industry

23indust.ai
Want to know more? Drop us a line:
info@indust.ai

#OSSPARIS19 - Overcoming open source challenges in reinforcement learning - WILLIAM CLEMENTS, Stealth

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to #OSSPARIS19 - Overcoming open source challenges in reinforcement learning - WILLIAM CLEMENTS, Stealth

Similar to #OSSPARIS19 - Overcoming open source challenges in reinforcement learning - WILLIAM CLEMENTS, Stealth (20)

More from Paris Open Source Summit

More from Paris Open Source Summit (20)

Recently uploaded

Recently uploaded (20)

#OSSPARIS19 - Overcoming open source challenges in reinforcement learning - WILLIAM CLEMENTS, Stealth

Editor's Notes