#IA Track - Practical applications
Reinforcement learning is a rapidly growing branch of artificial intelligence that has achieved super-human performance in board games such as Go and chess and video games such as Starcraft. Research papers and open code in this field are widely available.
However, unlike other fields of machine learning, open code and research has so far largely failed to translate into real world applications.
In this talk, we leverage the indust.ai team's experience in developing their own reinforcement learning activity to discuss the challenges involved. These include poor reproducibility, varying code quality, prohibitive computation and data requirements, the difference in mindset between traditional machine learning and reinforcement learning, and the difficulty of finding the skills required to transfer academic research to the real world. We will also present some of our approaches for overcoming these issues.
#OSSPARIS19 - Overcoming open source challenges in reinforcement learning - WILLIAM CLEMENTS, Stealth
1. William Clements, PhD
Mastering the challenges of open source
reinforcement learning
william.clements@indust.ai
indust.ai
Independent
Applied AI
R&D lab
2. What this talk is about
RL has huge potential for decision-making and control
We were one of the first companies to start a dedicated applied
RL lab in mid-2018
In this talk, we will share what we learned in the process
2indust.ai
3. Who we are
Solutions in time series analysis, speech recognition, sentiment analysis, …
World-class research in learning from bad data and in decision-making:
3 publications (including AAAI 2020)
Academic collaborations: Ecole polytechnique, Oxford, FAIR, …
3indust.ai
Providing medium-size businesses with easy access
to a rare, deep and broad AI resources pool they
need to unlock future growth
A skilled and experienced team, passionate
about applying AI to the core of industrial and
financial services businesses
4. Overview
What is reinforcement learning, and what is it good for?
Use case: building an RL system for smart homes
Challenges to overcome
Is open source the answer?
Our approach
4indust.ai
5. What is reinforcement learning?
Create agents that learn to maximize a given objective by interacting with the
environment
Advances have been made possible by combining reinforcement learning and neural
networks
Environment
Agent
Observation
Reward
Action
Trial and Error
5indust.ai
6. Reinforcement learning: state of the art
RL has taken off in the last few years, and achieved impressive results
Go : Learns to beat
humans with no initial
knowledge apart from
the rules of the game
Silver, David, et al.
"Mastering the game of Go
with deep neural networks
and tree search." nature
529.7587 (2016): 484.
Henderson, Peter, et al. "Deep reinforcement learning that matters."
Thirty-Second AAAI Conference on Artificial Intelligence. 2018.
Publications per year
6indust.ai
7. How RL differs from traditional ML
ML
Requires data
First collect data, then
learn
Mature field
Cannot outperform the
people who label
RL
Requires a simulator
Exploration to collect the right
experience
New field, mostly academic
Can outperform human-specified
systems
7indust.ai
8. Industrial applications
Potential applications exist in any system that reacts to external stimulus
Healthcare Recommendation Thermal Control Robotics
A human body reacts
to treatments
A customer reacts
to advertisement
A building reacts to
heating elements
A robot reacts to
motor controls
- Digital doctor
- Dynamic treatments
- Facebook notifications
- Newspaper links
- Google data centers
- 3D printing
- Industrial ovens
- Delivery drones
- Sorting robots
- Robotic tool use
Other fields: logistics, data networks, autonomous driving, portfolio optimization, etc.
Despite this potential, very few real world implementations
8indust.ai
9. Example Use Case: Smart Home
Role playing: Imagine you’ve been asked to use RL to improve thermal control
How would you do this ?
Observations: thermostat + energy readings
Actions: air conditioning or radiator controls
RL can work here but it won’t be easy
9indust.ai
10. Why RL is hard – human aspects
Requires a change of mindset away from traditional ML
Very little expertise available outside of academia
RL requires either a simulator or exploration in real life
Will your customer trust your RL agent?
10indust.ai
11. Why RL is hard – technical aspects
High compute
requirements
100 years of simulated
experience just to turn
this cube around
Bad at generalizing
A car trained with a green background
fails when the color changes
Hard to specify
« cobra effect » : you may
not get what you want
Equip smart homes
with GPUs?
What will happen in a heat wave? Heat/cost may not be the best
reward for a smart home
11indust.ai
12. Open Source RL to the rescue?
- In 2018, 72% of companies used open source software, even for critical tasks
(source: Linux Foundation)
- Open source ML tools have been hugely successful (scikit-learn, tensorflow,
pytorch, keras, etc.)
However, open source RL is not the same as open source ML
12indust.ai
Two choices in RL: build from scratch or build from open source (virtually
no proprietary software yet)
Why open source:
13. Open Source RL vs Open Source ML
Maturity:
Open source ML has been around for a long time (scikit-learn: 2007), with both industry
and academia in mind
Open source RL is much more recent (OpenAI gym: 2016), mostly for the academic
community
Structure:
Open source ML provides the algorithm, you provide the data
Open source RL can provide both the algorithm and the simulator
13indust.ai
15. Open Source Resources: Environments
Environments
Spriteworld
OpenAI Gym
BSuite
DM Control
AI Safety Gridworld
PyBullet
OpenSpiel
RLCard
OffWorld Gym
Industrial
Benchmark
DMLab OffWorld GymOpenAI Gym
PyBullet DMLab
15indust.ai
16. Issues with open source Environments
Environments are generally designed as academic benchmarks, with no
connection to real world applications
You will have to make
your own environment
16indust.ai
17. Open source Algorithms and Frameworks
https://github.com/openai/baselines
Allow for comparisons between
algorithms
Can support distributed
calculations
Adapted for specific applications
https://surreal.stanford.edu/
https://github.com/facebookresearch/ReAgent
17indust.ai
18. Issues with open source Algorithms and Frameworks
Open source algorithms tend to be:
- Designed for specific environments
- Not easily customisable
- Written by researchers, not developers
- Not always reproducible
Henderson, Peter, et al. "Deep reinforcement learning that matters."
Thirty-Second AAAI Conference on Artificial Intelligence. 2018.
Example: Different implementations
of the same algorithm
You will have to make
your own framework
18indust.ai
19. Our approach
19indust.ai
Unreliable and opaque open
source frameworks
We built our own, which we plan
to open source!
- Great for upskilling
Unsuitable open source
environments
We make our own (proprietary)
simulators, working with industry experts
Disconnect between
academia and industry
We started a research activity, in
collaboration with academics
20. Smart home: our solution
Design a simulator with industry
experts, using OpenAI gym template
Validate simulator using real world data
Build RL framework from
ground up using
Benchmark algorithms on academic environments
Train algorithms on simulator, real world tests, and refinements
20indust.ai
21. How we tackle the RL human challenge
- Managers often don’t see opportunities: we are evangelists who work
on changing mindsets
- Talents shortage: we do outreach
- “Nuit de l’IA” at Polytechnique
- Lecture at the Machine Learning Summer School 2019
- “Maths et IA” event at Université Paris-Sud 2019
21indust.ai
Challenges are not just technical:
22. Conclusion
22indust.ai
RL has huge potential for impact on industry
Open source RL does not yet rise to the challenge
We’ve built a dedicated R&D lab to solve RL for industry
RL is the most computing-power intensive discipline in AI. Below the training times for the most advanced models in RL, NLP and Machine Vision :
BERT (Google, NLP) : 64 GPU, 1 week training
Open5 (OpenAI, RL) : 256 GPU, 64000 CPU-core, 1 month training
ImageNet (FastAI, CV) : 128 GPU,18 mintues
RL is barely taught at University : Out of 42 classes in the leading French teaching program in AI (MVA, ENS Cachan), 2 is dedicated to RL, 16 on Computer Vision (as of Jan-19)
RL is the most computing-power intensive discipline in AI. Below the training times for the most advanced models in RL, NLP and Machine Vision :
BERT (Google, NLP) : 64 GPU, 1 week training
Open5 (OpenAI, RL) : 256 GPU, 64000 CPU-core, 1 month training
ImageNet (FastAI, CV) : 128 GPU,18 mintues
RL is barely taught at University : Out of 42 classes in the leading French teaching program in AI (MVA, ENS Cachan), 2 is dedicated to RL, 16 on Computer Vision (as of Jan-19)