How can reinforcement learning help us fly balloons in the stratosphere?
This talk describes the use of reinforcement learning to create a high-performing flight controller for Loon superpressure balloons. The Google algorithm uses data augmentation and a self-correcting design to overcome the key technical challenge of reinforcement learning from imperfect data, which has proved to be a major obstacle to its application to physical systems.
Marc G. Bellemare, from the Google Brain team in Montreal, will present recent work, published in Nature, on using reinforcement learning to fly tennis-court-sized balloons in the stratosphere.
Agenda:
-----------
3:45pm - 4:00pm: Arrival & Networking
4:00pm - 4:15pm: News & Intro
4:15pm - 5:15pm: RL to fly balloons in the stratosphere
5:15pm - 5:30pm: Virtual Snack & Networking
4. Agenda
3:45 - 4:00 Arrival & Networking
4:00 - 4:15 News & Intro
4:15 - 5:15 How can reinforcement learning
help us fly balloons in the stratosphere?
5:15 - 5:30: Virtual Snack & Networking
4
DSDT Meetup - May 26, 2021
5. 5
A special thanks to our contributors…
Lorem ipsum congue
tempus
Lorem ipsum
tempus
Lorem ipsum congue
tempus
Lorem ipsum
tempus
Lorem ipsum
congue tempus
Lorem ipsum congue
tempus
Thanks
Merci
The
(virtual)
venue
sponsor
& snacks
The brain
...
6. DSDT Mtl meetup
Pdipiscing elit
322,722 views
DSDT Meetup
Pdipiscing elit
322,722 views
DSDT Meetup
Pdipiscing elit
322,722 views
DSDT
Pdipiscin
322,722
Virtual Meetups
Until we can do in-person events
again in Montreal…
Past (and future) presentations
available on Slideshare.
http://www.slideshare.net/DSDT_MTL
7. Monthly cadence, on Wednesdays.
Incredible sessions already planned for May, June and July.
Contact us with your expectations & ideas.
ML
Validation
Reinforcement
Learning
Explainable
AI
RNN & Time
Series
Lorem ipsum
Commodo
April 28
May 26 July 21
What is coming in 2021
June 16
Your ideas,
your meetup.
http://bit.ly/DSDTsurvey2021
8. 8
Our 2021 campaign to fight against
poverty and social exclusion.
Data Science.
Design.
Technology.
https://centraide-mtl.org/dsdtmtl
10. Decisions from data: Controlling complex systems with
reinforcement learning
Marc G. Bellemare1
, Salvatore Candido2
, Pablo Samuel Castro1
, Jun Gong2
, Marlos C.
Machado1
, Subhodeep Moitra1
, Sameera S. Ponda2
, Ziyu Wang1
1
Google Research, Brain team
2
Loon
With thanks to: Beth Reid, Joshua Greaves, Bradley Rhodes, and many more
https://www.nature.com/articles/s41586-020-2939-8
30. Deep reinforcement learning for balloon navigation
+16 ambient variables
Forecast +
measurements +
Gaussian process =
wind column
31. The ERA5 reanalysis (dataset) provides baseline winds
Like real, but
Low resolution.
Baseline winds are upsampled using procedural noise:
Statistically plausible
High resolution
Effectively infinite supply
The simulator
32. Design and training
2-day training simulations
In the tropics (+/- 25 lat.)
Starting up to 200km away
Light filtering of “impossible” conditions
Distributional predictions (QR-DQN)
Distributed training:
100 actors
4 replay buffers
1 GPU
1.1B training steps (~30 days wall time)
33.
34.
35.
36.
37. Pacific Ocean Experiment
26 Oct 2019 – 25 Jan 2020
13 balloons
Total 2884 RL flight hours
Longest RL flight ~16 days
0N 114W
“StationSeeker”
balloons
“Perciatelli”
balloons
48. Bellemare, Naddaf, Veness, Bowling (2013)
Mnih, Silver, Kavukcuoglu, Rusu, Veness, Bellemare, et al. (Nature, 2015)
Levine et al., (2016)
Kalashnikov, Irpan, et al. (2018)
Silver, Huang, Maddison, et al. (2016, 2017)
Bard, Foerster, Chandar, et al. (2020)
OpenAI et al. (2019)
Vinyals et al. (2019)
Deep reinforcement learning
49. Mirhoseini, Goldie, et al. (arXiv, 2020)
Won et al., 2020
Glavic et al., 2017
Ie et al., 2019
Reddy et al. (2018)
50. Merci / Thank You
@DsdtMtl
Data Science | Design | Technology
(Check for next DSDT meetup at https://www.meetup.com/DSDTmtl)
http://bit.ly/dsdtmtl-in