DSDT Meetup May 2021

Data Science | Design | Technology
https://www.meetup.com/DSDTMTL
May
26
2021

2
May
26
Please, don't forget to
mute yourself
(2021)

JL Maréchaux
DSDT Co-Organizer
Marc G. Bellemare
Research Scientist
Google Research
Brain Team

Agenda
3:45 - 4:00 Arrival & Networking
4:00 - 4:15 News & Intro
4:15 - 5:15 How can reinforcement learning
help us fly balloons in the stratosphere?
5:15 - 5:30: Virtual Snack & Networking
4
DSDT Meetup - May 26, 2021

5
A special thanks to our contributors…
Lorem ipsum congue
tempus
Lorem ipsum
tempus
Lorem ipsum congue
tempus
Lorem ipsum
tempus
Lorem ipsum
congue tempus
Lorem ipsum congue
tempus
Thanks
Merci
The
(virtual)
venue
sponsor
& snacks
The brain
...

DSDT Mtl meetup
Pdipiscing elit
322,722 views
DSDT Meetup
Pdipiscing elit
322,722 views
DSDT Meetup
Pdipiscing elit
322,722 views
DSDT
Pdipiscin
322,722
Virtual Meetups
Until we can do in-person events
again in Montreal…
Past (and future) presentations
available on Slideshare.
http://www.slideshare.net/DSDT_MTL

Monthly cadence, on Wednesdays.
Incredible sessions already planned for May, June and July.
Contact us with your expectations & ideas.
ML
Validation
Reinforcement
Learning
Explainable
AI
RNN & Time
Series
Lorem ipsum
Commodo
April 28
May 26 July 21
What is coming in 2021
June 16
Your ideas,
your meetup.
http://bit.ly/DSDTsurvey2021

8
Our 2021 campaign to ﬁght against
poverty and social exclusion.
Data Science.
Design.
Technology.
https://centraide-mtl.org/dsdtmtl

How can
reinforcement
learning help us ﬂy
balloons in the
stratosphere?
Data Science | Design | Technology 9
Marc G. Bellemare
Picture

Decisions from data: Controlling complex systems with
reinforcement learning
Marc G. Bellemare1
, Salvatore Candido2
, Pablo Samuel Castro1
, Jun Gong2
, Marlos C.
Machado1
, Subhodeep Moitra1
, Sameera S. Ponda2
, Ziyu Wang1
1
Google Research, Brain team
2
Loon
With thanks to: Beth Reid, Joshua Greaves, Bradley Rhodes, and many more
https://www.nature.com/articles/s41586-020-2939-8

Image credit: https://bejofo.net/ttt

Reinforcement learning = trial and error
data → decisions

Credit assignment is at the heart of RL

Credit assignment via the Bellman equation
Markov decision process
Implemented as a
Deep neural network

80 units; Tesauro (1995) 40 x 256 convolutional filters;
Silver et al. (2017)
Deep reinforcement learning

Many RL problems are...
Underactuated Partially observable

The quasi-biennial oscillation, Baldwin et al. (2001)

312 Days in the Stratosphere, Loon, Oct 28 2020.
Loon proprietary

Long-term objective,
binary signal
Partial
observability
Limited
power
Underactuated system,
stochastic dynamics

StationSeeker in equations
1) Wind score.
2) Per-altitude score.
3) Setpoint to max. scoring altitude.

Deep reinforcement learning for balloon navigation
+16 ambient variables
Forecast +
measurements +
Gaussian process =
wind column

The ERA5 reanalysis (dataset) provides baseline winds
Like real, but
Low resolution.
Baseline winds are upsampled using procedural noise:
Statistically plausible
High resolution
Effectively infinite supply
The simulator

Design and training
2-day training simulations
In the tropics (+/- 25 lat.)
Starting up to 200km away
Light filtering of “impossible” conditions
Distributional predictions (QR-DQN)
Distributed training:
100 actors
4 replay buffers
1 GPU
1.1B training steps (~30 days wall time)

Pacific Ocean Experiment
26 Oct 2019 – 25 Jan 2020
13 balloons
Total 2884 RL flight hours
Longest RL flight ~16 days
0N 114W
“StationSeeker”
balloons
“Perciatelli”
balloons

StationSeeker Perciatelli
TWR50: 72%
Power: 33 W
TWR50: 79%
Power: 29 W

312 Days in the Stratosphere, Loon, Oct 28 2020.

Bellemare, Naddaf, Veness, Bowling (2013)
Mnih, Silver, Kavukcuoglu, Rusu, Veness, Bellemare, et al. (Nature, 2015)
Levine et al., (2016)
Kalashnikov, Irpan, et al. (2018)
Silver, Huang, Maddison, et al. (2016, 2017)
Bard, Foerster, Chandar, et al. (2020)
OpenAI et al. (2019)
Vinyals et al. (2019)
Deep reinforcement learning

Mirhoseini, Goldie, et al. (arXiv, 2020)
Won et al., 2020
Glavic et al., 2017
Ie et al., 2019
Reddy et al. (2018)

Merci / Thank You
@DsdtMtl
(Check for next DSDT meetup at https://www.meetup.com/DSDTmtl)
http://bit.ly/dsdtmtl-in

DSDT Meetup May 2021

Recommended

Recommended

More Related Content

Similar to DSDT Meetup May 2021

Similar to DSDT Meetup May 2021 (20)

More from DSDT_MTL

More from DSDT_MTL (14)

Recently uploaded

Recently uploaded (20)

DSDT Meetup May 2021