A Recurrent Neuronal Network (RNN) is used for short term prediction of weather radar images. The aim is to predict future rain fields.
The model structure has been carefully selected to guide the model and make learning more efficient.
Adaptive Restore algorithm & importance Monte Carlo
Recurrent Neuronal Network tailored for Weather Radar Nowcasting
1. Eawag: Swiss Federal Institute of Aquatic Science and Technology
Recurrent Neuronal Network tailored for
Weather Radar Nowcasting
June 23, 2016
Andreas Scheidegger
2. Andreas Scheidegger – Eawag
Weather Radar Nowcasting
t0
t-1t-2t-3
t-4
t5t4t3t2
t1
nowcast
model
available inputs:
every pixel of the previous images
desired outputs:
every pixel of the next n images
4. Andreas Scheidegger – Eawag
Recurrent ANN
Pt
Ht Ht+1
ANN
Ht+2
ANN
Pt+1
^ Pt+2
^
last observed
image
5. Andreas Scheidegger – Eawag
Recurrent ANN
Pt
Ht Ht+1
ANN
Ht+2 Ht+3
Pt+3
^
ANN ANN
Pt+1
^ Pt+2
^
last observed
image
6. Andreas Scheidegger – Eawag
Recurrent ANN
Pt
Ht Ht+1
ANN
Ht+2 Ht+3 Ht+4
Pt+3
^ Pt+4
^
ANN ANN ANN
Pt+1
^ Pt+2
^
last observed
image
7. Andreas Scheidegger – Eawag
Model structure
Pt
Ht Ht+1
ANN
Pt+1
^
v j=σ(∑
k
wjk uk +bj)
Traditional Artificial Neuronal
Networks (ANN) are (nested) non-
linear regressions:
Traditional Artificial Neuronal
Networks (ANN) are (nested) non-
linear regressions:
How can we do better?
8. Andreas Scheidegger – Eawag
Model structure
Pt
Ht Ht+1
ANN
Pt+1
^
v j=σ(∑
k
wjk uk +bj)
Traditional Artificial Neuronal
Networks (ANN) are (nested) non-
linear regressions:
Traditional Artificial Neuronal
Networks (ANN) are (nested) non-
linear regressions:
How can we do better?
tailor model structure to problem
12. Andreas Scheidegger – Eawag
Spatial Transformer
input output
Jaderberg, M., Simonyan, K., Zisserman, A., and
Kavukcuoglu, K. (2015) Spatial Transformer Networks.
arXiv:1506.02025 [cs].
27. Andreas Scheidegger – Eawag
Deep learning may be a
hype – but it's a useful
tool nevertheless!
Conclusions
28. Andreas Scheidegger – Eawag
v j=σ( ∑k
wjk
u k
+b j)
Combine domain
knowledge with data
driven modeling
Deep learning may be a
hype – but it's a useful
tool nevertheless!
Conclusions
29. Andreas Scheidegger – Eawag
and Outlook
v j=σ( ∑k
wjk
u k
+b j)
Combine domain
knowledge with data
driven modeling
Deep learning may be a
hype – but it's a useful
tool nevertheless!
Conclusions
30. Andreas Scheidegger – Eawag
and Outlook
v j=σ( ∑k
wjk
u k
+b j)
Combine domain
knowledge with data
driven modeling
θt+1=θt−λ ∇ L(θt)
Online parameter adaptionDeep learning may be a
hype – but it's a useful
tool nevertheless!
Conclusions
31. Andreas Scheidegger – Eawag
and Outlook
v j=σ( ∑k
wjk
u k
+b j)
Combine domain
knowledge with data
driven modeling
θt+1=θt−λ ∇ L(θt)
Online parameter adaption
L(θ)Other objective function?
Deep learning may be a
hype – but it's a useful
tool nevertheless!
Conclusions
32. Andreas Scheidegger – Eawag
and Outlook
v j=σ( ∑k
wjk
u k
+b j)
Combine domain
knowledge with data
driven modeling
θt+1=θt−λ ∇ L(θt)
Online parameter adaption
L(θ)Other objective function?
Deep learning may be a
hype – but it's a useful
tool nevertheless!
Include other inputs
Conclusions
33. Andreas Scheidegger – Eawag
and Outlook
v j=σ( ∑k
wjk
u k
+b j)
Combine domain
knowledge with data
driven modeling
θt+1=θt−λ ∇ L(θt)
Online parameter adaption
L(θ)Other objective function?
Deep learning may be a
hype – but it's a useful
tool nevertheless!
Include other inputs
Predict prediction uncertainty
Conclusions
34. Andreas Scheidegger – Eawag
and Outlook
v j=σ( ∑k
wjk
u k
+b j)
Combine domain
knowledge with data
driven modeling
θt+1=θt−λ ∇ L(θt)
Online parameter adaption
L(θ)Other objective function?
Deep learning may be a
hype – but it's a useful
tool nevertheless!
Include other inputs
Interested in collaboration?
andreas.scheidegger@eawag.ch
Predict prediction uncertainty
Conclusions
36. Andreas Scheidegger – Eawag
Training and Implementaion
θnew=θold−λ ∇ L(θold)
Loss function
Scheduled Learning
Regularisation
Drop-out
L(θ)=∫
0
∞
RMS(τ;θ) p(τ)d τ Runs on cuda compatible GPUs
Based on Python library chainer
Srivastava, et al. (2014). Dropout: A simple way to
prevent neural networks from overfitting. The Journal
of Machine Learning Research 15, 1929–1958.
Stochastic Gradient Decent
Implementation
Bengio, et al. (2015). Scheduled sampling for sequence prediction with
recurrent neural networks. arXiv preprint arXiv:1506.03099.
Tokui, et al., (2015). Chainer: a Next-Generation Open
Source Framework for Deep Learning.
37. Andreas Scheidegger – Eawag
Training
Pt
Ht Ht+1
Pt+1
ANN
Ht+2 Ht+3 Ht+4
Pt+3
^ Pt+4
^
ANN ANN ANN
Pt+2
training predictions
observed images predicted images
38. Andreas Scheidegger – Eawag
Scheduled Training
Pt
Ht Ht+1
Pt+1
ANN
Ht+2 Ht+3 Ht+4
Pt+3
^ Pt+4
^
ANN ANN ANN
Pt+1
^
or
Pt+2
Pt+2
^
or
training predictions
observed or predicted images predicted images
Bengio, Set al. (2015). Scheduled sampling for sequence prediction with
recurrent neural networks. arXiv preprint arXiv:1506.03099.
42. Andreas Scheidegger – Eawag
Traditional model structures
Estimate velocity field
and translate last radar
image (optical flow)
Identify rain cells, track
and extrapolate them
→ seems to be the most
“natural” approach
→ can model grows
and decay of cells
● Models can be difficult to tune
● Local features (e.g. mountain range) cannot be modeled