Deep Learning
for
Extreme Weather Detection
- 1 -
Prabhat
Lawrence Berkeley National Laboratory
SAMSI Climate Workshop
8/23/2017
Outline
•Scientific Motivation
•Deep Learning in Computer Vision
– Trends, Successes
•Deep Learning in Climate Science
– Results, Challenges
•Future of AI in Climate Science
- 2 -
Extreme Weather
- 3 -
- 4 -
- 5 -
- 6 -
- 7 -
How will extreme weather change in the future?
- 8 -
How will extreme weather change in the future?
•Look back in time (Paleoclimate records)
•Look forward in time (Climate simulations)
– Internal climate system variability
– External forcings
– Anthropogenic influence
•Need an objective tool for detecting extremes
– Pattern detection task
– Can Deep Learning come to the rescue?
- 9 -
Climate Simulations
- 10 -
Outline
•
•Deep Learning in Computer Vision
– Trends, Successes
•
–
•
- 11 -
Computer Vision tasks
- 12 -
Deep Learning for Computer Vision
- 13 -
ImageNet Dataset
•1000 object classes
•1.2M training examples
- 14 -
AlexNet Architecture
- 15 -
ImageNet Performance
- 16 -
Can Deep Learning Work for Climate Science?
- 17 -
• Similarities
– Tasks:
• Classification, Localization, Detection, Segmentation
• Clustering
• Feature Learning
• Differences
– Unique attributes of Climate Data
• Multi-channel / Multi-variate
• Different Spatio-temporal scales
• Double precision floating point
• Underlying statistics are likely different
Challenge: Multi-Variate Data
- 18 -
Outline
•
•
–
•Deep Learning in Climate Science
– Results, Challenges
•
- 19 -
- 20 -
Task: Find Extreme Weather Patterns
Supervised Binary Classification
•Training Input: Cropped, Centered, Multi-variate
patches with Labels
–Tropical Cyclone (TC)
–Atmospheric River (AR)
–Weather Front (WF)
•Output: Binary (Yes/No) on Test patches
– Is there a TC in the patch?
– Is there an AR in the patch?
– Is there a WF in the patch?
- 21 -
CLASSIFICATION Image
Dimension
Variables Total Examples
(+ve) (-ve)
Tropical Cyclone 32x32 PSL,UBOT,VBOT,TMQ,
U850,V850,T200,T500
10000 10000
Atmospheric
Rivers
148x224 TMQ, Land Sea mask 6500 6800
Weather Fronts 27x60 T2m, Precip, PSL 5600 6500
Training Data
CLASSIFICATION Conv1 Pool1 Conv2 Pool2 Full Full
Tropical Cyclone 5x5-8 2x2 5x5-16 2x2 50 2
Atmospheric River 12x12-8 3x3 12x12-16 2x2 200 2
Weather Fronts 5x5-16 2x2 5x5-16 2x2 400 2
Supervised Convolutional Architecture
Logistic
Regression
K-Nearest
Neighbor
Support Vector
Machine
Random
Forest
ConvNet
Train Test Train Test Train Test Train Test Train Test
Tropical
Cyclone
96.8 95.85 98.1 97.85 97.0 95.85 99.2 99.4 99.3 99.1
Atmospheric
Rivers
81.97 82.65 79.7 81.7 81.6 83.0 87.9 88.4 90.5 90.0
Weather
Fronts
84.9 89.8 72.46 76.45 84.35 90.2 80.97 87.5 88.7 89.4
Hyper-parameter optimization applied with Spearmint for all methods
Supervised Classification Accuracy
Semi-Supervised Detection
•Objectives:
– Create unified architecture for all weather patterns
– Want to predict bounding box location for weather pattern
– Might have few/no labels for several weather patterns; want
to discover new patterns
- 25 -
Semi-Supervised Convolutional
Architecture
Encoder Decoder
Classification + Bounding Box Regression
Contributors: Evan Racah, Chris Beckham, Tegan Maharaj, Chris
Pal
Reconstruction Results
- 27 -
Original Reconstruction
Classification + Regression Results
- 28 -
Ground Truth
Prediction
Weather Front Detection
- 30 -
Contributors: Jim Biard, Ken Kunkel, Evan Racah
- 31 -
- 32 -
- 33 -
Current Status
• SC’17 paper on scaling semi-supervised
architecture to 600,000 cores, 15PF performance
•2 papers under review
–BAMS article: “Deep Learning for Detecting Extreme
Weather Patterns”
– NIPS paper: “ExtremeWeather: A large-scale climate
dataset for semi-supervised detection, localization, and
understanding of extreme weather events”
•Upcoming papers
– Front detection
– Atmospheric river detection
- 34 -
Next Steps
• Goal: Create unified architecture for finding
extreme weather in all climate datasets
• Create schema and reference dataset for climate
community; release reference architecture
• Hybrid LSTM + CNN architectures for storm tracking
• Classification -> Localization -> Detection ->
Segmentation
- 35 -
Open Challenges
• Hyper-Parameter Optimization
– Tuning #layers, #filters, learning rates, schedule is a black art
• Performance and Scaling
– Current networks take days to train on O(10) GB datasets, we
have O(10TB) datasets on hand
• Scarcity of Labeled Data
– Community needs to self-organize and run labeling campaigns
– Need some theory around limits of semi-supervised learning
• Interpretability and Visualization
– ‘Black Box’ classifier
– Need to incorporate domain science principles (physical
consistency, etc)
- 36 -
Outline
•
•
–
•
–
•Future of AI in Climate Science
- 37 -
Timeline of 2 communities
- 38 -
1970 1980 1990 2000 2010
• Neural Nets • Multi-layer
perceptrons,
Backpropagation
• Eigenface
• First principles
Physics,
Geometry
• Random Forests
• Hand-tuned
features (SIFT,
HOG, …)
• Edge, Corner
detectors
• SVMs, Machine
Learning
• PASCAL VOC
• ImageNet
• ILSVRC
• AlexNet
• ResNet
• …
• Hand-tuned features, first principles physics,…
• IMILAST, ARTMIP
Pattern Detection in Computer Vision
Pattern Detection in Climate Science
Disclaimer…
- 39 -
2018-2020
•Climate Science community will self-organize and
conduct labeling campaigns
– Datasets and reference architectures will become easily
available
•Researchers will exploit low-hanging fruit
– Classification, Clustering, Regression problems will be
(nearly) completely solved
•Broad applicability for analyzing observational
(satellite, weather station) and model (reanalysis,
CMIP-6) output
- 40 -
2020+
•Broad Deployment of tools at HPC centers, Cloud
and universities
– Interactive analytics on 100TB-PB datasets
•Entire climate archives are segmented and
classified
– Anomaly detection; Correlation; Causal Analysis
•Hard questions are formulated and addressed
– Interpretability, incorporating domain science principles
• What is the ‘value add’ of the climate data analyst?
- 41 -
2020+ Climate Analyst Workflow
- 42 -
CMIP
archives
Observational
Data
• Interactive Exploration
• Semantic Labels
• Patterns
• Clusters
• Anomalies
• Mechanisms
• Hypothesis
Conclusions
• Characterizing Extreme Weather in future climate
regimes is an important goal
• Deep Learning can be applied effectively to solve pattern
recognition problem
– Supervised and semi-supervised CNNs
– Segmentation and LSTM architectures in development
• Unique challenges
– Performance and Scaling, Interpretability, Hyper-parameter
Optimization, Lack of labeled data
• Potential for innovation and impact, open to
collaboration!
- 43 -
Acknowledgements
• Intel Research: Nadathur Satish, Narayanan Sundaram, Mostofa
Patwary, Amir Khosrowshahi, Pradeep Dubey, Joe Curley
• Stanford: Ioannis Mitliagkas, Jian Yang, Yunjie Liu, Chris Re
• UC Berkeley: Mayur Mudigonda
• MILA: Evan Racah, Chris Beckham, Tegan Maharaj, Chris Pal
• LBNL: Thorsten Kurth, Wahid Bhimji, Michael Wehner, William
Collins
• NOAA: Jim Biard, Ken Kunkel
- 44 -
Questions?
prabhat@lbl.gov

Program on Mathematical and Statistical Methods for Climate and the Earth System Opening Workshop, Deep Learning for Extreme Weather Detection - Prabhat, Aug 23, 2017

  • 1.
    Deep Learning for Extreme WeatherDetection - 1 - Prabhat Lawrence Berkeley National Laboratory SAMSI Climate Workshop 8/23/2017
  • 2.
    Outline •Scientific Motivation •Deep Learningin Computer Vision – Trends, Successes •Deep Learning in Climate Science – Results, Challenges •Future of AI in Climate Science - 2 -
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
    How will extremeweather change in the future? - 8 -
  • 9.
    How will extremeweather change in the future? •Look back in time (Paleoclimate records) •Look forward in time (Climate simulations) – Internal climate system variability – External forcings – Anthropogenic influence •Need an objective tool for detecting extremes – Pattern detection task – Can Deep Learning come to the rescue? - 9 -
  • 10.
  • 11.
    Outline • •Deep Learning inComputer Vision – Trends, Successes • – • - 11 -
  • 12.
  • 13.
    Deep Learning forComputer Vision - 13 -
  • 14.
    ImageNet Dataset •1000 objectclasses •1.2M training examples - 14 -
  • 15.
  • 16.
  • 17.
    Can Deep LearningWork for Climate Science? - 17 - • Similarities – Tasks: • Classification, Localization, Detection, Segmentation • Clustering • Feature Learning • Differences – Unique attributes of Climate Data • Multi-channel / Multi-variate • Different Spatio-temporal scales • Double precision floating point • Underlying statistics are likely different
  • 18.
  • 19.
    Outline • • – •Deep Learning inClimate Science – Results, Challenges • - 19 -
  • 20.
    - 20 - Task:Find Extreme Weather Patterns
  • 21.
    Supervised Binary Classification •TrainingInput: Cropped, Centered, Multi-variate patches with Labels –Tropical Cyclone (TC) –Atmospheric River (AR) –Weather Front (WF) •Output: Binary (Yes/No) on Test patches – Is there a TC in the patch? – Is there an AR in the patch? – Is there a WF in the patch? - 21 -
  • 22.
    CLASSIFICATION Image Dimension Variables TotalExamples (+ve) (-ve) Tropical Cyclone 32x32 PSL,UBOT,VBOT,TMQ, U850,V850,T200,T500 10000 10000 Atmospheric Rivers 148x224 TMQ, Land Sea mask 6500 6800 Weather Fronts 27x60 T2m, Precip, PSL 5600 6500 Training Data
  • 23.
    CLASSIFICATION Conv1 Pool1Conv2 Pool2 Full Full Tropical Cyclone 5x5-8 2x2 5x5-16 2x2 50 2 Atmospheric River 12x12-8 3x3 12x12-16 2x2 200 2 Weather Fronts 5x5-16 2x2 5x5-16 2x2 400 2 Supervised Convolutional Architecture
  • 24.
    Logistic Regression K-Nearest Neighbor Support Vector Machine Random Forest ConvNet Train TestTrain Test Train Test Train Test Train Test Tropical Cyclone 96.8 95.85 98.1 97.85 97.0 95.85 99.2 99.4 99.3 99.1 Atmospheric Rivers 81.97 82.65 79.7 81.7 81.6 83.0 87.9 88.4 90.5 90.0 Weather Fronts 84.9 89.8 72.46 76.45 84.35 90.2 80.97 87.5 88.7 89.4 Hyper-parameter optimization applied with Spearmint for all methods Supervised Classification Accuracy
  • 25.
    Semi-Supervised Detection •Objectives: – Createunified architecture for all weather patterns – Want to predict bounding box location for weather pattern – Might have few/no labels for several weather patterns; want to discover new patterns - 25 -
  • 26.
    Semi-Supervised Convolutional Architecture Encoder Decoder Classification+ Bounding Box Regression Contributors: Evan Racah, Chris Beckham, Tegan Maharaj, Chris Pal
  • 27.
    Reconstruction Results - 27- Original Reconstruction
  • 28.
    Classification + RegressionResults - 28 - Ground Truth Prediction
  • 30.
    Weather Front Detection -30 - Contributors: Jim Biard, Ken Kunkel, Evan Racah
  • 31.
  • 32.
  • 33.
  • 34.
    Current Status • SC’17paper on scaling semi-supervised architecture to 600,000 cores, 15PF performance •2 papers under review –BAMS article: “Deep Learning for Detecting Extreme Weather Patterns” – NIPS paper: “ExtremeWeather: A large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events” •Upcoming papers – Front detection – Atmospheric river detection - 34 -
  • 35.
    Next Steps • Goal:Create unified architecture for finding extreme weather in all climate datasets • Create schema and reference dataset for climate community; release reference architecture • Hybrid LSTM + CNN architectures for storm tracking • Classification -> Localization -> Detection -> Segmentation - 35 -
  • 36.
    Open Challenges • Hyper-ParameterOptimization – Tuning #layers, #filters, learning rates, schedule is a black art • Performance and Scaling – Current networks take days to train on O(10) GB datasets, we have O(10TB) datasets on hand • Scarcity of Labeled Data – Community needs to self-organize and run labeling campaigns – Need some theory around limits of semi-supervised learning • Interpretability and Visualization – ‘Black Box’ classifier – Need to incorporate domain science principles (physical consistency, etc) - 36 -
  • 37.
  • 38.
    Timeline of 2communities - 38 - 1970 1980 1990 2000 2010 • Neural Nets • Multi-layer perceptrons, Backpropagation • Eigenface • First principles Physics, Geometry • Random Forests • Hand-tuned features (SIFT, HOG, …) • Edge, Corner detectors • SVMs, Machine Learning • PASCAL VOC • ImageNet • ILSVRC • AlexNet • ResNet • … • Hand-tuned features, first principles physics,… • IMILAST, ARTMIP Pattern Detection in Computer Vision Pattern Detection in Climate Science
  • 39.
  • 40.
    2018-2020 •Climate Science communitywill self-organize and conduct labeling campaigns – Datasets and reference architectures will become easily available •Researchers will exploit low-hanging fruit – Classification, Clustering, Regression problems will be (nearly) completely solved •Broad applicability for analyzing observational (satellite, weather station) and model (reanalysis, CMIP-6) output - 40 -
  • 41.
    2020+ •Broad Deployment oftools at HPC centers, Cloud and universities – Interactive analytics on 100TB-PB datasets •Entire climate archives are segmented and classified – Anomaly detection; Correlation; Causal Analysis •Hard questions are formulated and addressed – Interpretability, incorporating domain science principles • What is the ‘value add’ of the climate data analyst? - 41 -
  • 42.
    2020+ Climate AnalystWorkflow - 42 - CMIP archives Observational Data • Interactive Exploration • Semantic Labels • Patterns • Clusters • Anomalies • Mechanisms • Hypothesis
  • 43.
    Conclusions • Characterizing ExtremeWeather in future climate regimes is an important goal • Deep Learning can be applied effectively to solve pattern recognition problem – Supervised and semi-supervised CNNs – Segmentation and LSTM architectures in development • Unique challenges – Performance and Scaling, Interpretability, Hyper-parameter Optimization, Lack of labeled data • Potential for innovation and impact, open to collaboration! - 43 -
  • 44.
    Acknowledgements • Intel Research:Nadathur Satish, Narayanan Sundaram, Mostofa Patwary, Amir Khosrowshahi, Pradeep Dubey, Joe Curley • Stanford: Ioannis Mitliagkas, Jian Yang, Yunjie Liu, Chris Re • UC Berkeley: Mayur Mudigonda • MILA: Evan Racah, Chris Beckham, Tegan Maharaj, Chris Pal • LBNL: Thorsten Kurth, Wahid Bhimji, Michael Wehner, William Collins • NOAA: Jim Biard, Ken Kunkel - 44 -
  • 45.