Evaluating global climate models
using simple, explainable
neural networks
@ZLabe
Zachary M. Labe
with Elizabeth A. Barnes
Colorado State University
Department of Atmospheric Science
17 December 2021
NG51A-06 – AGU Fall Meeting
Climate Variability Across Scales and Climate States and
Neural Earth System Modeling [Oral Session I]
THE REAL WORLD
(Observations)
Map of temperature
THE REAL WORLD
(Observations)
Anomaly is relative to 1951-1980
THE REAL WORLD
(Observations)
CLIMATE MODEL
ENSEMBLES
Range of ensembles
= internal variability (noise)
Mean of ensembles
= forced response (climate change)
Range of ensembles
= internal variability (noise)
Mean of ensembles
= forced response (climate change)
But let’s remove
climate change…
Range of ensembles
= internal variability (noise)
Mean of ensembles
= forced response (climate change)
After removing the
forced response…
anomalies/noise!
2-m Temperature (°C)
THERE ARE MANY CLIMATE MODEL LARGE ENSEMBLES…
Annual mean 2-m temperature
7 global climate models
16 ensembles each
ERA5-BE (observations)
STANDARD EVALUATION OF
CLIMATE MODELS
Pattern correlation
RMSE
EOFs
Trends, anomalies, mean state
Climate modes of variability
STANDARD EVALUATION OF
CLIMATE MODELS
Pattern correlation
RMSE
EOFs
Trends, anomalies, mean state
Climate modes of variability
CORRELATION
[R]
STANDARD EVALUATION OF
CLIMATE MODELS
Pattern correlation
RMSE
EOFs
Trends, anomalies, mean state
Climate modes of variability
CORRELATION
[R]
STANDARD EVALUATION OF
CLIMATE MODELS
Pattern correlation
RMSE
EOFs
Trends, anomalies, mean state
Climate modes of variability
Negative Correlation Positive Correlation
PATTERN CORRELATION – T2M
INPUT
[DATA]
PREDICTION
Machine
Learning
----ANN----
2 Hidden Layers
10 Nodes each
Ridge Regularization
Early Stopping
TEMPERATURE
We know some metadata…
+ What year is it?
+ Where did it come from?
TEMPERATURE
We know some metadata…
+ What year is it?
+ Where did it come from?
Train on data from the
Multi-Model Large
Ensemble Archive
TEMPERATURE
We know some metadata…
+ What year is it?
+ Where did it come from?
NEURAL NETWORK
CLASSIFICATION TASK
HIDDEN LAYERS
INPUT LAYER
CLIMATE MODEL
MAP
[DATA]
Machine
Learning
CLASSIFICATION
CLASSIFICATION
Machine
Learning
CLIMATE MODEL
MAP
[DATA]
CLASSIFICATION
Machine
Learning
CLIMATE MODEL
MAP
[DATA]
Explainable AI
Learn new
science!
LAYER-WISE RELEVANCE PROPAGATION (LRP)
Volcano
Great White
Shark
Timber
Wolf
Image Classification LRP
https://heatmapping.org/
LRP heatmaps show regions
of “relevance” that
contribute to the neural
network’s decision-making
process for a sample
belonging to a particular
output category
Neural Network
WHY
WHY
WHY
Backpropagation – LRP
LAYER-WISE RELEVANCE PROPAGATION (LRP)
Volcano
Great White
Shark
Timber
Wolf
Image Classification LRP
https://heatmapping.org/
LRP heatmaps show regions
of “relevance” that
contribute to the neural
network’s decision-making
process for a sample
belonging to a particular
output category
Neural Network
WHY
WHY
WHY
Backpropagation – LRP
LAYER-WISE RELEVANCE PROPAGATION (LRP)
Volcano
Great White
Shark
Timber
Wolf
Image Classification LRP
https://heatmapping.org/
LRP heatmaps show regions
of “relevance” that
contribute to the neural
network’s decision-making
process for a sample
belonging to a particular
output category
Neural Network
WHY
WHY
WHY
Backpropagation – LRP
LAYER-WISE RELEVANCE PROPAGATION (LRP)
Image Classification LRP
https://heatmapping.org/
NOT PERFECT
Crock
Pot
Neural Network
WHY
Backpropagation – LRP
[Adapted from Adebayo et al., 2020]
EXPLAINABLE AI IS
NOT PERFECT
THERE ARE MANY
METHODS
[Adapted from Adebayo et al., 2020]
THERE ARE MANY
METHODS
EXPLAINABLE AI IS
NOT PERFECT
COMPARING CLIMATE MODELS
LRP
(Explainable AI)
Raw data
(Difference from
multi-model mean)
Colder
Warmer
High
Low
COMPARING CLIMATE MODELS
LRP
(Explainable AI)
Raw data
(Difference from
multi-model mean)
Colder
Warmer
High
Low
COMPARING CLIMATE MODELS
LRP
(Explainable AI)
Raw data
(Difference from
multi-model mean)
Colder
Warmer
High
Low
COMPARING CLIMATE MODELS
LRP
(Explainable AI)
Raw data
(Difference from
multi-model mean)
Colder
Warmer
High
Low
COMPARING CLIMATE MODELS
LRP
(Explainable AI)
Raw data
(Difference from
multi-model mean)
Colder
Warmer
High
Low
COMPARING CLIMATE MODELS
LRP
(Explainable AI)
Raw data
(Difference from
multi-model mean)
Colder
Warmer
High
Low
COMPARING CLIMATE MODELS
LRP
(Explainable AI)
Raw data
(Difference from
multi-model mean)
Colder
Warmer
High
Low
COMPARING CLIMATE MODELS
LRP
(Explainable AI)
Raw data
(Difference from
multi-model mean)
Colder
Warmer
High
Low
EXPLAINABLE AI
What climate model
does the neural
network predict for
each year of
observations?
APPLYING METHODOLOGY TO REGIONS
PREDICTION FOR EACH YEAR IN OBSERVATIONS PATTERN CORRELATIONS FOR EACH YEAR IN THE ARCTIC
CORRELATION
[R]
APPLYING METHODOLOGY TO REGIONS
PREDICTION FOR EACH YEAR IN OBSERVATIONS TRENDS IN 2-M TEMPERATURE FROM 2005 TO 2019
Colder Warmer
°C
APPLYING METHODOLOGY TO REGIONS
PREDICTION FOR EACH YEAR IN OBSERVATIONS RAW DATA (DIFFERENCE FROM MULTI-MODEL MEAN)
Colder Warmer
°C
APPLYING METHODOLOGY TO REGIONS
High
Low
RECENT ARCTIC AMPLIFICATION
APPLYING METHODOLOGY TO REGIONS
High
Low
HISTORICAL PERIOD
APPLYING METHODOLOGY TO REGIONS
High
Low
DIFFERENCE IN LAYER-WISE RELEVANCE PROPAGATION
APPLY SOFTMAX OPERATOR
IN THE OUTPUT LAYER
RANK
APPLY SOFTMAX OPERATOR
IN THE OUTPUT LAYER
[ 0.71 ]
[ 0.05 ]
[ 0.01 ]
[ 0.01 ]
[ 0.03 ]
[ 0.11 ]
[ 0.08 ]
RANK
APPLY SOFTMAX OPERATOR
IN THE OUTPUT LAYER
[ 0.71 ]
[ 0.05 ]
[ 0.01 ]
[ 0.01 ]
[ 0.03 ]
[ 0.11 ]
[ 0.08 ]
RANK
[ 1 ]
[ 4 ]
[ 7 ]
[ 6 ]
[ 5 ]
[ 2 ]
[ 3 ]
APPLY SOFTMAX OPERATOR
IN THE OUTPUT LAYER
[ 0.71 ]
[ 0.05 ]
[ 0.01 ]
[ 0.01 ]
[ 0.03 ]
[ 0.11 ]
[ 0.08 ]
RANK
[ 1 ]
[ 4 ]
[ 7 ]
[ 6 ]
[ 5 ]
[ 2 ]
[ 3 ]
Confidence/Probability
EVALUATING THE ANN’S CONFIDENCE
Confidence for single
ANN network from
1950 to 2019
EVALUATING THE ANN’S CONFIDENCE
100 ANNs:
Combinations of
training/testing/seeds
EVALUATING THE ANN’S CONFIDENCE
RANKING CLIMATE MODEL PREDICTIONS FOR EACH YEAR IN OBSERVATIONS
RANKING CLIMATE MODEL PREDICTIONS FOR EACH YEAR IN OBSERVATIONS
KEY POINTS
Zachary Labe
zmlabe@rams.colostate.edu
@ZLabe
1. Explainable neural networks can be used to identify unique differences in temperature
simulated between global climate model large ensembles
2. As a method of climate model evaluation, we input maps from observations into the neural
network in order to classify each year with a climate model
3. The neural network architecture can be used in regions with known large biases, such as over
the Arctic, or for different methods of preprocessing climate data

Evaluating global climate models using simple, explainable neural networks