An intro to explainable AI for polar climate science

An intro to
explainable AI for
polar climate science
Zachary M. Labe
Postdoc in Seasonal-to-Decadal (S2D) Variability and Predictability Division
with Elizabeth A. Barnes (CSU), Thomas L. Delworth (GFDL), and Nathaniel C. Johnson (GFDL)
26 March 2024
Polar Climate
Interest Group Meeting
https://zacklabe.com/ @ZLabe

Do it better
e.g., parameterizations in climate models are not
perfect, use ML to make them more accurate
Do it faster
e.g., code in climate models is very slow (but we
know the right answer) - use ML methods to speed
things up
Do something new
• e.g., go looking for non-linear relationships you
didn’t know were there
WHY ELSE SHOULD WE CONSIDER
MACHINE LEARNING?

Do it better
e.g., parameterizations in climate models are not
perfect, use ML to make them more accurate
Do it faster
e.g., code in climate models is very slow (but we
know the right answer) - use ML methods to speed
things up
Do something new
• e.g., go looking for non-linear relationships you
didn’t know were there
WHY ELSE SHOULD WE CONSIDER
MACHINE LEARNING?
Very relevant for
research: may be
slower and worse,
but can still learn
something

https://zacklabe.com/climate-model-projections/

STANDARD EVALUATION OF
CLIMATE MODELS
Pattern correlation
RMSE
EOFs
Trends, anomalies, mean state
Climate modes of variability
Negative Correlation Positive Correlation
PATTERN CORRELATION – T2M
PATTERN CORRELATION : NEAR-SURFACE AIR TEMPERATURE

INPUT PREDICTION
SO, WHAT ABOUT
MACHINE LEARNING?

INPUT
[DATA]
PREDICTION
Machine
Learning

----ANN----
2 Hidden Layers
10 Nodes each
Ridge Regularization
Early Stopping
TEMPERATURE
We know some metadata…
+ What year is it? (Labe & Barnes, 2021)
+ Where did it come from?

TEMPERATURE
Train on data from the
Multi-Model Large
Ensemble Archive

TEMPERATURE
NEURAL NETWORK
CLASSIFICATION TASK
HIDDEN LAYERS
INPUT LAYER
INPUT LAYER
OUTPUT LAYER
HIDDEN LAYERS

Ensemble
members in
GFDL SPEAR
Maps of a given time period for each ensemble
Inputs for machine learning

Ensemble
members in
GFDL SPEAR
Training Data:
24 ensemble members
Maps of a given time period for each ensemble

Training Data:
24 ensemble members
Validation Data:
4 ensemble members

Training Data:
24 ensemble members
Validation Data:
4 ensemble members
Testing Data:
2 ensemble members

2-m Actual Air Temperature (°C)
THERE ARE MANY CLIMATE MODEL LARGE ENSEMBLES…
Annual mean 2-m temperature
7 global climate models
16 ensembles each
ERA5 (observations)

CLASSIFICATION
Machine
Learning
CLIMATE MODEL
MAP
[DATA]
Explainable AI
Learn new
science!

EXPLAINABLE AI (XAI)
1. Is the prediction correct for the right reasons?
• Is it consistent with our physical understanding of the climate system?
2. Provide insights for improving the machine learning model
• Is the model overfitting? Can the model be further optimized?
3. Learn new science
• For example, in climate prediction this could be a new forecast of opportunity or teleconnection
https://doi.org/10.1175/AIES-D-22-0001.1

Faithfulness:
Relates to the actual decision-making process
Comprehensibility:
How well the attributions are understood by the user

https://github.com/understandable-machine-intelligence-lab/Quantus

Sensitivity: refers to how much the value of the output will
change for a unit change in a specific feature
Such as… Gradient (Saliency Maps), Smooth Gradient (first derivative of the output with respect to input)
Signal: all the information in the input that is relevant to the
prediction task (i.e., signal component versus distractor)
Such as… PatternNet
Attribution: refers to the relative contribution of an input
feature to the output
Such as… Input*Gradient, Integrated Gradients, Layer-wise Relevance Propagation (LRP), Deep Taylor, DeepSHAP

ATTRIBUTION-BASED XAI METHODS
Volcano
Great White
Shark
Timber
Wolf
Image Classification XAI
https://heatmapping.org/
XAI heatmaps show regions
of “relevance” that
contribute to the neural
network’s decision-making
process for a sample
belonging to a particular
output category
Neural Network
WHY
WHY
WHY
Backpropagation Rules

Volcano
Great White
Shark
Timber
Wolf
Image Classification
output category
Neural Network
WHY
WHY
WHY
XAI

Image Classification XAI
NOT PERFECT
Crock
Pot
Neural Network
WHY

https://doi.org/10.1016/j.patcog.2016.11.008

Visualizing something we already know…
ENSO

Neural
Network
[0] La Niña [1] El Niño
Input a map of sea surface temperatures
[Toms et al. 2020, JAMES]

Input maps of sea surface
temperatures (SST) to
identify El Niño or La Niña
Use ‘LRP’ to see how the
neural network is making
its decision
Layer-wise Relevance Propagation (LRP)
Composite SST Observations
LRP [Relevance]
SST Anomaly [°C]
0.00 0.75
0.0 1.5
-1.5
Warmer
Colder
High
Low

Input maps of sea surface
temperatures (SST) to
identify El Niño or La Niña
Use ‘Backward Optimization’ to
identify synthetic input that
maximizes the neural network’s
confidence of the prediction
Backward Optimization
Composite SST Observations
Optimal Input
SST Anomaly [°C]
-1.0 1.0
0.0 1.5
-1.5
Warmer
Colder
Warmer
Colder
0.0

THERE ARE MANY
METHODS
A bird!
XAI
[Adapted from Adebayo et al., 2020]

THERE ARE MANY
METHODS
[Adapted from Adebayo et al., 2020]

https://doi.org/10.1017/eds.2022.7
Artificial Neural Networks

Convolutional Neural Networks

https://doi.org/10.1175/BAMS-D-18-0195.1

https://arxiv.org/abs/2303.00652

1. Shuffle ensemble member and year
dimensions (bootstrap-like method)
2. Apply true labels (unshuffled years)
3. Apply same ANN architecture and LRP
4. Repeat 500x by using different
combinations of training/testing data and
initialization seeds
5. Compute 95th percentile of the distribution
of LRP at all grid points
Uncertainty for XAI
[Labe and Barnes 2021, JAMES]

Uncertainty for XAI
Ultimately, we are trying to
mask noise in the LRP output
Identify robust climate pattern indicators!

RESULTS FROM LRP

Interpretable vs. Explainable
Explainable AI (XAI): method to explain black box after
training model – approximate model behavior
Interpretable AI: model is inherently interpretable and
provides own explanation – degree to which a model can
be understood
https://www.nature.com/articles/s42256-019-0048-x
a priori a posterio
NO CONSENSUS!

Adapted from McGovern et al. (2022, EDS) at https://doi.org/10.1017/eds.2022.5
ETHICAL, RESPONSIBLE, TRUSTWORTHY AI
1. Issues related to training data
q Non-representative training data, including lack of geo-diversity
q Training labels are biased or faulty
q Data is affected by adversaries
2. Issues related to AI models
q Model training choices
q Algorithms learns faulty strategies
q AI learns to fake something plausible
q AI model used in inappropriate situations
q Non-trustworthy AI model deployed
q Lack of robustness in the AI model
3. Other issues related to workforce and society
q Globally applicable AI approaches may stymie burgeoning efforts in developing countries
q Lack of input or consent on data collection and model training
q Scientists might feel disenfranchised
q Increase of carbon emissions due to computing
McGovern
et
al.
(2024,
AI)

NEURAL NETWORK
CLASSIFICATION TASK
HIDDEN LAYERS
INPUT LAYER
OUTPUT LAYER
TEMPERATURE MAP
LABE AND BARNES 2022, ESS

APPLY SOFTMAX OPERATOR
IN THE OUTPUT LAYER
RANK

IN THE OUTPUT LAYER
[ 0.71 ]
[ 0.05 ]
[ 0.01 ]
[ 0.01 ]
[ 0.03 ]
[ 0.11 ]
[ 0.08 ]
RANK

IN THE OUTPUT LAYER
[ 0.71 ]
[ 0.05 ]
[ 0.01 ]
[ 0.01 ]
[ 0.03 ]
[ 0.11 ]
[ 0.08 ]
RANK
[ 1 ]
[ 4 ]
[ 7 ]
[ 6 ]
[ 5 ]
[ 2 ]
[ 3 ]

IN THE OUTPUT LAYER
[ 0.71 ]
[ 0.05 ]
[ 0.01 ]
[ 0.01 ]
[ 0.03 ]
[ 0.11 ]
[ 0.08 ]
RANK
[ 1 ]
[ 4 ]
[ 7 ]
[ 6 ]
[ 5 ]
[ 2 ]
[ 3 ]
Confidence/Probability

RANKING CLIMATE MODEL PREDICTIONS FOR EACH YEAR IN OBSERVATIONS

COMPARING CLIMATE MODELS IN THE ARCTIC
High
Low
RECENT ARCTIC AMPLIFICATION

High
Low
HISTORICAL PERIOD

High
Low
DIFFERENCE IN LAYER-WISE RELEVANCE PROPAGATION

KEY POINTS
1. Artificial neural network can identify which climate model produced an annual mean map of
near-surface temperature in the Arctic
2. Classification network is evaluated using input from atmospheric reanalysis as a method of
comparing climate models and observations
3. XAI method reveals regional temperature patterns the artificial neural network is using to
classify observations with different climate models
Labe, Z. M., & Barnes, E. A. (2022). Comparison of climate model large
ensembles with observations in the Arctic using simple neural networks.
Earth and Space Science, 9(7), e2022EA002348
https://doi.org/10.1029/2022EA002348

https://doi.org/10.1038/s41467-021-25257-4
https://doi.org/10.1029/2023GL106060

An intro to explainable AI for polar climate science

Recommended

Recommended

More Related Content

Similar to An intro to explainable AI for polar climate science

Similar to An intro to explainable AI for polar climate science (20)

More from Zachary Labe

More from Zachary Labe (20)

Recently uploaded

Recently uploaded (20)

An intro to explainable AI for polar climate science