Interactive XAI for ODSC East 2023

Interactive XAI
Meg Kurdziolek, PhD
meg.kurdziolek@gmail.com

Meg Kurdziolek
● PhD in Human-Computer Interaction
● Staff UXR for Intrinsic.ai, an Alphabet
Company
● 10+ years industry experience, the past 5 has
been in the Robotics & AI/ML space

01 What do we need XAI for?
Discussion of why XAI is essential for the
growth, adoption and engineering of ML
Human factors of AI and XAI
Different audiences will have different
needs for model outputs and explanations.
Also, we are all subject to human-biases.
We’ve actually been explaining complex
things for a long time
We’ll take a look at an analogy of
explaining complex-weather data to end-
users
Human Factors of Explainable AI
Presentation Outline
02
03
04 The UX of XAI
Recommendations on how to think about
and design XAI for your audience
Examples of Interactive XAI
Some examples of interactive XAI and
“lessons learned”
Parting thoughts
A recap of what we talked about today and
some resources for you if you want to
learn more.
05
06

“The danger is in creating and
using decisions that are not
justifiable, legitimate, or that
simply do not allow obtaining
detailed explanations of their
behavior.”
(Arrieta et al., 2020)

● Identifying and troubleshooting illegitimate conclusions
○ Deficiencies in the training data, and data “skews” or shifts can result in
illegitimate conclusions. Without knowing the “why” behind a prediction it is
difficult to diagnose.
● Feature engineering and data pipeline optimization
○ Removing features/data that is unnecessary for achieving desired model
performance
Explainability is important to the development, assessment, optimization,
and troubleshooting of ML Systems

● Identifying bias in datasets/models
○ Models can arrive at unfair, discriminatory, or biased decisions. Without a
means of understanding the underlying decision making, these issues are
difficult to assess.
Explainability is important to assessing fairness and addressing bias

● Trust and adoption
○ humans are reluctant to adopt or trust technologies they do not understand
● Utility requires understanding
○ in cases where humans utilize the technology to make critical decisions, they
require explanations in order to effectively execute their own judgment
Explainability is essential for end-user adoption and the ultimate utility of
ML applications

Depending on who the audience is, the explanation
may need to account for different domain
expertise, cognitive abilities, and context of use.

Developers,
Operators,
and
Engineers
Data
Scientists /
Model
Builders
Domain
expert
Lay-person/
Consumer
Auditors/
regulatory
agencies
Prediction + Explanation

Developers,
Operators,
and
Engineers
Data
Scientists /
Model
Builders
Domain
expert
Lay-person/
Consumer
Auditors/
regulatory
agencies
Prediction + Explanation
Expert on ML
NOT an expert on the data domain
Expert on the data domain
NOT an expert on ML
NOT an expert on ML
NOT an expert on the data domain

Human Bias
● Anchoring Bias - relying too heavily on
the first piece of information we are given
about a topic. We interpret newer
information from the reference point of
our anchor, instead of seeing it objectively.
● Availability bias - tendency to believe
that examples or cases that come readily
to mind are more representative of a
population than they actually are.
“When we become anchored to a
specific figure or plan of action, we
end up filtering all new information
through the framework we initially
drew up in our head, distorting our
perception. This makes us reluctant
to make significant changes to our
plans, even if the situation calls for
it.”
- Why we tend to rely heavily upon the first
piece of information we receive

Human Bias
● Confirmation Bias - seeking and favoring
information that supports their prior
beliefs. Can result in unjustified trust and
mistrust.
● Unjustified Trust/“Over trust” - end-
users may have a higher degree of trust
than they should (or “over trust”) when
explanations are presented in different
formats.
“They found that participants
tended to place “unwarranted” faith
in numbers. For example, the AI
group participants often ascribed
more value to mathematical
representations than was justified,
while the non-AI group participants
believed the numbers signaled
intelligence — even if they couldn’t
understand the meaning.”
- Even experts are too quick to rely on
AI explanations

We’ve actually been
explaining complex things
for a long time
3

Weather is an example of
just one of the many
complex systems we
explain and interpret
today.

“Stop sensationalizing
storms in your maps…”
- user feedback
Weather Underground’s radar
imagery felt inaccurate to users.
We had a problem:

Different Sites, Different Storms
Intellicast AccuWeather

NWS dBZ to Rain Rate
dBZ Rain Rate (in/hr)
65 16+
60 8.00
55 4.00
52 2.50
47 1.25
41 0.50
36 0.25
30 0.10
20 Trace
< 20 No rain

Meteorologist Interviews
65 16+
60 8.00
55 4.00
52 2.50
47 1.25
41 0.50
36 0.25
30 0.10
20 Trace
< 20 No rain
What does a quarter inch of rain
per hour feel like?
“Thats a solid rain. But not a
downpour. You would want an
umbrella, but you’d be okay if you
needed to make a quick dash to
your car or something.”

What do you think you’d
experience in a rainstorm that
looked like this?
“I think that if I was right in the
middle of it, in that orange spot
right there, I would not want to
be outside. I bet it would be
raining real heavy. Might flood
the storm drains.”
User Interviews

Lining up the expert and non-expert experience
65 16+
60 8.00
55 4.00
52 2.50
47 1.25
41 0.50
36 0.25
30 0.10
20 Trace
< 20 No rain
~35 dBZ
Big jump
~55 dBZ
Big difference
Meteorologist
Experience
End-user
Experience

New radar palette is launched
Old Palette
New Palette
“Absolutely fantastic! I
abandoned WU a while back
because of the ‘dramatic
imagery’ that didn't match
reality on the ground / in the
field; and so I am very happy
that feedback was heard, that
you studied the complaint and
data, as well as communicated
with pros, observers and end
users. Time to bookmark and
load the WU apps again; and test
it out.”
- User feedback on Radar
Palette Improvements
blog post (2014)

“What counts as an explanation
depends on what the user needs,
what knowledge the user already
has, and especially the user's
goals.”
(Hoffman et al., 2019)

How can we help end-users meet their goals
and make better decisions?
Designing explanations to meet user goals

Designing explanations for better decision making
Designing Theory-Driven User-Centric Explainable AI (Wang et al, 2019)

“The property of ‘being an
explanation’ is not a property of
statements, it is an interaction.”
(Hoffman et al., 2019)

How can we build understanding through
interaction?
Designing explanations for interaction

Editing Model Inputs + Example Based Explanations
Intuitively Assessing ML Model Reliability through Example-Based Explanations and Editing Model Inputs (Suresh et al., 2022)

“Grounding interpretability in real examples,
facilitating comparison across them, and
visualizing class distributions can help users
grasp the model’s uncertainty and connect it to
relevant challenges of the task.
Moreover, by looking at and comparing real
examples, users can discover or ask
questions about limitations of the data —
and doing so does not damage trust, but can
play an important role in building it.”
(Suresh et al., 2022)

Interaction Example: The What-If Tool
https://pair-code.github.io/what-if-tool/

Interaction Example: The Language Interpretability Tool (LIT)
https://ai.googleblog.com/2020/11/the-language-interpretability-tool-lit.html

Interaction Example: SMILY (Similar Image Search for Histopathology)
https://ai.googleblog.com/2019/07/building-smily-human-centric-similar.html

Interaction Example: SMILY
https://ai.googleblog.com/2019/07/building-smily-human-centric-similar.html

XAI = interaction; Interaction Design is a cycle
Discover
Ideate
Create
Evaluate
Interaction design is a cycle

User-centric evaluation of XAI methods
● Understandability - Does the XAI method provide
explanations in human-readable terms with sufficient
detail to be understandable to the intended end-users?
● Satisfaction - Does the XAI method provide explanations
such that users feel that they understand the AI system
and are satisfied?
● Utility - Does the XAI method provide explanations such
that end-users can make decisions and take further action
on the prediction?
● Trustworthyness - After interacting with the explanation,
do users trust the AI model prediction to an appropriate
degree?
Interaction design for XAI
Evaluate

XAI “101” article (an article I wrote)
● Explaining the Unexplainable by Meg
Kurdziolek
Resources
Intrinsic.ai (where I work)
● Intrinsic.ai
● RSVP for Keynote May 15th
Articles I recommend
● Intuitively Assessing ML Model Reliability through
Example-Based Explanations and Editing Model Inputs
by Harini Suresh, Kathleen M Lewis, John Guttag, Arvind
Satyanarayan
● Designing Theory-Driven User-Centric Explainable AI by
Danding Wang, Qian Yang, Ashraf Abdul, Brian Y. Lim
● Metrics for Explainable AI: Challenges and Prospects by
Robert R. Hoffman, Shane T. Mueller, Gary Klein, Jordan
Litman
● Building SMILY, a Human-Centric, Similar-Image Search
Tool for Pathology by Narayan Hegde and Carrie J. Cai
● The Language Interpretability Tool (LIT): Interactive
Exploration and Analysis of NLP Models by James
Wexler and Ian Tenney
Let’s Talk! (LinkedIn and Email)
● linkedin.com/in/mdickeykurdziolek/
● meg.kurdziolek@gmail.com

https://rsvp.withgoogle.com/events/intrinsic-product-keynote

Interactive XAI for ODSC East 2023

Recommended

Recommended

More Related Content

Similar to Interactive XAI for ODSC East 2023

Similar to Interactive XAI for ODSC East 2023 (20)

Recently uploaded

Recently uploaded (20)

Interactive XAI for ODSC East 2023