For many AI applications, a prediction is not enough. End-users need to understand the “why” behind a prediction to make decisions and take next steps. Explainable AI techniques today can provide some insight into what your model has learned, but recent research highlights the need for interactivity with XAI tools. End-users need to interact and test “what if” scenarios in order to understand and build trust with an AI system. In this talk, I’ll discuss what human-factors research tells us about human decision making and how users build trust (or lose trust) in systems. I’ll also present interaction design techniques that can be applied to XAI services design.
2. Meg Kurdziolek
● PhD in Human-Computer Interaction
● Staff UXR for Intrinsic.ai, an Alphabet
Company
● 10+ years industry experience, the past 5 has
been in the Robotics & AI/ML space
3. 01 What do we need XAI for?
Discussion of why XAI is essential for the
growth, adoption and engineering of ML
Human factors of AI and XAI
Different audiences will have different
needs for model outputs and explanations.
Also, we are all subject to human-biases.
We’ve actually been explaining complex
things for a long time
We’ll take a look at an analogy of
explaining complex-weather data to end-
users
Human Factors of Explainable AI
Presentation Outline
02
03
04 The UX of XAI
Recommendations on how to think about
and design XAI for your audience
Examples of Interactive XAI
Some examples of interactive XAI and
“lessons learned”
Parting thoughts
A recap of what we talked about today and
some resources for you if you want to
learn more.
05
06
5. “The danger is in creating and
using decisions that are not
justifiable, legitimate, or that
simply do not allow obtaining
detailed explanations of their
behavior.”
(Arrieta et al., 2020)
6. ● Identifying and troubleshooting illegitimate conclusions
○ Deficiencies in the training data, and data “skews” or shifts can result in
illegitimate conclusions. Without knowing the “why” behind a prediction it is
difficult to diagnose.
● Feature engineering and data pipeline optimization
○ Removing features/data that is unnecessary for achieving desired model
performance
Explainability is important to the development, assessment, optimization,
and troubleshooting of ML Systems
7. ● Identifying bias in datasets/models
○ Models can arrive at unfair, discriminatory, or biased decisions. Without a
means of understanding the underlying decision making, these issues are
difficult to assess.
Explainability is important to assessing fairness and addressing bias
8. ● Trust and adoption
○ humans are reluctant to adopt or trust technologies they do not understand
● Utility requires understanding
○ in cases where humans utilize the technology to make critical decisions, they
require explanations in order to effectively execute their own judgment
Explainability is essential for end-user adoption and the ultimate utility of
ML applications
13. Human Bias
● Anchoring Bias - relying too heavily on
the first piece of information we are given
about a topic. We interpret newer
information from the reference point of
our anchor, instead of seeing it objectively.
● Availability bias - tendency to believe
that examples or cases that come readily
to mind are more representative of a
population than they actually are.
“When we become anchored to a
specific figure or plan of action, we
end up filtering all new information
through the framework we initially
drew up in our head, distorting our
perception. This makes us reluctant
to make significant changes to our
plans, even if the situation calls for
it.”
- Why we tend to rely heavily upon the first
piece of information we receive
14. Human Bias
● Confirmation Bias - seeking and favoring
information that supports their prior
beliefs. Can result in unjustified trust and
mistrust.
● Unjustified Trust/“Over trust” - end-
users may have a higher degree of trust
than they should (or “over trust”) when
explanations are presented in different
formats.
“They found that participants
tended to place “unwarranted” faith
in numbers. For example, the AI
group participants often ascribed
more value to mathematical
representations than was justified,
while the non-AI group participants
believed the numbers signaled
intelligence — even if they couldn’t
understand the meaning.”
- Even experts are too quick to rely on
AI explanations
20. Meteorologist Interviews
dBZ Rain Rate (in/hr)
65 16+
60 8.00
55 4.00
52 2.50
47 1.25
41 0.50
36 0.25
30 0.10
20 Trace
< 20 No rain
What does a quarter inch of rain
per hour feel like?
“Thats a solid rain. But not a
downpour. You would want an
umbrella, but you’d be okay if you
needed to make a quick dash to
your car or something.”
21. What do you think you’d
experience in a rainstorm that
looked like this?
“I think that if I was right in the
middle of it, in that orange spot
right there, I would not want to
be outside. I bet it would be
raining real heavy. Might flood
the storm drains.”
User Interviews
22. Lining up the expert and non-expert experience
dBZ Rain Rate (in/hr)
65 16+
60 8.00
55 4.00
52 2.50
47 1.25
41 0.50
36 0.25
30 0.10
20 Trace
< 20 No rain
~35 dBZ
Big jump
~55 dBZ
Big difference
Meteorologist
Experience
End-user
Experience
23. New radar palette is launched
Old Palette
New Palette
“Absolutely fantastic! I
abandoned WU a while back
because of the ‘dramatic
imagery’ that didn't match
reality on the ground / in the
field; and so I am very happy
that feedback was heard, that
you studied the complaint and
data, as well as communicated
with pros, observers and end
users. Time to bookmark and
load the WU apps again; and test
it out.”
- User feedback on Radar
Palette Improvements
blog post (2014)
25. “What counts as an explanation
depends on what the user needs,
what knowledge the user already
has, and especially the user's
goals.”
(Hoffman et al., 2019)
26. How can we help end-users meet their goals
and make better decisions?
Designing explanations to meet user goals
27. Designing explanations for better decision making
Designing Theory-Driven User-Centric Explainable AI (Wang et al, 2019)
28. Designing explanations for better decision making
Designing Theory-Driven User-Centric Explainable AI (Wang et al, 2019)
29. “The property of ‘being an
explanation’ is not a property of
statements, it is an interaction.”
(Hoffman et al., 2019)
30. How can we build understanding through
interaction?
Designing explanations for interaction
32. Editing Model Inputs + Example Based Explanations
Intuitively Assessing ML Model Reliability through Example-Based Explanations and Editing Model Inputs (Suresh et al., 2022)
33. “Grounding interpretability in real examples,
facilitating comparison across them, and
visualizing class distributions can help users
grasp the model’s uncertainty and connect it to
relevant challenges of the task.
Moreover, by looking at and comparing real
examples, users can discover or ask
questions about limitations of the data —
and doing so does not damage trust, but can
play an important role in building it.”
(Suresh et al., 2022)
42. XAI = interaction; Interaction Design is a cycle
Discover
Ideate
Create
Evaluate
Interaction design is a cycle
43. User-centric evaluation of XAI methods
● Understandability - Does the XAI method provide
explanations in human-readable terms with sufficient
detail to be understandable to the intended end-users?
● Satisfaction - Does the XAI method provide explanations
such that users feel that they understand the AI system
and are satisfied?
● Utility - Does the XAI method provide explanations such
that end-users can make decisions and take further action
on the prediction?
● Trustworthyness - After interacting with the explanation,
do users trust the AI model prediction to an appropriate
degree?
Interaction design for XAI
Evaluate
44. XAI “101” article (an article I wrote)
● Explaining the Unexplainable by Meg
Kurdziolek
Resources
Intrinsic.ai (where I work)
● Intrinsic.ai
● RSVP for Keynote May 15th
Articles I recommend
● Intuitively Assessing ML Model Reliability through
Example-Based Explanations and Editing Model Inputs
by Harini Suresh, Kathleen M Lewis, John Guttag, Arvind
Satyanarayan
● Designing Theory-Driven User-Centric Explainable AI by
Danding Wang, Qian Yang, Ashraf Abdul, Brian Y. Lim
● Metrics for Explainable AI: Challenges and Prospects by
Robert R. Hoffman, Shane T. Mueller, Gary Klein, Jordan
Litman
● Building SMILY, a Human-Centric, Similar-Image Search
Tool for Pathology by Narayan Hegde and Carrie J. Cai
● The Language Interpretability Tool (LIT): Interactive
Exploration and Analysis of NLP Models by James
Wexler and Ian Tenney
Let’s Talk! (LinkedIn and Email)
● linkedin.com/in/mdickeykurdziolek/
● meg.kurdziolek@gmail.com