Interactive XAI
Meg Kurdziolek, PhD
meg.kurdziolek@gmail.com
Meg Kurdziolek
● PhD in Human-Computer Interaction
● Staff UXR for Intrinsic.ai, an Alphabet
Company
● 10+ years industry experience, the past 5 has
been in the Robotics & AI/ML space
01 What do we need XAI for?
Discussion of why XAI is essential for the
growth, adoption and engineering of ML
Human factors of AI and XAI
Different audiences will have different
needs for model outputs and explanations.
Also, we are all subject to human-biases.
We’ve actually been explaining complex
things for a long time
We’ll take a look at an analogy of
explaining complex-weather data to end-
users
Human Factors of Explainable AI
Presentation Outline
02
03
04 The UX of XAI
Recommendations on how to think about
and design XAI for your audience
Examples of Interactive XAI
Some examples of interactive XAI and
“lessons learned”
Parting thoughts
A recap of what we talked about today and
some resources for you if you want to
learn more.
05
06
What do we need XAI for?
1
“The danger is in creating and
using decisions that are not
justifiable, legitimate, or that
simply do not allow obtaining
detailed explanations of their
behavior.”
(Arrieta et al., 2020)
● Identifying and troubleshooting illegitimate conclusions
○ Deficiencies in the training data, and data “skews” or shifts can result in
illegitimate conclusions. Without knowing the “why” behind a prediction it is
difficult to diagnose.
● Feature engineering and data pipeline optimization
○ Removing features/data that is unnecessary for achieving desired model
performance
Explainability is important to the development, assessment, optimization,
and troubleshooting of ML Systems
● Identifying bias in datasets/models
○ Models can arrive at unfair, discriminatory, or biased decisions. Without a
means of understanding the underlying decision making, these issues are
difficult to assess.
Explainability is important to assessing fairness and addressing bias
● Trust and adoption
○ humans are reluctant to adopt or trust technologies they do not understand
● Utility requires understanding
○ in cases where humans utilize the technology to make critical decisions, they
require explanations in order to effectively execute their own judgment
Explainability is essential for end-user adoption and the ultimate utility of
ML applications
Human-factors of AI and
XAI
2
Depending on who the audience is, the explanation
may need to account for different domain
expertise, cognitive abilities, and context of use.
Developers,
Operators,
and
Engineers
Data
Scientists /
Model
Builders
Domain
expert
Lay-person/
Consumer
Auditors/
regulatory
agencies
Prediction + Explanation
Developers,
Operators,
and
Engineers
Data
Scientists /
Model
Builders
Domain
expert
Lay-person/
Consumer
Auditors/
regulatory
agencies
Prediction + Explanation
Expert on ML
NOT an expert on the data domain
Expert on the data domain
NOT an expert on ML
NOT an expert on ML
NOT an expert on the data domain
Human Bias
● Anchoring Bias - relying too heavily on
the first piece of information we are given
about a topic. We interpret newer
information from the reference point of
our anchor, instead of seeing it objectively.
● Availability bias - tendency to believe
that examples or cases that come readily
to mind are more representative of a
population than they actually are.
“When we become anchored to a
specific figure or plan of action, we
end up filtering all new information
through the framework we initially
drew up in our head, distorting our
perception. This makes us reluctant
to make significant changes to our
plans, even if the situation calls for
it.”
- Why we tend to rely heavily upon the first
piece of information we receive
Human Bias
● Confirmation Bias - seeking and favoring
information that supports their prior
beliefs. Can result in unjustified trust and
mistrust.
● Unjustified Trust/“Over trust” - end-
users may have a higher degree of trust
than they should (or “over trust”) when
explanations are presented in different
formats.
“They found that participants
tended to place “unwarranted” faith
in numbers. For example, the AI
group participants often ascribed
more value to mathematical
representations than was justified,
while the non-AI group participants
believed the numbers signaled
intelligence — even if they couldn’t
understand the meaning.”
- Even experts are too quick to rely on
AI explanations
We’ve actually been
explaining complex things
for a long time
3
Weather is an example of
just one of the many
complex systems we
explain and interpret
today.
“Stop sensationalizing
storms in your maps…”
- user feedback
Weather Underground’s radar
imagery felt inaccurate to users.
We had a problem:
Different Sites, Different Storms
Intellicast AccuWeather
NWS dBZ to Rain Rate
dBZ Rain Rate (in/hr)
65 16+
60 8.00
55 4.00
52 2.50
47 1.25
41 0.50
36 0.25
30 0.10
20 Trace
< 20 No rain
Meteorologist Interviews
dBZ Rain Rate (in/hr)
65 16+
60 8.00
55 4.00
52 2.50
47 1.25
41 0.50
36 0.25
30 0.10
20 Trace
< 20 No rain
What does a quarter inch of rain
per hour feel like?
“Thats a solid rain. But not a
downpour. You would want an
umbrella, but you’d be okay if you
needed to make a quick dash to
your car or something.”
What do you think you’d
experience in a rainstorm that
looked like this?
“I think that if I was right in the
middle of it, in that orange spot
right there, I would not want to
be outside. I bet it would be
raining real heavy. Might flood
the storm drains.”
User Interviews
Lining up the expert and non-expert experience
dBZ Rain Rate (in/hr)
65 16+
60 8.00
55 4.00
52 2.50
47 1.25
41 0.50
36 0.25
30 0.10
20 Trace
< 20 No rain
~35 dBZ
Big jump
~55 dBZ
Big difference
Meteorologist
Experience
End-user
Experience
New radar palette is launched
Old Palette
New Palette
“Absolutely fantastic! I
abandoned WU a while back
because of the ‘dramatic
imagery’ that didn't match
reality on the ground / in the
field; and so I am very happy
that feedback was heard, that
you studied the complaint and
data, as well as communicated
with pros, observers and end
users. Time to bookmark and
load the WU apps again; and test
it out.”
- User feedback on Radar
Palette Improvements
blog post (2014)
The UX of XAI
4
“What counts as an explanation
depends on what the user needs,
what knowledge the user already
has, and especially the user's
goals.”
(Hoffman et al., 2019)
How can we help end-users meet their goals
and make better decisions?
Designing explanations to meet user goals
Designing explanations for better decision making
Designing Theory-Driven User-Centric Explainable AI (Wang et al, 2019)
Designing explanations for better decision making
Designing Theory-Driven User-Centric Explainable AI (Wang et al, 2019)
“The property of ‘being an
explanation’ is not a property of
statements, it is an interaction.”
(Hoffman et al., 2019)
How can we build understanding through
interaction?
Designing explanations for interaction
Examples of Interactive
XAI
5
Editing Model Inputs + Example Based Explanations
Intuitively Assessing ML Model Reliability through Example-Based Explanations and Editing Model Inputs (Suresh et al., 2022)
“Grounding interpretability in real examples,
facilitating comparison across them, and
visualizing class distributions can help users
grasp the model’s uncertainty and connect it to
relevant challenges of the task.
Moreover, by looking at and comparing real
examples, users can discover or ask
questions about limitations of the data —
and doing so does not damage trust, but can
play an important role in building it.”
(Suresh et al., 2022)
Interaction Example: The What-If Tool
https://pair-code.github.io/what-if-tool/
Interaction Example: The What-If Tool
https://pair-code.github.io/what-if-tool/
Interaction Example: The Language Interpretability Tool (LIT)
https://ai.googleblog.com/2020/11/the-language-interpretability-tool-lit.html
Interaction Example: SMILY (Similar Image Search for Histopathology)
https://ai.googleblog.com/2019/07/building-smily-human-centric-similar.html
Interaction Example: SMILY
https://ai.googleblog.com/2019/07/building-smily-human-centric-similar.html
Example: ChatGPT
Example: ChatGPT
Parting thoughts
6
XAI = interaction; Interaction Design is a cycle
Discover
Ideate
Create
Evaluate
Interaction design is a cycle
User-centric evaluation of XAI methods
● Understandability - Does the XAI method provide
explanations in human-readable terms with sufficient
detail to be understandable to the intended end-users?
● Satisfaction - Does the XAI method provide explanations
such that users feel that they understand the AI system
and are satisfied?
● Utility - Does the XAI method provide explanations such
that end-users can make decisions and take further action
on the prediction?
● Trustworthyness - After interacting with the explanation,
do users trust the AI model prediction to an appropriate
degree?
Interaction design for XAI
Evaluate
XAI “101” article (an article I wrote)
● Explaining the Unexplainable by Meg
Kurdziolek
Resources
Intrinsic.ai (where I work)
● Intrinsic.ai
● RSVP for Keynote May 15th
Articles I recommend
● Intuitively Assessing ML Model Reliability through
Example-Based Explanations and Editing Model Inputs
by Harini Suresh, Kathleen M Lewis, John Guttag, Arvind
Satyanarayan
● Designing Theory-Driven User-Centric Explainable AI by
Danding Wang, Qian Yang, Ashraf Abdul, Brian Y. Lim
● Metrics for Explainable AI: Challenges and Prospects by
Robert R. Hoffman, Shane T. Mueller, Gary Klein, Jordan
Litman
● Building SMILY, a Human-Centric, Similar-Image Search
Tool for Pathology by Narayan Hegde and Carrie J. Cai
● The Language Interpretability Tool (LIT): Interactive
Exploration and Analysis of NLP Models by James
Wexler and Ian Tenney
Let’s Talk! (LinkedIn and Email)
● linkedin.com/in/mdickeykurdziolek/
● meg.kurdziolek@gmail.com
https://rsvp.withgoogle.com/events/intrinsic-product-keynote
Any Questions?
Thank you!

Interactive XAI for ODSC East 2023

  • 1.
    Interactive XAI Meg Kurdziolek,PhD meg.kurdziolek@gmail.com
  • 2.
    Meg Kurdziolek ● PhDin Human-Computer Interaction ● Staff UXR for Intrinsic.ai, an Alphabet Company ● 10+ years industry experience, the past 5 has been in the Robotics & AI/ML space
  • 3.
    01 What dowe need XAI for? Discussion of why XAI is essential for the growth, adoption and engineering of ML Human factors of AI and XAI Different audiences will have different needs for model outputs and explanations. Also, we are all subject to human-biases. We’ve actually been explaining complex things for a long time We’ll take a look at an analogy of explaining complex-weather data to end- users Human Factors of Explainable AI Presentation Outline 02 03 04 The UX of XAI Recommendations on how to think about and design XAI for your audience Examples of Interactive XAI Some examples of interactive XAI and “lessons learned” Parting thoughts A recap of what we talked about today and some resources for you if you want to learn more. 05 06
  • 4.
    What do weneed XAI for? 1
  • 5.
    “The danger isin creating and using decisions that are not justifiable, legitimate, or that simply do not allow obtaining detailed explanations of their behavior.” (Arrieta et al., 2020)
  • 6.
    ● Identifying andtroubleshooting illegitimate conclusions ○ Deficiencies in the training data, and data “skews” or shifts can result in illegitimate conclusions. Without knowing the “why” behind a prediction it is difficult to diagnose. ● Feature engineering and data pipeline optimization ○ Removing features/data that is unnecessary for achieving desired model performance Explainability is important to the development, assessment, optimization, and troubleshooting of ML Systems
  • 7.
    ● Identifying biasin datasets/models ○ Models can arrive at unfair, discriminatory, or biased decisions. Without a means of understanding the underlying decision making, these issues are difficult to assess. Explainability is important to assessing fairness and addressing bias
  • 8.
    ● Trust andadoption ○ humans are reluctant to adopt or trust technologies they do not understand ● Utility requires understanding ○ in cases where humans utilize the technology to make critical decisions, they require explanations in order to effectively execute their own judgment Explainability is essential for end-user adoption and the ultimate utility of ML applications
  • 9.
  • 10.
    Depending on whothe audience is, the explanation may need to account for different domain expertise, cognitive abilities, and context of use.
  • 11.
  • 12.
    Developers, Operators, and Engineers Data Scientists / Model Builders Domain expert Lay-person/ Consumer Auditors/ regulatory agencies Prediction +Explanation Expert on ML NOT an expert on the data domain Expert on the data domain NOT an expert on ML NOT an expert on ML NOT an expert on the data domain
  • 13.
    Human Bias ● AnchoringBias - relying too heavily on the first piece of information we are given about a topic. We interpret newer information from the reference point of our anchor, instead of seeing it objectively. ● Availability bias - tendency to believe that examples or cases that come readily to mind are more representative of a population than they actually are. “When we become anchored to a specific figure or plan of action, we end up filtering all new information through the framework we initially drew up in our head, distorting our perception. This makes us reluctant to make significant changes to our plans, even if the situation calls for it.” - Why we tend to rely heavily upon the first piece of information we receive
  • 14.
    Human Bias ● ConfirmationBias - seeking and favoring information that supports their prior beliefs. Can result in unjustified trust and mistrust. ● Unjustified Trust/“Over trust” - end- users may have a higher degree of trust than they should (or “over trust”) when explanations are presented in different formats. “They found that participants tended to place “unwarranted” faith in numbers. For example, the AI group participants often ascribed more value to mathematical representations than was justified, while the non-AI group participants believed the numbers signaled intelligence — even if they couldn’t understand the meaning.” - Even experts are too quick to rely on AI explanations
  • 15.
    We’ve actually been explainingcomplex things for a long time 3
  • 16.
    Weather is anexample of just one of the many complex systems we explain and interpret today.
  • 17.
    “Stop sensationalizing storms inyour maps…” - user feedback Weather Underground’s radar imagery felt inaccurate to users. We had a problem:
  • 18.
    Different Sites, DifferentStorms Intellicast AccuWeather
  • 19.
    NWS dBZ toRain Rate dBZ Rain Rate (in/hr) 65 16+ 60 8.00 55 4.00 52 2.50 47 1.25 41 0.50 36 0.25 30 0.10 20 Trace < 20 No rain
  • 20.
    Meteorologist Interviews dBZ RainRate (in/hr) 65 16+ 60 8.00 55 4.00 52 2.50 47 1.25 41 0.50 36 0.25 30 0.10 20 Trace < 20 No rain What does a quarter inch of rain per hour feel like? “Thats a solid rain. But not a downpour. You would want an umbrella, but you’d be okay if you needed to make a quick dash to your car or something.”
  • 21.
    What do youthink you’d experience in a rainstorm that looked like this? “I think that if I was right in the middle of it, in that orange spot right there, I would not want to be outside. I bet it would be raining real heavy. Might flood the storm drains.” User Interviews
  • 22.
    Lining up theexpert and non-expert experience dBZ Rain Rate (in/hr) 65 16+ 60 8.00 55 4.00 52 2.50 47 1.25 41 0.50 36 0.25 30 0.10 20 Trace < 20 No rain ~35 dBZ Big jump ~55 dBZ Big difference Meteorologist Experience End-user Experience
  • 23.
    New radar paletteis launched Old Palette New Palette “Absolutely fantastic! I abandoned WU a while back because of the ‘dramatic imagery’ that didn't match reality on the ground / in the field; and so I am very happy that feedback was heard, that you studied the complaint and data, as well as communicated with pros, observers and end users. Time to bookmark and load the WU apps again; and test it out.” - User feedback on Radar Palette Improvements blog post (2014)
  • 24.
    The UX ofXAI 4
  • 25.
    “What counts asan explanation depends on what the user needs, what knowledge the user already has, and especially the user's goals.” (Hoffman et al., 2019)
  • 26.
    How can wehelp end-users meet their goals and make better decisions? Designing explanations to meet user goals
  • 27.
    Designing explanations forbetter decision making Designing Theory-Driven User-Centric Explainable AI (Wang et al, 2019)
  • 28.
    Designing explanations forbetter decision making Designing Theory-Driven User-Centric Explainable AI (Wang et al, 2019)
  • 29.
    “The property of‘being an explanation’ is not a property of statements, it is an interaction.” (Hoffman et al., 2019)
  • 30.
    How can webuild understanding through interaction? Designing explanations for interaction
  • 31.
  • 32.
    Editing Model Inputs+ Example Based Explanations Intuitively Assessing ML Model Reliability through Example-Based Explanations and Editing Model Inputs (Suresh et al., 2022)
  • 33.
    “Grounding interpretability inreal examples, facilitating comparison across them, and visualizing class distributions can help users grasp the model’s uncertainty and connect it to relevant challenges of the task. Moreover, by looking at and comparing real examples, users can discover or ask questions about limitations of the data — and doing so does not damage trust, but can play an important role in building it.” (Suresh et al., 2022)
  • 34.
    Interaction Example: TheWhat-If Tool https://pair-code.github.io/what-if-tool/
  • 35.
    Interaction Example: TheWhat-If Tool https://pair-code.github.io/what-if-tool/
  • 36.
    Interaction Example: TheLanguage Interpretability Tool (LIT) https://ai.googleblog.com/2020/11/the-language-interpretability-tool-lit.html
  • 37.
    Interaction Example: SMILY(Similar Image Search for Histopathology) https://ai.googleblog.com/2019/07/building-smily-human-centric-similar.html
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
    XAI = interaction;Interaction Design is a cycle Discover Ideate Create Evaluate Interaction design is a cycle
  • 43.
    User-centric evaluation ofXAI methods ● Understandability - Does the XAI method provide explanations in human-readable terms with sufficient detail to be understandable to the intended end-users? ● Satisfaction - Does the XAI method provide explanations such that users feel that they understand the AI system and are satisfied? ● Utility - Does the XAI method provide explanations such that end-users can make decisions and take further action on the prediction? ● Trustworthyness - After interacting with the explanation, do users trust the AI model prediction to an appropriate degree? Interaction design for XAI Evaluate
  • 44.
    XAI “101” article(an article I wrote) ● Explaining the Unexplainable by Meg Kurdziolek Resources Intrinsic.ai (where I work) ● Intrinsic.ai ● RSVP for Keynote May 15th Articles I recommend ● Intuitively Assessing ML Model Reliability through Example-Based Explanations and Editing Model Inputs by Harini Suresh, Kathleen M Lewis, John Guttag, Arvind Satyanarayan ● Designing Theory-Driven User-Centric Explainable AI by Danding Wang, Qian Yang, Ashraf Abdul, Brian Y. Lim ● Metrics for Explainable AI: Challenges and Prospects by Robert R. Hoffman, Shane T. Mueller, Gary Klein, Jordan Litman ● Building SMILY, a Human-Centric, Similar-Image Search Tool for Pathology by Narayan Hegde and Carrie J. Cai ● The Language Interpretability Tool (LIT): Interactive Exploration and Analysis of NLP Models by James Wexler and Ian Tenney Let’s Talk! (LinkedIn and Email) ● linkedin.com/in/mdickeykurdziolek/ ● meg.kurdziolek@gmail.com
  • 45.
  • 46.