SlideShare a Scribd company logo
1 of 51
Download to read offline
Explainability in AI and RecSys:
let’s make it interactive!
Martijn Willemsen
Why do we need explainability?
• Model validation: avoid biases, unfairness or overfitting, detect
issues in the training data, adhere to ethical/legal requirements
• Model debugging and improvement: improving the model fit,
adversarial learning (fooling a model with ‘hacked’ inputs), reliability
& robustness (sensitivity to small input changes)
• Knowledge discovery: explanations provide feedback to the Data
Scientist or user that can result in new insights by revealing hidden
underlying correlations/patterns.
• Trust and technology acceptance: explanations might convince
users to adopt the technology more and have more control
“
”
Poll: What is a good explanation?
A: complete and accurate evidence for decision
B: gives a single good reason why this decision
C: tells me what I need to get a different decision
What is important for explainability in ML?
• Accuracy: does the explanation predict unseen data? Is it as
accurate as the model itself?
• Fidelity: does the explanation approximate the prediction of the
model? Especially important for black-box models (local
fidelity).
• Consistency: same explanations for different models?
• Stability: similar explanations for similar instances?
• Comprehensibility: do humans get it (see previous slide)
Some of these are hard to achieve with some models…
https://christophm.github.io/interpretable-ml-book/properties.html
What is a good explanation (for humans)?
Confalonieri et al. (2020) & Molnar (2020) based on Miller:
• Contrastive: why was this prediction made in stead of
another?
• Selective: focus on a few important causes (not all
features that contributed to the model).
• Social: should fit the mental model of the explainee /
target audience, consider the social context, and fit
their prior belie
• Abnormalness: humans like rare causes (related to
counterfactuals)
• (Truthfulness: less important for humans then
selectiveness!)
https://christophm.github.io/interpretable-ml-book/explanation.html
Machine learning / AI interpretability
Some methods are inherently interpretable (glass-box or white box models)
• Regression, decision trees, GAM
• Some RecSys algorithms (content based-or classical CF)
Many others are not: black-box models
• Neural networks (CNN/RNN), random forest, Matrix factorization etc
• often requires post-hoc explanations (leave the model intact)
Further distinction can be made between:
• Model-specific method (explanation is specific to the ML technique)
• Model-agnostic methods (explanation treats ML as black-box: use only the
input/outputs)
Explanations, can be global, component-based, or local
GAM
SHAP
Global explanation components / dependence plot local explanations
Interpreting Interpretability: Understanding Data Scientists'
Use of Interpretability Tools for Machine Learning
Kaur et al. CHI 2020
Data Scientists also do not get these visualizations… !
Global explanations (how does it work in general?)
How does the model perform on average for the dataset, overall
approximation of the (black box) ML model?
• Feature importance ranks: permutate/remove features and
see how the model output changes to find feature importance
• Feature effects: effect of a specific feature on the outcome of
the model: Partial Dependence Plots (marginal effects) or
Accumulated Local Effect plots (conditional effects)
local explanations: why do I get this prediction?
LIME (Local Interpretable Model-agnostic Explanations), an
algorithm that can explain the predictions of any classier or
regressor in a faithful way, by approximating it locally with an
interpretable (surrogate) model.
Local explanations that are model-agnostic…
By “explaining a prediction", we mean presenting textual or
visual artifacts that provide qualitative understanding of
the relationship between the instance's components (e.g.
words in text, patches in an image) and the model's
prediction.
Criteria:
Interpretable: provide qualitative understanding between the
input variables and the response.
local fidelity: for an explanation to be meaningful it must at
least be locally faithful
model-agnostic: an explainer should be able to explain any
model
LIME output: which algorithm works better?
Two algorithms with
similar accuracy
predicting if the text
below is about
Christianity or
atheism
Poll: Which model
should you trust
more 1 or 2?
Works very well, but…
Sentiment of the sentence “This is not bad”
LIME can show that the sentiment is
detected correctly because of the conjunction
of “not” and “bad”
Same results for two very different models
But do you notice a difference?
Valence of the decision class: which is more
understandable?
Logistic regression on unigrams
LSTM on sentence embeddings
Ribeiro et al. 2016, Model-Agnostic Interpretability of Machine Learning, arXiv:1606.05386v1
Improving understandability of feature contributions in
model-agnostic explainable AI tools (CHI 2022)
Sophia Hadash, Martijn Willemsen, Chris Snijders, and Wijnand IJsselsteijn
Jheronimus Academy of Data Science
Human-Technology Interaction, TU/e
Visualizations of LIME (and SHAP) can be counterintuitive!
Prediction class: bad (ineligible for loan) (Data: credit-g)
Cognitively challenging due to (double) negations!
Proposed improvements
Frame feature contributions towards the decision class that the
reader perceives positively.
Proposed improvements
2) Add semantic labels to the feature contributions.
Empirical User study
⚫ 133 participants (61 male), university database + convenience sampling
Factors:
⚫ Loan applications and music recommendations (within-subjects)
⚫ Framing: positive or negative (within-subjects)
⚫ Semantic labelling: no labels, “eligibility/like”, or “ineligibility/dislike”
⚫ Between-subject to prevent carry-over learning effects.
Measurement: perceived understandability using 4-pt Likert scale.
⚫ 6 trials per within-condition, 24 per participant
Results
Positive framing leads to higher
understandability, even when
the prediction/ decision class is
negative.
Results
Negatively framed semantic
labels do not improve
understandability.
⚫ (e.g. “+5% ineligibility”)
⚫ Not even when compatible
with the negative decision
class…
Results
Positively framed semantic
labels improves
understandability.
⚫ (e.g. “+5% eligibility”)
Framing is no longer relevant
for understandability.
Take away: do not forget the psychology!
Positive framing always works better than negative
framing (even for negative decision classes).
• Requires that decision-classes are inherently “positive” or “negative”
Use of semantic labelling can improve understandability
of the visualizations of interpretability tools.
• Reduces framing effects!
Drawbacks of post-hoc explanations
These tools still just provide a retrospective explanation of the
outcome…
• Static, lack on contrastive, counterfactual insights…
Ben Shneiderman promoted prospective user interfaces
• Interactive tools that show you what aspects influence and
change the outcome of an AI
How would that work? It has already been done for decades!
“
”
How do we make explanations contrastive,
and selective?
How do we make sure they fit our mental
models and beliefs?
Let’s make them interactive!
Interactive ML is not new…
Dudley (2018) and Amershi (2019) show that two
decades of research already have looked at these
issues in communities like IUI and CHI…
Example: Crayons, 2003
Fails & Olson, Crayons, IUI (2003)
Traditional ML
Amershi et al. 2014: Power to the People
• ML works with experts on
feature selection / data
representation
• Use ML, build predictions, go
back to expert for validation
• Long and slow cycle, big steps,
• exploration is mostly on the side
of the ML/ data scientist
Interactive ML
• User directly interacts with the
model
• Incremental but fast updates,
small steps, low-cost trial & error
• Smaller cycles, gives better
understanding what happens
• Can be done by low-expertise
users
• Examples: recommender
systems and tools like Crayons
Amershi et al. 2014: Power to the People
Interface elements of an IML (Dudley 2018, sec. 4)
‘These elements represent distinct
functionality that the interface
must typically support to deliver a
comprehensive IML workflow’
Not necessarily physically distinct:
e.g., crayons merges sample
review and feedback assignment
IML Workflow (sec 5)
Key Solution Principles according to Dudley (2018)
Exploit interactivity and promote rich interactions
• Interaction for understanding: many UX principles are hard to achieve
in IML (i.e. direct manipulation principles)
• Make the most of the user: balance effort and value of input, avoid
repeated requests, provide retrace of steps and undo
Engage the user
• Provide feedback, show partial predictions, do not ask trivial labeling
tasks
• might promote users to spend more time and improve the modeling
Guidelines for Human-AI interaction
Amershi et al., CHI 2019
18 guidelines
• UX design process
• Brings knowledge
from many related
fields together
• Goes back to earlier
classical work:
strongly founded in
Mixed initiatives
work of Horvitz (IUI
1999)
“
”
Two example applications of interactive AI
/ REcSys from my lab that I consider to be
Prospective user interfaces
Preparing for a marathon
Target finish times
Not too fast, not too slow
Pacing (min/km) strategy
Constant ‘flat’ speed is associated
with best performance
34
Heleen Rutjes
Prediction model for setting a
challenging, yet realistic finish time.
Model predictions are based on similar runners:
If runner *sunglasses* has had similar past performances as runner *hat*, yet has a
better Personal Best (PB), than runner *hat* can potentially achieve that too.
Approach: ‘case-based reasoning’ (CBR)
We asked coaches what aspects
they would like to control:
- Select similar runners?
- Select best races to serve as a case?
35 Research by Barry Smyth: http://medium.com/running-with-data/
Making the model interactive
Running coaches could indicate for every previous race how ‘representative’
they consider it.
By setting the slider, the model prediction
was continuously updated.
36
Model interactivity increased
trust and acceptance
Acceptance
Coaches showed to be more inclined to accept a model that they could interact with.
Trust
Model interactivity increased coaches’ perceived competence of the model.
37
“Without my adjustments the model did not make
sense, but by eliminating the race from Eindhoven,
we’re getting somewhere.”
(Coach 53, familiar runner, interactive condition)
Coaches improved the accuracy of the model
Model accuracy improved by coaches’ interactions
(Mean PercentError dropped from 3.14 to 2.33, p = 0.018)
What did the coaches adjust?
Systematic adjustments More recent races were indicated as more
representative. (p<0.001)
‘Anecdotal’ adjustments Based on knowledge of the specific runner,
running in general, environmental circumstances, etc.
Even when working with unfamiliar runners:
38
“There is clearly something going on with this lady. Maybe she
stopped training, or she has a persistent injury?”
(Coach 45, unfamiliar runner, non-interactive condition)
Music Genre Exploration with
mood control and visualizations
Work by Yu Liang (IUI 2021)
How to better support users to explore a new music genre?
40
[Millecamp, M., Htun, N. N., Jin, Y., & Verbert, K. 2018]
[Bostandjiev, S., O’Donovan, J., & Höllerer, T. (2012)]
[Andjelkovic, I., et al. 2019], [He, C., Parra, D., & Verbert, K. 2016]
Simple bar plot visualization to explain recommendation
41
[Millecamp, et al. 2019]
Easy to understand
Not very informative: present
only the averaged preferences
Bar charts
More complex contour plot visualization
42
1) Show the relation between the recommendations ,
users’ current preferences and the new genre
2) Show the preference intensity of users
Contour plots
Mood control
A bit hard to understand
Contour plot + Mood control (Most helpful?)
43
Easily see how
recommendation
changes
Contour plot + Mood control (Most helpful?)
44
Easily see how
recommendation
changes
Research questions
RQ1: How do different types of visualizations (bar charts/contour
plots) influence the perceived helpfulness for new music genre
exploration?
RQ2: How does mood control improve the perceived helpfulness
for new music genre exploration
45
Study design
2X2 mixed facotorial
design:
Mood control:
between-subject
Visualization:
within-subject
46
Interactive Music Genre Exploration with Visualization and Mood Control
Measurements
• Subjective measures: post-task quesionnaires
Perceived helpfulness, perceived control, perceived
informativeness and understandability
• Objective measures: user-interactions with the
system
• Musical Sophistication (active engagement &
emotional engagement)
47
Interactive Music Genre Exploration with Visualization and Mood Control
• Participants: mainly university students
• 102 valid reponses Fig. Genre selection frequencies
Genres they wanted to explore
Which is more helpful?
48
Interactive Music Genre Exploration with Visualization and Mood Control
Contour plot (vs bar charts):
• More helpful
• Total effect: 𝛽 = .378, 𝑠𝑒 =
.082, 𝑝 < .001)
Control (vs no control):
• Seems to be more helpful
• Total effect: 𝛽 = .238, 𝑠𝑒 = .123, 𝑝
= .053 (marginal significant)
Contour + control:
• More helpful
• Total effect: 𝛽 = 0.242, 𝑠𝑒
= 0.123, 𝑝 = .049).
What we have found….
Good visualization is key for understandability and explainability
Contour plot is perceived more helpful than the bar chart
• More informative, thus more understandable & helpful
• Better mental model?
Interaction only helps with good mental model/understanding
Mood control itself does not make the system more helpful
• paired with the contour plot it benefits the perceived
helpfulness mostly due to increased informativeness
49
Further work on genre exploration
RecSys 2021: the role of default settings on genre
selection and exploration:
• tradeoff slider: from genre representative to more
personalized songs
• Defaults had a strong effect on how far users explored…
RecSys 2022 (just accepted): Longitudinal study in which
they used the same tool for 4 weeks
• Default effects fade over the weeks
• Users find the tool helpful / keep exploring after 4 weeks
• Some actual change in music profile after 6 weeks!
Conclusions
Two separate worlds:
• interactive Machine Learning: interpretability for data scientists
• human-AI interaction work focused on the user at CHI, UMAP,
IUI (and RecSys)
We should learn from each other and bring them more together!
Human-AI interaction requires solid understand of mental models,
cognitive processes and biases, visualization guidelines and user
experience research!
Questions?
M.C.Willemsen@tue.nl
@MCWillemsen

More Related Content

What's hot

DC02. Interpretation of predictions
DC02. Interpretation of predictionsDC02. Interpretation of predictions
DC02. Interpretation of predictionsAnton Kulesh
 
Artificial Intelligence Can Now Copy Your Voice: What Does That Mean For Humans?
Artificial Intelligence Can Now Copy Your Voice: What Does That Mean For Humans?Artificial Intelligence Can Now Copy Your Voice: What Does That Mean For Humans?
Artificial Intelligence Can Now Copy Your Voice: What Does That Mean For Humans?Bernard Marr
 
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and CostLLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and CostAggregage
 
Interpretable deep learning for healthcare
Interpretable deep learning for healthcareInterpretable deep learning for healthcare
Interpretable deep learning for healthcareNAVER Engineering
 
Exploring Generating AI with Diffusion Models
Exploring Generating AI with Diffusion ModelsExploring Generating AI with Diffusion Models
Exploring Generating AI with Diffusion ModelsKonfHubTechConferenc
 
Transformers, LLMs, and the Possibility of AGI
Transformers, LLMs, and the Possibility of AGITransformers, LLMs, and the Possibility of AGI
Transformers, LLMs, and the Possibility of AGISynaptonIncorporated
 
Issues on Artificial Intelligence and Future (Standards Perspective)
Issues on Artificial Intelligence  and Future (Standards Perspective)Issues on Artificial Intelligence  and Future (Standards Perspective)
Issues on Artificial Intelligence and Future (Standards Perspective)Seungyun Lee
 
Responsible Generative AI
Responsible Generative AIResponsible Generative AI
Responsible Generative AICMassociates
 
Artificial Intelligence in the Media
Artificial Intelligence in the Media Artificial Intelligence in the Media
Artificial Intelligence in the Media Gigi Teo
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language ModelsLeon Dohmen
 
Building trust through Explainable AI
Building trust through Explainable AIBuilding trust through Explainable AI
Building trust through Explainable AIPeet Denny
 
Explainable AI: Building trustworthy AI models?
Explainable AI: Building trustworthy AI models? Explainable AI: Building trustworthy AI models?
Explainable AI: Building trustworthy AI models? Raheel Ahmad
 
Introduction to Interpretable Machine Learning
Introduction to Interpretable Machine LearningIntroduction to Interpretable Machine Learning
Introduction to Interpretable Machine LearningNguyen Giang
 
Paper presentation on LLM compression
Paper presentation on LLM compression Paper presentation on LLM compression
Paper presentation on LLM compression SanjanaRajeshKothari
 
Why AI is Taking Over the World?
Why AI is Taking Over the World?Why AI is Taking Over the World?
Why AI is Taking Over the World?SabaGhazan2
 
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...Krishnaram Kenthapadi
 
Unlocking the Power of Generative AI Models and Systems such as GPT-4 and Cha...
Unlocking the Power of Generative AI Models and Systems such as GPT-4 and Cha...Unlocking the Power of Generative AI Models and Systems such as GPT-4 and Cha...
Unlocking the Power of Generative AI Models and Systems such as GPT-4 and Cha...Nohoax Kanont
 

What's hot (20)

DC02. Interpretation of predictions
DC02. Interpretation of predictionsDC02. Interpretation of predictions
DC02. Interpretation of predictions
 
Explainable AI (XAI)
Explainable AI (XAI)Explainable AI (XAI)
Explainable AI (XAI)
 
Artificial Intelligence Can Now Copy Your Voice: What Does That Mean For Humans?
Artificial Intelligence Can Now Copy Your Voice: What Does That Mean For Humans?Artificial Intelligence Can Now Copy Your Voice: What Does That Mean For Humans?
Artificial Intelligence Can Now Copy Your Voice: What Does That Mean For Humans?
 
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and CostLLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
 
Interpretable deep learning for healthcare
Interpretable deep learning for healthcareInterpretable deep learning for healthcare
Interpretable deep learning for healthcare
 
Exploring Generating AI with Diffusion Models
Exploring Generating AI with Diffusion ModelsExploring Generating AI with Diffusion Models
Exploring Generating AI with Diffusion Models
 
Transformers, LLMs, and the Possibility of AGI
Transformers, LLMs, and the Possibility of AGITransformers, LLMs, and the Possibility of AGI
Transformers, LLMs, and the Possibility of AGI
 
Issues on Artificial Intelligence and Future (Standards Perspective)
Issues on Artificial Intelligence  and Future (Standards Perspective)Issues on Artificial Intelligence  and Future (Standards Perspective)
Issues on Artificial Intelligence and Future (Standards Perspective)
 
Responsible Generative AI
Responsible Generative AIResponsible Generative AI
Responsible Generative AI
 
Chatgpt ppt
Chatgpt  pptChatgpt  ppt
Chatgpt ppt
 
Artificial Intelligence in the Media
Artificial Intelligence in the Media Artificial Intelligence in the Media
Artificial Intelligence in the Media
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
 
Building trust through Explainable AI
Building trust through Explainable AIBuilding trust through Explainable AI
Building trust through Explainable AI
 
Explainable AI: Building trustworthy AI models?
Explainable AI: Building trustworthy AI models? Explainable AI: Building trustworthy AI models?
Explainable AI: Building trustworthy AI models?
 
Introduction to Interpretable Machine Learning
Introduction to Interpretable Machine LearningIntroduction to Interpretable Machine Learning
Introduction to Interpretable Machine Learning
 
Paper presentation on LLM compression
Paper presentation on LLM compression Paper presentation on LLM compression
Paper presentation on LLM compression
 
Why AI is Taking Over the World?
Why AI is Taking Over the World?Why AI is Taking Over the World?
Why AI is Taking Over the World?
 
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
 
Unlocking the Power of Generative AI Models and Systems such as GPT-4 and Cha...
Unlocking the Power of Generative AI Models and Systems such as GPT-4 and Cha...Unlocking the Power of Generative AI Models and Systems such as GPT-4 and Cha...
Unlocking the Power of Generative AI Models and Systems such as GPT-4 and Cha...
 
Journey of Generative AI
Journey of Generative AIJourney of Generative AI
Journey of Generative AI
 

Similar to ​​Explainability in AI and Recommender systems: let’s make it interactive!

GDG Cloud Southlake #17: Meg Dickey-Kurdziolek: Explainable AI is for Everyone
GDG Cloud Southlake #17: Meg Dickey-Kurdziolek: Explainable AI is for EveryoneGDG Cloud Southlake #17: Meg Dickey-Kurdziolek: Explainable AI is for Everyone
GDG Cloud Southlake #17: Meg Dickey-Kurdziolek: Explainable AI is for EveryoneJames Anderson
 
Interpretable Machine Learning
Interpretable Machine LearningInterpretable Machine Learning
Interpretable Machine LearningSri Ambati
 
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...Analytics India Magazine
 
Human in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AIHuman in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AIPramit Choudhary
 
Explainable AI
Explainable AIExplainable AI
Explainable AIDinesh V
 
Keepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivosKeepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivosKeepler Data Tech
 
Applying AI to software engineering problems: Do not forget the human!
Applying AI to software engineering problems: Do not forget the human!Applying AI to software engineering problems: Do not forget the human!
Applying AI to software engineering problems: Do not forget the human!University of Córdoba
 
STAT7440StudentIMLPresentationJishan.pptx
STAT7440StudentIMLPresentationJishan.pptxSTAT7440StudentIMLPresentationJishan.pptx
STAT7440StudentIMLPresentationJishan.pptxJishanAhmed24
 
Explainable AI in Healthcare
Explainable AI in HealthcareExplainable AI in Healthcare
Explainable AI in Healthcarevonaurum
 
Hybrid use of machine learning and ontology
Hybrid use of machine learning and ontologyHybrid use of machine learning and ontology
Hybrid use of machine learning and ontologyAnthony (Tony) Sarris
 
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...MITAILibrary
 
Interpretable Machine Learning_ Techniques for Model Explainability.
Interpretable Machine Learning_ Techniques for Model Explainability.Interpretable Machine Learning_ Techniques for Model Explainability.
Interpretable Machine Learning_ Techniques for Model Explainability.Tyrion Lannister
 
Design considerations for machine learning system
Design considerations for machine learning systemDesign considerations for machine learning system
Design considerations for machine learning systemAkemi Tazaki
 
ML crash course
ML crash courseML crash course
ML crash coursemikaelhuss
 
Ethical AI - Open Compliance Summit 2020
Ethical AI - Open Compliance Summit 2020Ethical AI - Open Compliance Summit 2020
Ethical AI - Open Compliance Summit 2020Debmalya Biswas
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedKrishnaram Kenthapadi
 
2018.01.25 rune sætre_triallecture_xai_v2
2018.01.25 rune sætre_triallecture_xai_v22018.01.25 rune sætre_triallecture_xai_v2
2018.01.25 rune sætre_triallecture_xai_v2Rune Sætre
 
Practical Tips for Interpreting Machine Learning Models - Patrick Hall, H2O.ai
Practical Tips for Interpreting Machine Learning Models - Patrick Hall, H2O.aiPractical Tips for Interpreting Machine Learning Models - Patrick Hall, H2O.ai
Practical Tips for Interpreting Machine Learning Models - Patrick Hall, H2O.aiSri Ambati
 
Recommenders, Topics, and Text
Recommenders, Topics, and TextRecommenders, Topics, and Text
Recommenders, Topics, and TextNBER
 

Similar to ​​Explainability in AI and Recommender systems: let’s make it interactive! (20)

GDG Cloud Southlake #17: Meg Dickey-Kurdziolek: Explainable AI is for Everyone
GDG Cloud Southlake #17: Meg Dickey-Kurdziolek: Explainable AI is for EveryoneGDG Cloud Southlake #17: Meg Dickey-Kurdziolek: Explainable AI is for Everyone
GDG Cloud Southlake #17: Meg Dickey-Kurdziolek: Explainable AI is for Everyone
 
Interpretable Machine Learning
Interpretable Machine LearningInterpretable Machine Learning
Interpretable Machine Learning
 
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
 
Human in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AIHuman in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AI
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
Keepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivosKeepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivos
 
Applying AI to software engineering problems: Do not forget the human!
Applying AI to software engineering problems: Do not forget the human!Applying AI to software engineering problems: Do not forget the human!
Applying AI to software engineering problems: Do not forget the human!
 
STAT7440StudentIMLPresentationJishan.pptx
STAT7440StudentIMLPresentationJishan.pptxSTAT7440StudentIMLPresentationJishan.pptx
STAT7440StudentIMLPresentationJishan.pptx
 
Explainable AI in Healthcare
Explainable AI in HealthcareExplainable AI in Healthcare
Explainable AI in Healthcare
 
Hybrid use of machine learning and ontology
Hybrid use of machine learning and ontologyHybrid use of machine learning and ontology
Hybrid use of machine learning and ontology
 
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...
 
Interpretable Machine Learning_ Techniques for Model Explainability.
Interpretable Machine Learning_ Techniques for Model Explainability.Interpretable Machine Learning_ Techniques for Model Explainability.
Interpretable Machine Learning_ Techniques for Model Explainability.
 
Design considerations for machine learning system
Design considerations for machine learning systemDesign considerations for machine learning system
Design considerations for machine learning system
 
ODSC APAC 2022 - Explainable AI
ODSC APAC 2022 - Explainable AIODSC APAC 2022 - Explainable AI
ODSC APAC 2022 - Explainable AI
 
ML crash course
ML crash courseML crash course
ML crash course
 
Ethical AI - Open Compliance Summit 2020
Ethical AI - Open Compliance Summit 2020Ethical AI - Open Compliance Summit 2020
Ethical AI - Open Compliance Summit 2020
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons Learned
 
2018.01.25 rune sætre_triallecture_xai_v2
2018.01.25 rune sætre_triallecture_xai_v22018.01.25 rune sætre_triallecture_xai_v2
2018.01.25 rune sætre_triallecture_xai_v2
 
Practical Tips for Interpreting Machine Learning Models - Patrick Hall, H2O.ai
Practical Tips for Interpreting Machine Learning Models - Patrick Hall, H2O.aiPractical Tips for Interpreting Machine Learning Models - Patrick Hall, H2O.ai
Practical Tips for Interpreting Machine Learning Models - Patrick Hall, H2O.ai
 
Recommenders, Topics, and Text
Recommenders, Topics, and TextRecommenders, Topics, and Text
Recommenders, Topics, and Text
 

Recently uploaded

VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 

Recently uploaded (20)

VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 

​​Explainability in AI and Recommender systems: let’s make it interactive!

  • 1. Explainability in AI and RecSys: let’s make it interactive! Martijn Willemsen
  • 2. Why do we need explainability? • Model validation: avoid biases, unfairness or overfitting, detect issues in the training data, adhere to ethical/legal requirements • Model debugging and improvement: improving the model fit, adversarial learning (fooling a model with ‘hacked’ inputs), reliability & robustness (sensitivity to small input changes) • Knowledge discovery: explanations provide feedback to the Data Scientist or user that can result in new insights by revealing hidden underlying correlations/patterns. • Trust and technology acceptance: explanations might convince users to adopt the technology more and have more control
  • 3. “ ” Poll: What is a good explanation? A: complete and accurate evidence for decision B: gives a single good reason why this decision C: tells me what I need to get a different decision
  • 4. What is important for explainability in ML? • Accuracy: does the explanation predict unseen data? Is it as accurate as the model itself? • Fidelity: does the explanation approximate the prediction of the model? Especially important for black-box models (local fidelity). • Consistency: same explanations for different models? • Stability: similar explanations for similar instances? • Comprehensibility: do humans get it (see previous slide) Some of these are hard to achieve with some models… https://christophm.github.io/interpretable-ml-book/properties.html
  • 5. What is a good explanation (for humans)? Confalonieri et al. (2020) & Molnar (2020) based on Miller: • Contrastive: why was this prediction made in stead of another? • Selective: focus on a few important causes (not all features that contributed to the model). • Social: should fit the mental model of the explainee / target audience, consider the social context, and fit their prior belie • Abnormalness: humans like rare causes (related to counterfactuals) • (Truthfulness: less important for humans then selectiveness!) https://christophm.github.io/interpretable-ml-book/explanation.html
  • 6. Machine learning / AI interpretability Some methods are inherently interpretable (glass-box or white box models) • Regression, decision trees, GAM • Some RecSys algorithms (content based-or classical CF) Many others are not: black-box models • Neural networks (CNN/RNN), random forest, Matrix factorization etc • often requires post-hoc explanations (leave the model intact) Further distinction can be made between: • Model-specific method (explanation is specific to the ML technique) • Model-agnostic methods (explanation treats ML as black-box: use only the input/outputs)
  • 7. Explanations, can be global, component-based, or local GAM SHAP Global explanation components / dependence plot local explanations Interpreting Interpretability: Understanding Data Scientists' Use of Interpretability Tools for Machine Learning Kaur et al. CHI 2020 Data Scientists also do not get these visualizations… !
  • 8. Global explanations (how does it work in general?) How does the model perform on average for the dataset, overall approximation of the (black box) ML model? • Feature importance ranks: permutate/remove features and see how the model output changes to find feature importance • Feature effects: effect of a specific feature on the outcome of the model: Partial Dependence Plots (marginal effects) or Accumulated Local Effect plots (conditional effects)
  • 9. local explanations: why do I get this prediction? LIME (Local Interpretable Model-agnostic Explanations), an algorithm that can explain the predictions of any classier or regressor in a faithful way, by approximating it locally with an interpretable (surrogate) model.
  • 10. Local explanations that are model-agnostic… By “explaining a prediction", we mean presenting textual or visual artifacts that provide qualitative understanding of the relationship between the instance's components (e.g. words in text, patches in an image) and the model's prediction. Criteria: Interpretable: provide qualitative understanding between the input variables and the response. local fidelity: for an explanation to be meaningful it must at least be locally faithful model-agnostic: an explainer should be able to explain any model
  • 11. LIME output: which algorithm works better? Two algorithms with similar accuracy predicting if the text below is about Christianity or atheism Poll: Which model should you trust more 1 or 2?
  • 12. Works very well, but… Sentiment of the sentence “This is not bad” LIME can show that the sentiment is detected correctly because of the conjunction of “not” and “bad” Same results for two very different models But do you notice a difference? Valence of the decision class: which is more understandable? Logistic regression on unigrams LSTM on sentence embeddings Ribeiro et al. 2016, Model-Agnostic Interpretability of Machine Learning, arXiv:1606.05386v1
  • 13. Improving understandability of feature contributions in model-agnostic explainable AI tools (CHI 2022) Sophia Hadash, Martijn Willemsen, Chris Snijders, and Wijnand IJsselsteijn Jheronimus Academy of Data Science Human-Technology Interaction, TU/e
  • 14. Visualizations of LIME (and SHAP) can be counterintuitive! Prediction class: bad (ineligible for loan) (Data: credit-g) Cognitively challenging due to (double) negations!
  • 15. Proposed improvements Frame feature contributions towards the decision class that the reader perceives positively.
  • 16. Proposed improvements 2) Add semantic labels to the feature contributions.
  • 17. Empirical User study ⚫ 133 participants (61 male), university database + convenience sampling Factors: ⚫ Loan applications and music recommendations (within-subjects) ⚫ Framing: positive or negative (within-subjects) ⚫ Semantic labelling: no labels, “eligibility/like”, or “ineligibility/dislike” ⚫ Between-subject to prevent carry-over learning effects. Measurement: perceived understandability using 4-pt Likert scale. ⚫ 6 trials per within-condition, 24 per participant
  • 18. Results Positive framing leads to higher understandability, even when the prediction/ decision class is negative.
  • 19. Results Negatively framed semantic labels do not improve understandability. ⚫ (e.g. “+5% ineligibility”) ⚫ Not even when compatible with the negative decision class…
  • 20. Results Positively framed semantic labels improves understandability. ⚫ (e.g. “+5% eligibility”) Framing is no longer relevant for understandability.
  • 21. Take away: do not forget the psychology! Positive framing always works better than negative framing (even for negative decision classes). • Requires that decision-classes are inherently “positive” or “negative” Use of semantic labelling can improve understandability of the visualizations of interpretability tools. • Reduces framing effects!
  • 22. Drawbacks of post-hoc explanations These tools still just provide a retrospective explanation of the outcome… • Static, lack on contrastive, counterfactual insights… Ben Shneiderman promoted prospective user interfaces • Interactive tools that show you what aspects influence and change the outcome of an AI How would that work? It has already been done for decades!
  • 23. “ ” How do we make explanations contrastive, and selective? How do we make sure they fit our mental models and beliefs? Let’s make them interactive!
  • 24. Interactive ML is not new… Dudley (2018) and Amershi (2019) show that two decades of research already have looked at these issues in communities like IUI and CHI… Example: Crayons, 2003 Fails & Olson, Crayons, IUI (2003)
  • 25. Traditional ML Amershi et al. 2014: Power to the People • ML works with experts on feature selection / data representation • Use ML, build predictions, go back to expert for validation • Long and slow cycle, big steps, • exploration is mostly on the side of the ML/ data scientist
  • 26. Interactive ML • User directly interacts with the model • Incremental but fast updates, small steps, low-cost trial & error • Smaller cycles, gives better understanding what happens • Can be done by low-expertise users • Examples: recommender systems and tools like Crayons Amershi et al. 2014: Power to the People
  • 27. Interface elements of an IML (Dudley 2018, sec. 4) ‘These elements represent distinct functionality that the interface must typically support to deliver a comprehensive IML workflow’ Not necessarily physically distinct: e.g., crayons merges sample review and feedback assignment
  • 29. Key Solution Principles according to Dudley (2018) Exploit interactivity and promote rich interactions • Interaction for understanding: many UX principles are hard to achieve in IML (i.e. direct manipulation principles) • Make the most of the user: balance effort and value of input, avoid repeated requests, provide retrace of steps and undo Engage the user • Provide feedback, show partial predictions, do not ask trivial labeling tasks • might promote users to spend more time and improve the modeling
  • 30. Guidelines for Human-AI interaction Amershi et al., CHI 2019
  • 31. 18 guidelines • UX design process • Brings knowledge from many related fields together • Goes back to earlier classical work: strongly founded in Mixed initiatives work of Horvitz (IUI 1999)
  • 32. “ ” Two example applications of interactive AI / REcSys from my lab that I consider to be Prospective user interfaces
  • 33. Preparing for a marathon Target finish times Not too fast, not too slow Pacing (min/km) strategy Constant ‘flat’ speed is associated with best performance 34 Heleen Rutjes
  • 34. Prediction model for setting a challenging, yet realistic finish time. Model predictions are based on similar runners: If runner *sunglasses* has had similar past performances as runner *hat*, yet has a better Personal Best (PB), than runner *hat* can potentially achieve that too. Approach: ‘case-based reasoning’ (CBR) We asked coaches what aspects they would like to control: - Select similar runners? - Select best races to serve as a case? 35 Research by Barry Smyth: http://medium.com/running-with-data/
  • 35. Making the model interactive Running coaches could indicate for every previous race how ‘representative’ they consider it. By setting the slider, the model prediction was continuously updated. 36
  • 36. Model interactivity increased trust and acceptance Acceptance Coaches showed to be more inclined to accept a model that they could interact with. Trust Model interactivity increased coaches’ perceived competence of the model. 37 “Without my adjustments the model did not make sense, but by eliminating the race from Eindhoven, we’re getting somewhere.” (Coach 53, familiar runner, interactive condition)
  • 37. Coaches improved the accuracy of the model Model accuracy improved by coaches’ interactions (Mean PercentError dropped from 3.14 to 2.33, p = 0.018) What did the coaches adjust? Systematic adjustments More recent races were indicated as more representative. (p<0.001) ‘Anecdotal’ adjustments Based on knowledge of the specific runner, running in general, environmental circumstances, etc. Even when working with unfamiliar runners: 38 “There is clearly something going on with this lady. Maybe she stopped training, or she has a persistent injury?” (Coach 45, unfamiliar runner, non-interactive condition)
  • 38. Music Genre Exploration with mood control and visualizations Work by Yu Liang (IUI 2021)
  • 39. How to better support users to explore a new music genre? 40 [Millecamp, M., Htun, N. N., Jin, Y., & Verbert, K. 2018] [Bostandjiev, S., O’Donovan, J., & Höllerer, T. (2012)] [Andjelkovic, I., et al. 2019], [He, C., Parra, D., & Verbert, K. 2016]
  • 40. Simple bar plot visualization to explain recommendation 41 [Millecamp, et al. 2019] Easy to understand Not very informative: present only the averaged preferences Bar charts
  • 41. More complex contour plot visualization 42 1) Show the relation between the recommendations , users’ current preferences and the new genre 2) Show the preference intensity of users Contour plots Mood control A bit hard to understand
  • 42. Contour plot + Mood control (Most helpful?) 43 Easily see how recommendation changes
  • 43. Contour plot + Mood control (Most helpful?) 44 Easily see how recommendation changes
  • 44. Research questions RQ1: How do different types of visualizations (bar charts/contour plots) influence the perceived helpfulness for new music genre exploration? RQ2: How does mood control improve the perceived helpfulness for new music genre exploration 45
  • 45. Study design 2X2 mixed facotorial design: Mood control: between-subject Visualization: within-subject 46 Interactive Music Genre Exploration with Visualization and Mood Control
  • 46. Measurements • Subjective measures: post-task quesionnaires Perceived helpfulness, perceived control, perceived informativeness and understandability • Objective measures: user-interactions with the system • Musical Sophistication (active engagement & emotional engagement) 47 Interactive Music Genre Exploration with Visualization and Mood Control • Participants: mainly university students • 102 valid reponses Fig. Genre selection frequencies Genres they wanted to explore
  • 47. Which is more helpful? 48 Interactive Music Genre Exploration with Visualization and Mood Control Contour plot (vs bar charts): • More helpful • Total effect: 𝛽 = .378, 𝑠𝑒 = .082, 𝑝 < .001) Control (vs no control): • Seems to be more helpful • Total effect: 𝛽 = .238, 𝑠𝑒 = .123, 𝑝 = .053 (marginal significant) Contour + control: • More helpful • Total effect: 𝛽 = 0.242, 𝑠𝑒 = 0.123, 𝑝 = .049).
  • 48. What we have found…. Good visualization is key for understandability and explainability Contour plot is perceived more helpful than the bar chart • More informative, thus more understandable & helpful • Better mental model? Interaction only helps with good mental model/understanding Mood control itself does not make the system more helpful • paired with the contour plot it benefits the perceived helpfulness mostly due to increased informativeness 49
  • 49. Further work on genre exploration RecSys 2021: the role of default settings on genre selection and exploration: • tradeoff slider: from genre representative to more personalized songs • Defaults had a strong effect on how far users explored… RecSys 2022 (just accepted): Longitudinal study in which they used the same tool for 4 weeks • Default effects fade over the weeks • Users find the tool helpful / keep exploring after 4 weeks • Some actual change in music profile after 6 weeks!
  • 50. Conclusions Two separate worlds: • interactive Machine Learning: interpretability for data scientists • human-AI interaction work focused on the user at CHI, UMAP, IUI (and RecSys) We should learn from each other and bring them more together! Human-AI interaction requires solid understand of mental models, cognitive processes and biases, visualization guidelines and user experience research!