This document summarizes the results of a user experiment that tested how model interpretability affects people's ability to understand and detect mistakes in a machine learning model. Participants were tasked with predicting apartment prices using models of varying transparency. The study found that more transparent models hindered people's ability to correct the model's mistakes, an effect that was reduced by including attention checks. The researchers suggest model transparency can anchor users' judgments and impair oversight of unusual cases.
Model Explanation and Prediction Exploration Using Spark MLDatabricks
Black box models are no longer good enough. As machine learning becomes mainstream, users increasingly demand clarity on why a model makes certain predictions. Explaining linear models is easy but they often don’t provide enough accuracy.
[Video recording available at https://www.youtube.com/playlist?list=PLewjn-vrZ7d3x0M4Uu_57oaJPRXkiS221]
Artificial Intelligence is increasingly playing an integral role in determining our day-to-day experiences. Moreover, with proliferation of AI based solutions in areas such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching. The dominant role played by AI models in these domains has led to a growing concern regarding potential bias in these models, and a demand for model transparency and interpretability. In addition, model explainability is a prerequisite for building trust and adoption of AI systems in high stakes domains requiring reliability and safety such as healthcare and automated transportation, and critical industrial applications with significant economic implications such as predictive maintenance, exploration of natural resources, and climate change modeling.
As a consequence, AI researchers and practitioners have focused their attention on explainable AI to help them better trust and understand models at scale. The challenges for the research community include (i) defining model explainability, (ii) formulating explainability tasks for understanding model behavior and developing solutions for these tasks, and finally (iii) designing measures for evaluating the performance of models in explainability tasks.
In this tutorial, we present an overview of model interpretability and explainability in AI, key regulations / laws, and techniques / tools for providing explainability as part of AI/ML systems. Then, we focus on the application of explainability techniques in industry, wherein we present practical challenges / guidelines for effectively using explainability techniques and lessons learned from deploying explainable models for several web-scale machine learning and data mining applications. We present case studies across different companies, spanning application domains such as search & recommendation systems, hiring, sales, and lending. Finally, based on our experiences in industry, we identify open problems and research directions for the data mining / machine learning community.
Model Explanation and Prediction Exploration Using Spark MLDatabricks
Black box models are no longer good enough. As machine learning becomes mainstream, users increasingly demand clarity on why a model makes certain predictions. Explaining linear models is easy but they often don’t provide enough accuracy.
[Video recording available at https://www.youtube.com/playlist?list=PLewjn-vrZ7d3x0M4Uu_57oaJPRXkiS221]
Artificial Intelligence is increasingly playing an integral role in determining our day-to-day experiences. Moreover, with proliferation of AI based solutions in areas such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching. The dominant role played by AI models in these domains has led to a growing concern regarding potential bias in these models, and a demand for model transparency and interpretability. In addition, model explainability is a prerequisite for building trust and adoption of AI systems in high stakes domains requiring reliability and safety such as healthcare and automated transportation, and critical industrial applications with significant economic implications such as predictive maintenance, exploration of natural resources, and climate change modeling.
As a consequence, AI researchers and practitioners have focused their attention on explainable AI to help them better trust and understand models at scale. The challenges for the research community include (i) defining model explainability, (ii) formulating explainability tasks for understanding model behavior and developing solutions for these tasks, and finally (iii) designing measures for evaluating the performance of models in explainability tasks.
In this tutorial, we present an overview of model interpretability and explainability in AI, key regulations / laws, and techniques / tools for providing explainability as part of AI/ML systems. Then, we focus on the application of explainability techniques in industry, wherein we present practical challenges / guidelines for effectively using explainability techniques and lessons learned from deploying explainable models for several web-scale machine learning and data mining applications. We present case studies across different companies, spanning application domains such as search & recommendation systems, hiring, sales, and lending. Finally, based on our experiences in industry, we identify open problems and research directions for the data mining / machine learning community.
Stochastic computer models, wherein reruns of the code with the exact same inputs does not yield the exact same output, are becoming increasingly commonplace. Effective statistical analysis of such output can be more challenging and more crucial than the statistical analysis of deterministic computer models. Even so, stochastic simulation is currently subject to less statistical research focus.
This talk will outline a review we have been working on, in which we aim to spur additional research on the topic – introducing the objectives; outlining what statistical models currently exist; discussing how one can efficiently use such models to answer key questions about a stochastic computer model, and explaining what challenges currently still exist.
DN18 | A/B Testing: Lessons Learned | Dan McKinley | MailchimpDataconomy Media
Abstract about the Presemtation:
Introducing A/B testing to a large team that has never done it before is a weird and bewildering thing that Dan McKinley has somehow done twice. This has burdened him with many opinions about how to achieve this with minimal wailing and gnashing of teeth.
About the Author:
Dan McKinley is a Co-Founder of Skyliner in Los Angeles. Previously he worked at Stripe and spent nearly 7 years building Etsy, during which he worked on “pretty much every feature and backend facility on the site”. He resides in LA with his wife and son.
Presentation by Céline Deknop of the paper "Advanced Differencing of Legacy Code and Migration Logs" @SATToSE2020 (Virtual event).
Rediffusion of the presentation can be found here : https://www.youtube.com/watch?v=YJxPzWqW9DI&fbclid=IwAR3voPfFsp-ywRUrXOejW4oq8axlFAqbxGidNh2WMEE_VR-pb0diK3Cb05Y (around the 3h mark)
FairBench: A Fairness Assessment Frameworkmaniopas
FairBench is a computational framework that combines many fairness building blocks to create multifaceted analysis of AI systems in a variety of computational environments. It then lets its users programmatically or visualy explore the root causes of discrimination.
Keynote of HOP-Rec @ RecSys 2018
Presenter: Jheng-Hong Yang
These slides aim to be a complementary material for the short paper: HOP-Rec @ RecSys18. It explains the intuition and some abstract idea behind the descriptions and mathematical symbols by illustrating some plots and figures.
How to focus - design your new app in 60 minutes!Zach Pousman
These are the talk slides from "Make it Real" on August 12, 2015. #MakeItReal is Atlanta's meetup focused on app and startup development.
Eureka! You’ve invented a smart idea for a new product or new app. You had that flash of insight, a moment where you saw something that few people know or understand. And it all made perfect sense.
This talk will give you four key ways to focus your efforts and help you to turn your smart idea into a brilliant new digital product. You might not “solve it in the room,” but you’ll have the structure you need to make substantial decisions in under an hour. Whether your product is still a gleam in your eye or you have been working on it for months, this will be a valuable talk and discussion.
In order to transform your idea into a working product, you need clarity: every screen, every moment and every way you’ll make money. Focus is key for lean businesses, so these tools will help you do just that.
1. Text reference, Chapter 6
2. Special case of the general factorial design; k factors, all at two levels
3. The two levels are usually called low and high (they could be either quantitative or qualitative)
4. Very widely used in industrial experimentation
5. Form a basic “building block” for other very useful experimental designs (DNA)
6. Special (short-cut) methods for analysis
7. We will make use of Design-Expert
United Kingdom: +44-1143520021
India: +044 3318-2000
Email: info@statswork.com
Website: www.statswork.com
Data Science Popup Austin: Predicting Customer Behavior & Enhancing Customer ...Domino Data Lab
We will look at ways of applying data science and machine learning to better understand customers and improve their user experience. From a practical industry application perspective, we will discuss the following: measuring popularity, statistical significance in A/B testing, survival analysis, predictive lifetime value and recommendation systems. We will review the concepts and some of the math behind these, while also addressing the real world challenges faced by many of these implementations.
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
Understanding Human Impact: Social and Equity Assessments for AI Technologies
Social and Equity Impact Assessments have broad applications but can be a useful tool to explore and mitigate for Machine Learning fairness issues and can be applied to product specific questions as a way to generate insights and learnings about users, as well as impacts on society broadly as a result of the deployment of new and emerging technologies.
In this presentation, my goal is to advocate for and highlight the need to consult community and external stakeholder engagement to develop a new knowledge base and understanding of the human and social consequences of algorithmic decision making and to introduce principles, methods and process for these types of impact assessments.
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
The Brain’s Guide to Dealing with Context in Language Understanding
Like the visual cortex, the regions of the brain involved in understanding language represent information hierarchically. But whereas the visual cortex organizes things into a spatial hierarchy, the language regions encode information into a hierarchy of timescale. This organization is key to our uniquely human ability to integrate semantic information across narratives. More and more, deep learning-based approaches to natural language understanding embrace models that incorporate contextual information at varying timescales. This has not only led to state-of-the art performance on many difficult natural language tasks, but also to breakthroughs in our understanding of brain activity.
In this talk, we will discuss the important connection between language understanding and context at different timescales. We will explore how different deep learning architectures capture timescales in language and how closely their encodings mimic the brain. Along the way, we will uncover some surprising discoveries about what depth does and doesn’t buy you in deep recurrent neural networks. And we’ll describe a new, more flexible way to think about these architectures and ease design space exploration. Finally, we’ll discuss some of the exciting applications made possible by these breakthroughs.
More Related Content
Similar to Manipulating and measuring model interpretability
Stochastic computer models, wherein reruns of the code with the exact same inputs does not yield the exact same output, are becoming increasingly commonplace. Effective statistical analysis of such output can be more challenging and more crucial than the statistical analysis of deterministic computer models. Even so, stochastic simulation is currently subject to less statistical research focus.
This talk will outline a review we have been working on, in which we aim to spur additional research on the topic – introducing the objectives; outlining what statistical models currently exist; discussing how one can efficiently use such models to answer key questions about a stochastic computer model, and explaining what challenges currently still exist.
DN18 | A/B Testing: Lessons Learned | Dan McKinley | MailchimpDataconomy Media
Abstract about the Presemtation:
Introducing A/B testing to a large team that has never done it before is a weird and bewildering thing that Dan McKinley has somehow done twice. This has burdened him with many opinions about how to achieve this with minimal wailing and gnashing of teeth.
About the Author:
Dan McKinley is a Co-Founder of Skyliner in Los Angeles. Previously he worked at Stripe and spent nearly 7 years building Etsy, during which he worked on “pretty much every feature and backend facility on the site”. He resides in LA with his wife and son.
Presentation by Céline Deknop of the paper "Advanced Differencing of Legacy Code and Migration Logs" @SATToSE2020 (Virtual event).
Rediffusion of the presentation can be found here : https://www.youtube.com/watch?v=YJxPzWqW9DI&fbclid=IwAR3voPfFsp-ywRUrXOejW4oq8axlFAqbxGidNh2WMEE_VR-pb0diK3Cb05Y (around the 3h mark)
FairBench: A Fairness Assessment Frameworkmaniopas
FairBench is a computational framework that combines many fairness building blocks to create multifaceted analysis of AI systems in a variety of computational environments. It then lets its users programmatically or visualy explore the root causes of discrimination.
Keynote of HOP-Rec @ RecSys 2018
Presenter: Jheng-Hong Yang
These slides aim to be a complementary material for the short paper: HOP-Rec @ RecSys18. It explains the intuition and some abstract idea behind the descriptions and mathematical symbols by illustrating some plots and figures.
How to focus - design your new app in 60 minutes!Zach Pousman
These are the talk slides from "Make it Real" on August 12, 2015. #MakeItReal is Atlanta's meetup focused on app and startup development.
Eureka! You’ve invented a smart idea for a new product or new app. You had that flash of insight, a moment where you saw something that few people know or understand. And it all made perfect sense.
This talk will give you four key ways to focus your efforts and help you to turn your smart idea into a brilliant new digital product. You might not “solve it in the room,” but you’ll have the structure you need to make substantial decisions in under an hour. Whether your product is still a gleam in your eye or you have been working on it for months, this will be a valuable talk and discussion.
In order to transform your idea into a working product, you need clarity: every screen, every moment and every way you’ll make money. Focus is key for lean businesses, so these tools will help you do just that.
1. Text reference, Chapter 6
2. Special case of the general factorial design; k factors, all at two levels
3. The two levels are usually called low and high (they could be either quantitative or qualitative)
4. Very widely used in industrial experimentation
5. Form a basic “building block” for other very useful experimental designs (DNA)
6. Special (short-cut) methods for analysis
7. We will make use of Design-Expert
United Kingdom: +44-1143520021
India: +044 3318-2000
Email: info@statswork.com
Website: www.statswork.com
Data Science Popup Austin: Predicting Customer Behavior & Enhancing Customer ...Domino Data Lab
We will look at ways of applying data science and machine learning to better understand customers and improve their user experience. From a practical industry application perspective, we will discuss the following: measuring popularity, statistical significance in A/B testing, survival analysis, predictive lifetime value and recommendation systems. We will review the concepts and some of the math behind these, while also addressing the real world challenges faced by many of these implementations.
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
Understanding Human Impact: Social and Equity Assessments for AI Technologies
Social and Equity Impact Assessments have broad applications but can be a useful tool to explore and mitigate for Machine Learning fairness issues and can be applied to product specific questions as a way to generate insights and learnings about users, as well as impacts on society broadly as a result of the deployment of new and emerging technologies.
In this presentation, my goal is to advocate for and highlight the need to consult community and external stakeholder engagement to develop a new knowledge base and understanding of the human and social consequences of algorithmic decision making and to introduce principles, methods and process for these types of impact assessments.
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
The Brain’s Guide to Dealing with Context in Language Understanding
Like the visual cortex, the regions of the brain involved in understanding language represent information hierarchically. But whereas the visual cortex organizes things into a spatial hierarchy, the language regions encode information into a hierarchy of timescale. This organization is key to our uniquely human ability to integrate semantic information across narratives. More and more, deep learning-based approaches to natural language understanding embrace models that incorporate contextual information at varying timescales. This has not only led to state-of-the art performance on many difficult natural language tasks, but also to breakthroughs in our understanding of brain activity.
In this talk, we will discuss the important connection between language understanding and context at different timescales. We will explore how different deep learning architectures capture timescales in language and how closely their encodings mimic the brain. Along the way, we will uncover some surprising discoveries about what depth does and doesn’t buy you in deep recurrent neural networks. And we’ll describe a new, more flexible way to think about these architectures and ease design space exploration. Finally, we’ll discuss some of the exciting applications made possible by these breakthroughs.
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
Applying Computer Vision to Reduce Contamination in the Recycling Stream
With China’s recent refusal of most foreign recyclables, North American waste haulers are scrambling to figure out how to make on-shore recycling cost-effective in order to continue providing recycling services. Recyclables that were once being shipped to China for manual sorting are now primarily being redirected to landfills or incinerators. Without a solution, a nearly $5 billion annual recycling market could come to a halt.
Purity in the recycling stream is key to this effort as contaminants in the stream can increase the cost of operations, damage equipment and reduce the ability to create pure commodities suitable for creating recycled goods. This market disruption as a result of China’s new regulations, however, provides us the chance to re-examine and improve our current disposal & collection habits with modern monitoring & artificial intelligence technology.
Using images from our in-dumpster cameras, Compology has developed an ML-based process that helps identify, measure and alert for contaminants in recycling containers before they are picked-up, helping keep the recycling stream clean.
Our convolutional neural network flags potential instances of contamination inside a dumpster, enabling garbage haulers to know which containers have the wrong type of material inside. This allows them to provide targeted, timely education, and when appropriate, assess fines, to improve recycling compliance at the businesses and residences they serve, helping keep recycling services financially viable.
In this presentation, we will walk through our ML-based contamination measurement and scoring process by showing how Waste Management, a national waste hauler, has experienced 57% contamination reduction in nearly 2,000 containers over six months, This progress shows significant strides towards financially viable recycling services.
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
Quantum Computing: a Treasure Hunt, not a Gold Rush
Quantum computers promise a significant step up in computational power over conventional computers, but also suffer a number of counterintuitive limitations --- both in their computational model and in leading lab implementations. In this talk, we review how quantum computers compete with conventional computers and how conventional computers try to hold their ground. Then we outline what stands in the way of successful quantum ML applications.
Josh Wills - Data Labeling as Religious ExperienceMLconf
Data Labeling as Religious Experience
One of the most common places to deploy a production machine learning systems is as a replacement for a legacy rules-based system that is having a hard time keeping up with new edge cases and requirements. I'll be walking through the process and tooling we used to help us design, train, and deploy a model to replace a set of static rules we had for handling invite spam at Slack, talk about what we learned, and discuss some problems to solve in order to make these migrations easier for everyone.
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
Project GaitNet: Ushering in the ImageNet moment for human Gait kinematics
The emergence of the upright human bipedal gait can be traced back 4 to 2.8 million years ago, to the now extinct hominin Australopithecus afarensis. Fine grained analysis of gait using the modern MEMS sensors found on all smartphones not just reveals a lot about the person’s orthopedic and neuromuscular health status, but also has enough idiosyncratic clues that it can be harnessed as a passive biometric. While there were many siloed attempts made by the machine learning community to model Bipedal Gait sensor data, these were done with small datasets oft collected in restricted academic environs. In this talk, we will introduce the ImageNet moment for human gait analysis by presenting 'Project GaitNet', the largest ever planet-sized motion sensor based human bipedal gait dataset ever curated. We’ll also present the associated state-of-the-art results in classifying humans harnessing novel deep neural architectures and the related success stories we have enjoyed in transfer-learning into disparate domains of human kinematics analysis.
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
Machine Learning Methods in Detecting Alzheimer’s Disease from Speech and Language
Alzheimer's disease affects millions of people worldwide, and it is important to predict the disease as early and as accurate as possible. In this talk, I will discuss development of novel ML models that help classifying healthy people from those who develop Alzheimer's, using short samples of human speech. As an input to the model, features of different modalities are extracted from speech audio samples and transcriptions: (1) syntactic measures, such as e.g. production rules extracted from syntactic parse trees, (2) lexical measures, such as e.g. features of lexical richness and complexity and lexical norms, and (3) acoustic measures, such as e.g. standard Mel-frequency cepstral coefficients. I will present the ML model that detects cognitive impairment by reaching agreement among modalities. The resulting model is able to achieve state of the art performance in both supervised and semi-supervised manner, using manual transcripts of human speech. Additionally, I will discuss potential limitations of any fully-automated speech-based Alzheimer's disease detection model, focusing mostly on the analysis of the impact of a not-so-accurate automatic speech recognition (ASR) on the classification performance. To illustrate this, I will present the experiments with controlled amounts of artificially generated ASR errors and explain how the deletion errors affect Alzheimer's detection performance the most, due to their impact on the features of syntactic and lexical complexity.
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
Optimized Image Classification on the Cheap
In this talk, we anchor on building an image classifier trained on the Stanford Cars dataset to evaluate two approaches to transfer learning -fine tuning and feature extraction- and the impact of hyperparameter optimization on these techniques. Once we define the most performant transfer learning technique for Stanford Cars, we will double the size of the dataset through image augmentation to boost the classifier’s performance. We will use Bayesian optimization to learn the hyperparameters associated with image transformations using the downstream image classifier’s performance as the guide. In conjunction with model performance, we will also focus on the features of these augmented images and the downstream implications for our image classifier.
To both maximize model performance on a budget and explore the impact of optimization on these methods, we apply a particularly efficient implementation of Bayesian optimization to each of these architectures in this comparison. Our goal is to draw on a rigorous set of experimental results that can help us answer the question: how can resource-constrained teams make trade-offs between efficiency and effectiveness using pre-trained models?
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
The Importance of Modeling Data Collection
Data sets used in machine learning are often collected in a systematically biased way - certain data points are more likely to be collected than others. We call this "observation bias". For example, in health care, we are more likely to see lab tests when the patient is feeling unwell than otherwise. Failing to account for observation bias can, of course, result in poor predictions on new data. By contrast, properly accounting for this bias allows us to make better use of the data we do have.
In this presentation, we discuss practical and theoretical approaches to dealing with observation bias. When the nature of the bias is known, there are simple adjustments we can make to nonparametric function estimation techniques, such as Gaussian Process models. We also discuss the scenario where the data collection model is unknown. In this case, there are steps we can take to estimate it from observed data. Finally, we demonstrate that having a small subset of data points that are known to be collected at random - that is, in an unbiased way - can vastly improve our ability to account for observation bias in the rest of the data set.
My hope is that attendees of this presentation will be aware of the perils of observation bias in their own work, and be equipped with tools to address it.
The Uncanny Valley of ML
Every so often, the conundrum of the Uncanny Valley re-emerges as advanced technologies evolve from clearly experimental products to refined accepted technologies. We have seen its effects in robotics, computer graphics, and page load times. The debate of how to handle the new technology detracts from its benefits. When machine learning is added to human decision systems a similar effect can be measured in increased response time and decreased accuracy. These systems include radiology, judicial assignments, bus schedules, housing prices, power grids and a growing variety of applications. Unfortunately, the Uncanny Valley of ML can be hard to detect in these systems and can lead to degraded system performance when ML is introduced, at great expense. Here, we'll introduce key design principles for introducing ML into human decision systems to navigate around the Uncanny Valley and avoid its pitfalls.
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
Deep Learning Architectures for Semantic Relation Detection Tasks
Recognizing and distinguishing specific semantic relations from other types of semantic relations is an essential part of language understanding systems. Identifying expressions with similar and contrasting meanings is valuable for NLP systems which go beyond recognizing semantic relatedness and require to identify specific semantic relations. In this talk, I will first present novel techniques for creating labelled datasets required for training deep learning models for classifying semantic relations between phrases. I will further present various neural network architectures that integrate morphological features into integrated path-based and distributional relation detection algorithms and demonstrate that this model outperforms state-of-the-art models in distinguishing semantic relations and is capable of efficiently handling multi-word expressions.
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
Building an Incrementally Trained, Local Taste Aware, Global Deep Learned Recommender System Model
At Netflix, our main goal is to maximize our members’ enjoyment of the selected show by minimizing the amount of time it takes for them to find it. We try to achieve this goal by personalizing almost all the aspects of our product -- from what shows to recommend, to how to present these shows and construct their home-pages to what images to select per show, among many other things. Everything is recommendations for us and as an applied Machine Learning group, we spend our time building models for personalization that will eventually increase the joy and satisfaction of our members. In this talk we will primarily focus our attention on a) making a global deep learned recommender model that is regional tastes and popularity aware and b) adapting this model to changing taste preferences as well as dynamic catalog availability.
We will first go through some standard recommender system models that use Matrix Factorization and Topic Models and then compare and contrast them with more powerful and higher capacity deep learning based models such as sequence models that use recurrent neural networks. We will show what it entails to build a global model that is aware of regional taste preferences and catalog availability. We will show how models that are built on simple Maximum Likelihood principle fail to do that. We will then describe one solution that we have employed in order to enable the global deep learned models to focus their attention on capturing regional taste preferences and changing catalog.In the latter half of the talk, we will discuss how we do incremental learning of deep learned recommender system models. Why do we need to do that ? Everything changes with time. Users’ tastes change with time. What’s available on Netflix and what’s popular also change over time. Therefore, updating or improving recommendation systems over time is necessary to bring more joy to users. In addition to how we apply incremental learning, we will discuss some of the challenges we face involving large-scale data preparation, infrastructure setup for incremental model training as well as pipeline scheduling. The incremental training enables us to serve fresher models trained on fresher and larger amounts of data. This helps our recommender system to nicely and quickly adapt to catalog and users’ taste changes, and improve overall performance.
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
Vito Ostuni - The Voice: New Challenges in a Zero UI World
The adoption of voice-enabled devices has seen an explosive growth in the last few years and music consumption is among the most popular use cases. Music personalization and recommendation plays a major role at Pandora in providing a daily delightful listening experience for millions of users. In turn, providing the same perfectly tailored listening experience through these novel voice interfaces brings new interesting challenges and exciting opportunities. In this talk we will describe how we apply personalization and recommendation techniques in three common voice scenarios which can be defined in terms of request types: known-item, thematic, and broad open-ended. We will describe how we use deep learning slot filling techniques and query classification to interpret the user intent and identify the main concepts in the query.
We will also present the differences and challenges regarding evaluation of voice powered recommendation systems. Since pure voice interfaces do not contain visual UI elements, relevance labels need to be inferred through implicit actions such as play time, query reformulations or other types of session level information. Another difference is that while the typical recommendation task corresponds to recommending a ranked list of items, a voice play request translates into a single item play action. Thus, some considerations about closed feedback loops need to be made. In summary, improving the quality of voice interactions in music services is a relatively new challenge and many exciting opportunities for breakthroughs still remain. There are many new aspects of recommendation system interfaces to address to bring a delightful and effortless experience for voice users. We will share a few open challenges to solve for the future.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
Manipulating and measuring model interpretability
1. Manipulating and Measuring Model Interpretability
Microsoft Research NYC
Forough Poursabzi-
Sangdeh
Dan Goldstein Jake Hofman Jenn Wortman
Vaughan
Hanna Wallach
6. DIFFERENT SCENARIOS, DIFFERENT PEOPLE, DIFFERENT NEEDS
u = k(x, u)
Explain a
prediction
Understand
model
Make better
decisions
Debug
model
De-bias
model
Inspire trust
CEOs Approach A
Data
scientists
Approach C
Laypeople
Regulators Approach B
9. Interpretability
INTERPRETABILITY AS A LATENT PROPERTY
number of features
linearity
black-box vs. clear
visualizations
types of features
… …
trust
ability to debug
ability to simulate
ability to explain
ability to detect mistakes
10. Interpretability
INTERPRETABILITY AS A LATENT PROPERTY
number of features
linearity
black-box vs. clear
visualizations
types of features
…
properties of model and
system design
…
trust
ability to debug
ability to simulate
ability to explain
ability to detect mistakes
11. Interpretability
INTERPRETABILITY AS A LATENT PROPERTY
number of features
linearity
black-box vs. clear
visualizations
types of features
…
properties of human
behavior
properties of model and
system design
…
trust
ability to debug
ability to simulate
ability to explain
ability to detect mistakes
12. Interpretability
INTERPRETABILITY AS A LATENT PROPERTY
number of features
linearity
black-box vs. clear
visualizations
types of features
…
properties of human
behavior
We need interdisciplinary approaches
properties of model and
system design
…
trust
ability to debug
ability to simulate
ability to explain
ability to detect mistakes
13. Interpretability
FOCUS ON LAYPEOPLE
number of features
linearity
black-box vs. clear
visualizations
types of features
…
properties of human
behavior
Randomized human-subject experiments
properties of model and
system design
…
trust
ability to debug
ability to simulate
ability to explain
ability to detect mistakes
14. USER EXPERIMENT, PREDICTIVE TASK
u = k(x, u)
• Predict the price of apartments in NYC with the help of a model
23. USER INTERFACE AND INTERACTIONS
u = k(x, u)
• Training phase: participants get familiar with the model
• Testing phase step 1: simulate the model’s prediction
Simulate the model
24. USER INTERFACE AND INTERACTIONS
u = k(x, u)
• Testing phase step 2: observe the model’s prediction and guess the price
Predict actual selling price
25. PRE-REGISTERED HYPOTHESES
u = k(x, u)
• CLEAR-2 feature will be easiest for participants to simulate
• Participants will trust CLEAR-2 feature more than BB-8 feature
• Participants’ behaviors will vary when they see unusual examples where the model makes
inaccurate predictions
https://aspredicted.org/xy5s6.pdf
26. SIMULATION ERROR
u = k(x, u)
CLEAR-2 feature will be easiest for participants to simulate
27. SIMULATION ERROR
u = k(x, u)
CLEAR-2 feature will be easiest for participants to simulate
m
$um
28. SIMULATION ERROR
u = k(x, u)
CLEAR-2 feature will be easiest for participants to simulate
Simulation error
CLEAR−2 CLEAR−8 BB−2 BB−8
$0k
$100k
$200k
Meansimulationerror
m
$um
29. SIMULATION ERROR
u = k(x, u)
CLEAR-2 feature will be easiest for participants to simulate
Simulation error
CLEAR−2 CLEAR−8 BB−2 BB−8
$0k
$100k
$200k
Meansimulationerror
m
$um
30. TRUST (DEVIATION FROM THE MODEL)
Participants will trust CLEAR-2 feature more than BB-8 feature
31. TRUST (DEVIATION FROM THE MODEL)
Participants will trust CLEAR-2 feature more than BB-8 feature
m
$ua
32. Deviation
CLEAR−2 CLEAR−8 BB−2 BB−8
$0k
$50k
$100k
$150k
Meandeviationfromthemodel
TRUST (DEVIATION FROM THE MODEL)
Participants will trust CLEAR-2 feature more than BB-8 feature
m
$ua
33. Deviation
CLEAR−2 CLEAR−8 BB−2 BB−8
$0k
$50k
$100k
$150k
Meandeviationfromthemodel
TRUST (DEVIATION FROM THE MODEL)
Participants will trust CLEAR-2 feature more than BB-8 feature
m
$ua
37. DETECTION OF MISTAKES
Participants’ behaviors will vary when they see unusual examples where the model makes
inaccurate predictions
Apartment 12: 1 bed, 3 bath
CLEAR−2 CLEAR−8 BB−2 BB−8
$0k
$50k
$100k
$150k
$200k
$250k
$300k
Meandeviationfromthemodel
forapartment12 m
$ua
38. DETECTION OF MISTAKES
Participants’ behaviors will vary when they see unusual examples where the model makes
inaccurate predictions
Apartment 12: 1 bed, 3 bath
CLEAR−2 CLEAR−8 BB−2 BB−8
$0k
$50k
$100k
$150k
$200k
$250k
$300k
Meandeviationfromthemodel
forapartment12 m
$ua
When participants see unusual examples, they are less likely to correct inaccurate
predictions made by clear models than black-box models
45. USER INTERFACE AND INTERACTIONS
u = k(x, u)
• We remove potential anchors
46. PRE-REGISTERED HYPOTHESES
u = k(x, u)
• Explicit attention checks on unusual inputs will affect participants’ abilities in detecting
model’s mistakes
• Model transparency affects participants’ abilities in detecting model’s mistakes, both with
and without attention checks
https://aspredicted.org/5xy8y.pdf
47. Apartment 6: 1 bed, 3 bath, 726 sq ft Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantpredictio
Model's prediction CLEAR BB
DETECTION OF MISTAKES
Apartment 6: 1 bed, 3 bath Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantprediction
Model's prediction CLEAR BB
48. Apartment 6: 1 bed, 3 bath, 726 sq ft Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantpredictio
Model's prediction CLEAR BB
DETECTION OF MISTAKES
• No attention checks: clear models lower users’ ability to correct model’s
mistakes
Apartment 6: 1 bed, 3 bath Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantprediction
Model's prediction CLEAR BB
49. Apartment 6: 1 bed, 3 bath, 726 sq ft Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantpredictio
Model's prediction CLEAR BB
DETECTION OF MISTAKES
• No attention checks: clear models lower users’ ability to correct model’s
mistakes
Apartment 6: 1 bed, 3 bath Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantprediction
Model's prediction CLEAR BB
50. Apartment 6: 1 bed, 3 bath, 726 sq ft Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantpredictio
Model's prediction CLEAR BB
DETECTION OF MISTAKES
• No attention checks: clear models lower users’ ability to correct model’s
mistakes
Apartment 6: 1 bed, 3 bath Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantprediction
Model's prediction CLEAR BB
51. Apartment 6: 1 bed, 3 bath, 726 sq ft Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantpredictio
Model's prediction CLEAR BB
DETECTION OF MISTAKES
• Attention checks improve users’ ability to correct model’s mistakes
• No attention checks: clear models lower users’ ability to correct model’s
mistakes
Apartment 6: 1 bed, 3 bath Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantprediction
Model's prediction CLEAR BB
52. Apartment 6: 1 bed, 3 bath, 726 sq ft Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantpredictio
Model's prediction CLEAR BB
DETECTION OF MISTAKES
• Attention checks improve users’ ability to correct model’s mistakes
• No attention checks: clear models lower users’ ability to correct model’s
mistakes
Apartment 6: 1 bed, 3 bath Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantprediction
Model's prediction CLEAR BB
53. Apartment 6: 1 bed, 3 bath, 726 sq ft Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantpredictio
Model's prediction CLEAR BB
DETECTION OF MISTAKES
• Attention checks improve users’ ability to correct model’s mistakes
• No attention checks: clear models lower users’ ability to correct model’s
mistakes
Apartment 6: 1 bed, 3 bath Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantprediction
Model's prediction CLEAR BB
54. Apartment 6: 1 bed, 3 bath, 726 sq ft Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantpredictio
Model's prediction CLEAR BB
DETECTION OF MISTAKES
• Attention checks improve users’ ability to correct model’s mistakes
• No attention checks: clear models lower users’ ability to correct model’s
mistakes
• With attention checks, there is no difference between clear and black-box
Apartment 6: 1 bed, 3 bath Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantprediction
Model's prediction CLEAR BB
55. Apartment 6: 1 bed, 3 bath, 726 sq ft Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantpredictio
Model's prediction CLEAR BB
DETECTION OF MISTAKES
• Attention checks improve users’ ability to correct model’s mistakes
• No attention checks: clear models lower users’ ability to correct model’s
mistakes
• With attention checks, there is no difference between clear and black-box
Apartment 6: 1 bed, 3 bath Apartment 8: 1 bed, 3 bath, 350 sq ft
No attention
check
With attention
check
No attention
check
With attention
check
$0M
$0.5M
$1M
$1.5M
Meanparticipantprediction
Model's prediction CLEAR BB
56. SUMMARY OF RESULTS
u = k(x, u)
• A clear model with a small number of features is easier for participants to simulate
- People have a better understanding of simple and transparent models
• No significant difference in participants’ trust in the model
- Contrary to intuition, people do not necessarily trust simple and transparent models
more
• Participants were less able to correct inaccurate predictions of a clear model than a black-
box model
- Too much transparency can be harmful
- Design implications (e.g., highlighting unusual inputs, display model internals on
demand)
57. • Interpretability is not a purely computational problem
- We need interdisciplinary research to understand interpretability
• Our surprising results underscore that interpretability research is much more complicated
- We need more empirical studies
- Other scenarios, domains, models, factors, outcomes
TAKEAWAYS
58. u = k(x, u)
https://csel.cs.colorado.edu/~fopo5620/
forough.poursabzi@microsoft.com
Thanks!