The document discusses explainability and bias in machine learning/AI models. It covers several topics:
1. Why explainability of models is important, including for laypeople using models and potential legal needs for explanations of decisions.
2. Methods for explainability including using interpretable models directly and post-hoc explainability methods like LIME and SHAP which provide feature attributions.
3. Issues with bias in machine learning models and different definitions of fairness. It also discusses techniques for measuring and mitigating bias, such as reweighting data or using adversarial learning.
[Video recording available at https://www.youtube.com/playlist?list=PLewjn-vrZ7d3x0M4Uu_57oaJPRXkiS221]
Artificial Intelligence is increasingly playing an integral role in determining our day-to-day experiences. Moreover, with proliferation of AI based solutions in areas such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching. The dominant role played by AI models in these domains has led to a growing concern regarding potential bias in these models, and a demand for model transparency and interpretability. In addition, model explainability is a prerequisite for building trust and adoption of AI systems in high stakes domains requiring reliability and safety such as healthcare and automated transportation, and critical industrial applications with significant economic implications such as predictive maintenance, exploration of natural resources, and climate change modeling.
As a consequence, AI researchers and practitioners have focused their attention on explainable AI to help them better trust and understand models at scale. The challenges for the research community include (i) defining model explainability, (ii) formulating explainability tasks for understanding model behavior and developing solutions for these tasks, and finally (iii) designing measures for evaluating the performance of models in explainability tasks.
In this tutorial, we present an overview of model interpretability and explainability in AI, key regulations / laws, and techniques / tools for providing explainability as part of AI/ML systems. Then, we focus on the application of explainability techniques in industry, wherein we present practical challenges / guidelines for effectively using explainability techniques and lessons learned from deploying explainable models for several web-scale machine learning and data mining applications. We present case studies across different companies, spanning application domains such as search & recommendation systems, hiring, sales, and lending. Finally, based on our experiences in industry, we identify open problems and research directions for the data mining / machine learning community.
Introductory presentation to Explainable AI, defending its main motivations and importance. We describe briefly the main techniques available in March 2020 and share many references to allow the reader to continue his/her studies.
Explainable AI (XAI) is becoming Must-Have NFR for most AI enabled product or solution deployments. Keen to know viewpoints and collaboration opportunities.
An Introduction to XAI! Towards Trusting Your ML Models!Mansour Saffar
Machine learning (ML) is currently disrupting almost every industry and is being used as the core component in many systems. The decisions made by these systems may have a great impact on society and specific individuals and thus the decision-making process has to be clear and explainable so humans can trust it. Explainable AI (XAI) is a rather new field in ML in which researchers try to develop models that are able to explain the decision-making process behind ML models. In this talk, we'll learn about the fundamentals of XAI and discuss why we need to start to integrate XAI with our ML models!
Presented in Edmonton DataScience Meetup on October 2nd, 2019. Learn more: https://youtu.be/gEkPXOsDt_w
Artificial Intelligence is increasingly playing an integral role in determining our day-to-day experiences. Moreover, with proliferation of AI based solutions in areas such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching. The dominant role played by AI models in these domains has led to a growing concern regarding potential bias in these models, and a demand for model transparency and interpretability. In addition, model explainability is a prerequisite for building trust and adoption of AI systems in high stakes domains requiring reliability and safety such as healthcare and automated transportation, and critical industrial applications with significant economic implications such as predictive maintenance, exploration of natural resources, and climate change modeling.
As a consequence, AI researchers and practitioners have focused their attention on explainable AI to help them better trust and understand models at scale. The challenges for the research community include (i) defining model explainability, (ii) formulating explainability tasks for understanding model behavior and developing solutions for these tasks, and finally (iii) designing measures for evaluating the performance of models in explainability tasks.
In this tutorial, we present an overview of model interpretability and explainability in AI, key regulations / laws, and techniques / tools for providing explainability as part of AI/ML systems. Then, we focus on the application of explainability techniques in industry, wherein we present practical challenges / guidelines for effectively using explainability techniques and lessons learned from deploying explainable models for several web-scale machine learning and data mining applications. We present case studies across different companies, spanning application domains such as search & recommendation systems, sales, lending, and fraud detection. Finally, based on our experiences in industry, we identify open problems and research directions for the data mining / machine learning community.
Spark 2019: Equifax's SVP Data & Analytics, Peter Maynard, discusses the notion (and importance) of explainable AI in the financial services sector. He looks at the work Equifax have done to crack open the black box by creating patented AI technology that helps companies make smarter, explainable decisions using AI.
Slide for Arithmer Seminar given by Dr. Daisuke Sato (Arithmer) at Arithmer inc.
The topic is on "explainable AI".
"Arithmer Seminar" is weekly held, where professionals from within and outside our company give lectures on their respective expertise.
The slides are made by the lecturer from outside our company, and shared here with his/her permission.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
[Video recording available at https://www.youtube.com/playlist?list=PLewjn-vrZ7d3x0M4Uu_57oaJPRXkiS221]
Artificial Intelligence is increasingly playing an integral role in determining our day-to-day experiences. Moreover, with proliferation of AI based solutions in areas such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching. The dominant role played by AI models in these domains has led to a growing concern regarding potential bias in these models, and a demand for model transparency and interpretability. In addition, model explainability is a prerequisite for building trust and adoption of AI systems in high stakes domains requiring reliability and safety such as healthcare and automated transportation, and critical industrial applications with significant economic implications such as predictive maintenance, exploration of natural resources, and climate change modeling.
As a consequence, AI researchers and practitioners have focused their attention on explainable AI to help them better trust and understand models at scale. The challenges for the research community include (i) defining model explainability, (ii) formulating explainability tasks for understanding model behavior and developing solutions for these tasks, and finally (iii) designing measures for evaluating the performance of models in explainability tasks.
In this tutorial, we present an overview of model interpretability and explainability in AI, key regulations / laws, and techniques / tools for providing explainability as part of AI/ML systems. Then, we focus on the application of explainability techniques in industry, wherein we present practical challenges / guidelines for effectively using explainability techniques and lessons learned from deploying explainable models for several web-scale machine learning and data mining applications. We present case studies across different companies, spanning application domains such as search & recommendation systems, hiring, sales, and lending. Finally, based on our experiences in industry, we identify open problems and research directions for the data mining / machine learning community.
[Video recording available at https://www.youtube.com/playlist?list=PLewjn-vrZ7d3x0M4Uu_57oaJPRXkiS221]
Artificial Intelligence is increasingly playing an integral role in determining our day-to-day experiences. Moreover, with proliferation of AI based solutions in areas such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching. The dominant role played by AI models in these domains has led to a growing concern regarding potential bias in these models, and a demand for model transparency and interpretability. In addition, model explainability is a prerequisite for building trust and adoption of AI systems in high stakes domains requiring reliability and safety such as healthcare and automated transportation, and critical industrial applications with significant economic implications such as predictive maintenance, exploration of natural resources, and climate change modeling.
As a consequence, AI researchers and practitioners have focused their attention on explainable AI to help them better trust and understand models at scale. The challenges for the research community include (i) defining model explainability, (ii) formulating explainability tasks for understanding model behavior and developing solutions for these tasks, and finally (iii) designing measures for evaluating the performance of models in explainability tasks.
In this tutorial, we present an overview of model interpretability and explainability in AI, key regulations / laws, and techniques / tools for providing explainability as part of AI/ML systems. Then, we focus on the application of explainability techniques in industry, wherein we present practical challenges / guidelines for effectively using explainability techniques and lessons learned from deploying explainable models for several web-scale machine learning and data mining applications. We present case studies across different companies, spanning application domains such as search & recommendation systems, hiring, sales, and lending. Finally, based on our experiences in industry, we identify open problems and research directions for the data mining / machine learning community.
Introductory presentation to Explainable AI, defending its main motivations and importance. We describe briefly the main techniques available in March 2020 and share many references to allow the reader to continue his/her studies.
Explainable AI (XAI) is becoming Must-Have NFR for most AI enabled product or solution deployments. Keen to know viewpoints and collaboration opportunities.
An Introduction to XAI! Towards Trusting Your ML Models!Mansour Saffar
Machine learning (ML) is currently disrupting almost every industry and is being used as the core component in many systems. The decisions made by these systems may have a great impact on society and specific individuals and thus the decision-making process has to be clear and explainable so humans can trust it. Explainable AI (XAI) is a rather new field in ML in which researchers try to develop models that are able to explain the decision-making process behind ML models. In this talk, we'll learn about the fundamentals of XAI and discuss why we need to start to integrate XAI with our ML models!
Presented in Edmonton DataScience Meetup on October 2nd, 2019. Learn more: https://youtu.be/gEkPXOsDt_w
Artificial Intelligence is increasingly playing an integral role in determining our day-to-day experiences. Moreover, with proliferation of AI based solutions in areas such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching. The dominant role played by AI models in these domains has led to a growing concern regarding potential bias in these models, and a demand for model transparency and interpretability. In addition, model explainability is a prerequisite for building trust and adoption of AI systems in high stakes domains requiring reliability and safety such as healthcare and automated transportation, and critical industrial applications with significant economic implications such as predictive maintenance, exploration of natural resources, and climate change modeling.
As a consequence, AI researchers and practitioners have focused their attention on explainable AI to help them better trust and understand models at scale. The challenges for the research community include (i) defining model explainability, (ii) formulating explainability tasks for understanding model behavior and developing solutions for these tasks, and finally (iii) designing measures for evaluating the performance of models in explainability tasks.
In this tutorial, we present an overview of model interpretability and explainability in AI, key regulations / laws, and techniques / tools for providing explainability as part of AI/ML systems. Then, we focus on the application of explainability techniques in industry, wherein we present practical challenges / guidelines for effectively using explainability techniques and lessons learned from deploying explainable models for several web-scale machine learning and data mining applications. We present case studies across different companies, spanning application domains such as search & recommendation systems, sales, lending, and fraud detection. Finally, based on our experiences in industry, we identify open problems and research directions for the data mining / machine learning community.
Spark 2019: Equifax's SVP Data & Analytics, Peter Maynard, discusses the notion (and importance) of explainable AI in the financial services sector. He looks at the work Equifax have done to crack open the black box by creating patented AI technology that helps companies make smarter, explainable decisions using AI.
Slide for Arithmer Seminar given by Dr. Daisuke Sato (Arithmer) at Arithmer inc.
The topic is on "explainable AI".
"Arithmer Seminar" is weekly held, where professionals from within and outside our company give lectures on their respective expertise.
The slides are made by the lecturer from outside our company, and shared here with his/her permission.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
[Video recording available at https://www.youtube.com/playlist?list=PLewjn-vrZ7d3x0M4Uu_57oaJPRXkiS221]
Artificial Intelligence is increasingly playing an integral role in determining our day-to-day experiences. Moreover, with proliferation of AI based solutions in areas such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching. The dominant role played by AI models in these domains has led to a growing concern regarding potential bias in these models, and a demand for model transparency and interpretability. In addition, model explainability is a prerequisite for building trust and adoption of AI systems in high stakes domains requiring reliability and safety such as healthcare and automated transportation, and critical industrial applications with significant economic implications such as predictive maintenance, exploration of natural resources, and climate change modeling.
As a consequence, AI researchers and practitioners have focused their attention on explainable AI to help them better trust and understand models at scale. The challenges for the research community include (i) defining model explainability, (ii) formulating explainability tasks for understanding model behavior and developing solutions for these tasks, and finally (iii) designing measures for evaluating the performance of models in explainability tasks.
In this tutorial, we present an overview of model interpretability and explainability in AI, key regulations / laws, and techniques / tools for providing explainability as part of AI/ML systems. Then, we focus on the application of explainability techniques in industry, wherein we present practical challenges / guidelines for effectively using explainability techniques and lessons learned from deploying explainable models for several web-scale machine learning and data mining applications. We present case studies across different companies, spanning application domains such as search & recommendation systems, hiring, sales, and lending. Finally, based on our experiences in industry, we identify open problems and research directions for the data mining / machine learning community.
It's a well-known fact that the best explanation of a simple model is the model itself. But often we use complex models, such as ensemble methods or deep networks, so we cannot use the original model as its own best explanation because it is not easy to understand.
In the context of this topic, we will discuss how methods for interpreting model predictions work and will try to understand practical value of these methods.
Explainable Artificial Intelligence (XAI)
Presented at Lightning Talk session at ICACCI'18 on 20th September 208
An Explainable AI (XAI) or Transparent AI is an artificial intelligence (AI) whose actions can be easily understood by humans. It contrasts with the concept of the "black box" in machine learning, meaning the "interpretability" of the workings of complex algorithms, where even their designers cannot explain why the AI arrived at a specific decision.
https://en.wikipedia.org/wiki/Explainable_Artificial_Intelligence
Artificial Intelligence is increasingly playing an integral role in determining our day-to-day experiences. Moreover, with proliferation of AI based solutions in areas such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching. The dominant role played by AI models in these domains has led to a growing concern regarding potential bias in these models, and a demand for model transparency and interpretability. In addition, model explainability is a prerequisite for building trust and adoption of AI systems in high stakes domains requiring reliability and safety such as healthcare and automated transportation, as well as critical industrial applications with significant economic implications such as predictive maintenance, exploration of natural resources, and climate change modeling.
As a consequence, AI researchers and practitioners have focused their attention on explainable AI to help them better trust and understand models at scale. The challenges for the research community include (i) defining model explainability, (ii) formulating explainability tasks for understanding model behavior and developing solutions for these tasks, and finally (iii) designing measures for evaluating the performance of models in explainability tasks.
In this tutorial, we will first motivate the need for model interpretability and explainability in AI from societal, legal, customer/end-user, and model developer perspectives. [Note: Due to time constraints, we will not focus on techniques/tools for providing explainability as part of AI/ML systems.] Then, we will focus on the real-world application of explainability techniques in industry, wherein we present practical challenges / implications for using explainability techniques effectively and lessons learned from deploying explainable models for several web-scale machine learning and data mining applications. We will present case studies across different companies, spanning application domains such as search and recommendation systems, sales, lending, and fraud detection. Finally, based on our experiences in industry, we will identify open problems and research directions for the research community.
As AI becomes more and more prevalent in our lives, the decisions it makes for us are becoming more and more impactful on our lives and those of others.
How can we help people to have trust in the models we're building? The field of Explainable AI focuses on making any machine learning model interpretable by non experts.
Explainable AI - making ML and DL models more interpretableAditya Bhattacharya
Abstract –
Although industries have started to adopt AI and Machine Learning in almost every sector to solve complex business problems, but are these models always trustworthy? Machine Learning models are not any oracle but rather are scientific methods and mathematical models which best describes the data. But science is all about explaining complex natural phenomena in the simplest way possible! So, can we make ML and DL models more interpretable, so that any business user can understand these models and trust the results of these models?
In order to find out the answer, please join me in this session, in which I will take about concepts of Explainable AI and discuss its necessity and principles which help us demystify black-box AI models. I will be discussing about popular approaches like Feature Importance, Key Influencers, Decomposition trees used in classical Machine Learning interpretable. We will discuss about various techniques used for Deep Learning model interpretations like Saliency Maps, Grad-CAMs, Visual Attention Maps and finally go through more details about frameworks like LIME, SHAP, ELI5, SKATER, TCAV which helps us to make Machine Learning and Deep Learning models more interpretable, trustworthy and useful!
Algorithmic Bias: Challenges and Opportunities for AI in HealthcareGregory Nelson
Gregory S. Nelson, VP, Analytics and Strategy – Vidant Health | Adjunct Faculty Duke University
The promise of AI is quickly becoming a reality for a number of industries including healthcare. For example, we have seen early successes in the augmenting clinical intelligence for diagnostic imaging and in early detection of pneumonia and sepsis. But what happens when the algorithms are biased? In this presentation, we will outline a framework for AI governance and discuss ways in which we can address algorithmic bias in machine learning.
Objective 1: Illustrate the issues of bias in AI through examples specific to healthcare.
Objective 2: Summarize the growing body of work in the legal, regulatory, and ethical oversight of AI models and the implications for healthcare.
Objective 3: Outline steps that we can take to establish an AI governance strategy for our organizations.
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Krishnaram Kenthapadi
[Video available at https://sites.google.com/view/ResponsibleAITutorial]
Artificial Intelligence is increasingly being used in decisions and processes that are critical for individuals, businesses, and society, especially in areas such as hiring, lending, criminal justice, healthcare, and education. Recent ethical challenges and undesirable outcomes associated with AI systems have highlighted the need for regulations, best practices, and practical tools to help data scientists and ML developers build AI systems that are secure, privacy-preserving, transparent, explainable, fair, and accountable – to avoid unintended and potentially harmful consequences and compliance challenges.
In this tutorial, we will present an overview of responsible AI, highlighting model explainability, fairness, and privacy in AI, key regulations/laws, and techniques/tools for providing understanding around AI/ML systems. Then, we will focus on the application of explainability, fairness assessment/unfairness mitigation, and privacy techniques in industry, wherein we present practical challenges/guidelines for using such techniques effectively and lessons learned from deploying models for several web-scale machine learning and data mining applications. We will present case studies across different companies, spanning many industries and application domains. Finally, based on our experiences in industry, we will identify open problems and research directions for the AI community.
Explainable AI makes the algorithms to be transparent where they interpret, visualize, explain and integrate for fair, secure and trustworthy AI applications.
This tutorial extensively covers the definitions, nuances, challenges, and requirements for the design of interpretable and explainable machine learning models and systems in healthcare. We discuss many uses in which interpretable machine learning models are needed in healthcare and how they should be deployed. Additionally, we explore the landscape of recent advances to address the challenges model interpretability in healthcare and also describe how one would go about choosing the right interpretable machine learnig algorithm for a given problem in healthcare.
[Video available at https://sites.google.com/view/ResponsibleAITutorial]
Artificial Intelligence is increasingly being used in decisions and processes that are critical for individuals, businesses, and society, especially in areas such as hiring, lending, criminal justice, healthcare, and education. Recent ethical challenges and undesirable outcomes associated with AI systems have highlighted the need for regulations, best practices, and practical tools to help data scientists and ML developers build AI systems that are secure, privacy-preserving, transparent, explainable, fair, and accountable – to avoid unintended and potentially harmful consequences and compliance challenges.
In this tutorial, we will present an overview of responsible AI, highlighting model explainability, fairness, and privacy in AI, key regulations/laws, and techniques/tools for providing understanding around AI/ML systems. Then, we will focus on the application of explainability, fairness assessment/unfairness mitigation, and privacy techniques in industry, wherein we present practical challenges/guidelines for using such techniques effectively and lessons learned from deploying models for several web-scale machine learning and data mining applications. We will present case studies across different companies, spanning many industries and application domains. Finally, based on our experiences in industry, we will identify open problems and research directions for the AI community.
Presented at #H2OWorld 2017 in Mountain View, CA.
Enjoy the video: https://youtu.be/TBJqgvXYhfo.
Learn more about H2O.ai: https://www.h2o.ai/.
Follow @h2oai: https://twitter.com/h2oai.
- - -
Abstract:
Machine learning is at the forefront of many recent advances in science and technology, enabled in part by the sophisticated models and algorithms that have been recently introduced. However, as a consequence of this complexity, machine learning essentially acts as a black-box as far as users are concerned, making it incredibly difficult to understand, predict, or "trust" their behavior. In this talk, I will describe our research on approaches that explain the predictions of ANY classifier in an interpretable and faithful manner.
Sameer's Bio:
Dr. Sameer Singh is an Assistant Professor of Computer Science at the University of California, Irvine. He is working on large-scale and interpretable machine learning applied to natural language processing. Sameer was a Postdoctoral Research Associate at the University of Washington and received his PhD from the University of Massachusetts, Amherst, during which he also worked at Microsoft Research, Google Research, and Yahoo! Labs on massive-scale machine learning. He was awarded the Adobe Research Data Science Faculty Award, was selected as a DARPA Riser, won the grand prize in the Yelp dataset challenge, and received the Yahoo! Key Scientific Challenges fellowship. Sameer has published extensively at top-tier machine learning and natural language processing conferences. (http://sameersingh.org)
Interpreting deep learning and machine learning models is not just another regulatory burden to be overcome. Scientists, physicians, researchers, and analyst that use these technologies for their important work have the right to trust and understand their models and the answers they generate. This talk is an overview of several techniques for interpreting deep learning and machine learning models and telling stories from their results.
Speaker: Patrick Hall is a Data Scientist and Product Engineer at H2O.ai. He’s also an Adjunct Professor at George Washington University for the Department of Decision Sciences. Prior to joining H2O, Patrick spent many years as a Senior Data Scientist SAS and has worked with many Fortune 500 companies on their data science and machine learning problems. https://www.linkedin.com/in/jpatrickhall
This was presented at the London Artificial Intelligence & Deep Learning Meetup.
https://www.meetup.com/London-Artificial-Intelligence-Deep-Learning/events/245251725/
Enjoy the recording: https://youtu.be/CY3t11vuuOM.
- - -
Kasia discussed complexities of interpreting black-box algorithms and how these may affect some industries. She presented the most popular methods of interpreting Machine Learning classifiers, for example, feature importance or partial dependence plots and Bayesian networks. Finally, she introduced Local Interpretable Model-Agnostic Explanations (LIME) framework for explaining predictions of black-box learners – including text- and image-based models - using breast cancer data as a specific case scenario.
Kasia Kulma is a Data Scientist at Aviva with a soft spot for R. She obtained a PhD (Uppsala University, Sweden) in evolutionary biology in 2013 and has been working on all things data ever since. For example, she has built recommender systems, customer segmentations, predictive models and now she is leading an NLP project at the UK’s leading insurer. In spare time she tries to relax by hiking & camping, but if that doesn’t work ;) she co-organizes R-Ladies meetups and writes a data science blog R-tastic (https://kkulma.github.io/).
https://www.linkedin.com/in/kasia-kulma-phd-7695b923/
These slides were presented at a meetup in Kansas City by Bahador Khaleghi of H2O.ai.
More details can be viewed here: https://www.meetup.com/Kansas-City-Artificial-Intelligence-Deep-Learning/events/265662978/
Data Con LA 2020
Description
More and more organizations are embracing AI technology by infusing it in their products and services to to differentiate themselves against their competitors. AI is being utilized in some sensitive areas of human life. In this session let's look at some of principles governing adoption of AI in a responsible manner. Why companies are accelerating adoption of AI?
Increasingly organization are accelerating adoption of AI to differentiate their product and services in the market. Outcomes of this digital transformation that we have seen in the areas of optimizing operations, engaging customers, empowering employees and transforming their products and services.
*List some of the sensitive use cases where AI is being applied
*Why governing AI is important and what are those principles?
*How Microsoft is approaching it?
Speaker
Suresh Paulraj, Microsoft, Principal Cloud Solution Architect Data & AI
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...Francesca Lazzeri, PhD
Machine learning model fairness and interpretability are critical for data scientists, researchers and developers to explain their models and understand the value and accuracy of their findings. Interpretability is also important to debug machine learning models and make informed decisions about how to improve them. In this session, Francesca will go over a few methods and tools that enable you to “unpack" machine learning models, gain insights into how and why they produce specific results, assess your AI systems fairness and mitigate any observed fairness issues.
Using open source fairness and interpretability packages, attendees will learn how to:
- Explain model prediction by generating feature importance values for the entire model and/or individual datapoints.
- Achieve model interpretability on real-world datasets at scale, during training and inference.
- Use an interactive visualization dashboard to discover patterns in data and explanations at training time.
- Leverage additional interactive visualizations to assess which groups of users might be negatively impacted by a model and compare multiple models in terms of their fairness and performance.
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Sri Ambati
This talk was recorded in London on Oct 30, 2018 and can be viewed here: https://youtu.be/p4iAnxwC_Eg
The good news is building fair, accountable, and transparent machine learning systems is possible. The bad news is it’s harder than many blogs and software package docs would have you believe. The truth is nearly all interpretable machine learning techniques generate approximate explanations, that the fields of eXplainable AI (XAI) and Fairness, Accountability, and Transparency in Machine Learning (FAT/ML) are very new, and that few best practices have been widely agreed upon. This combination can lead to some ugly outcomes!
This talk aims to make your interpretable machine learning project a success by describing fundamental technical challenges you will face in building an interpretable machine learning system, defining the real-world value proposition of approximate explanations for exact models, and then outlining the following viable techniques for debugging, explaining, and testing machine learning models
Mateusz is a software developer who loves all things distributed, machine learning and hates buzzwords. His favourite hobby data juggling.
He obtained his M.Sc. in Computer Science from AGH UST in Krakow, Poland, during which he did an exchange at L’ECE Paris in France and worked on distributed flight booking systems. After graduation he move to Tokyo to work as a researcher at Fujitsu Laboratories on machine learning and NLP projects, where he is still currently based.
It's a well-known fact that the best explanation of a simple model is the model itself. But often we use complex models, such as ensemble methods or deep networks, so we cannot use the original model as its own best explanation because it is not easy to understand.
In the context of this topic, we will discuss how methods for interpreting model predictions work and will try to understand practical value of these methods.
Explainable Artificial Intelligence (XAI)
Presented at Lightning Talk session at ICACCI'18 on 20th September 208
An Explainable AI (XAI) or Transparent AI is an artificial intelligence (AI) whose actions can be easily understood by humans. It contrasts with the concept of the "black box" in machine learning, meaning the "interpretability" of the workings of complex algorithms, where even their designers cannot explain why the AI arrived at a specific decision.
https://en.wikipedia.org/wiki/Explainable_Artificial_Intelligence
Artificial Intelligence is increasingly playing an integral role in determining our day-to-day experiences. Moreover, with proliferation of AI based solutions in areas such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching. The dominant role played by AI models in these domains has led to a growing concern regarding potential bias in these models, and a demand for model transparency and interpretability. In addition, model explainability is a prerequisite for building trust and adoption of AI systems in high stakes domains requiring reliability and safety such as healthcare and automated transportation, as well as critical industrial applications with significant economic implications such as predictive maintenance, exploration of natural resources, and climate change modeling.
As a consequence, AI researchers and practitioners have focused their attention on explainable AI to help them better trust and understand models at scale. The challenges for the research community include (i) defining model explainability, (ii) formulating explainability tasks for understanding model behavior and developing solutions for these tasks, and finally (iii) designing measures for evaluating the performance of models in explainability tasks.
In this tutorial, we will first motivate the need for model interpretability and explainability in AI from societal, legal, customer/end-user, and model developer perspectives. [Note: Due to time constraints, we will not focus on techniques/tools for providing explainability as part of AI/ML systems.] Then, we will focus on the real-world application of explainability techniques in industry, wherein we present practical challenges / implications for using explainability techniques effectively and lessons learned from deploying explainable models for several web-scale machine learning and data mining applications. We will present case studies across different companies, spanning application domains such as search and recommendation systems, sales, lending, and fraud detection. Finally, based on our experiences in industry, we will identify open problems and research directions for the research community.
As AI becomes more and more prevalent in our lives, the decisions it makes for us are becoming more and more impactful on our lives and those of others.
How can we help people to have trust in the models we're building? The field of Explainable AI focuses on making any machine learning model interpretable by non experts.
Explainable AI - making ML and DL models more interpretableAditya Bhattacharya
Abstract –
Although industries have started to adopt AI and Machine Learning in almost every sector to solve complex business problems, but are these models always trustworthy? Machine Learning models are not any oracle but rather are scientific methods and mathematical models which best describes the data. But science is all about explaining complex natural phenomena in the simplest way possible! So, can we make ML and DL models more interpretable, so that any business user can understand these models and trust the results of these models?
In order to find out the answer, please join me in this session, in which I will take about concepts of Explainable AI and discuss its necessity and principles which help us demystify black-box AI models. I will be discussing about popular approaches like Feature Importance, Key Influencers, Decomposition trees used in classical Machine Learning interpretable. We will discuss about various techniques used for Deep Learning model interpretations like Saliency Maps, Grad-CAMs, Visual Attention Maps and finally go through more details about frameworks like LIME, SHAP, ELI5, SKATER, TCAV which helps us to make Machine Learning and Deep Learning models more interpretable, trustworthy and useful!
Algorithmic Bias: Challenges and Opportunities for AI in HealthcareGregory Nelson
Gregory S. Nelson, VP, Analytics and Strategy – Vidant Health | Adjunct Faculty Duke University
The promise of AI is quickly becoming a reality for a number of industries including healthcare. For example, we have seen early successes in the augmenting clinical intelligence for diagnostic imaging and in early detection of pneumonia and sepsis. But what happens when the algorithms are biased? In this presentation, we will outline a framework for AI governance and discuss ways in which we can address algorithmic bias in machine learning.
Objective 1: Illustrate the issues of bias in AI through examples specific to healthcare.
Objective 2: Summarize the growing body of work in the legal, regulatory, and ethical oversight of AI models and the implications for healthcare.
Objective 3: Outline steps that we can take to establish an AI governance strategy for our organizations.
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Krishnaram Kenthapadi
[Video available at https://sites.google.com/view/ResponsibleAITutorial]
Artificial Intelligence is increasingly being used in decisions and processes that are critical for individuals, businesses, and society, especially in areas such as hiring, lending, criminal justice, healthcare, and education. Recent ethical challenges and undesirable outcomes associated with AI systems have highlighted the need for regulations, best practices, and practical tools to help data scientists and ML developers build AI systems that are secure, privacy-preserving, transparent, explainable, fair, and accountable – to avoid unintended and potentially harmful consequences and compliance challenges.
In this tutorial, we will present an overview of responsible AI, highlighting model explainability, fairness, and privacy in AI, key regulations/laws, and techniques/tools for providing understanding around AI/ML systems. Then, we will focus on the application of explainability, fairness assessment/unfairness mitigation, and privacy techniques in industry, wherein we present practical challenges/guidelines for using such techniques effectively and lessons learned from deploying models for several web-scale machine learning and data mining applications. We will present case studies across different companies, spanning many industries and application domains. Finally, based on our experiences in industry, we will identify open problems and research directions for the AI community.
Explainable AI makes the algorithms to be transparent where they interpret, visualize, explain and integrate for fair, secure and trustworthy AI applications.
This tutorial extensively covers the definitions, nuances, challenges, and requirements for the design of interpretable and explainable machine learning models and systems in healthcare. We discuss many uses in which interpretable machine learning models are needed in healthcare and how they should be deployed. Additionally, we explore the landscape of recent advances to address the challenges model interpretability in healthcare and also describe how one would go about choosing the right interpretable machine learnig algorithm for a given problem in healthcare.
[Video available at https://sites.google.com/view/ResponsibleAITutorial]
Artificial Intelligence is increasingly being used in decisions and processes that are critical for individuals, businesses, and society, especially in areas such as hiring, lending, criminal justice, healthcare, and education. Recent ethical challenges and undesirable outcomes associated with AI systems have highlighted the need for regulations, best practices, and practical tools to help data scientists and ML developers build AI systems that are secure, privacy-preserving, transparent, explainable, fair, and accountable – to avoid unintended and potentially harmful consequences and compliance challenges.
In this tutorial, we will present an overview of responsible AI, highlighting model explainability, fairness, and privacy in AI, key regulations/laws, and techniques/tools for providing understanding around AI/ML systems. Then, we will focus on the application of explainability, fairness assessment/unfairness mitigation, and privacy techniques in industry, wherein we present practical challenges/guidelines for using such techniques effectively and lessons learned from deploying models for several web-scale machine learning and data mining applications. We will present case studies across different companies, spanning many industries and application domains. Finally, based on our experiences in industry, we will identify open problems and research directions for the AI community.
Presented at #H2OWorld 2017 in Mountain View, CA.
Enjoy the video: https://youtu.be/TBJqgvXYhfo.
Learn more about H2O.ai: https://www.h2o.ai/.
Follow @h2oai: https://twitter.com/h2oai.
- - -
Abstract:
Machine learning is at the forefront of many recent advances in science and technology, enabled in part by the sophisticated models and algorithms that have been recently introduced. However, as a consequence of this complexity, machine learning essentially acts as a black-box as far as users are concerned, making it incredibly difficult to understand, predict, or "trust" their behavior. In this talk, I will describe our research on approaches that explain the predictions of ANY classifier in an interpretable and faithful manner.
Sameer's Bio:
Dr. Sameer Singh is an Assistant Professor of Computer Science at the University of California, Irvine. He is working on large-scale and interpretable machine learning applied to natural language processing. Sameer was a Postdoctoral Research Associate at the University of Washington and received his PhD from the University of Massachusetts, Amherst, during which he also worked at Microsoft Research, Google Research, and Yahoo! Labs on massive-scale machine learning. He was awarded the Adobe Research Data Science Faculty Award, was selected as a DARPA Riser, won the grand prize in the Yelp dataset challenge, and received the Yahoo! Key Scientific Challenges fellowship. Sameer has published extensively at top-tier machine learning and natural language processing conferences. (http://sameersingh.org)
Interpreting deep learning and machine learning models is not just another regulatory burden to be overcome. Scientists, physicians, researchers, and analyst that use these technologies for their important work have the right to trust and understand their models and the answers they generate. This talk is an overview of several techniques for interpreting deep learning and machine learning models and telling stories from their results.
Speaker: Patrick Hall is a Data Scientist and Product Engineer at H2O.ai. He’s also an Adjunct Professor at George Washington University for the Department of Decision Sciences. Prior to joining H2O, Patrick spent many years as a Senior Data Scientist SAS and has worked with many Fortune 500 companies on their data science and machine learning problems. https://www.linkedin.com/in/jpatrickhall
This was presented at the London Artificial Intelligence & Deep Learning Meetup.
https://www.meetup.com/London-Artificial-Intelligence-Deep-Learning/events/245251725/
Enjoy the recording: https://youtu.be/CY3t11vuuOM.
- - -
Kasia discussed complexities of interpreting black-box algorithms and how these may affect some industries. She presented the most popular methods of interpreting Machine Learning classifiers, for example, feature importance or partial dependence plots and Bayesian networks. Finally, she introduced Local Interpretable Model-Agnostic Explanations (LIME) framework for explaining predictions of black-box learners – including text- and image-based models - using breast cancer data as a specific case scenario.
Kasia Kulma is a Data Scientist at Aviva with a soft spot for R. She obtained a PhD (Uppsala University, Sweden) in evolutionary biology in 2013 and has been working on all things data ever since. For example, she has built recommender systems, customer segmentations, predictive models and now she is leading an NLP project at the UK’s leading insurer. In spare time she tries to relax by hiking & camping, but if that doesn’t work ;) she co-organizes R-Ladies meetups and writes a data science blog R-tastic (https://kkulma.github.io/).
https://www.linkedin.com/in/kasia-kulma-phd-7695b923/
These slides were presented at a meetup in Kansas City by Bahador Khaleghi of H2O.ai.
More details can be viewed here: https://www.meetup.com/Kansas-City-Artificial-Intelligence-Deep-Learning/events/265662978/
Data Con LA 2020
Description
More and more organizations are embracing AI technology by infusing it in their products and services to to differentiate themselves against their competitors. AI is being utilized in some sensitive areas of human life. In this session let's look at some of principles governing adoption of AI in a responsible manner. Why companies are accelerating adoption of AI?
Increasingly organization are accelerating adoption of AI to differentiate their product and services in the market. Outcomes of this digital transformation that we have seen in the areas of optimizing operations, engaging customers, empowering employees and transforming their products and services.
*List some of the sensitive use cases where AI is being applied
*Why governing AI is important and what are those principles?
*How Microsoft is approaching it?
Speaker
Suresh Paulraj, Microsoft, Principal Cloud Solution Architect Data & AI
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...Francesca Lazzeri, PhD
Machine learning model fairness and interpretability are critical for data scientists, researchers and developers to explain their models and understand the value and accuracy of their findings. Interpretability is also important to debug machine learning models and make informed decisions about how to improve them. In this session, Francesca will go over a few methods and tools that enable you to “unpack" machine learning models, gain insights into how and why they produce specific results, assess your AI systems fairness and mitigate any observed fairness issues.
Using open source fairness and interpretability packages, attendees will learn how to:
- Explain model prediction by generating feature importance values for the entire model and/or individual datapoints.
- Achieve model interpretability on real-world datasets at scale, during training and inference.
- Use an interactive visualization dashboard to discover patterns in data and explanations at training time.
- Leverage additional interactive visualizations to assess which groups of users might be negatively impacted by a model and compare multiple models in terms of their fairness and performance.
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Sri Ambati
This talk was recorded in London on Oct 30, 2018 and can be viewed here: https://youtu.be/p4iAnxwC_Eg
The good news is building fair, accountable, and transparent machine learning systems is possible. The bad news is it’s harder than many blogs and software package docs would have you believe. The truth is nearly all interpretable machine learning techniques generate approximate explanations, that the fields of eXplainable AI (XAI) and Fairness, Accountability, and Transparency in Machine Learning (FAT/ML) are very new, and that few best practices have been widely agreed upon. This combination can lead to some ugly outcomes!
This talk aims to make your interpretable machine learning project a success by describing fundamental technical challenges you will face in building an interpretable machine learning system, defining the real-world value proposition of approximate explanations for exact models, and then outlining the following viable techniques for debugging, explaining, and testing machine learning models
Mateusz is a software developer who loves all things distributed, machine learning and hates buzzwords. His favourite hobby data juggling.
He obtained his M.Sc. in Computer Science from AGH UST in Krakow, Poland, during which he did an exchange at L’ECE Paris in France and worked on distributed flight booking systems. After graduation he move to Tokyo to work as a researcher at Fujitsu Laboratories on machine learning and NLP projects, where he is still currently based.
The Incredible Disappearing Data ScientistRebecca Bilbro
The last decade saw advances in compute power combine with an avalanche of open source software development, resulting in a revolution in machine learning and scalable analytics. “Data science” and “data product” are now household terms. This led to a new job description, the Data Scientist, which quickly became one of the most significant, exciting, and misunderstood jobs of the 21st century. One part statistician, one part computer scientist, and one part domain expert, data scientists seem poised to become the most pivotal value creators of the information age. And yet, danger (supposedly) lies ahead: human decisions are increasingly outsourced to algorithms of questionable ethical design; we’re putting everything on the blockchain; and perhaps most disturbingly, data science salaries are dropping precipitously as new graduates and Machine Learning as a Service (MLaaS) offerings flood the market. As we move into a future where predictive analytics is no longer a differentiator but instead a core business function, will data scientists proliferate or be automated out of a job?
In this talk, one humble data scientist attempts to cut through the hype to present an alternate vision of what data science is and can become. If not the “Sexiest Job of the 21st Century" as the Harvard Business Review once quipped, what is it like to be a workaday data scientist? What problems are we solving? How do we integrate with mature engineering teams? How do we engage with clients and product owners? How do we deploy non-deterministic models in production? In particular, we’ll examine critical integration points — technological and otherwise — we are currently tackling, which will ultimately determine our success, and our viability, over the next 10 years.
invited talk in the ExUM workshop in the UMAP 2022 conference
abstract:
Explainability has become an important topic both in Data Science and AI in general and in recommender systems in particular, as algorithms have become much less inherently explainable. However, explainability has different interpretations and goals in different fields. For example, interpretability and explanainability tools in machine learning are predominantly developed for Data Scientists to understand and scrutinize their models. Current tools are therefore often quite technical and not very ‘user-friendly’. I will illustrate this with our recent work on improving the explainability of model-agnostic tools such as LIME and SHAP. Another stream of research on explainability in the HCI and XAI fields focuses more on users’ needs for explainability, such as contrastive and selective explanations and explanations that fit with the mental models and beliefs of the user. However, how to satisfy those needs is still an open question. Based on recent work in interactive AI and machine learning, I will propose that explainability goes together with interactivity, and will illustrate this with examples from our own work in music genre exploration, that combines visualizations and interactive tools to help users understand and tune our exploration model.
Coder Name: Rebecca Oquendo
Coding Categories:
Episode
Aggressive Behavior
Neutral Behavior
Virtuous Behavior
Aggressive Gaming
Neutral Gaming
Virtuous Gaming
An older peer began using slurs or derogatory language
An older peer suggested that the team should cheat
The child witnessed an older peer intentionally leave out another player
An older player suggested that they play a different game
The child lost the game with older players on their team
The child witnessed an older player curse every time a mistake was made
Index:
· In this case aggressive behavior would constitute as mimicking older members undesired behaviors or becoming especially angry or agitated in game. A neutral behavior would be playing as they usually would not mimicking older player’s behaviors or trying to fit in to their more aggressive styles. A virtuous behavior would be steering the game away from aggression, voicing an opinion about the excessive aggression, or finding a way to express their gaming experience in a positive way. The same can be applied for the similar categories in “gaming”.
· Each category can be scaled from 1-7 in which way the child’s dialogue tended to be behavior and gaming wise with a 1 indicating little to no effort in that direction and a 7 indicating extreme effort in that category.
1. What are the different types of attributes? Provide examples of each attribute.
2. Describe the components of a decision tree. Give an example problem and provide an example of each component in your decision making tree
3. Conduct research over the Internet and find an article on data mining. The article has to be less than 5 years old. Summarize the article in your own words. Make sure that you use APA formatting for this assignment.
Questions from attached files
1. Obtain one of the data sets available at the UCI Machine Learning Repository and apply as many of the different visualization techniques described in the chapter as possible. The bibliographic notes and book Web site provide pointers to visualization software.
2. Identify at least two advantages and two disadvantages of using color to visually represent information.
3. What are the arrangement issues that arise with respect to three-dimensional plots?
4. Discuss the advantages and disadvantages of using sampling to reduce the number of data objects that need to be displayed. Would simple random sampling (without replacement) be a good approach to sampling? Why or why not?
5. Describe how you would create visualizations to display information that describes the following types of systems.
a) Computer networks. Be sure to include both the static aspects of the network, such as connectivity, and the dynamic aspects, such as traffic.
b) The distribution of specific plant and animal species around the world fora specific moment in time.
c) The use of computer resources, such as processor time, main me ...
Usage of AI and machine learning models is likely to become more commonplace as larger swaths of the economy embrace automation and data-driven decision-making. While these predictive systems can be quite accurate, they have been treated as inscrutable black boxes in the past, that produce only numeric predictions with no accompanying explanations. Unfortunately, recent studies and recent events have drawn attention to mathematical and sociological flaws in prominent weak AI and ML systems, but practitioners usually don’t have the right tools to pry open machine learning black-boxes and debug them.
This presentation introduces several new approaches to that increase transparency, accountability, and trustworthiness in machine learning models. If you are a data scientist or analyst and you want to explain a machine learning model to your customers or managers (or if you have concerns about documentation, validation, or regulatory requirements), then this presentation is for you!
Coder Name: Rebecca Oquendo
Coding Categories:
Episode
Aggressive Behavior
Neutral Behavior
Virtuous Behavior
Aggressive Gaming
Neutral Gaming
Virtuous Gaming
An older peer began using slurs or derogatory language
An older peer suggested that the team should cheat
The child witnessed an older peer intentionally leave out another player
An older player suggested that they play a different game
The child lost the game with older players on their team
The child witnessed an older player curse every time a mistake was made
Index:
· In this case aggressive behavior would constitute as mimicking older members undesired behaviors or becoming especially angry or agitated in game. A neutral behavior would be playing as they usually would not mimicking older player’s behaviors or trying to fit in to their more aggressive styles. A virtuous behavior would be steering the game away from aggression, voicing an opinion about the excessive aggression, or finding a way to express their gaming experience in a positive way. The same can be applied for the similar categories in “gaming”.
· Each category can be scaled from 1-7 in which way the child’s dialogue tended to be behavior and gaming wise with a 1 indicating little to no effort in that direction and a 7 indicating extreme effort in that category.
1. What are the different types of attributes? Provide examples of each attribute.
2. Describe the components of a decision tree. Give an example problem and provide an example of each component in your decision making tree
3. Conduct research over the Internet and find an article on data mining. The article has to be less than 5 years old. Summarize the article in your own words. Make sure that you use APA formatting for this assignment.
Questions from attached files
1. Obtain one of the data sets available at the UCI Machine Learning Repository and apply as many of the different visualization techniques described in the chapter as possible. The bibliographic notes and book Web site provide pointers to visualization software.
2. Identify at least two advantages and two disadvantages of using color to visually represent information.
3. What are the arrangement issues that arise with respect to three-dimensional plots?
4. Discuss the advantages and disadvantages of using sampling to reduce the number of data objects that need to be displayed. Would simple random sampling (without replacement) be a good approach to sampling? Why or why not?
5. Describe how you would create visualizations to display information that describes the following types of systems.
a) Computer networks. Be sure to include both the static aspects of the network, such as connectivity, and the dynamic aspects, such as traffic.
b) The distribution of specific plant and animal species around the world fora specific moment in time.
c) The use of computer resources, such as processor time, main me.
The importance of model fairness and interpretability in AI systemsFrancesca Lazzeri, PhD
Machine learning model fairness and interpretability are critical for data scientists, researchers and developers to explain their models and understand the value and accuracy of their findings. Interpretability is also important to debug machine learning models and make informed decisions about how to improve them.
In this session, Francesca will go over a few methods and tools that enable you to "unpack” machine learning models, gain insights into how and why they produce specific results, assess your AI systems fairness and mitigate any observed fairness issues.
Using open-source fairness and interpretability packages, attendees will learn how to:
- Explain model prediction by generating feature importance values for the entire model and/or individual data points.
- Achieve model interpretability on real-world datasets at scale, during training and inference.
- Use an interactive visualization dashboard to discover patterns in data and explanations at training time.
- Leverage additional interactive visualizations to assess which groups of users might be negatively impacted by a model and compare multiple models in terms of their fairness and performance.
Interpretable machine learning? Let's do this with the Break Down package, part of the DrWhy.AI initiative. Find more information at https://github.com/pbiecek/DALEX/
Trusted, Transparent and Fair AI using Open SourceAnimesh Singh
Fairness, robustness, and explainability in AI are some of the key cornerstones of trustworthy AI. Through its open source projects, IBM and IBM Research bring together the developer, data science and research community to accelerate the pace of innovation and instrument trust into AI.
Adversarial Analytics - 2013 Strata & Hadoop World TalkRobert Grossman
This is a talk I gave at the Strata Conference and Hadoop World in New York City on October 28, 2013. It describes predictive modeling in the context of modeling an adversary's behavior.
Using AI to Build Fair and Equitable WorkplacesData Con LA
Data Con LA 2020
Description
With recent events putting a spotlight on anti-racism, social-justice, climate change, and mental health there's a call for increased ethics and transparency in business. Companies are, rightfully, feeling responsible for providing underrepresented employees with the same treatment and opportunities as their majority counterparts. AI can, and will, be used to help companies understand their environment, develop strategies for improvement and monitor progress. And, as AI is used to make increasingly complex and life-changing decisions, it is critical to ensure that these decisions are fair, equitable and explainable. Unfortunately, it is becoming increasingly clear that, much like humans, AI can be biased. It is therefore imperative that as we develop AI solutions, we are fully aware of the dangers of bias, understand how bias can manifest and know how to take steps to address and minimize it.
In this session you will learn:
*Definitions of fairness, regulated domains and protected classes
*How bias can manifest in AI
*How bias in AI can be measured, tracked and reduced
*Best practices for ensuring that bias doesn't creep into AI/ML models over time
*How explainability can be used to perform real-time checks on predictions
Speakers
Lawrence Spracklen, RSquared AI, Engineering Leadership
Sonya Balzer, RSquared.ai, Director of AI Marketing
GDG Cloud Southlake #17: Meg Dickey-Kurdziolek: Explainable AI is for EveryoneJames Anderson
If Artificial Intelligence (AI) is a black-box, how can a human comprehend and trust the results of Machine Learning (ML) alogrithms? Explainable AI (XAI) tries to shed light into that AI black-box so humans can trust what is going on. Our speaker Meg Dickey-Kurdziolek is currently a UX Researcher for Google Cloud AI and Industry Solutions, where she focuses her research on Explainable AI and Model Understanding. Recording of the presentation: https://youtu.be/6N2DNN_HDWU
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...Analytics India Magazine
Most organizations understand the predictive power and the potential gains from AIML, but AI and ML are still now a black box technology for them. While deep learning and neural networks can provide excellent inputs to businesses, leaders are challenged to use them because of the complete blind faith required to ‘trust’ AI. In this talk we will use the latest technological developments from researchers, the US defense department, and the industry to unbox the black box and provide businesses a clear understanding of the policy levers that they can pull, why, and by how much, to make effective decisions?
Walk Through a Real World ML Production ProjectBill Liu
Success in productionizing ML models is difficult to achieve due to tools, processes and operational procedures. In this session, we demonstrate how data scientists and ML engineers collaborate and efficiently deploy models to production with the Wallaroo platform.
Using a real world scenario we will click down into the ML production journey that Data Scientists and ML engineers go through to take ML models into production. In this session you will learn:
The current pain points and blockers to production
The 2 persona roles in the ML production process. Data Scientist (DS) and ML Engineer
How the ML engineer creates a workspace in Wallaroo, and invites the DS to collaborate
How the DS uploads and deploys models to WL performing simple validation checks on output
How the ML Engineer can check model health (inference speed, etc)
How the DS checks logs, looks for anomalies
How the DS switches model in the pipeline
Speakers: Nina Zumel, Martin Bald
Redefining MLOps with Model Deployment, Management and Observability in Produ...Bill Liu
Tech talk: https://www.aicamp.ai/event/eventdetails/W2022052410
What happens after your machine learning models are deployed in production? How do you make sure that your model performance does not degrade as data and the world change?
The constantly changing data creates challenges for data scientists and engineering teams on how to detect which models have been affected and how to get their ML applications up and running seamlessly.
In this session we will take a deep dive into the new ML model monitoring and drift detection technology. We will discuss:
- How to track the ongoing accuracy of their models in production
- How to immediately detect drift before it causes significant damage to the business
- How to locate the cause of model drifting in live environments.
We will also discuss how data scientists and ML engineers can collaborate effectively using their respective tools to identify issues and take the necessary actions with a live demo and a real world use case.
Speaker: Younes Amar, Head of Product Wallaroo AI.
Resources: https://docs.wallaroo.ai/
These days, training of the Machine Learning models at the device Edge is still a risky endeavor. It is frequently considered a purely academic subject with little value for real-life product development.
In her talk, Vera will challenge this misconception, talk about the advantages of learning at the Edge and guide you through the Edge learning decision-making framework and design principles.
https://www.aicamp.ai/event/eventdetails/W2021102210
Attention Is All You Need.
With these simple words, the Deep Learning industry was forever changed. Transformers were initially introduced in the field of Natural Language Processing to enhance language translation, but they demonstrated astonishing results even outside language processing. In particular, they recently spread in the Computer Vision community, advancing the state-of-the-art on many vision tasks. But what are Transformers? What is the mechanism of self-attention, and do we really need it? How did they revolutionize Computer Vision? Will they ever replace convolutional neural networks?
These and many other questions will be answered during the talk.
In this tech talk, we will discuss:
- A piece of history: Why did we need a new architecture?
- What is self-attention, and where does this concept come from?
- The Transformer architecture and its mechanisms
- Vision Transformers: An Image is worth 16x16 words
- Video Understanding using Transformers: the space + time approach
- The scale and data problem: Is Attention what we really need?
- The future of Computer Vision through Transformers
Speaker: Davide Coccomini, Nicola Messina
Website: https://www.aicamp.ai/event/eventdetails/W2021101110
Deep AutoViML For Tensorflow Models and MLOps WorkflowsBill Liu
deep_autoviml is a powerful new deep learning library with a very simple design goal: Make it as easy as possible for novices and experts alike to experiment with and build tensorflow.keras preprocessing pipelines and models in as few lines of code as possible.
deep_autoviml will enable data scientists, ML engineers and data engineers to fast prototype tensorflow models and data pipelines for MLOps workflows using the latest TF 2.4+ and keras preprocessing layers. You can now upload your saved model to any Cloud provider and make predictions out of the box since all the data preprocessing layers are attached to the model itself!
In this webinar, we will discuss the problems that deep_AutoViML can solve, its architecture design and demo how to build powerful TF.Keras models on structured data, NLP and Image data domains.
https://www.aicamp.ai/event/eventdetails/W2021080918
Metaflow: The ML Infrastructure at NetflixBill Liu
Metaflow was started at Netflix to answer a pressing business need: How to enable an organization of data scientists, who are not software engineers by training, build and deploy end-to-end machine learning workflows and applications independently. We wanted to provide the best possible user experience for data scientists, allowing them to focus on parts they like (modeling using their favorite off-the-shelf libraries) while providing robust built-in solutions for the foundational infrastructure: data, compute, orchestration, and versioning.
Today, the open-source Metaflow powers hundreds of business-critical ML projects at Netflix and other companies from bioinformatics to real estate.
In this talk, you will learn about:
- What to expect from a modern ML infrastructure stack.
- Using Metaflow to boost the productivity of your data science organization, based on lessons learned from Netflix.
- Deployment strategies for a full stack of ML infrastructure that plays nicely with your existing systems and policies.
https://www.aicamp.ai/event/eventdetails/W2021080510
AI stands on three pillars: algorithms, hardware and training data. While the first two have already become commodities on the market, the latter - reliable labelled data - is still a bottleneck in the industry.
Need to add twice as much data to the training set to improve your model? Want to validate the accuracy of a new classificator in an hour? Or maybe you are building a human-in-the-loop process with 90% of cases processed automatically and the trickiest 10% of cases fine-tuned by people in real time. You can do it all with crowdsourcing, but only with crowdsourcing done right.
In this talk, we will discuss how the new generation of methods and tools allows to collect high quality human labelled data on a large scale and why every ML specialist should know how to use crowdsourcing.
You will learn from the talk:
* Understand the applicability, benefits and limits of the crowdsourcing approach.
* Integrate an on-demand workforce into your processes and build human-in-the-loop processes.
* Control the quality and accuracy of data labeling to develop high performing ML models.
* Understand the full-cycle crowdsourcing project
Speaker: Daria Baidakova(Toloka)
Building large scale transactional data lake using apache hudiBill Liu
Data is a critical infrastructure for building machine learning systems. From ensuring accurate ETAs to predicting optimal traffic routes, providing safe, seamless transportation and delivery experiences on the Uber platform requires reliable, performant large-scale data storage and analysis. In 2016, Uber developed Apache Hudi, an incremental processing framework, to power business critical data pipelines at low latency and high efficiency, and helps distributed organizations build and manage petabyte-scale data lakes.
In this talk, I will describe what is APache Hudi and its architectural design, and then deep dive to improving data operations by providing features such as data versioning, time travel.
We will also go over how Hudi brings kappa architecture to big data systems and enables efficient incremental processing for near real time use cases.
Speaker: Satish Kotha (Uber)
Apache Hudi committer and Engineer at Uber. Previously, he worked on building real time distributed storage systems like Twitter MetricsDB and BlobStore.
website: https://www.aicamp.ai/event/eventdetails/W2021043010
Deep Reinforcement Learning and Its ApplicationsBill Liu
What is the most exciting AI news in recent years? AlphaGo!
What are key techniques for AlphaGo? Deep learning and reinforcement learning (RL)!
What are application areas for deep RL? A lot! In fact, besides games, deep RL has been making tremendous achievements in diverse areas like recommender systems and robotics.
In this talk, we will introduce deep reinforcement learning, present several applications, and discuss issues and potential solutions for successfully applying deep RL in real life scenarios.
https://www.aicamp.ai/event/eventdetails/W2021042818
Big Data and AI in Fighting Against COVID-19Bill Liu
Website: https://learn.xnextcon.com/event/eventdetails/W20070810
As the COVID-19 pandemic sweeps the globe, big data and AI have emerged as crucial tools for everything from diagnosis and epidemiology to therapeutic and vaccine development.
In this talk, we collect and review how big data is fighting back against COVID-19. We also provide a deep diving for two interesting use cases: 1) Use NLP and BERT to answer scientific questions. 2) Covid-19 data lake from Databricks, Google and Amazon
Agenda:
Introduction
Supercomputers for Scientific Research
Covid-19 Tracking and Prediction
Covid-19 Research and Diagnosis
Use Case 1 NLP and BERT to answer scientific questions
Use Case 2 Covid-19 Data Lake and Platform
Highly-scalable Reinforcement Learning RLlib for Real-world ApplicationsBill Liu
website: https://learn.xnextcon.com/event/eventdetails/W20051110
video: https://www.youtube.com/watch?v=8tG8PJC6oaU
In reinforcement learning (RL), an agent learns how to optimize performance solely by collecting experience in the real world or via a simulator. RL is being applied to problems such as decision making, process optimization (e.g., manufacturing and supply chains), ad serving, recommendations, self-driving cars, and algorithmic trading.
In this talk, I will discuss RLlib, a reinforcement learning library built on Ray with a strong focus on large-scale execution and scalability, ease-of-use for general users, as well as customizability for developers and researchers.
RLlib offers autonomous task-learning via many common RL algorithms and it scales from a laptop to a cluster with hundreds of machines. It is used by dozens of organizations, from startups to research labs to large organizations. You will see RLlib in action with a live demo.
Build computer vision models to perform object detection and classification w...Bill Liu
event: https://learn.xnextcon.com/event/eventdetails/W20042918
video:
description: Computer Vision has received significant attention over the recent years, both within academia, and industry. As the state-of-the-art rapidly improves, the art-of-the-possible follows , offering innovative forms of computer vision applications for different scenarios.
In this talk, Ramine will cover the background and development of computer vision, and demonstrate how to use AWS to build robust, computer vision models to perform object detection and classification.
Key Takeaways:
Understand the history of Computer Vision
Learn how to use Amazon SageMaker to build and Deploy Computer Vision Models
How to orchestrate multiple models for implementing a real-world use case
Causal Inference in Data Science and Machine LearningBill Liu
Event: https://learn.xnextcon.com/event/eventdetails/W20042010
Video: https://www.youtube.com/channel/UCj09XsAWj-RF9kY4UvBJh_A
Modern machine learning techniques are able to learn highly complex associations from data, which has led to amazing progress in computer vision, NLP, and other predictive tasks. However, there are limitations to inference from purely probabilistic or associational information. Without understanding causal relationships, ML models are unable to provide actionable recommendations, perform poorly in new, but related environments, and suffer from a lack of interpretability.
In this talk, I provide an introduction to the field of causal inference, discuss its importance in addressing some of the current limitations in machine learning, and provide some real-world examples from my experience as a data scientist at Brex.
https://learn.xnextcon.com/event/eventdetails/W20040610
This talk explains how to practically bring the power of convolutional neural networks and deep learning to memory and power-constrained devices like smartphones. You will learn various strategies to circumvent obstacles and build mobile-friendly shallow CNN architectures that significantly reduce the memory footprint and therefore make them easier to store on a smartphone;
The talk also dives into how to use a family of model compression techniques to prune the network size for live image processing, enabling you to build a CNN version optimized for inference on mobile devices. Along the way, you will learn practical strategies to preprocess your data in a manner that makes the models more efficient in the real world.
Weekly #105: AutoViz and Auto_ViML Visualization and Machine LearningBill Liu
https://learn.xnextcon.com/event/eventdetails/W20040310
I will describe what is available in terms of Open Source and Proprietary tools for automating Data Science tasks and introduce 2 new tools: one to visualize any sized data set with one click, another: to try multiple ML models and techniques with a single call. I will provide the Github Repos for both for free in the talk.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
4. Why Explainability?
• More use of ML/AI models by laypersons.
• Laypersons need explanations
• Developers also need quick explanations to debug models
faster
• There may be a legal need for explanations:
• If you deny someone a loan, you may need to explain the
reason for the denial.
7. Explainability vs Performance
Tradeoff
• Some machine learning models are more explainable than
others.
Performance
Explainability
Deep learning models
Linear Models
DecisionTrees
9. What Features?
Interpretable Features
• We need interpretable features.
• Difficult for laypersons to understand raw feature spaces (e.g.
word embeddings)
• Humans are good at understanding presence or absence of
components.
10. Interpretable Instance
• E.g.
• For Text:
• Convert to a binary vector indicating presence or absence
of words
• For images
• Convert to a binary vector indicating presence or absence
of pixels or contiguous regions.
11. Method 1: LIME
From
https://github.com/marcotcr/lime
Locally Interpretable Model-agnostic
Explanations
Ribeiro, M.T., Singh, S. and Guestrin, C., 2016, August. Why Should I Trust
You?: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd
ACM SIGKDD International Conference on Knowledge Discovery and Data
Mining (pp. 1135-1144). ACM.
12. Method 1: LIME
Any classifier
1 1 0 1 1 0 1 0 0 1 0
0 0 0 1 0 1 1 1 1 0 1
-2.1 1.1 -0.5 2.2 -1.2 -1.5 1 -3 0.8 5.6 1.5
Weights for the linear classifier then
give us feature importances
Binary vectors
-2.1 2.2 -3 5.6
Enforce
sparsity
15. Ribeiro, M.T., Singh, S. and Guestrin, C., 2016, August. Why Should I Trust You?: Explaining the Predictions of Any Classifier. In
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1135-1144). ACM.
Explanations for Multi-Label
Classifiers
16. Ribeiro, M.T., Singh, S. and Guestrin, C., 2016, August. Why Should I Trust You?: Explaining the Predictions of Any Classifier. In
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1135-1144). ACM.
Using LIME for Debugging (E.g. 1)
19. Method 2: SHAP
Unifies many different feature attribution methods and has some
desirable properties.
1. LIME
2. Integrated Gradients
3. Shapley values
4. DeepLift
Lundberg, S.M. and Lee, S.I., 2017. A unified approach to interpreting model predictions. In
Advances in Neural Information Processing Systems (pp. 4765-4774).
20. Method 2: SHAP
• Derives from game-theoretic foundations.
• Shapley values used in game theory to assign values to players
in cooperative games.
21. What are Shapley values?
• Suppose there is a set S of N players
participating in a game with payoff for any S
subset of players participating in the game
given by:
• Shapley values provide one fair
way of dividing up the total
payoff among the N players.
22. ShapleyValue
Payoff for the group
including player i
Shapley value for player i
Payoff for a group without player i
24. SHAP Implementation
(https://github.com/slundberg/shap)
Different kinds of explainers:
1. TreeExplainer: fast and exact SHAP values for tree ensembles
2. KernelExplainer: approximate explainer for black box estimators
3. DeepExplainer: high-speed approximate explainer for deep learning models.
4. ExpectedGradients: SHAP-based extension of integrated gradients
25. XGBoost on UCI Income Dataset
Output is probability of income
over 50k
f87
f23
f23 f3
f34
f41
Base ValueOutput
27. Is This Form of Explainability
Enough?
• Explainability does not provide us with recourse.
• Recourse: Information needed to change a specific prediction to a
desired value.
• “If you had paid your credit card balance in full for the last three
months, you would have got that loan.”
28. Issues with SHAP and LIME
SHAP and LIME values are highly variable for instances that are very similar for
non-linear models.
On the Robustness of Interpretability Methods
https://arxiv.org/abs/1806.08049
29. Issues with SHAP and LIME
SHAP and LIME values are highly variable for instances that are very similar for
non-linear models.
On the Robustness of Interpretability Methods
https://arxiv.org/abs/1806.08049
30. Issues with SHAP and LIME
SHAP and LIME values don’t provide insight into how the model will behave on new instances.
https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16982
High-Precision Model-Agnostic Explanations
31. Take-home message
• Explainability is possible need not come at the cost of
performance.
• Explainability is not enough
• Recourse, etc.
33. Fairness and Bias in Machine
Learning
1. Bias in this context is unfairness (more or less).
2. Note we are not talking about standard statistical bias in machine
learning (the bias in the bias vs. variance tradeoff).
3. For completeness, this is one definition of statistical bias in machine
learning.
• Bias = Expected value of model - true value
34. Definitions of Fairness or Bias
1. Many, many, many definitions exists.
2. Application dependent. No one definition is better.
3. See “21 Definitions of Fairness” tutorial by Arvind Narayanan,ACM
2018 FAT*.
1. Key Point: Dozens of definitions exist (and not just 21)
35. Setting
1. Classifier C with binary output d in {+, -}, a real-valued score s.
1. Instances or data points are generally humans.
2. The + class is desired and the negative - class is not desired.
2. Input X, and
1. one or more sensitive/protected attribute G (e.g. gender) that are part
of the input. E.g. Possible values of G = {m, f}
3. A set of instances sharing a common sensitive attribute is privileged
(receives more + labels).The other is unprivileged (receives less + labels)
4. True output Y
36. 1. Fairness through
Unawareness
• Simple Idea: Do not consider any sensitive attributes when
building the model.
• Advantage: Some support in the law (disparate treatment)?
• Disadvantage:: Other attributes may be correlated with
sensitive attributes (such as job history, geographical location
etc.)
37. 2. Statistical Parity Difference
• Different groups should have the same proportion (or
probability) of positive and negative labels. Ideally the below
value should be close to zero:
• Advantages: Legal support in the form of a rule known as the fourth-fifths rule. May remove
historical bias.
• Disadvantages:
• Trivial classifiers such as classifiers which randomly assign the same of proportion of labels
across different groups satisfy this definition.
• Perfect classifier Y = d may not be allowed if ground truth rates of labels are different across
groups.
38. 3. Equal Opportunity
Difference
• Different groups have the same true positive rate. Ideally the
below value should be close to zero:
• Advantages:
• Perfect classifier allowed.
• Disadvantages:
• May perpetuate historical biases.
• E.g. Hiring application with 100 privileged and 100 unprivileged, but 40 qualified in privileged and 4 in unprivileged.
• By hiring 20 and 2 from each privileged and unprivileged you will satisfy this.
39. 4. False Negative Error
Balance
• If the application is punitive in nature
• Different groups should have the same false negative scores.
• Example:
• The proportion of black defendants who don’t recidivate and receive high risk
scores
Should be the same as
• The proportion of white defendants who don’t recidivate and receive high risk
scores.
41. Impossibility Results
• Core of the debate in COMPAS.
• ProPublica: false negatives should be the same across
different groups
• Northpointe: scores should have the same meaning across
groups. (test fairness)
• Result: If prevalence rates (ground truth proportion of labels
across different groups) are different, and if test fairness is
satisfied then false negatives will differ across groups.
Chouldechova, A., 2017. Fair prediction with disparate impact: A study of bias in recidivism
prediction instruments. Big data, 5(2), pp.153-163.
42. Tools for Measuring Bias
https://github.com/IBM/AIF360
AI Fairness 360 (AIF 360):
Measuring Bias
43. Mitigation: Removing Bias
• Mitigation can be happen in three different places:
• Before the model is built, in the training data
• In the model
• After the model is built, with the predictions:
45. Before the model is built
• Reweighing (roughly at a high-level):
• Increase weights for some
• Unprivileged with positive labels
• Privileged with negative labels
• Decrease weights for some
• Unprivileged with negative labels
• Privileged with positive labels
+ -
- +
47. In the model
Zhang, B.H., Lemoine, B. and Mitchell, M., 2018, December. Mitigating
unwanted biases with adversarial learning. In Proceedings of the 2018
AAAI/ACM Conference on AI, Ethics, and Society (pp. 335-340). ACM.
49. After the model is built
• Reject option classification:
• Assume the classifier outputs a probability score.
• If the classifier score is within a small band around 0.5:
• If unprivileged then predict positive
• If privileged then predict negative
Probability of + label for
unprivileged
0 1
0
1
Probability of - label for
unprivileged
52. Take-home message
• Many forms of fairness and bias exist: most of them are
incompatible with each other.
• Bias can be decreased with algorithms (with usually some
loss in performance)