Measures and mismeasures of algorithmic fairnessManojit Nandi
This document discusses various measures and challenges of achieving algorithmic fairness. It begins by defining algorithmic fairness and noting it is inherently a social concept. It then covers three main types of algorithmic biases: bias in allocation, representation, and weaponization. It outlines three families of fairness measures: anti-classification, classification parity, and calibration. It notes each approach has dangers and no single definition of fairness exists. The document concludes by discussing proposed standards for documenting datasets and models to improve algorithmic transparency and accountability.
Nick Schmidt of BLDS, LLC to the Maryland AI meetup, June 4, 2019 (https://www.meetup.com/Maryland-AI). Nick discusses ideas of fairness and how they apply to machine learning. He explores recent academic work on identifying and mitigating bias, and how his work in lending and employment can be applied to other industries. Nick explains how to measure whether an algorithm is fair and also demonstrate the techniques that model builders can use to ameliorate bias when it is found.
[Video available at https://sites.google.com/view/ResponsibleAITutorial]
Artificial Intelligence is increasingly being used in decisions and processes that are critical for individuals, businesses, and society, especially in areas such as hiring, lending, criminal justice, healthcare, and education. Recent ethical challenges and undesirable outcomes associated with AI systems have highlighted the need for regulations, best practices, and practical tools to help data scientists and ML developers build AI systems that are secure, privacy-preserving, transparent, explainable, fair, and accountable – to avoid unintended and potentially harmful consequences and compliance challenges.
In this tutorial, we will present an overview of responsible AI, highlighting model explainability, fairness, and privacy in AI, key regulations/laws, and techniques/tools for providing understanding around AI/ML systems. Then, we will focus on the application of explainability, fairness assessment/unfairness mitigation, and privacy techniques in industry, wherein we present practical challenges/guidelines for using such techniques effectively and lessons learned from deploying models for several web-scale machine learning and data mining applications. We will present case studies across different companies, spanning many industries and application domains. Finally, based on our experiences in industry, we will identify open problems and research directions for the AI community.
Data Con LA 2020
Description
More and more organizations are embracing AI technology by infusing it in their products and services to to differentiate themselves against their competitors. AI is being utilized in some sensitive areas of human life. In this session let's look at some of principles governing adoption of AI in a responsible manner. Why companies are accelerating adoption of AI?
Increasingly organization are accelerating adoption of AI to differentiate their product and services in the market. Outcomes of this digital transformation that we have seen in the areas of optimizing operations, engaging customers, empowering employees and transforming their products and services.
*List some of the sensitive use cases where AI is being applied
*Why governing AI is important and what are those principles?
*How Microsoft is approaching it?
Speaker
Suresh Paulraj, Microsoft, Principal Cloud Solution Architect Data & AI
This collection of slides are meant as a starting point and tutorial for the ones who want to understand AI Ethics and in particular the challenges around bias and fairness. Furthermore, I have also included studies on how we as humans perceive AI influence in our private as well as working lives.
The document summarizes key topics around fairness and bias in machine learning:
- It discusses different types of biases that can arise such as historical, representation, measurement, and aggregation biases.
- It explores how bias can be introduced and amplified at various stages of an ML system from data collection to deployment.
- Various definitions of fairness are presented, including demographic parity, equal odds, and equal opportunity.
- Methods for quantifying and mitigating bias are outlined, such as preprocessing techniques like reweighing and disparate impact removal, inprocessing approaches like prejudice removal and adversarial debiasing, and postprocessing options like tuning for equal odds.
The document discusses explainability and bias in machine learning/AI models. It covers several topics:
1. Why explainability of models is important, including for laypeople using models and potential legal needs for explanations of decisions.
2. Methods for explainability including using interpretable models directly and post-hoc explainability methods like LIME and SHAP which provide feature attributions.
3. Issues with bias in machine learning models and different definitions of fairness. It also discusses techniques for measuring and mitigating bias, such as reweighting data or using adversarial learning.
The document discusses responsible AI and the need for proper guidelines to ensure user trust and privacy as AI becomes more ubiquitous. It notes that while AI demonstrates potential for economic and social value, there are also risks that must be balanced. There is currently a lack of clear and consistent responsible AI benchmarks, standards, and compliance. However, awareness of ethics is growing as seen in the increasing number of frameworks and principles from different organizations. The document outlines India's national strategy for AI developed by NITI Aayog and NASSCOM's responsible AI toolkit which provides guidance and tools to help businesses leverage AI responsibly.
Measures and mismeasures of algorithmic fairnessManojit Nandi
This document discusses various measures and challenges of achieving algorithmic fairness. It begins by defining algorithmic fairness and noting it is inherently a social concept. It then covers three main types of algorithmic biases: bias in allocation, representation, and weaponization. It outlines three families of fairness measures: anti-classification, classification parity, and calibration. It notes each approach has dangers and no single definition of fairness exists. The document concludes by discussing proposed standards for documenting datasets and models to improve algorithmic transparency and accountability.
Nick Schmidt of BLDS, LLC to the Maryland AI meetup, June 4, 2019 (https://www.meetup.com/Maryland-AI). Nick discusses ideas of fairness and how they apply to machine learning. He explores recent academic work on identifying and mitigating bias, and how his work in lending and employment can be applied to other industries. Nick explains how to measure whether an algorithm is fair and also demonstrate the techniques that model builders can use to ameliorate bias when it is found.
[Video available at https://sites.google.com/view/ResponsibleAITutorial]
Artificial Intelligence is increasingly being used in decisions and processes that are critical for individuals, businesses, and society, especially in areas such as hiring, lending, criminal justice, healthcare, and education. Recent ethical challenges and undesirable outcomes associated with AI systems have highlighted the need for regulations, best practices, and practical tools to help data scientists and ML developers build AI systems that are secure, privacy-preserving, transparent, explainable, fair, and accountable – to avoid unintended and potentially harmful consequences and compliance challenges.
In this tutorial, we will present an overview of responsible AI, highlighting model explainability, fairness, and privacy in AI, key regulations/laws, and techniques/tools for providing understanding around AI/ML systems. Then, we will focus on the application of explainability, fairness assessment/unfairness mitigation, and privacy techniques in industry, wherein we present practical challenges/guidelines for using such techniques effectively and lessons learned from deploying models for several web-scale machine learning and data mining applications. We will present case studies across different companies, spanning many industries and application domains. Finally, based on our experiences in industry, we will identify open problems and research directions for the AI community.
Data Con LA 2020
Description
More and more organizations are embracing AI technology by infusing it in their products and services to to differentiate themselves against their competitors. AI is being utilized in some sensitive areas of human life. In this session let's look at some of principles governing adoption of AI in a responsible manner. Why companies are accelerating adoption of AI?
Increasingly organization are accelerating adoption of AI to differentiate their product and services in the market. Outcomes of this digital transformation that we have seen in the areas of optimizing operations, engaging customers, empowering employees and transforming their products and services.
*List some of the sensitive use cases where AI is being applied
*Why governing AI is important and what are those principles?
*How Microsoft is approaching it?
Speaker
Suresh Paulraj, Microsoft, Principal Cloud Solution Architect Data & AI
This collection of slides are meant as a starting point and tutorial for the ones who want to understand AI Ethics and in particular the challenges around bias and fairness. Furthermore, I have also included studies on how we as humans perceive AI influence in our private as well as working lives.
The document summarizes key topics around fairness and bias in machine learning:
- It discusses different types of biases that can arise such as historical, representation, measurement, and aggregation biases.
- It explores how bias can be introduced and amplified at various stages of an ML system from data collection to deployment.
- Various definitions of fairness are presented, including demographic parity, equal odds, and equal opportunity.
- Methods for quantifying and mitigating bias are outlined, such as preprocessing techniques like reweighing and disparate impact removal, inprocessing approaches like prejudice removal and adversarial debiasing, and postprocessing options like tuning for equal odds.
The document discusses explainability and bias in machine learning/AI models. It covers several topics:
1. Why explainability of models is important, including for laypeople using models and potential legal needs for explanations of decisions.
2. Methods for explainability including using interpretable models directly and post-hoc explainability methods like LIME and SHAP which provide feature attributions.
3. Issues with bias in machine learning models and different definitions of fairness. It also discusses techniques for measuring and mitigating bias, such as reweighting data or using adversarial learning.
The document discusses responsible AI and the need for proper guidelines to ensure user trust and privacy as AI becomes more ubiquitous. It notes that while AI demonstrates potential for economic and social value, there are also risks that must be balanced. There is currently a lack of clear and consistent responsible AI benchmarks, standards, and compliance. However, awareness of ethics is growing as seen in the increasing number of frameworks and principles from different organizations. The document outlines India's national strategy for AI developed by NITI Aayog and NASSCOM's responsible AI toolkit which provides guidance and tools to help businesses leverage AI responsibly.
An algorithm is considered fair if its results and performance are independent of sensitive variables like gender, ethnicity, etc. Fairness can be introduced at different stages of model development, such as in data collection, preparation, and model selection. Techniques for identifying and mitigating bias include causal reasoning, explainability, fairness metrics, and counterfactuals. Counterfactual fairness evaluates predictions across different protected attribute values while holding other variables constant. Explainability helps ensure models make decisions for the right reasons. Overall fairness aims to achieve equal outcomes or opportunities across groups.
Invited talk on fairness in AI systems at the 2nd Workshop on Interactive Natural Language Technology for Explainable AI co-located with the International Conference on Natural Language Generation, 18/12/2020.
Responsible Data Use in AI - core tech pillarsSofus Macskássy
In this deck, we cover four core pillars of responsible data use in AI, including fairness, transparency, explainability -- as well as data governance.
Explainable AI is not yet Understandable AIepsilon_tud
Keynote of Dr. Nava Tintarev at RCIS'2020. Decision-making at individual, business, and societal levels is influenced by online content. Filtering and ranking algorithms such as those used in recommender systems are used to support these decisions. However, it is often not clear to a user whether the advice given is suitable to be followed, e.g., whether it is correct, whether the right information was taken into account, or if the user’s best interests were taken into consideration. In other words, there is a large mismatch between the representation of the advice by the system versus the representation assumed by its users. This talk addresses why we (might) want to develop advice-giving systems that can explain themselves, and how we can assess whether we are successful in this endeavor. This talk will also describe some of the state-of-the-art in explanations in a number of domains (music, tweets, and news articles) that help link the mental models of systems and people. However, it is not enough to generate rich and complex explanations; more is required in order to understand and be understood. This entails among other factors decisions around which information to select to show to people, and how to present that information, often depending on the target users and contextual factors
While AI may take some menial jobs, it is unlikely to dominate upper-level or skilled blue-collar positions held by humans. Many laws have been enacted to protect citizen privacy from AI by requiring explicit consent for personal data collection and storage of data on devices rather than the cloud. As AI systems grow more sophisticated and potentially develop their own languages, the debate around according rights to intelligent robots may become a future issue requiring discussion.
This document discusses the importance of context and connections for developing responsible artificial intelligence (AI) systems. It provides examples of how a lack of context has led to issues with biased, unexplainable, or inappropriate AI applications. The document argues that graph databases and knowledge graphs can help address these issues by providing AI systems with more robust contextual data and understanding of relationships. It highlights several companies and use cases that are leveraging graph technologies to develop more accurate, fair, and transparent AI.
Artificial Intelligence Machine Learning Deep Learning PPT PowerPoint Present...SlideTeam
This PPT is for the mid level managers giving information about AI Artificial Intelligence, Machine Learning ML, Deep Learning DL, Supervised Machine Learning, Unsupervised Machine Learning, Reinforcement Learning. You can also learn the difference between Artificial Intelligence and Machine Learning and deciding which out of AI or DL or ML will be better for your business. You will also get to know about the Expert System, its examples, characteristics, components, etc. https://bit.ly/2ApMbXB
Explore how different industries are embracing the utility of AI to create and deliver new value for their customers and organisation
* Discuss the state of maturity of AI across industries
* Get an appreciation of business posture to AI projects
We also review the utility of AI across several industries including:
* Healthcare
* Newsroom & Journalism
* Travel
* Finance
* Supply Chain / eCommerce / Retail
* Streaming & Gaming
* Transportation
* Logistics
* Manufacturing
* Agriculture
* Defense & Cybersecurity
Part of the What Matters in AI series as published on www.andremuscat.com
This document discusses fairness in artificial intelligence and machine learning. It begins by noting that AI can encode and amplify human biases, leading to unfair outcomes at scale. It then discusses different ways to measure fairness, such as demographic parity and equality of opportunity. The document presents an example of predicting income using census data and shows how the initial model is unfair, with low probabilities for certain groups. It explores potential sources of bias in systems and methods for enforcing fairness, such as adversarial training to iteratively train a classifier and adversarial model. The document emphasizes that fairness is complex with many approaches and no single solution, requiring active work to avoid unfair outcomes.
Today, I will be presenting on the topic of
"Generative AI, responsible innovation, and the law."
Artificial Intelligence has been making rapid strides in recent years,
and its applications are becoming increasingly diverse.
Generative AI, in particular, has emerged as a promising area of innovation, the potential to create highly realistic and compelling outputs.
This document discusses bias in artificial intelligence and algorithms. It begins with an introduction to the topic and why it is important. It then explores how to detect bias through various fairness metrics and how to mitigate bias through preprocessing, inprocessing, and postprocessing techniques. The document provides examples of different sources of bias and strategies to address them. It also recommends resources like the AI Fairness 360 toolkit to help evaluate models for fairness and identify potential biases.
This document provides an overview of artificial intelligence and machine learning. It begins by defining AI as computer systems that can perform tasks autonomously and adaptively. Machine learning is described as getting computers to learn without being explicitly programmed. Examples of machine learning in daily life are discussed. The basics of supervised and unsupervised learning are explained. Ethical issues around AI like bias, fairness, and determining appropriate use are then discussed. Options for addressing these issues like ensuring diversity of data and viewpoints are presented. The document concludes by providing recommendations for further learning.
AI and ML Series - Introduction to Generative AI and LLMs - Session 1DianaGray10
Session 1
👉This first session will cover an introduction to Generative AI & harnessing the power of large language models. The following topics will be discussed:
Introduction to Generative AI & harnessing the power of large language models.
What’s generative AI & what’s LLM.
How are we using it in our document understanding & communication mining models?
How to develop a trustworthy and unbiased AI model using LLM & GenAI.
Personal Intelligent Assistant
Speakers:
📌George Roth - AI Evangelist at UiPath
📌Sharon Palawandram - Senior Machine Learning Consultant @ Ashling Partners & UiPath MVP
📌Russel Alfeche - Technology Leader RPA @qBotica & UiPath MVP
The document discusses the importance of context for developing responsible artificial intelligence (AI) systems. It provides examples of AI systems that lacked proper context and oversight, which led to harmful or inappropriate behaviors. Specifically, it discusses how graphs can help address these issues by providing AI with more contextual data and connections to learn from. This allows for more accurate, fair, explainable and trustworthy AI solutions. The document advocates for incorporating adjacent information as context for AI using knowledge graphs, which will help drive reliable AI and become a standard approach.
Artificial Intelligence Introduction & Business usecasesVikas Jain
This document discusses artificial intelligence and the fourth industrial revolution. It provides background on AI, including its history and increasing importance due to lower hardware costs, availability of data, and improved algorithms. It describes different types of AI and discusses how AI is being applied in various industries like customer service, retail, e-commerce, warehousing, healthcare, agriculture, and finance. It also addresses some of the threats, ethics, and vocabulary related to AI.
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete DeckSlideTeam
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck is loaded with easy-to-follow content, and intuitive design. Introduce the types and levels of artificial intelligence using the highly-effective visuals featured in this PPT slide deck. Showcase the AI-subfield of machine learning, as well as deep learning through our comprehensive PowerPoint theme. Represent the differences, and interrelationship between AI, ML, and DL. Elaborate on the scope and use case of machine intelligence in healthcare, HR, banking, supply chain, or any other industry. Take advantage of the infographic-style layout to describe why AI is flourishing in today’s day and age. Elucidate AI trends such as robotic process automation, advanced cybersecurity, AI-powered chatbots, and more. Cover all the essentials of machine learning and deep learning with the help of this PPT slideshow. Outline the application, algorithms, use cases, significance, and selection criteria for machine learning. Highlight the deep learning process, types, limitations, and significance. Describe reinforcement training, neural network classifications, and a lot more. Hit download and begin personalization. Our AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck are topically designed to provide an attractive backdrop to any subject. Use them to look like a presentation pro. https://bit.ly/3ngJCKf
Artificial Intelligence is increasingly playing an integral role in determining our day-to-day experiences. Moreover, with proliferation of AI based solutions in areas such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching. The dominant role played by AI models in these domains has led to a growing concern regarding potential bias in these models, and a demand for model transparency and interpretability. In addition, model explainability is a prerequisite for building trust and adoption of AI systems in high stakes domains requiring reliability and safety such as healthcare and automated transportation, and critical industrial applications with significant economic implications such as predictive maintenance, exploration of natural resources, and climate change modeling.
As a consequence, AI researchers and practitioners have focused their attention on explainable AI to help them better trust and understand models at scale. The challenges for the research community include (i) defining model explainability, (ii) formulating explainability tasks for understanding model behavior and developing solutions for these tasks, and finally (iii) designing measures for evaluating the performance of models in explainability tasks.
In this tutorial, we present an overview of model interpretability and explainability in AI, key regulations / laws, and techniques / tools for providing explainability as part of AI/ML systems. Then, we focus on the application of explainability techniques in industry, wherein we present practical challenges / guidelines for effectively using explainability techniques and lessons learned from deploying explainable models for several web-scale machine learning and data mining applications. We present case studies across different companies, spanning application domains such as search & recommendation systems, sales, lending, and fraud detection. Finally, based on our experiences in industry, we identify open problems and research directions for the data mining / machine learning community.
Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...Shift Conference
Shift AI was a success, connecting hundreds of professionals that were eager to propel the progress of AI and discuss the newest technologies in data mining, machine learning and neural networks. More at https://ai.shiftconf.co/.
Talk description:
With all the breakthroughs in Machine Learning space, ML models are now being used to make decisions affecting the lives of humans, more than ever. Hence judging the quality of a model can no longer only fulfilled by accuracy, precision, and recall. It's important to understand that each individual and group of people is being treated with equality without any historical bias existed in the data. This talk focuses on some of the many potential ways to establish fairness as metrics for ML models in your organization. Also, my learnings and challenges, I encountered while building a fairness tool for data scientists and business stakeholders.
Demo: Algorithmic Fairness Tool (AFT) was an innovation project, done at Accenture The Dock, which focused on bringing the latest research from academia and building a tool for the industry.
In AI We Trust? Ethics & Bias in Machine Learning
Hosted by Seven Peaks Ventures and Fast Forward Labs
September 2016
Decision-making about critical life stages (college admissions, creditworthiness, employability, jail sentencing) is rapidly becoming centralized and predicted by automated systems like machine learning models. As Data Scientists, the creators of those models, how do we take responsibility for those decisions? How do we define our goals, and how do we measure the effect? As this presents a unique opportunity and risk to businesses, we all become invested in the answers to these questions. Here, I focus on the tactical elements of measuring fairness as well as the forward-looking concerns and opportunities this paradigmatic change presents.
https://www.eventbrite.com/e/in-ai-we-trust-ethics-bias-in-machine-learning-tickets-26313436196#
An algorithm is considered fair if its results and performance are independent of sensitive variables like gender, ethnicity, etc. Fairness can be introduced at different stages of model development, such as in data collection, preparation, and model selection. Techniques for identifying and mitigating bias include causal reasoning, explainability, fairness metrics, and counterfactuals. Counterfactual fairness evaluates predictions across different protected attribute values while holding other variables constant. Explainability helps ensure models make decisions for the right reasons. Overall fairness aims to achieve equal outcomes or opportunities across groups.
Invited talk on fairness in AI systems at the 2nd Workshop on Interactive Natural Language Technology for Explainable AI co-located with the International Conference on Natural Language Generation, 18/12/2020.
Responsible Data Use in AI - core tech pillarsSofus Macskássy
In this deck, we cover four core pillars of responsible data use in AI, including fairness, transparency, explainability -- as well as data governance.
Explainable AI is not yet Understandable AIepsilon_tud
Keynote of Dr. Nava Tintarev at RCIS'2020. Decision-making at individual, business, and societal levels is influenced by online content. Filtering and ranking algorithms such as those used in recommender systems are used to support these decisions. However, it is often not clear to a user whether the advice given is suitable to be followed, e.g., whether it is correct, whether the right information was taken into account, or if the user’s best interests were taken into consideration. In other words, there is a large mismatch between the representation of the advice by the system versus the representation assumed by its users. This talk addresses why we (might) want to develop advice-giving systems that can explain themselves, and how we can assess whether we are successful in this endeavor. This talk will also describe some of the state-of-the-art in explanations in a number of domains (music, tweets, and news articles) that help link the mental models of systems and people. However, it is not enough to generate rich and complex explanations; more is required in order to understand and be understood. This entails among other factors decisions around which information to select to show to people, and how to present that information, often depending on the target users and contextual factors
While AI may take some menial jobs, it is unlikely to dominate upper-level or skilled blue-collar positions held by humans. Many laws have been enacted to protect citizen privacy from AI by requiring explicit consent for personal data collection and storage of data on devices rather than the cloud. As AI systems grow more sophisticated and potentially develop their own languages, the debate around according rights to intelligent robots may become a future issue requiring discussion.
This document discusses the importance of context and connections for developing responsible artificial intelligence (AI) systems. It provides examples of how a lack of context has led to issues with biased, unexplainable, or inappropriate AI applications. The document argues that graph databases and knowledge graphs can help address these issues by providing AI systems with more robust contextual data and understanding of relationships. It highlights several companies and use cases that are leveraging graph technologies to develop more accurate, fair, and transparent AI.
Artificial Intelligence Machine Learning Deep Learning PPT PowerPoint Present...SlideTeam
This PPT is for the mid level managers giving information about AI Artificial Intelligence, Machine Learning ML, Deep Learning DL, Supervised Machine Learning, Unsupervised Machine Learning, Reinforcement Learning. You can also learn the difference between Artificial Intelligence and Machine Learning and deciding which out of AI or DL or ML will be better for your business. You will also get to know about the Expert System, its examples, characteristics, components, etc. https://bit.ly/2ApMbXB
Explore how different industries are embracing the utility of AI to create and deliver new value for their customers and organisation
* Discuss the state of maturity of AI across industries
* Get an appreciation of business posture to AI projects
We also review the utility of AI across several industries including:
* Healthcare
* Newsroom & Journalism
* Travel
* Finance
* Supply Chain / eCommerce / Retail
* Streaming & Gaming
* Transportation
* Logistics
* Manufacturing
* Agriculture
* Defense & Cybersecurity
Part of the What Matters in AI series as published on www.andremuscat.com
This document discusses fairness in artificial intelligence and machine learning. It begins by noting that AI can encode and amplify human biases, leading to unfair outcomes at scale. It then discusses different ways to measure fairness, such as demographic parity and equality of opportunity. The document presents an example of predicting income using census data and shows how the initial model is unfair, with low probabilities for certain groups. It explores potential sources of bias in systems and methods for enforcing fairness, such as adversarial training to iteratively train a classifier and adversarial model. The document emphasizes that fairness is complex with many approaches and no single solution, requiring active work to avoid unfair outcomes.
Today, I will be presenting on the topic of
"Generative AI, responsible innovation, and the law."
Artificial Intelligence has been making rapid strides in recent years,
and its applications are becoming increasingly diverse.
Generative AI, in particular, has emerged as a promising area of innovation, the potential to create highly realistic and compelling outputs.
This document discusses bias in artificial intelligence and algorithms. It begins with an introduction to the topic and why it is important. It then explores how to detect bias through various fairness metrics and how to mitigate bias through preprocessing, inprocessing, and postprocessing techniques. The document provides examples of different sources of bias and strategies to address them. It also recommends resources like the AI Fairness 360 toolkit to help evaluate models for fairness and identify potential biases.
This document provides an overview of artificial intelligence and machine learning. It begins by defining AI as computer systems that can perform tasks autonomously and adaptively. Machine learning is described as getting computers to learn without being explicitly programmed. Examples of machine learning in daily life are discussed. The basics of supervised and unsupervised learning are explained. Ethical issues around AI like bias, fairness, and determining appropriate use are then discussed. Options for addressing these issues like ensuring diversity of data and viewpoints are presented. The document concludes by providing recommendations for further learning.
AI and ML Series - Introduction to Generative AI and LLMs - Session 1DianaGray10
Session 1
👉This first session will cover an introduction to Generative AI & harnessing the power of large language models. The following topics will be discussed:
Introduction to Generative AI & harnessing the power of large language models.
What’s generative AI & what’s LLM.
How are we using it in our document understanding & communication mining models?
How to develop a trustworthy and unbiased AI model using LLM & GenAI.
Personal Intelligent Assistant
Speakers:
📌George Roth - AI Evangelist at UiPath
📌Sharon Palawandram - Senior Machine Learning Consultant @ Ashling Partners & UiPath MVP
📌Russel Alfeche - Technology Leader RPA @qBotica & UiPath MVP
The document discusses the importance of context for developing responsible artificial intelligence (AI) systems. It provides examples of AI systems that lacked proper context and oversight, which led to harmful or inappropriate behaviors. Specifically, it discusses how graphs can help address these issues by providing AI with more contextual data and connections to learn from. This allows for more accurate, fair, explainable and trustworthy AI solutions. The document advocates for incorporating adjacent information as context for AI using knowledge graphs, which will help drive reliable AI and become a standard approach.
Artificial Intelligence Introduction & Business usecasesVikas Jain
This document discusses artificial intelligence and the fourth industrial revolution. It provides background on AI, including its history and increasing importance due to lower hardware costs, availability of data, and improved algorithms. It describes different types of AI and discusses how AI is being applied in various industries like customer service, retail, e-commerce, warehousing, healthcare, agriculture, and finance. It also addresses some of the threats, ethics, and vocabulary related to AI.
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete DeckSlideTeam
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck is loaded with easy-to-follow content, and intuitive design. Introduce the types and levels of artificial intelligence using the highly-effective visuals featured in this PPT slide deck. Showcase the AI-subfield of machine learning, as well as deep learning through our comprehensive PowerPoint theme. Represent the differences, and interrelationship between AI, ML, and DL. Elaborate on the scope and use case of machine intelligence in healthcare, HR, banking, supply chain, or any other industry. Take advantage of the infographic-style layout to describe why AI is flourishing in today’s day and age. Elucidate AI trends such as robotic process automation, advanced cybersecurity, AI-powered chatbots, and more. Cover all the essentials of machine learning and deep learning with the help of this PPT slideshow. Outline the application, algorithms, use cases, significance, and selection criteria for machine learning. Highlight the deep learning process, types, limitations, and significance. Describe reinforcement training, neural network classifications, and a lot more. Hit download and begin personalization. Our AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck are topically designed to provide an attractive backdrop to any subject. Use them to look like a presentation pro. https://bit.ly/3ngJCKf
Artificial Intelligence is increasingly playing an integral role in determining our day-to-day experiences. Moreover, with proliferation of AI based solutions in areas such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching. The dominant role played by AI models in these domains has led to a growing concern regarding potential bias in these models, and a demand for model transparency and interpretability. In addition, model explainability is a prerequisite for building trust and adoption of AI systems in high stakes domains requiring reliability and safety such as healthcare and automated transportation, and critical industrial applications with significant economic implications such as predictive maintenance, exploration of natural resources, and climate change modeling.
As a consequence, AI researchers and practitioners have focused their attention on explainable AI to help them better trust and understand models at scale. The challenges for the research community include (i) defining model explainability, (ii) formulating explainability tasks for understanding model behavior and developing solutions for these tasks, and finally (iii) designing measures for evaluating the performance of models in explainability tasks.
In this tutorial, we present an overview of model interpretability and explainability in AI, key regulations / laws, and techniques / tools for providing explainability as part of AI/ML systems. Then, we focus on the application of explainability techniques in industry, wherein we present practical challenges / guidelines for effectively using explainability techniques and lessons learned from deploying explainable models for several web-scale machine learning and data mining applications. We present case studies across different companies, spanning application domains such as search & recommendation systems, sales, lending, and fraud detection. Finally, based on our experiences in industry, we identify open problems and research directions for the data mining / machine learning community.
Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...Shift Conference
Shift AI was a success, connecting hundreds of professionals that were eager to propel the progress of AI and discuss the newest technologies in data mining, machine learning and neural networks. More at https://ai.shiftconf.co/.
Talk description:
With all the breakthroughs in Machine Learning space, ML models are now being used to make decisions affecting the lives of humans, more than ever. Hence judging the quality of a model can no longer only fulfilled by accuracy, precision, and recall. It's important to understand that each individual and group of people is being treated with equality without any historical bias existed in the data. This talk focuses on some of the many potential ways to establish fairness as metrics for ML models in your organization. Also, my learnings and challenges, I encountered while building a fairness tool for data scientists and business stakeholders.
Demo: Algorithmic Fairness Tool (AFT) was an innovation project, done at Accenture The Dock, which focused on bringing the latest research from academia and building a tool for the industry.
In AI We Trust? Ethics & Bias in Machine Learning
Hosted by Seven Peaks Ventures and Fast Forward Labs
September 2016
Decision-making about critical life stages (college admissions, creditworthiness, employability, jail sentencing) is rapidly becoming centralized and predicted by automated systems like machine learning models. As Data Scientists, the creators of those models, how do we take responsibility for those decisions? How do we define our goals, and how do we measure the effect? As this presents a unique opportunity and risk to businesses, we all become invested in the answers to these questions. Here, I focus on the tactical elements of measuring fairness as well as the forward-looking concerns and opportunities this paradigmatic change presents.
https://www.eventbrite.com/e/in-ai-we-trust-ethics-bias-in-machine-learning-tickets-26313436196#
Ethical Issues in Artificial Intelligence: Examining Bias and DiscriminationTechCyber Vision
The document discusses several key issues regarding ensuring ethical and unbiased artificial intelligence (AI), including:
1. AI systems can unintentionally learn and perpetuate biases from historical data, resulting in discriminatory outcomes. Addressing bias requires attention to diverse and representative datasets, identification and removal of biases in data, and fairness metrics in algorithm design.
2. Governance frameworks and regulations are needed to establish ethical principles, promote transparency, accountability and privacy, require impact assessments and audits, and mandate algorithmic explainability. International collaboration is important for consistent standards.
3. Mitigating discrimination involves defining fairness metrics, addressing biases in training data, regular evaluation, stakeholder involvement, transparency, and continuous improvement of
Artificial Intelligence and Ethics from MS It seems.docx4934bk
Artificial intelligence systems show promise but also raise ethical concerns about discrimination. Researchers have found examples where face recognition algorithms more accurately identify white males than black women. If algorithms are used by law enforcement or doctors, the consequences of such biases could seriously impact people. Developers must remove bias from AI by ensuring training data is vast, diverse, and reflective of the population to avoid systems that discriminate against certain groups. The advantages of AI must be weighed against the risks of uncorrected biases.
This document summarizes a webinar on data ethics when designing civil justice interventions. It provides an agenda for the webinar which includes introductions, a discussion of how machines can learn to discriminate with Solon Barocas speaking, and a discussion of digital decision making with Ali Lange speaking. It also includes information about the speakers and a question period. Key topics discussed are how big data can unintentionally reinforce biases and disparities if human oversight is lacking, and the importance of considering data sensitivities, consumer protection laws, and empowering clients when using big data.
How do we protect privacy of users when building large-scale AI based systems? How do we develop machine learned models and systems taking fairness, accountability, and transparency into account? With the ongoing explosive growth of AI/ML models and systems, these are some of the ethical, legal, and technical challenges encountered by researchers and practitioners alike. In this talk, we will first motivate the need for adopting a "fairness and privacy by design" approach when developing AI/ML models and systems for different consumer and enterprise applications. We will then focus on the application of fairness-aware machine learning and privacy-preserving data mining techniques in practice, by presenting case studies spanning different LinkedIn applications (such as fairness-aware talent search ranking, privacy-preserving analytics, and LinkedIn Salary privacy & security design), and conclude with the key takeaways and open challenges.
Responsible AI in Industry: Practical Challenges and Lessons LearnedKrishnaram Kenthapadi
How do we develop machine learning models and systems taking fairness, accuracy, explainability, and transparency into account? How do we protect the privacy of users when building large-scale AI based systems? Model fairness and explainability and protection of user privacy are considered prerequisites for building trust and adoption of AI systems in high stakes domains such as hiring, lending, and healthcare. We will first motivate the need for adopting a “fairness, explainability, and privacy by design” approach when developing AI/ML models and systems for different consumer and enterprise applications from the societal, regulatory, customer, end-user, and model developer perspectives. We will then focus on the application of responsible AI techniques in practice through industry case studies. We will discuss the sociotechnical dimensions and practical challenges, and conclude with the key takeaways and open challenges.
The document discusses several topics related to artificial intelligence including the ethics of AI, types of bias in AI systems, criteria for fairness in AI, and model cards. It describes the ethics of AI as concerning both the moral behavior of humans designing AI systems and the behavior of machines. It outlines six types of bias that can occur in AI systems including historical, representation, measurement, aggregation, evaluation, and development bias. It discusses criteria for fairness in AI such as demographic parity, equal opportunity, and equal accuracy. It also provides an overview of what model cards are and the types of information they can include such as intended use, factors, metrics, evaluation data, training data, quantitative analyses, and ethical considerations.
This document discusses responsible artificial intelligence. It begins by showing survey results that most consumers want organizations to be accountable for misusing AI and want privacy protected. It then defines responsible AI as evaluating, developing, and implementing AI safely, reliably, and ethically. The main principles discussed are privacy using differential privacy, fairness by mitigating unfair impacts on groups, and transparency through explainable AI tools. General recommendations are given such as clarifying a system's purpose, considering social biases, and encouraging feedback. Benefits of responsible AI include minimizing unintentional bias and ensuring transparency.
This document outlines several ethical considerations for developing and using artificial intelligence (AI), including where data comes from, potential environmental impacts, privacy concerns, accountability, responsibility over the AI's use, avoiding anthropomorphism, ensuring trustworthy and explainable outputs, using high-quality data, avoiding disproportionate harms or benefits, transparency, and independent auditing. Key points addressed include informed consent in data collection, reducing energy usage, privacy protections, clear accountability, democratic input on AI governance, avoiding emotional manipulation, accuracy of outputs, explainability, addressing biases, equitable access, transparency on limitations and risks, and verifying developers' claims.
1. Human-centered design is an approach to designing AI systems that focuses on understanding and serving people's needs. It involves understanding needs, considering potential harms, prototyping without AI first, and allowing people to challenge systems.
2. Bias in AI can discriminate and harm individuals. Data bias like underrepresentation and flawed data can result in models that fail to perform well for all groups.
3. Model cards are proposed as a way to increase transparency of AI systems by communicating key information about trained models to broad audiences.
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Krishnaram Kenthapadi
This document provides an overview of explainable AI techniques. It discusses how explainable AI aims to make AI models more transparent and understandable by providing explanations for their predictions. Various explanation methods are covered, including model-specific techniques like interpreting gradients in neural networks, as well as model-agnostic approaches like Shapley values from game theory. The document explains how explanations are important for building user trust in AI systems and can help with debugging, analyzing robustness, and extracting rules from complex models.
There has been very little analysis of big data ethics from an Ignatian or Catholic Social Thought point of view. The Jesuit tradition, with its focus on persons and community, as well as the CST tradition, certainly provides some direction for navigating these difficult Big Data questions.
This document discusses algorithmic fairness and the impacts of machine learning and AI systems on society. It provides an outline of topics to be covered, including sources of algorithmic bias, examination of key research papers in the field, and sketching out a different direction for the discussion. The document reviews several influential papers that proposed definitions of fairness for algorithms and techniques for achieving fairness, but also showed limitations and tradeoffs. It discusses how combining the results of these papers suggests that achieving perfect algorithmic fairness through technical means alone may not be possible. The document argues for taking a broader view of the problem that considers social and normative issues, in addition to technical approaches.
The Ethical Considerations of AI in Retail_ Bias, Transparency, and User Priv...tamizhias2003
Mindnotix is an exclusive web and mobile app development company with 12+ years of experience and 400+ happy clients in India, US, UK and Middle East. We Provide Complete solution on disruptive technologies like AR, VR , IOT and AI app developments.
Good Systems: Policy & Ethical Implications of AI slide presentation at AS
Good Systems: Policy & Ethical Implications of AI slide presentation at ASPA 202 Digital conference
Algorithmic Bias: Challenges and Opportunities for AI in HealthcareGregory Nelson
Gregory S. Nelson, VP, Analytics and Strategy – Vidant Health | Adjunct Faculty Duke University
The promise of AI is quickly becoming a reality for a number of industries including healthcare. For example, we have seen early successes in the augmenting clinical intelligence for diagnostic imaging and in early detection of pneumonia and sepsis. But what happens when the algorithms are biased? In this presentation, we will outline a framework for AI governance and discuss ways in which we can address algorithmic bias in machine learning.
Objective 1: Illustrate the issues of bias in AI through examples specific to healthcare.
Objective 2: Summarize the growing body of work in the legal, regulatory, and ethical oversight of AI models and the implications for healthcare.
Objective 3: Outline steps that we can take to establish an AI governance strategy for our organizations.
ARTIFICIAL INTELLIGENCE AND ETHICS 29.pptxAmalaPaulson
This document discusses human-centered design approaches to AI, potential harms of machine learning systems, and ways to address bias and promote fairness. It outlines six types of bias that can occur, including historical, representation, measurement, aggregation, evaluation and deployment bias. It also describes four common fairness criteria: demographic parity, equal opportunity, equal accuracy, and group unawareness. The document explains that model cards can be used to promote transparency by documenting key details about machine learning models, including intended uses, technical information, and evaluations of model performance and limitations.
The document discusses various techniques for preparing data in DataRobot, including importing datasets, removing irrelevant features, exploring datasets to create profiles, engineering data through projects by exploring values, splitting columns, changing types, and using regular expressions and computed functions to clean and transform data. It also covers publishing datasets to libraries, exporting datasets, and creating baselines.
An introduction session in the "Blockchain: Cryptocurrencies and other Business Applications" course that I've just finished delivering to the School of Business of Hong Kong Baptist University.
Ethereum is a decentralized platform that runs smart contracts and distributed applications. It provides a programmable blockchain that anyone can use. Ethereum launched in 2015 and enables peer-to-peer transactions through digital tokens called Ether. Unlike Bitcoin, Ethereum supports smart contracts and distributed applications through an Ethereum Virtual Machine. Common tools for developing on Ethereum include Solidity for writing smart contracts, Geth and Parity for running nodes, and Metamask for interacting with dApps. Tokens on Ethereum can represent various assets and functions like currencies, shares, voting rights, and access to services.
Managing Mobile App Performance Through Geo-Performance VisibilityDaniel Chan
A presentation delivered in the Test Automation and Digital QA Summit 2018 held in Hong Kong. A shift of paradigm for software testing as a result of new technologies in mobile computing, cloud computing, big data, and machine learning is observed in recent years. The presentation discusses how understanding on mobile app performance can be improved, and subsequently diagnosed, and monitored using the new paradigm.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
OpenID AuthZEN Interop Read Out - AuthorizationDavid Brossard
During Identiverse 2024 and EIC 2024, members of the OpenID AuthZEN WG got together and demoed their authorization endpoints conforming to the AuthZEN API
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxSitimaJohn
Ocean Lotus cyber threat actors represent a sophisticated, persistent, and politically motivated group that poses a significant risk to organizations and individuals in the Southeast Asian region. Their continuous evolution and adaptability underscore the need for robust cybersecurity measures and international cooperation to identify and mitigate the threats posed by such advanced persistent threat groups.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
CAKE: Sharing Slices of Confidential Data on BlockchainClaudio Di Ciccio
Presented at the CAiSE 2024 Forum, Intelligent Information Systems, June 6th, Limassol, Cyprus.
Synopsis: Cooperative information systems typically involve various entities in a collaborative process within a distributed environment. Blockchain technology offers a mechanism for automating such processes, even when only partial trust exists among participants. The data stored on the blockchain is replicated across all nodes in the network, ensuring accessibility to all participants. While this aspect facilitates traceability, integrity, and persistence, it poses challenges for adopting public blockchains in enterprise settings due to confidentiality issues. In this paper, we present a software tool named Control Access via Key Encryption (CAKE), designed to ensure data confidentiality in scenarios involving public blockchains. After outlining its core components and functionalities, we showcase the application of CAKE in the context of a real-world cyber-security project within the logistics domain.
Paper: https://doi.org/10.1007/978-3-031-61000-4_16
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
4. 4
Ideally, we would use statistics that cleanly separate
categories but overlapping categories are the norm
Fairness & Ethics in AI
Credit Score: higher scores represent higher likelihood of payback
◦ A single statistic can stand in for many different variables, boiling them down to one number
◦ Credit score, which is computed using a number of factors (e.g., income, promptness in paying debts), attempts to correctly
represent the likelihood of a person going to pay off a loan or default
◦ Can credit score really generalize real-world cases to one statistical number?
◦ The relationship is usually fuzzy – it is rare to find a statistic that correlates perfectly with real-world outcomes
An Ideal World The Reality
https://research.google.com/bigpicture/attacking-discrimination-in-ml/
5. 5
People whose credit scores are below the cut-off / threshold
are denied the loan, people above it are granted the loan
Fairness & Ethics in AI
Threshold
Classifier
6. 6
All paying applicants get granted the loan but the number of
defaulters getting the loan also increases
Fairness & Ethics in AI
Maximized
Customer
Satisfaction
7. 7
All defaulters are denied the loan but a large number of
paying applicants are also wrongly denied the loan
Fairness & Ethics in AI
Minimized
Default Risks
8. 8
Optimal profit is attained using a threshold credit score of 54
Fairness & Ethics in AI
Profit could be a driver of the threshold
10. 10
▪ The issue of how the correct decision is defined and with sensitivities to which
factors, becomes particularly thorny when a statistic like a credit score ends up
distributed differently between two groups
▪ Imagine we have two groups of people: blue and orange
▪ We are interested in making small loans, subject to the following rules
◦ A successful loan makes $300
◦ An unsuccessful loan costs $700
◦ Everyone has a credit score between 0 and 100
The statistic behind a score may distribute differently across
various groups
Fairness & Ethics in AI
11. 11
The two distributions are slightly different, even though blue
and orange people are equally likely to pay off a loan
Fairness & Ethics in AI
21. 21
Group Unaware
Demographic
Parity
Equal Opportunity
▪ Fairness through
unawareness
▪ The group attribute is not used
in the classification
▪ All groups to the same &
one standard
▪ Ignore real differences
between groups
▪ Women generally pay less for
life insurance than men since
they tend to live longer
▪ Differences in score
distributions causes the
orange group gets fewer loans
if the most profitable group-
unaware threshold is used
▪ Same chance for the
positive ones in each group
▪ True Positive Rate (TPR) is
identical between groups
▪ For people who can pay back
a loan, the same fraction in
each group should actually be
granted a loan
▪ Aka statistical parity or
group fairness
▪ Same fraction of each group
will receive intervention
▪ Same positive rate for each
group
▪ The bank uses different loan
thresholds that yield the
same fraction of loans to
each group
▪ Similar individuals (having
similar attribute values) but
in different groups may be
discriminated
Fairness & Ethics in AI
22. 22 Fairness & Ethics in AI
Why does fairness matter?
Regardless of one's definition of fairness, everyone
wants to be treated fairly.
Ensuring fairness is a moral and ethical imperative.
23. 23 Fairness & Ethics in AI
What is fairness anyway?
There are 20+ definitions of fairness.
Some of the definitions are contradictory.
The way fairness is defined impacts bias.
24. 24 Fairness & Ethics in AI
Data + Math = Objectivity
≠
25. 25 Fairness & Ethics in AI
Given essentially any scoring system, it is possible to
efficiently find thresholds that meet the chosen criteria.
In other words, even if you don't have control over the
underlying scoring system (a common case) it is still
possible to attack the issue of discrimination.
27. 27
Most bias come from data used in classification
Fairness & Ethics in AI
Gender Spending
Power
… Number of
Accidents
ZIP Code Premium
M 8 … 5 06230 ?
F 8 … 3 92100 ?
F 10 … 0 75300 ?
F 8 … 0 13500 ?
M 12 … 2 45000 ?
… … … … …
protected attribute non-protected attributes target
direct bias indirect bias
bias induced by sensitive information
sample
bias
model bias
representation
bias
label
bias
28. 28
Bias can be induced from sample representation
Fairness & Ethics in AI
OVERSAMPLING
copies of
minority class
ORIGINAL DATASET
UNDERSAMPLING
samples of
majority class
ORIGINAL DATASET
29. 29 Fairness & Ethics in AI
Bias can be induced from sample
collection, e.g. using biased sources
36. 36
Fairness Terminology
Fairness & Ethics in AI
Protected Attributes
An attribute that partitions a population into
groups whose outcomes should have parity
(e.g. race, gender, age, and religion).
Privileged Protected Attributes
A protected attribute value indicating a group
that has historically been at systematic
advantage.
Group Fairness
Groups defined by protected attributes
receiving similar treatments or outcomes.
Individual Fairness
Similar individuals receiving similar
treatments or outcomes.
Fairness Metric
A measure of unwanted bias in training data
or models.
Favorable Label
A label whose value correspond to an
outcome that provides an advantage to the
recipient.
37. 37
▪ A common concern with AI models is that they may create proxies for protected
attributes, where the complexity of the model leads to class membership being
used to make decisions in a way that cannot easily be found and improved
▪ If the attributes used in the model have a strong relationship with the protected
attributes, spurious correlation or poor model building could lead to a proxy
problem
▪ Measuring within-class disparities (differences in treatment that only occurs for
some members of a class) is much harder
Removing the protected attributes may not be sufficient due
to the problem of proxies
Fairness & Ethics in AI
39. 39
Most of the research on the topic of bias and fairness
in AI is about making sure that a system does not
have a disproportionate effect on some group of
users relative to other groups.
The primary focus of AI ethics is on distribution
checks and similar analytics.
Fairness & Ethics in AI
41. 41
▪ Machine Learning based predictive tools are being
increasingly used in problems that can have a
drastic impact on people’s lives
▪ e.g., criminal justice, education, public health, workforce
development, and social services
▪ Recent work has raised concerns on the risk of
unintended bias in these models, affecting
individuals from certain groups unfairly
▪ While a lot of bias metrics and fairness definitions
have been proposed, there is no consensus on
which definitions and metrics should be used in
practice to evaluate and audit these systems
Interest in algorithmic fairness and bias has been growing
recently
Fairness & Ethics in AI
42. 42
▪ The Aequitas toolkit is a flexible bias-audit utility for algorithmic
decision-making models, accessible via Python API, command line
interface (CLI), and through a web application
▪ Aequitas is used to evaluate model performance across several bias
and fairness metrics, and utilize the most relevant metrics in model
selection.
▪ Aequitas will help
• Understand where biases exist in the model(s)
• Compare the level of bias between groups in the samples (bias disparity)
• Visualize absolute bias metrics and their related disparities for rapid
comprehension and decision-making
Aequitas audits the predictions of ML-based risk
assessment tools to understand different types of biases
Fairness & Ethics in AI
43. 43
▪ Aequitas can audit systems for two types of biases
▪ Disparate representation: biased actions / interventions that are not allocated in a way that is
representative of the population
▪ Disparate errors: biased outcomes through actions or interventions that are result of the system
being wrong about certain groups of people
▪ To assess these biases, the following data are needed
• Data about the overall population considered for interventions along with the protected attributes that
are to be audited (e.g., race, gender, age, income)
• The set of individuals in the target population that the system recommends / selects for intervention
or action
• Unseen data, not the training dataset
• To audit for biases due to disparate errors of the system, actual outcomes for the individuals who
were selected and not selected are also required
Aequitas audits the evidence of disparate representation
and disparate errors
Fairness & Ethics in AI
44. 44
Equal Parity (均等)
Also known as
Demographic Parity or
Statistical Parity
Different bias and fairness criteria need to be used for
different types of interventions
Proportional Parity
Also known as
Impact Parity or
Minimizing Disparate
Impact
False Positive Parity
Desirable when
interventions are
punitive
False Negative Parity
Desirable when
interventions are
assistive /
preventative
Fairness & Ethics in AI
Each group
represented equally
among the selected set
Each group
represented
proportional to their
representation in the
overall population
Each group to have
equal False Positive
Rates
Each group to have
equal False Negative
Rates
45. 45
everyone without
regard for actual need
people not
receiving assistance
people with
actual need
everyone without
regard for actual
outcome
people for whom
intervention is taken
intervention not
warranted
The fairness tree describes the interpretation of the metrics
Fairness & Ethics in AI
Are your interventions punitive or assistive?
Among which group are you most
concerned with ensuring predictive equity?
Can you intervene with most people with
need or only a small fraction?
Among which group are you most
concerned with ensuring predictive equity?
FP/GS PARITY FDR PARITY FPR PARITY RECALL PARITY FN/GS PARITY FOR PARITY FNR PARITY
punitive
(could hurt individuals)
assistive
(will help individuals)
small fraction
most people
COUNTERFACTUAL FAIRNESS
Do you trust the labels?
Do you need to select equal number of people from each group
OR
proportional to their % in the overall population?
EQUAL SELECTION PARITY DEMOGRAPHIC PARITY
Do you want to be fair based on disparate representation
OR
based on disparate errors of your system?
representation errors
yes no
equal numbers proportional
46. 46
Preliminary Concepts
Name Notation Definition
Score 𝑆 ∈ 0,1 A real-valued score assigned to each entity by the predictor.
Decision
𝑌 ∈ 0,1 A binary-valued prediction assigned to an entity.
True Outcome 𝑌 ∈ 0,1 A binary-valued label (ground truth) of an entity.
Attribute 𝐴 = 𝑎1,𝑎2,⋯ , 𝑎𝑛 A multi-valued attribute with multiple possible values, e.g., gender = female,male,other .
Group 𝑔 𝑎𝑖 A group formed by all entities having the same attribute value, e.g., race Asian .
Reference Group 𝑔 𝑎𝑟 A reference group formed by all entities having the reference attribute values, e.g., gender male .
Labelled Positive 𝐿𝑃
𝑔 Number of entities within 𝑔 𝑎𝑖 with positive label, i.e., 𝑌 = 1.
Labelled Negative 𝐿𝑁𝑔 Number of entities within 𝑔 𝑎𝑖 with negative label, i.e., 𝑌 = 0.
Predicted Positive 𝑃𝑃
𝑔 Number of entities within 𝑔 𝑎𝑖 with positive prediction, i.e.,
𝑌 = 1.
Total Predicted
Positive 𝐾 =
𝐴=𝑎1
𝐴=𝑎𝑛
𝑃𝑃𝑔 𝑎𝑖
Total number of entities with positive prediction across all groups 𝑔 𝑎𝑖 formed by all possible
attribute values of 𝐴.
Predicted Negative 𝑃𝑁𝑔 Number of entities within 𝑔 𝑎𝑖 with negative prediction, i.e.,
𝑌 = 0.
False Positive 𝐹𝑃
𝑔 Number of entities within 𝑔 𝑎𝑖 with false positive prediction, i.e.,
𝑌 = 1 ٿ 𝑌 = 0.
False Negative 𝐹𝑁𝑔 Number of entities within 𝑔 𝑎𝑖 with false negative prediction, i.e.,
𝑌 = 0 ٿ 𝑌 = 1.
True Positive 𝑇𝑃
𝑔 Number of entities within 𝑔 𝑎𝑖 with true positive prediction, i.e.,
𝑌 = 1 ٿ 𝑌 = 1.
True Negative 𝑇𝑁𝑔 Number of entities within 𝑔 𝑎𝑖 with true negative prediction, i.e.,
𝑌 = 0 ٿ 𝑌 = 0.
Fairness & Ethics in AI
47. 47
Basic Metrics
Name Notation Definition & Example
Prevalence (Prev)
𝑃𝑟𝑒𝑣𝑔 =
𝐿𝑃
𝑔
𝑔
= 𝑃 𝑌 = 1 𝐴 = 𝑎𝑖
Fraction of entities within 𝑔 𝑎𝑖 with positive label.
Given your race, what is your chance of being denied bail?
Predicted Prevalence (PPrev)
𝑃𝑃𝑟𝑒𝑣𝑔 =
𝑃𝑃
𝑔
𝑔
= 𝑃
𝑌 = 1 𝐴 = 𝑎𝑖
Fraction of entities within 𝑔 𝑎𝑖 with positive prediction.
Given your race, what is your predicted chance of being
denied bail?
Predicted Positive Rate (PPR)
𝑃𝑃𝑅𝑔 =
𝑃𝑃
𝑔
𝐾
= 𝑃 𝐴 = 𝑎𝑖
𝑌 = 1
Ratio of number of entities within 𝑔 𝑎𝑖 with positive prediction
to the Total Predicted Positive over all 𝑔 𝑎𝑖 formed by all
possible attribute values of 𝐴.
Given the predicted denials of bail over all races, what is the
chance of your race being denied bail?
Recall / True Positive Rate
(TPR)
𝑇𝑃𝑅𝑔 =
𝑇𝑃
𝑔
𝐿𝑃
𝑔
= 𝑃
𝑌 = 1 𝑌 = 1, 𝐴 = 𝑎𝑖
Fraction of entities within 𝑔 𝑎𝑖 with positive label that are also
with positive prediction.
Among people with need, what is your chance of receiving
assistance given your gender?
Fairness & Ethics in AI
48. 48
Basic Metrics
Name Notation Definition & Example
False Negative Rate (FNR)
𝐹𝑁𝑅𝑔 =
𝐹𝑁𝑔
𝐿𝑃
𝑔
= 𝑃
𝑌 = 0 𝑌 = 1, 𝐴 = 𝑎𝑖
Fraction of entities within 𝑔 𝑎𝑖 with positive label but have
negative prediction.
Among people with need, what is your chance of not receiving
any assistance given your gender?
False Positive Rate (FPR)
𝐹𝑃𝑅𝑔 =
𝐹𝑃
𝑔
𝐿𝑁𝑔
= 𝑃
𝑌 = 1 𝑌 = 0, 𝐴 = 𝑎𝑖
Fraction of entities within 𝑔 𝑎𝑖 with negative label but have
positive prediction.
Among people who should be granted bail, what is your chance
of being denied bail given your race?
False Discovery Rate (FDR) /
Calibration
𝐹𝐷𝑅𝑔 =
𝐹𝑃
𝑔
𝑃𝑃
𝑔
= 𝑃 𝑌 = 0
𝑌 = 1, 𝐴 = 𝑎𝑖
Fraction of entities within 𝑔 𝑎𝑖 with positive prediction that are
false.
Among people being denied bail, what is your chance of being
innocent given your race?
False Omission Rate (FOR)
𝐹𝑂𝑅𝑔 =
𝐹𝑁𝑔
𝑃𝑁𝑔
= 𝑃 𝑌 = 1
𝑌 = 0, 𝐴 = 𝑎𝑖
Fraction of entities within 𝑔 𝑎𝑖 with negative prediction that are
false.
Among people who do not receive assistance, what is your
chance of requiring assistance given your gender?
Fairness & Ethics in AI
49. 49
Basic Metrics
Name Notation Definition & Example
False Positive over Group
Size (FP/GS)
𝐹𝑃/𝐺𝑆𝑔 =
𝐹𝑃
𝑔
𝑔
= 𝑃
𝑌 = 1, 𝑌 = 0 𝐴 = 𝑎𝑖
Ratio of number of entities within 𝑔 𝑎𝑖 with positive prediction
that is wrong to the number of entities within 𝑔 𝑎𝑖
What is your chance of being wrongly denied bail given your
race?
False Negative over Group
Size (FN/GS)
𝐹𝑁/𝐺𝑆𝑔 =
𝐹𝑁𝑔
𝑔
= 𝑃
𝑌 = 0, 𝑌 = 1 𝐴 = 𝑎𝑖
Ratio of number of entities within 𝑔 𝑎𝑖 with negative prediction
that is wrong to the number of entities within 𝑔 𝑎𝑖
What is your chances of being wrongly left out of assistance
given your race?
Fairness & Ethics in AI
51. 51
▪ In 2016, Propublica reported on racial inequality in COMPAS, a risk assessment tool
▪ The algorithm was shown to lead to unfair disparities in False Negative Rates and False
Positive Rates
▪ In the case of recidivation, it was shown that black defendants faced disproportionately
high risk scores, while white defendants received disproportionately low risk scores
▪ Northpointe, the company responsible for the algorithm, responded by arguing they
calibrated the algorithm to be fair in terms of False Discovery Rate, also known as
calibration
▪ The Bias Report provides metrics on each type of disparity, which add clarity to the bias
auditing process
COMPAS was reported to have unfair disparities,
Northpointe pushed back, who is right and who is wrong
Fairness & Ethics in AI
53. 53
COMPAS Recidivism Risk Assessment Dataset
Fairness & Ethics in AI
score is a binary assessment made by the
predictive model and 1 denotes an individual
selected for the intervention
label_value is the binary valued
ground truth and 1 denotes a biased
case based on disparate errors
57. 57 Fairness & Ethics in AI
Fair: between
80~125% of the
reference group
metric value
Unfair: outside the
80~125% range of
the reference group
metric value
threshold ≤ Fair Parity ≤
1
threshold
66. 66
Only a few bias metric values fall within the 80~125% range
Fairness & Ethics in AI
FDR Disparity𝐴𝑠𝑖𝑎𝑛 =
FDR𝐴𝑠𝑖𝑎𝑛
FDR𝐶𝑎𝑢𝑐𝑎𝑠𝑖𝑎𝑛
=
0.25
0.41
= 0.61
68. 68
What is modifiable determines what mitigation algorithms
can be used
Pro-Processing
Algorithm
Bias mitigation algorithms
applied to training data
In-Processing
Algorithm
Bias mitigation algorithms
applied to a model during
its training
In-Processing
Algorithm
Bias mitigation algorithms
applied to predicted
labels
69. 69
Bias mitigation can be applied at different phases of the
machine learning pipeline
Fairness & Ethics in AI
Reweighing
Modifies the weights of different
training examples
Disparate Impact Remover
Edits feature values to improve group
fairness
Optimized Preprocessing
Modifies training data features &
labels
Learning Fair Representation
Learns fair representation by
obfuscating information about
protected attributes
Adversarial Debiasing
Uses adversarial techniques to
maximise accuracy & reduce
evidence of protected attributes in
prediction
Prejudice Remover
Adds a discrimination-aware
regularization term to the learning
objective
Meta Fair Classifier
Takes the fairness metric as part of
the input & returns a classifier
optimized for the metric
Reject-Option Classification
Changes predictions from a classifier
to make them fairer
Calibrated Equalized Odds
Optimizes over calibrated classifier
score outputs that lead to fair output
labels
Equalized Odds
Modifies the predicted label using an
optimization scheme to make
predictions fairer
Pre-processing Algorithms
Mitigates Bias in Training Data
In-processing Algorithms
Mitigates Bias in Classifiers
Post-processing Algorithms
Mitigates Bias in Prediction
71. 71
A fully autonomous car is transporting a human being (A) to its
desired destination. Suddenly, in a twist of fate, some living
being (B) appears on the road. The AI (i.e., the computer) that
controls the vehicle (i.e., the machine) must come to a
decision within a fraction of a second: take evasive action or
continue straight ahead. If it does try to dodge B, the vehicle
skids and hits a tree, A dies, and B survives. If not, A survives,
but B dies. For simplification purposes, we shall assume that
collateral damage is negligible or identical in both cases.
Fairness & Ethics in AI
73. 73
▪ To facilitate data scientists to practice data ethics
▪ To evaluate considerations related to advanced
analytics and machine learning applications from
data collection through deployment
▪ To ensure that risks inherent to AI-empowered
technology do not escalate into threats to an
organization's constituents, reputation, or society
more broadly
▪ To provide concrete, actionable reminders to the
developers that have influence over how data
science gets done
▪ A lightweight, open-source command line tool
that facilitates integration into ML workflows
Fairness & Ethics in AI
74. 74
Bias mitigation can be applied at different phases of the
machine learning pipeline
Fairness & Ethics in AI
INFORMED CONSENT DATA SECURITY MISSING PERSPECTIVE
PROXY
DISCRIMINATION
REDRESS
COLLECTION BIAS
RIGHT TO BE
FORGOTTEN
DATASET BIAS
FAIRNESS ACROSS
GROUPS
ROLL BACK
LIMITING PII
EXPOSURE
DATA RETENTION PLAN
HONEST
REPRESENTATION
METRIC SELECTION CONCEPT DRIFT
DOWNSTREAM BIAS
MITIGATION
PRIVACY IN ANALYSIS EXPLAINABILITY UNINTENDED USE
AUDITABILITY COMMUNICATING BIAS
DATA COLLECTION DATA STORAGE ANALYSIS MODELING DEPLOYMENT
76. 76
A. Data Collection
A.1 INFORMED CONSENT If there are human subjects, have they given informed consent, where
subjects affirmatively opt-in and have a clear understanding of the data uses
to which they consent?
A.2 COLLECTION BIAS Have we considered sources of bias that could be introduced during data
collection and survey design and taken steps to mitigate those?
A.3 LIMIT PII EXPOSURE Have we considered ways to minimize exposure of personally identifiable
information (PII) for example through anonymization or not collecting
information that isn't relevant for analysis?
A.4 DOWNSTREAM BIAS
MITIGATION
Have we considered ways to enable testing downstream results for biased
outcomes (e.g., collecting data on protected group status like race or
gender)?
Fairness & Ethics in AI
77. 77
▪ FB confirmed it in fact used phone numbers that users had provided it for security purposes to also
target them with ads
▪ Specially a phone number handed over for two factor authentication (2FA)
▪ SFA is a security technique that adds a second layer of authentication to help keep accounts secure
Facebook uses phone numbers provided for two-factor
authentication to target users with ads
Fairness & Ethics in AI
https://techcrunch.com/2018/09/27/yes-facebook-is-using-your-2fa-phone-number-to-target-you-with-ads/
A.1 Informed Consent
78. 78
▪ Data fundamentalism is the notion that correlation always indicates causation
and that massive data sets & predictive analytics always reflect objective truth
▪ Datasets are not objective, they are creations of human design
▪ We give numbers their voice, draw inferences from them, and define their
meaning through our interpretation
▪ Hidden biases in both the collection and analysis stages present considerable
risks
▪ Biases are as important to the big-data equation as the numbers themselves
Low smartphone penetration areas contribute less to big
data and consequently become digitally invisible
Fairness & Ethics in AI
https://hbr.org/2013/04/the-hidden-biases-in-big-data
A.2 Collection Bias
79. 79
▪ The greatest number of tweets (20 millions) were generated from Manhattan
around the strike of Hurricane Sandy creating the illusion that Manhattan was
the hub of the disaster
▪ Very few messages originated from more severely affected locations, such as
Breezy Point, Coney Island, and Rockaway
▪ As extended power backouts drained batteries and limited cellular access, even
fewer tweets came from the worst hit areas
▪ In fact, there was much more going on outside the privileged, urban experience of
Sandy that Twitter data failed to convey, especially in aggregate
▪ Data are assumed to accurately reflect the social world, but there are significant
gaps, with little or no signal coming from particular communities
Low smartphone penetration areas contribute less to big
data and consequently become digitally invisible
Fairness & Ethics in AI
https://hbr.org/2013/04/the-hidden-biases-in-big-data
A.2 Collection Bias
80. 80
▪ New York City released data of 173M individual taxi
trips including information like time & location of the
pickup & drop off as well as an anonymised licence
number and medallion number
▪ Even though no PII (Personal Identifiable
Information) was included in the data, de-
anonymising data to reveal personal identity was
then found to be trivial
▪ The anonymised licence number and medallion
number had not been anonymised properly and
trivial to undo with other publicly available data
Personal information on taxi drivers can be accessed in
poorly anonymized taxi trips dataset of New York City
Fairness & Ethics in AI
https://www.theguardian.com/technology/2014/jun/27/new-york-taxi-details-anonymised-data-researchers-warn
A.3 Limiting PII Exposure
81. 81
▪ 3 medallion number formats giving 22M possibilities
▪ 9X99, XX999, XXX999 (some numbers are skipped)
▪ 2 licence number formats giving 2M possibilities
▪ 5XXXXX, 5XXXXXX
▪ Both numbers were anonymised by hashing using MD5
▪ With the number of possibilities down to 24M, it was a
matter of only minutes to determine which number was
associated with a hash
▪ The entire dataset was de-anonymised within one hour
Personal information on taxi drivers can be accessed in
poorly anonymized taxi trips dataset of New York City
Fairness & Ethics in AI
https://www.theguardian.com/technology/2014/jun/27/new-york-taxi-details-anonymised-data-researchers-warn
A.3 Limiting PII Exposure
82. 82
▪ There was ton of resources on NYC Taxi and
Limousine Commission including a mapping
from licence number to driver name and a
way to look up owners of medallions
▪ This anonymisation was so poor that anyone
could, with less than two hours work, figure
out which driver drove every single trip in this
entire dataset, or calculate drivers' gross
income or infer where they live
▪ NYC could have simply assigned random
numbers to each licence plate making it much
more difficult to work backwards
Personal information on taxi drivers can be accessed in
poorly anonymized taxi trips dataset of New York City
Fairness & Ethics in AI
https://www.theguardian.com/technology/2014/jun/27/new-york-taxi-details-anonymised-data-researchers-warn
A.3 Limiting PII Exposure
83. 83
In six major cities,
Amazon's same day
delivery service
excludes many
predominantly black
neighborhoods
Fairness & Ethics in AI
https://www.bloomberg.com/graphics/2016-amazon-same-day/
A.4 Downstream Bias Mitigation
85. 85
B. Storage Collection
B.1 DATA SECURITY Do we have a plan to protect and secure data (e.g., encryption at rest and in
transit, access controls on internal users and third parties, access logs, and
up-to-date software)?
B.2 RIGHT TO BE
FORGOTTEN
Do we have a mechanism through which an individual can request their
personal information be removed?
B.3 DATA RETENTION
PLAN
Is there a schedule or plan to delete the data after it is no longer needed?
Fairness & Ethics in AI
86. 86
▪ Equifax is one of US's biggest credit reporting agencies
▪ In a filing with the SEC in 2018, Equiifax acknowledged that 145.5M Social Security
Numbers were compromised, more than 200,000 credit card numbers and expiration
dates were also collected, as well as government-issued identification documents (e.g.,
driver's licenses, taxpayer ID cards, passports and military IDs) that about 182,000
consumers uploaded when they disputed credit reports with Equifax
Equifax revealed the exact scope of the massive breach
that exposed sensitive data about millions of Americans
Fairness & Ethics in AI
https://www.nbcnews.com/news/us-news/equifax-breaks-down-just-how-bad-last-year-s-data-n872496
B.1 Data Security
Data Element Stolen Data Analyzed Approximate Number of Impacted US Consumers
Name First Name, Last Name, Middle Name, Suffix, Full Name 146.6 million
Data of Birth D.O.B. 146.6 million
Social Security Number SSN 145.5 million
Address Information Address, Address2, City, State, Zip 99 million
Gender Gender 27.3 million
Phone Number Phone, Phone2 20.3 million
Driver Licence Number DL 17.6 million
Email Address Email Address 1.8 million
Payment Card Number and Expiration Date CC Number, Exp Date 209,000
Tax ID TaxID 97,500
Driver's License State DL License State 27,000
87. 87
▪ The PDPO only includes a general obligation on a data user to take all practicable steps to erase
personal data held by it where the data is no longer required for the purpose for which the data was
used (unless such erasure is prohibited under any law or it is in the public interest, including historical
interest, for the data not to be erased)
▪ In the banking context, however: (i) the Privacy Commissioner has published a specific code of practice
on consumer credit data such that a credit provider must inform data subjects that they have the right to
instruct the credit provider to make a request to a credit reference agency to delete account data
relating to a terminated account and (ii) the Code of Banking Practice published by the Hong Kong
Association of Banks requires institution to have in place appropriate control and protection mechanism
that acknowledge the rights of customers to obtain prompt correction and/or deletion of inaccurate, or
unlawfully collected or processed data
▪ The Privacy Commissioner has the power, by way of issuing an enforcement notice, to request a data
user to remove personal data if the use of the personal data contravenes the PDPO
▪ This power has been exercised by the Privacy Commissioner in the past and was upheld on a legal
challenge against the Privacy Commissioner’s decision
There is no express right to be forgotten under Hong Kong's
PDPO (Personal Data Privacy Ordinance)
Fairness & Ethics in AI
https://www.linklaters.com/en/insights/data-protected/data-protected---hong-kong
B.2 Right to be Forgotten
88. 88
▪ FedEx was reported in 2018 to have exposed customer private information after
a legacy server was left open without a password
▪ The server contained more than 112,000 unencrypted files
▪ Completed US Postal Service forms (including names, home addresses, phone numbers,
and handwritten signatures) used to authorize the handling of mail, drivers' licenses,
national ID cards, and work ID cards, voting cards, utility bills, resumes, vehicle
registration forms, medical insurance cards, firearms licenses, a few US military
identification cards, and even a handful of credit cards that customers used to verify
their identity with the FedEx division
▪ Despite the division's shutdown in 2015, some documents remained valid and
the breach put customers at risk of identity theft
Unsecured server exposed thousands of FedEx customer
records
Fairness & Ethics in AI
https://www.zdnet.com/article/unsecured-server-exposes-fedex-customer-records/
B.3 Data Retention Plan
90. 90
C. Analysis
C.1 MISSING
PERSPECTIVES
Have we sought to address blindspots in the analysis through engagement
with relevant stakeholders (e.g., checking assumptions and discussing
implications with affected communities and subject matter experts)?
C.2 DATASET BIAS Have we examined the data for possible sources of bias and taken steps to
mitigate or address these biases (e.g., stereotype perpetuation, confirmation
bias, imbalanced classes, or omitted confounding variables)?
C.3 HONEST
REPRESENTATION
Are our visualizations, summary statistics, and reports designed to honestly
represent the underlying data?
C.4 PRIVACY IN ANALYSIS Have we ensured that data with PII are not used or displayed unless
necessary for the analysis?
C.5 AUDITABILITY Is the process of generating the analysis well documented and reproducible if
we discover issues in the future?
Fairness & Ethics in AI
92. 92
D. Modelling
D.1 PROXY
DISCRIMINATION
Have we ensured that the model does not rely on variables or proxies for
variables that are unfairly discriminatory?
D.2 FAIRNESS ACROSS
GROUPS
Have we tested model results for fairness with respect to different affected
groups (e.g., tested for disparate error rates)?
D.3 METRIC SELECTION Have we considered the effects of optimizing for our defined metrics and
considered additional metrics?
D.4 EXPLAINABILITY Can we explain in understandable terms a decision the model made in cases
where a justification is needed?
D.5 COMMUNICATE BIAS Have we communicated the shortcomings, limitations, and biases of the
model to relevant stakeholders in ways that can be generally understood?
Fairness & Ethics in AI
94. 94
E. Deployment
E.1 REDRESS Have we discussed with our organization a plan for response if users are
harmed by the results (e.g., how does the data science team evaluate these
cases and update analysis and models to prevent future harm)?
E.2 ROLL BACK Is there a way to turn off or roll back the model in production if necessary?
E.3 CONCEPT DRIFT Do we test and monitor for concept drift to ensure the model remains fair over
time?
E.4 UNINTENDED USE Have we taken steps to identify and prevent unintended uses and abuse of
the model and do we have a plan to monitor these once the model is
deployed?
Fairness & Ethics in AI