Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Lab presentation (a framework for understanding unintended consequences of machine learning)

Framework for bias in ML process

  • Be the first to comment

  • Be the first to like this

Lab presentation (a framework for understanding unintended consequences of machine learning)

  1. 1. A Framework for Understanding Unintended consequences of Machine Learning Author: Harini Suresh (MIT), John V. Guttag(MIT) Presented: Chenguang Xu “Shine”
  2. 2. The Problem with Biased data • Various unwanted consequences of ML algorithm arise in some way from biased data. • Bias refers to an unintended or potentially harmful property of the data. • Data is a product of many factors, and is the product of a process
  3. 3. An illustrative Scenario Lack of data on women, introducing more data solved the issue. The use of a proxy label (human assessment of quality) versus the true label (actual qualification) allowed the model to discriminate by gender.
  4. 4. Five Sources of Bias in ML
  5. 5. Historical Bias It is a fundamental, structural issue with the very first step of the data generation process.
  6. 6. Representation Bias • It arises when defining and sampling from a population. • It can arise for several reasons: • The sampling methods only reach a portion of the population. • The population of interest has changed or is distinct from the population used during model training.
  7. 7. Representation Bias (cont.) Shankar, Shreya, et al. "No classification without representation: Assessing geodiversity issues in open data sets for the developing world." arXiv preprint arXiv:1711.08536 (2017).
  8. 8. Representation Bias (cont.) Photos of bridegrooms from different countries aligned by the log-likelihood that the classifier trained on Open Images assigns to the bridegroom class. Shankar, Shreya, et al. "No classification without representation: Assessing geodiversity issues in open data sets for the developing world." arXiv preprint arXiv:1711.08536 (2017).
  9. 9. Measurement Bias • It arises when subsequently choosing and measuring the particular features of interest. • It can arise in several ways: • The granularity of data varies across groups. • The quality of data varies across groups. • The defined classification task is an oversimplification.
  10. 10. • It arises when a one-size-fit-all model is used for groups with different conditional distributions. Aggregation Bias
  11. 11. Evaluation Bias • It occurs when the evaluation and/or benchmark data for an algorithm doesn’t represent the target population. Buolamwini, Joy, and Timnit Gebru. "Gender shades: Intersectional accuracy disparities in commercial gender classification." Conference on Fairness, Accountability and Transparency. 2018.
  12. 12. Formalizations and Mitigations • A data generation and ML pipeline viewed as a series of mapping functions. Mitigating Aggregation Bias: • adjusting g • change r or t for transforming the data Mitigating Evaluation Bias: • redefine k • adjusting X, Y ^ ^ Mitigating Representation Bias: • improve s Measurement and historical Bias: • adjust s will likely be ineffective
  13. 13.

    Be the first to comment

    Login to see the comments

Framework for bias in ML process

Views

Total views

161

On Slideshare

0

From embeds

0

Number of embeds

0

Actions

Downloads

3

Shares

0

Comments

0

Likes

0

×