Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Frontiers of Computational Journalism week 5 - Algorithmic Accountability and Discrimination

52 views

Published on

Taught at Columbia Journalism School, Fall 2018
Full syllabus and lecture videos at http://www.compjournalism.com/?p=218

Published in: Education
  • Be the first to comment

  • Be the first to like this

Frontiers of Computational Journalism week 5 - Algorithmic Accountability and Discrimination

  1. 1. Frontiers of Computational Journalism Columbia Journalism School Week 5: Algorithmic Accountability and Discrimination October 10, 2018
  2. 2. This class • Algorithms in society • Previous algorithmic accountability stories • Measuring race and gender Bias • Unpacking ProPublica’s “Machine Bias”
  3. 3. Algorithms in Society
  4. 4. Algorithms in society • Personalized search • Political microtargeting • Credit score / loans / insurance • Predictive policing • Price discrimination • Algorithmic trading / markets • Terrorist threat prediction • Hiring models
  5. 5. Institute for the Future’s “unintended harms of technology” ethicalos.org
  6. 6. algorithmtips.org
  7. 7. Algorithm Tips: A Resource for Algorithmic Accountability in Government Trielli, Stark, Diakopolous Search terms used to find “algorithms” on .gov sites
  8. 8. Algorithm Tips: A Resource for Algorithmic Accountability in Government Trielli, Stark, Diakopolous
  9. 9. From myfico.com
  10. 10. Algorithmic Accountability Stories
  11. 11. Analyzing Released NYC Value-Added Data Part 2, Gary Rubenstein
  12. 12. Websites Vary Prices, Deals Based on Users' Information Valentino-Devries, Singer-Vine and Soltani, WSJ, 2012
  13. 13. Message Machine Jeff Larson, Al Shaw, ProPublica, 2012
  14. 14. Chicago Area Disparities in Car Insurance Premiums Al Shaw, Jeff Larson, Julia Angwin, ProPublica, 2017
  15. 15. Minority Neighborhoods Pay Higher Car Insurance Premiums Than White Areas With the Same Risk, Angwin, Larson, Kirchner, Mattu, ProPublica, 2017
  16. 16. How Uber surge pricing really works, Nick Diakopolous
  17. 17. Measuring Race and Gender Bias
  18. 18. Title VII of Civil Rights Act, 1964 • It shall be an unlawful employment practice for an employer - • (1) to fail or refuse to hire or to discharge any individual, or otherwise to discriminate against any individual with respect to his compensation, terms, conditions, or privileges of employment, because of such individual’s race, color, religion, sex, or national origin; or • (2) to limit, segregate, or classify his employees or applicants for employment in any way which would deprive or tend to deprive any individual of employment opportunities or otherwise adversely affect his status as an employee, because of such individual’s race, color, religion, sex, or national origin.
  19. 19. Regulated Domains Credit (Equal Credit Opportunity Act) Education (Civil Rights Act of 1964; Education Amendments of 1972) Employment (Civil Rights Act of 1964) Housing (Fair Housing Act) ‘Public Accommodation’ (Civil Rights Act of 1964) Fairness in Machine Learning, NIPS 2017 Tutorial Solon Barocas and Moritz Hardt
  20. 20. Protected Classes Race (Civil Rights Act of 1964); Color (Civil Rights Act of 1964); Sex (Equal Pay Act of 1963; Civil Rights Act of 1964); Religion (Civil Rights Act of 1964);National origin (Civil Rights Act of 1964); Citizenship (Immigration Reform and Control Act); Age (Age Discrimination in Employment Act of 1967);Pregnancy (Pregnancy Discrimination Act); Familial status (Civil Rights Act of 1968); Disability status (Rehabilitation Act of 1973; Americans with Disabilities Act of 1990); Veteran status (Vietnam Era Veterans' Readjustment Assistance Act of 1974; Uniformed Services Employment and Reemployment Rights Act); Genetic information (Genetic Information Nondiscrimination Act) Fairness in Machine Learning, NIPS 2017 Tutorial Solon Barocas and Moritz Hardt
  21. 21. Race and gender correlate with everything
  22. 22. From Kosinski et. al., Private traits and attributes are predictable from digital records of human behavior Learning from Facebook likes
  23. 23. Predicting gender from Twitter Zamal et. al., Homophily and Latent Attribute Inference: Inferring Latent Attributes of Twitter Users from Neighbors
  24. 24. Predicting race from Twitter Pennacchiotti and Popescu, A Machine Learning Approach to Twitter User Classification
  25. 25. Unpacking ProPublica’s “Machine Bias”
  26. 26. Should Prison Sentences Be Based On Crimes That Haven’t Been Committed Yet?, FiveThirtyEight
  27. 27. COMPAS ”CORE” questionnaire, 2011. Includes criminal history, family history, gang involvement, drug use…
  28. 28. How We Analyzed the COMPAS Recidivism Algorithm, ProPublica
  29. 29. Stephanie Wykstra, personal communication
  30. 30. ProPublica argument False positive rate P(high risk |black, no arrest) = C/(C+A) = 0.45 P(high risk |white, no arrest) = G/(G+E) = 0.23 False negative rate P(low risk | black, arrested ) = B/(B+D) = 0.28 P(low risk | white, arrested ) = F/(F+H) = 0.48 Northpointe response Positive predictive value P(arrest| black, high risk) = D/(C+D) = 0.63 P(arrest| white, high risk) = H/(G+H) = 0.59
  31. 31. P(outcome | score) is fair Fair prediction with disparate impact: A study of bias in recidivism prediction instruments, Chouldechova
  32. 32. How We Analyzed the COMPAS Recidivism Algorithm, ProPublica Or, as ProPublica put it
  33. 33. Even if two groups of the population admit simple classifiers, the whole population may not (from How Big Data is Unfair)
  34. 34. The Problem Fair prediction with disparate impact: A study of bias in recidivism prediction instruments, Chouldechova
  35. 35. When the base rates differ by protected group and when there is not separation, one cannot have both conditional use accuracy equality and equality in the false negative and false positive rates. … The goal of complete race or gender neutrality is unachievable. … Altering a risk algorithm to improve matters can lead to difficult stakeholder choices. If it is essential to have conditional use accuracy equality, the algorithm will produce different false positive and false negative rates across the protected group categories. Conversely, if it is essential to have the same rates of false positives and false negatives across protected group categories, the algorithm cannot produce conditional use accuracy equality. Stakeholders will have to settle for an increase in one for a decrease in the other. Fairness in Criminal Justice Risk Assessments: The State of the Art, Berk et. al. Impossibility theorem

×