Frontiers of Computational Journalism week 5 - Algorithmic Accountability and Discrimination

Frontiers of
Computational Journalism
Columbia Journalism School
Week 5: Algorithmic Accountability and Discrimination
October 10, 2018
This class
• Algorithms in society
• Previous algorithmic accountability stories
• Measuring race and gender Bias
• Unpacking ProPublica’s “Machine Bias”
Algorithms in Society
Algorithms in society
• Personalized search
• Political microtargeting
• Credit score / loans / insurance
• Predictive policing
• Price discrimination
• Algorithmic trading / markets
• Terrorist threat prediction
• Hiring models
Institute for the Future’s “unintended harms of technology”
ethicalos.org
algorithmtips.org
Algorithm Tips: A Resource for Algorithmic Accountability in Government
Trielli, Stark, Diakopolous
Search terms used to find “algorithms” on .gov sites
Algorithm Tips: A Resource for Algorithmic Accountability in Government
Trielli, Stark, Diakopolous
From myfico.com
Algorithmic Accountability Stories
Analyzing Released NYC Value-Added Data Part 2, Gary Rubenstein
Websites Vary Prices, Deals Based on Users' Information
Valentino-Devries, Singer-Vine and Soltani, WSJ, 2012
Message Machine
Jeff Larson, Al Shaw, ProPublica, 2012
Chicago Area Disparities in Car Insurance Premiums
Al Shaw, Jeff Larson, Julia Angwin, ProPublica, 2017
Minority Neighborhoods Pay Higher Car Insurance Premiums Than White Areas With the
Same Risk, Angwin, Larson, Kirchner, Mattu, ProPublica, 2017
How Uber surge pricing really works, Nick Diakopolous
Measuring Race and Gender Bias
Title VII of Civil Rights Act, 1964
• It shall be an unlawful employment practice for an employer -
• (1) to fail or refuse to hire or to discharge any individual, or otherwise to
discriminate against any individual with respect to his compensation,
terms, conditions, or privileges of employment, because of such
individual’s race, color, religion, sex, or national origin; or
• (2) to limit, segregate, or classify his employees or applicants for
employment in any way which would deprive or tend to deprive any
individual of employment opportunities or otherwise adversely affect his
status as an employee, because of such individual’s race, color,
religion, sex, or national origin.
Regulated Domains
Credit (Equal Credit Opportunity Act)
Education (Civil Rights Act of 1964; Education Amendments of 1972)
Employment (Civil Rights Act of 1964)
Housing (Fair Housing Act)
‘Public Accommodation’ (Civil Rights Act of 1964)
Fairness in Machine Learning, NIPS 2017 Tutorial
Solon Barocas and Moritz Hardt
Protected Classes
Race (Civil Rights Act of 1964); Color (Civil Rights Act of 1964); Sex (Equal
Pay Act of 1963; Civil Rights Act of 1964); Religion (Civil Rights Act of
1964);National origin (Civil Rights Act of 1964); Citizenship (Immigration
Reform and Control Act); Age (Age Discrimination in Employment Act of
1967);Pregnancy (Pregnancy Discrimination Act); Familial status (Civil
Rights Act of 1968); Disability status (Rehabilitation Act of 1973; Americans
with Disabilities Act of 1990); Veteran status (Vietnam Era Veterans'
Readjustment Assistance Act of 1974; Uniformed Services Employment and
Reemployment Rights Act); Genetic information (Genetic Information
Nondiscrimination Act)
Fairness in Machine Learning, NIPS 2017 Tutorial
Solon Barocas and Moritz Hardt
Race and gender
correlate with everything
From Kosinski et. al., Private traits and attributes are predictable from digital records of
human behavior
Learning from Facebook likes
Frontiers of Computational Journalism week 5 - Algorithmic Accountability and Discrimination
Predicting gender from Twitter
Zamal et. al., Homophily and Latent Attribute Inference: Inferring Latent Attributes of
Twitter Users from Neighbors
Predicting race from Twitter
Pennacchiotti and Popescu, A Machine Learning Approach to Twitter User Classification
Unpacking ProPublica’s
“Machine Bias”
Should Prison Sentences Be Based On Crimes That Haven’t Been Committed
Yet?, FiveThirtyEight
COMPAS ”CORE” questionnaire, 2011.
Includes criminal history, family history, gang involvement, drug use…
How We Analyzed the COMPAS Recidivism Algorithm, ProPublica
Stephanie Wykstra, personal communication
ProPublica argument
False positive rate
P(high risk |black, no arrest) = C/(C+A) = 0.45
P(high risk |white, no arrest) = G/(G+E) = 0.23
False negative rate
P(low risk | black, arrested ) = B/(B+D) = 0.28
P(low risk | white, arrested ) = F/(F+H) = 0.48
Northpointe response
Positive predictive value
P(arrest| black, high risk) = D/(C+D) = 0.63
P(arrest| white, high risk) = H/(G+H) = 0.59
P(outcome | score) is fair
Fair prediction with disparate impact: A study of bias in recidivism prediction instruments,
Chouldechova
How We Analyzed the COMPAS Recidivism Algorithm, ProPublica
Or, as ProPublica put it
Even if two groups of the population admit simple classifiers, the whole population may
not (from How Big Data is Unfair)
The Problem
Fair prediction with disparate impact: A study of bias in recidivism prediction instruments,
Chouldechova
When the base rates differ by protected group and when there is not separation, one cannot have
both conditional use accuracy equality and equality in the false negative and false positive rates.
…
The goal of complete race or gender neutrality is unachievable.
…
Altering a risk algorithm to improve matters can lead to difficult stakeholder choices. If it is
essential to have conditional use accuracy equality, the algorithm will produce different false
positive and false negative rates across the protected group categories. Conversely, if it is
essential to have the same rates of false positives and false negatives across protected group
categories, the algorithm cannot produce conditional use accuracy equality. Stakeholders will
have to settle for an increase in one for a decrease in the other.
Fairness in Criminal Justice Risk Assessments: The State of the Art, Berk et. al.
Impossibility theorem
1 of 36

Recommended

Frontiers of Computational Journalism week 6 - Quantitative Fairness by
Frontiers of Computational Journalism week 6 - Quantitative FairnessFrontiers of Computational Journalism week 6 - Quantitative Fairness
Frontiers of Computational Journalism week 6 - Quantitative FairnessJonathan Stray
544 views54 slides
Frontiers of Computational Journalism week 7 - Randomness and Statistical Sig... by
Frontiers of Computational Journalism week 7 - Randomness and Statistical Sig...Frontiers of Computational Journalism week 7 - Randomness and Statistical Sig...
Frontiers of Computational Journalism week 7 - Randomness and Statistical Sig...Jonathan Stray
484 views92 slides
Essays on using Formal Concept Analysis in Information Engineering by
Essays on using Formal Concept Analysis in Information EngineeringEssays on using Formal Concept Analysis in Information Engineering
Essays on using Formal Concept Analysis in Information EngineeringWitology
599 views110 slides
Frameworks for Algorithmic Bias by
Frameworks for Algorithmic BiasFrameworks for Algorithmic Bias
Frameworks for Algorithmic BiasJonathan Stray
538 views59 slides
Analyzing Bias in Data - IRE 2019 by
Analyzing Bias in Data - IRE 2019Analyzing Bias in Data - IRE 2019
Analyzing Bias in Data - IRE 2019Jonathan Stray
921 views27 slides
algorithmic-bias.pptx by
algorithmic-bias.pptxalgorithmic-bias.pptx
algorithmic-bias.pptxTewodrosEshete1
26 views23 slides

More Related Content

Similar to Frontiers of Computational Journalism week 5 - Algorithmic Accountability and Discrimination

Detecting Algorithmic Bias (keynote at DIR 2016) by
Detecting Algorithmic Bias (keynote at DIR 2016)Detecting Algorithmic Bias (keynote at DIR 2016)
Detecting Algorithmic Bias (keynote at DIR 2016)Carlos Castillo (ChaTo)
1.4K views47 slides
The Importance Of Voter Apathy by
The Importance Of Voter ApathyThe Importance Of Voter Apathy
The Importance Of Voter ApathyMegan Espinoza
5 views77 slides
The Path Forward: The Digital Transformation in Social Determinants of Health by
The Path Forward: The Digital Transformation in Social Determinants of HealthThe Path Forward: The Digital Transformation in Social Determinants of Health
The Path Forward: The Digital Transformation in Social Determinants of HealthCHC Connecticut
772 views30 slides
Voter Identification Legislation by
Voter Identification LegislationVoter Identification Legislation
Voter Identification LegislationSarah Jimenez
2 views77 slides
Biased Election Maps: Fighting Gerrymandering in 2020 by
Biased Election Maps: Fighting Gerrymandering in 2020Biased Election Maps: Fighting Gerrymandering in 2020
Biased Election Maps: Fighting Gerrymandering in 2020Nick Mapmeld
108 views33 slides
A Rational Choice Model of Compulsory Voting by
A Rational Choice Model of Compulsory VotingA Rational Choice Model of Compulsory Voting
A Rational Choice Model of Compulsory VotingJonathon Flegg
1.7K views14 slides

Similar to Frontiers of Computational Journalism week 5 - Algorithmic Accountability and Discrimination(20)

The Path Forward: The Digital Transformation in Social Determinants of Health by CHC Connecticut
The Path Forward: The Digital Transformation in Social Determinants of HealthThe Path Forward: The Digital Transformation in Social Determinants of Health
The Path Forward: The Digital Transformation in Social Determinants of Health
CHC Connecticut772 views
Voter Identification Legislation by Sarah Jimenez
Voter Identification LegislationVoter Identification Legislation
Voter Identification Legislation
Sarah Jimenez2 views
Biased Election Maps: Fighting Gerrymandering in 2020 by Nick Mapmeld
Biased Election Maps: Fighting Gerrymandering in 2020Biased Election Maps: Fighting Gerrymandering in 2020
Biased Election Maps: Fighting Gerrymandering in 2020
Nick Mapmeld108 views
A Rational Choice Model of Compulsory Voting by Jonathon Flegg
A Rational Choice Model of Compulsory VotingA Rational Choice Model of Compulsory Voting
A Rational Choice Model of Compulsory Voting
Jonathon Flegg1.7K views
Consequentialism Theory Essay by Nina Vazquez
Consequentialism Theory EssayConsequentialism Theory Essay
Consequentialism Theory Essay
Nina Vazquez2 views
National Civic Summit - University Of Minnestoa - Chris Uggen by National Civic Summit
National Civic Summit - University Of Minnestoa - Chris UggenNational Civic Summit - University Of Minnestoa - Chris Uggen
National Civic Summit - University Of Minnestoa - Chris Uggen
Running head PERCEPTION AND DISEFRANCHISEMENT 1PERCEPTION AN.docx by toltonkendal
Running head PERCEPTION AND DISEFRANCHISEMENT 1PERCEPTION AN.docxRunning head PERCEPTION AND DISEFRANCHISEMENT 1PERCEPTION AN.docx
Running head PERCEPTION AND DISEFRANCHISEMENT 1PERCEPTION AN.docx
toltonkendal3 views
3Victimization inthe United StatesAn OverviewCHAPTE.docx by lorainedeserre
3Victimization inthe United StatesAn OverviewCHAPTE.docx3Victimization inthe United StatesAn OverviewCHAPTE.docx
3Victimization inthe United StatesAn OverviewCHAPTE.docx
lorainedeserre10 views
The Importance Of Wrongful Convictions by Shannon Wright
The Importance Of Wrongful ConvictionsThe Importance Of Wrongful Convictions
The Importance Of Wrongful Convictions
Shannon Wright3 views
14 Congressional Digest n www.CongressionalDigest.com n Januar.docx by drennanmicah
14 Congressional Digest n www.CongressionalDigest.com n Januar.docx14 Congressional Digest n www.CongressionalDigest.com n Januar.docx
14 Congressional Digest n www.CongressionalDigest.com n Januar.docx
drennanmicah4 views
Voter Turnout by atrantham
Voter TurnoutVoter Turnout
Voter Turnout
atrantham80 views
The Controversy Over Voter Identification Laws Essay by Alexis Adams
The Controversy Over Voter Identification Laws EssayThe Controversy Over Voter Identification Laws Essay
The Controversy Over Voter Identification Laws Essay
Alexis Adams2 views

More from Jonathan Stray

Frontiers of Computational Journalism week 11 - Privacy and Security by
Frontiers of Computational Journalism week 11 - Privacy and SecurityFrontiers of Computational Journalism week 11 - Privacy and Security
Frontiers of Computational Journalism week 11 - Privacy and SecurityJonathan Stray
720 views91 slides
Frontiers of Computational Journalism week 10 - Truth and Trust by
Frontiers of Computational Journalism week 10 - Truth and TrustFrontiers of Computational Journalism week 10 - Truth and Trust
Frontiers of Computational Journalism week 10 - Truth and TrustJonathan Stray
5.1K views32 slides
Frontiers of Computational Journalism week 9 - Knowledge representation by
Frontiers of Computational Journalism week 9 - Knowledge representationFrontiers of Computational Journalism week 9 - Knowledge representation
Frontiers of Computational Journalism week 9 - Knowledge representationJonathan Stray
2.6K views37 slides
Frontiers of Computational Journalism week 8 - Visualization and Network Anal... by
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...Jonathan Stray
700 views72 slides
Frontiers of Computational Journalism - Final project suggestions by
Frontiers of Computational Journalism - Final project suggestionsFrontiers of Computational Journalism - Final project suggestions
Frontiers of Computational Journalism - Final project suggestionsJonathan Stray
98 views11 slides
Frontiers of Computational Journalism week 4 - Statistical Inference by
Frontiers of Computational Journalism week 4 - Statistical InferenceFrontiers of Computational Journalism week 4 - Statistical Inference
Frontiers of Computational Journalism week 4 - Statistical InferenceJonathan Stray
338 views54 slides

More from Jonathan Stray(9)

Frontiers of Computational Journalism week 11 - Privacy and Security by Jonathan Stray
Frontiers of Computational Journalism week 11 - Privacy and SecurityFrontiers of Computational Journalism week 11 - Privacy and Security
Frontiers of Computational Journalism week 11 - Privacy and Security
Jonathan Stray720 views
Frontiers of Computational Journalism week 10 - Truth and Trust by Jonathan Stray
Frontiers of Computational Journalism week 10 - Truth and TrustFrontiers of Computational Journalism week 10 - Truth and Trust
Frontiers of Computational Journalism week 10 - Truth and Trust
Jonathan Stray5.1K views
Frontiers of Computational Journalism week 9 - Knowledge representation by Jonathan Stray
Frontiers of Computational Journalism week 9 - Knowledge representationFrontiers of Computational Journalism week 9 - Knowledge representation
Frontiers of Computational Journalism week 9 - Knowledge representation
Jonathan Stray2.6K views
Frontiers of Computational Journalism week 8 - Visualization and Network Anal... by Jonathan Stray
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
Jonathan Stray700 views
Frontiers of Computational Journalism - Final project suggestions by Jonathan Stray
Frontiers of Computational Journalism - Final project suggestionsFrontiers of Computational Journalism - Final project suggestions
Frontiers of Computational Journalism - Final project suggestions
Jonathan Stray98 views
Frontiers of Computational Journalism week 4 - Statistical Inference by Jonathan Stray
Frontiers of Computational Journalism week 4 - Statistical InferenceFrontiers of Computational Journalism week 4 - Statistical Inference
Frontiers of Computational Journalism week 4 - Statistical Inference
Jonathan Stray338 views
Frontiers of Computational Journalism week 3 - Information Filter Design by Jonathan Stray
Frontiers of Computational Journalism week 3 - Information Filter DesignFrontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter Design
Jonathan Stray530 views
Frontiers of Computational Journalism week 2 - Text Analysis by Jonathan Stray
Frontiers of Computational Journalism week 2 - Text AnalysisFrontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text Analysis
Jonathan Stray527 views
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio... by Jonathan Stray
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Jonathan Stray481 views

Recently uploaded

BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE... by
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...Nguyen Thanh Tu Collection
100 views91 slides
StudioX.pptx by
StudioX.pptxStudioX.pptx
StudioX.pptxNikhileshSathyavarap
106 views18 slides
Introduction to AERO Supply Chain - #BEAERO Trainning program by
Introduction to AERO Supply Chain  - #BEAERO Trainning programIntroduction to AERO Supply Chain  - #BEAERO Trainning program
Introduction to AERO Supply Chain - #BEAERO Trainning programGuennoun Wajih
123 views78 slides
Guidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptx by
Guidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptxGuidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptx
Guidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptxNiranjan Chavan
42 views48 slides
Interaction of microorganisms with vascular plants.pptx by
Interaction of microorganisms with vascular plants.pptxInteraction of microorganisms with vascular plants.pptx
Interaction of microorganisms with vascular plants.pptxMicrobiologyMicro
58 views33 slides
MIXING OF PHARMACEUTICALS.pptx by
MIXING OF PHARMACEUTICALS.pptxMIXING OF PHARMACEUTICALS.pptx
MIXING OF PHARMACEUTICALS.pptxAnupkumar Sharma
125 views35 slides

Recently uploaded(20)

BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE... by Nguyen Thanh Tu Collection
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...
Introduction to AERO Supply Chain - #BEAERO Trainning program by Guennoun Wajih
Introduction to AERO Supply Chain  - #BEAERO Trainning programIntroduction to AERO Supply Chain  - #BEAERO Trainning program
Introduction to AERO Supply Chain - #BEAERO Trainning program
Guennoun Wajih123 views
Guidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptx by Niranjan Chavan
Guidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptxGuidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptx
Guidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptx
Niranjan Chavan42 views
Interaction of microorganisms with vascular plants.pptx by MicrobiologyMicro
Interaction of microorganisms with vascular plants.pptxInteraction of microorganisms with vascular plants.pptx
Interaction of microorganisms with vascular plants.pptx
Payment Integration using Braintree Connector | MuleSoft Mysore Meetup #37 by MysoreMuleSoftMeetup
Payment Integration using Braintree Connector | MuleSoft Mysore Meetup #37Payment Integration using Braintree Connector | MuleSoft Mysore Meetup #37
Payment Integration using Braintree Connector | MuleSoft Mysore Meetup #37
JQUERY.pdf by ArthyR3
JQUERY.pdfJQUERY.pdf
JQUERY.pdf
ArthyR3107 views
Nelson_RecordStore.pdf by BrynNelson5
Nelson_RecordStore.pdfNelson_RecordStore.pdf
Nelson_RecordStore.pdf
BrynNelson550 views
11.30.23A Poverty and Inequality in America.pptx by mary850239
11.30.23A Poverty and Inequality in America.pptx11.30.23A Poverty and Inequality in America.pptx
11.30.23A Poverty and Inequality in America.pptx
mary850239181 views
Guess Papers ADC 1, Karachi University by Khalid Aziz
Guess Papers ADC 1, Karachi UniversityGuess Papers ADC 1, Karachi University
Guess Papers ADC 1, Karachi University
Khalid Aziz105 views
INT-244 Topic 6b Confucianism by S Meyer
INT-244 Topic 6b ConfucianismINT-244 Topic 6b Confucianism
INT-244 Topic 6b Confucianism
S Meyer49 views

Frontiers of Computational Journalism week 5 - Algorithmic Accountability and Discrimination

  • 1. Frontiers of Computational Journalism Columbia Journalism School Week 5: Algorithmic Accountability and Discrimination October 10, 2018
  • 2. This class • Algorithms in society • Previous algorithmic accountability stories • Measuring race and gender Bias • Unpacking ProPublica’s “Machine Bias”
  • 4. Algorithms in society • Personalized search • Political microtargeting • Credit score / loans / insurance • Predictive policing • Price discrimination • Algorithmic trading / markets • Terrorist threat prediction • Hiring models
  • 5. Institute for the Future’s “unintended harms of technology” ethicalos.org
  • 7. Algorithm Tips: A Resource for Algorithmic Accountability in Government Trielli, Stark, Diakopolous Search terms used to find “algorithms” on .gov sites
  • 8. Algorithm Tips: A Resource for Algorithmic Accountability in Government Trielli, Stark, Diakopolous
  • 11. Analyzing Released NYC Value-Added Data Part 2, Gary Rubenstein
  • 12. Websites Vary Prices, Deals Based on Users' Information Valentino-Devries, Singer-Vine and Soltani, WSJ, 2012
  • 13. Message Machine Jeff Larson, Al Shaw, ProPublica, 2012
  • 14. Chicago Area Disparities in Car Insurance Premiums Al Shaw, Jeff Larson, Julia Angwin, ProPublica, 2017
  • 15. Minority Neighborhoods Pay Higher Car Insurance Premiums Than White Areas With the Same Risk, Angwin, Larson, Kirchner, Mattu, ProPublica, 2017
  • 16. How Uber surge pricing really works, Nick Diakopolous
  • 17. Measuring Race and Gender Bias
  • 18. Title VII of Civil Rights Act, 1964 • It shall be an unlawful employment practice for an employer - • (1) to fail or refuse to hire or to discharge any individual, or otherwise to discriminate against any individual with respect to his compensation, terms, conditions, or privileges of employment, because of such individual’s race, color, religion, sex, or national origin; or • (2) to limit, segregate, or classify his employees or applicants for employment in any way which would deprive or tend to deprive any individual of employment opportunities or otherwise adversely affect his status as an employee, because of such individual’s race, color, religion, sex, or national origin.
  • 19. Regulated Domains Credit (Equal Credit Opportunity Act) Education (Civil Rights Act of 1964; Education Amendments of 1972) Employment (Civil Rights Act of 1964) Housing (Fair Housing Act) ‘Public Accommodation’ (Civil Rights Act of 1964) Fairness in Machine Learning, NIPS 2017 Tutorial Solon Barocas and Moritz Hardt
  • 20. Protected Classes Race (Civil Rights Act of 1964); Color (Civil Rights Act of 1964); Sex (Equal Pay Act of 1963; Civil Rights Act of 1964); Religion (Civil Rights Act of 1964);National origin (Civil Rights Act of 1964); Citizenship (Immigration Reform and Control Act); Age (Age Discrimination in Employment Act of 1967);Pregnancy (Pregnancy Discrimination Act); Familial status (Civil Rights Act of 1968); Disability status (Rehabilitation Act of 1973; Americans with Disabilities Act of 1990); Veteran status (Vietnam Era Veterans' Readjustment Assistance Act of 1974; Uniformed Services Employment and Reemployment Rights Act); Genetic information (Genetic Information Nondiscrimination Act) Fairness in Machine Learning, NIPS 2017 Tutorial Solon Barocas and Moritz Hardt
  • 21. Race and gender correlate with everything
  • 22. From Kosinski et. al., Private traits and attributes are predictable from digital records of human behavior Learning from Facebook likes
  • 24. Predicting gender from Twitter Zamal et. al., Homophily and Latent Attribute Inference: Inferring Latent Attributes of Twitter Users from Neighbors
  • 25. Predicting race from Twitter Pennacchiotti and Popescu, A Machine Learning Approach to Twitter User Classification
  • 27. Should Prison Sentences Be Based On Crimes That Haven’t Been Committed Yet?, FiveThirtyEight
  • 28. COMPAS ”CORE” questionnaire, 2011. Includes criminal history, family history, gang involvement, drug use…
  • 29. How We Analyzed the COMPAS Recidivism Algorithm, ProPublica
  • 31. ProPublica argument False positive rate P(high risk |black, no arrest) = C/(C+A) = 0.45 P(high risk |white, no arrest) = G/(G+E) = 0.23 False negative rate P(low risk | black, arrested ) = B/(B+D) = 0.28 P(low risk | white, arrested ) = F/(F+H) = 0.48 Northpointe response Positive predictive value P(arrest| black, high risk) = D/(C+D) = 0.63 P(arrest| white, high risk) = H/(G+H) = 0.59
  • 32. P(outcome | score) is fair Fair prediction with disparate impact: A study of bias in recidivism prediction instruments, Chouldechova
  • 33. How We Analyzed the COMPAS Recidivism Algorithm, ProPublica Or, as ProPublica put it
  • 34. Even if two groups of the population admit simple classifiers, the whole population may not (from How Big Data is Unfair)
  • 35. The Problem Fair prediction with disparate impact: A study of bias in recidivism prediction instruments, Chouldechova
  • 36. When the base rates differ by protected group and when there is not separation, one cannot have both conditional use accuracy equality and equality in the false negative and false positive rates. … The goal of complete race or gender neutrality is unachievable. … Altering a risk algorithm to improve matters can lead to difficult stakeholder choices. If it is essential to have conditional use accuracy equality, the algorithm will produce different false positive and false negative rates across the protected group categories. Conversely, if it is essential to have the same rates of false positives and false negatives across protected group categories, the algorithm cannot produce conditional use accuracy equality. Stakeholders will have to settle for an increase in one for a decrease in the other. Fairness in Criminal Justice Risk Assessments: The State of the Art, Berk et. al. Impossibility theorem