Algorithmic Bias : What is it? Why should we care? What can we do about it?
1. Algorithmic Bias
What is it? Why should we care?
What can we do about it?
Ted Pedersen
Department of Computer Science / UMD
tpederse@d.umn.edu
@SeeTedTalk
http://umn.edu/home/tpederse
1
2. Me?
Computer Science Professor at UMD since 1999
Research in Natural Language Processing since even before then
How can we determine what a word means in a given context?
Automatically, with a computer
Have used Machine Learning and other Data Driven techniques for many years
In the last decade these techniques have entered the real world
Important to think about impacts and consequences of that
2
3. Our Plan
What are Algorithms? What is Bias? What is Algorithmic Bias?
What are some examples of Algorithmic Bias?
Why should we care?
What can we do about it?
Interactive Workshop - Iāll talk, and I hope you will too. At various points along the
way weāll share some ideas and experiences.
3
4. What are Algorithms?
A series of steps that we follow to accomplish a task.
Computer programs are a specific way of describing an algorithm.
IF (MAJOR == āComputer Scienceā) AND (GPA > 3.00)
THEN PRINT job offer letter
ELSE DELETE application
4
5. What is Machine Learning / Artificial Intelligence
Machine Learning and AI are often used synonymously. We can think of them as a
special class of algorithms. These are often the source of algorithmic bias.
Machine Learning algorithms find patterns in data and use those to build
classifiers that make decisions on our behalf.
These classifiers can be simple sets of rules (IF THEN ELSE) or they might be
more complicated models where features are automatically assigned weights.
These algorithms are often very complex and very mathematical. Not easy to
understand what they are doing (even for experts).
5
6. What is Bias?
Whatever causes an unfair action or representation that often leads to harm.
Origins can be in prejudice, hate, or ignorance.
Real life is full of many examples.
But how does this relate to Algorithms?
Machine Learning is complex and mathematical, so isnāt it objective??
6
7. Machine Learning and Algorithmic Bias
IF (MAJOR == āComputer Scienceā) AND (GENDER == āMaleā) AND (GPA > 3.00)
THEN PRINT job offer letter
ELSE DELETE application
Unreasonable? Unfair? Harmful? Biased? Yes. But a Machine Learning system
could easily learn this rule from your hiring history if your company has only
employed male programmers.
7
8. What kind of data could lead Machine Learning to biased
conclusions?
1.
2.
3.
8
9. What is Algorithmic Bias?
Whatever causes an algorithm to produce unfair actions or representations.
The data that Machine Learning / AI rely on is often created by humans, or by
other algorithms!
Many many decisions along the way to developing a computer system where
humans and the data they create enter the process.
Biases that exist in a workplace, community, or culture can (easily) enter into the
process and be codified in programs and models.
Many examples ā¦
9
10. Facial recognition systems that donāt āseeā non-white faces
Joy Buolamwini / MIT
Twitter : @jovialjoy
How I'm Fighting Bias in Algorithms (TED talk) :
https://www.youtube.com/watch?v=UG_X_7g63rY
Gender Shades :
http://gendershades.org/
Nova :
https://www.pbs.org/wgbh/nova/article/ai-bias/
10
11. Risk assessment systems that overstate the odds of black
men being a flight risk or re-offending
Pro Publica investigation (focused on Broward County, Florida):
https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
Wisconsin also has some history:
https://www.wisconsinwatch.org/2019/02/q-a-risk-assessments-explained/
11
12. Amazon Scraps Secret AI Recruiting Tool - Reuters story (Oct 2018) :
https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-re
cruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G
Hiring Algorithms are not Neutral - Harvard Business Review (Nov 2016) :
https://hbr.org/2016/12/hiring-algorithms-are-not-neutral
Resume screening systems that filter out women
12
13. Online advertising that systematically suggests that people
with āblackā names are more likely to have criminal records
Latanya Sweeney / Harvard
http://latanyasweeney.org
CACM paper (April 2013):
https://queue.acm.org/detail.cfm?id=2460278
MIT Technology Review (Feb 2013):
https://www.technologyreview.com/s/510646/rac
ism-is-poisoning-online-ad-delivery-says-harvar
d-professor/
13
14. Search engines that rank hate speech, misinformation, and
pornography highly in response to neutral queries
Safiya Umoja Noble / USC Oxford U
Twitter : @safiyanoble
Algorithms of Oppression: How Search Engines
Reinforce Racism :
https://www.youtube.com/watch?v=Q7yFysTBpAo
14
15. What examples of Algorithmic Bias have you encountered?
1.
2.
3.
15
16. Where does Algorithmic Bias come from?
Machine Learning isnāt magic. There is a lot of human engineering that goes into
these systems.
1) Create or collect training data
2) Decide what features in the data are relevant and important
3) Decide what you want to predict or classify and what you conclude from that
Bias can be introduced at any (or all) of these points
16
17. How does Bias affect Training Data?
Historical Bias - data captures bias and unfairness that has existed in society
Marginalized communities are over-policed, so there is more data about
searches, arrests, that leads to predictions of more of the same
Women are not well represented in computing, so there is little data about
hiring, success, that leads to predictions to keep doing more of the same
What if we add more training data??
Adding more training data just gives you more historical bias.
17
18. How does Bias affect Training Data?
Representational Bias - sample in training data is skewed or not representative of
entire possible population
Facial recognition system is trained on photographs of faces. 80% of faces
are white, 75% of those are male.
Fake profile detector trained on name database made up of First Last names
(John Smith, Mary Jones). Other names more likely to be considered āfakeā.
If we are careful and add more representative data, this might help.
Can have high overall accuracy while doing poorly on smaller classes.
18
19. Features
What features do we decide to include in our data?
What information do we collect in surveys, applications, arrest reports, etc?
What information do we give to our Machine Learning algorithms?
We donāt collect information about race or gender!
Does that mean our system is free from racism or sexism?
19
22. Proxies as Conclusions
We often want to predict outcomes that we canāt specifically measure. Proxies are
features that stand in for that outcome.
Will a student succeed in college?
What do we mean by success?
Finish first year, graduate, make Deanās List, active in student clubs ???
What proxies can we use to predict āsuccessā?
???
22
24. What proxies might decide if a search result is āgoodā?
1.
2.
3.
4.
24
25. The Problem with Proxies
They often end up measuring something else, something that introduces bias
1. Socio-economic status
2. Race
3. Gender
4. Religion
5.
6.
7.
8.
9.
25
26. Why should we care?
Feedback loops
Algorithms are making decisions about us and for us, and those decisions
become data for the next round of learning algorithms. Biased decisions today
become the biased machine learning training data of tomorrow.
Machine Learning is great if you want the future to look like the past.
Two different kinds of harm (Kate Crawford & colleagues)
Resources are allocated based on algorithms
Representations are reinforced and amplified by algorithms.
26
27. What can we do about it? Say Something
UMD Climate
http://d.umn.edu/campus-climate
Algorithmic Justice League - report bias
https://www.ajlunited.org/fight#report-bias
Share it, Tweet it
Screen shots and other documentation very important
27
28. What we we do about it? Learn more
AI Now Institute
2018 Annual Report, includes 10 recommendations for AI
https://ainowinstitute.org/AI_Now_2018_Report.pdf
Algorithmic Accountability Policy toolkit
https://ainowinstitute.org/aap-toolkit.pdf
28
29. What can we do? Learn More
Kate Crawford / Microsoft Research, AI Now Institute
Twitter : @katecrawford
The Trouble with Bias :
https://www.youtube.com/watch?v=fMym_BKWQzk
There is a Blind Spot in AI Research :
https://www.nature.com/news/there-is-a-blind-spot-in-ai-research-1.20805
29
30. What can we do? Learn More
Virginia Eubanks / U of Albany
Twitter : @PopTechWorks
Automating Inequality: How High-Tech Tools
Profile, Police, and Punish the Poor :
https://www.youtube.com/watch?v=TmRV17kAumc
30
31. What can we do? Learn More
Cathy O'Neil
Twitter : @mathbabedotorg
Weapons of Math Destruction
https://www.youtube.com/watch?v=TQHs8SA1qpk
31
32. Conclusion
Algorithms are not objective
Can be used to codify and harden biases under the guise of technology
Machine Learning is great if you want the future to look like the past
We should expect transparency and accountability from Algorithms
Why did it make this decision?
What consequences exist when decisions are biased?
32