Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Algorithmic Bias : What is it? Why should we care? What can we do about it?

140 views

Published on

Interactive Workshop presented February 27, 2019 at the UMD Summit on Equity, Race, and Ethnicity : Justice and Equity for All?

Published in: Education
  • Be the first to comment

  • Be the first to like this

Algorithmic Bias : What is it? Why should we care? What can we do about it?

  1. 1. Algorithmic Bias What is it? Why should we care? What can we do about it? Ted Pedersen Department of Computer Science / UMD tpederse@d.umn.edu @SeeTedTalk http://umn.edu/home/tpederse 1
  2. 2. Me? Computer Science Professor at UMD since 1999 Research in Natural Language Processing since even before then How can we determine what a word means in a given context? Automatically, with a computer Have used Machine Learning and other Data Driven techniques for many years In the last decade these techniques have entered the real world Important to think about impacts and consequences of that 2
  3. 3. Our Plan What are Algorithms? What is Bias? What is Algorithmic Bias? What are some examples of Algorithmic Bias? Why should we care? What can we do about it? Interactive Workshop - I’ll talk, and I hope you will too. At various points along the way we’ll share some ideas and experiences. 3
  4. 4. What are Algorithms? A series of steps that we follow to accomplish a task. Computer programs are a specific way of describing an algorithm. IF (MAJOR == ‘Computer Science’) AND (GPA > 3.00) THEN PRINT job offer letter ELSE DELETE application 4
  5. 5. What is Machine Learning / Artificial Intelligence Machine Learning and AI are often used synonymously. We can think of them as a special class of algorithms. These are often the source of algorithmic bias. Machine Learning algorithms find patterns in data and use those to build classifiers that make decisions on our behalf. These classifiers can be simple sets of rules (IF THEN ELSE) or they might be more complicated models where features are automatically assigned weights. These algorithms are often very complex and very mathematical. Not easy to understand what they are doing (even for experts). 5
  6. 6. What is Bias? Whatever causes an unfair action or representation that often leads to harm. Origins can be in prejudice, hate, or ignorance. Real life is full of many examples. But how does this relate to Algorithms? Machine Learning is complex and mathematical, so isn’t it objective?? 6
  7. 7. Machine Learning and Algorithmic Bias IF (MAJOR == ‘Computer Science’) AND (GENDER == ‘Male’) AND (GPA > 3.00) THEN PRINT job offer letter ELSE DELETE application Unreasonable? Unfair? Harmful? Biased? Yes. But a Machine Learning system could easily learn this rule from your hiring history if your company has only employed male programmers. 7
  8. 8. What kind of data could lead Machine Learning to biased conclusions? 1. 2. 3. 8
  9. 9. What is Algorithmic Bias? Whatever causes an algorithm to produce unfair actions or representations. The data that Machine Learning / AI rely on is often created by humans, or by other algorithms! Many many decisions along the way to developing a computer system where humans and the data they create enter the process. Biases that exist in a workplace, community, or culture can (easily) enter into the process and be codified in programs and models. Many examples … 9
  10. 10. Facial recognition systems that don’t “see” non-white faces Joy Buolamwini / MIT Twitter : @jovialjoy How I'm Fighting Bias in Algorithms (TED talk) : https://www.youtube.com/watch?v=UG_X_7g63rY Gender Shades : http://gendershades.org/ Nova : https://www.pbs.org/wgbh/nova/article/ai-bias/ 10
  11. 11. Risk assessment systems that overstate the odds of black men being a flight risk or re-offending Pro Publica investigation (focused on Broward County, Florida): https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing Wisconsin also has some history: https://www.wisconsinwatch.org/2019/02/q-a-risk-assessments-explained/ 11
  12. 12. Amazon Scraps Secret AI Recruiting Tool - Reuters story (Oct 2018) : https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-re cruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G Hiring Algorithms are not Neutral - Harvard Business Review (Nov 2016) : https://hbr.org/2016/12/hiring-algorithms-are-not-neutral Resume screening systems that filter out women 12
  13. 13. Online advertising that systematically suggests that people with “black” names are more likely to have criminal records Latanya Sweeney / Harvard http://latanyasweeney.org CACM paper (April 2013): https://queue.acm.org/detail.cfm?id=2460278 MIT Technology Review (Feb 2013): https://www.technologyreview.com/s/510646/rac ism-is-poisoning-online-ad-delivery-says-harvar d-professor/ 13
  14. 14. Search engines that rank hate speech, misinformation, and pornography highly in response to neutral queries Safiya Umoja Noble / USC Oxford U Twitter : @safiyanoble Algorithms of Oppression: How Search Engines Reinforce Racism : https://www.youtube.com/watch?v=Q7yFysTBpAo 14
  15. 15. What examples of Algorithmic Bias have you encountered? 1. 2. 3. 15
  16. 16. Where does Algorithmic Bias come from? Machine Learning isn’t magic. There is a lot of human engineering that goes into these systems. 1) Create or collect training data 2) Decide what features in the data are relevant and important 3) Decide what you want to predict or classify and what you conclude from that Bias can be introduced at any (or all) of these points 16
  17. 17. How does Bias affect Training Data? Historical Bias - data captures bias and unfairness that has existed in society Marginalized communities are over-policed, so there is more data about searches, arrests, that leads to predictions of more of the same Women are not well represented in computing, so there is little data about hiring, success, that leads to predictions to keep doing more of the same What if we add more training data?? Adding more training data just gives you more historical bias. 17
  18. 18. How does Bias affect Training Data? Representational Bias - sample in training data is skewed or not representative of entire possible population Facial recognition system is trained on photographs of faces. 80% of faces are white, 75% of those are male. Fake profile detector trained on name database made up of First Last names (John Smith, Mary Jones). Other names more likely to be considered “fake”. If we are careful and add more representative data, this might help. Can have high overall accuracy while doing poorly on smaller classes. 18
  19. 19. Features What features do we decide to include in our data? What information do we collect in surveys, applications, arrest reports, etc? What information do we give to our Machine Learning algorithms? We don’t collect information about race or gender! Does that mean our system is free from racism or sexism? 19
  20. 20. What features could signal race (without stating it)? 1. 2. 3. 4. 20
  21. 21. What features could signal gender (without stating it)? 1. 2. 3. 4. 21
  22. 22. Proxies as Conclusions We often want to predict outcomes that we can’t specifically measure. Proxies are features that stand in for that outcome. Will a student succeed in college? What do we mean by success? Finish first year, graduate, make Dean’s List, active in student clubs ??? What proxies can we use to predict “success”? ??? 22
  23. 23. What proxies might be used to evaluate job candidates? 1. 2. 3. 4. 23
  24. 24. What proxies might decide if a search result is “good”? 1. 2. 3. 4. 24
  25. 25. The Problem with Proxies They often end up measuring something else, something that introduces bias 1. Socio-economic status 2. Race 3. Gender 4. Religion 5. 6. 7. 8. 9. 25
  26. 26. Why should we care? Feedback loops Algorithms are making decisions about us and for us, and those decisions become data for the next round of learning algorithms. Biased decisions today become the biased machine learning training data of tomorrow. Machine Learning is great if you want the future to look like the past. Two different kinds of harm (Kate Crawford & colleagues) Resources are allocated based on algorithms Representations are reinforced and amplified by algorithms. 26
  27. 27. What can we do about it? Say Something UMD Climate http://d.umn.edu/campus-climate Algorithmic Justice League - report bias https://www.ajlunited.org/fight#report-bias Share it, Tweet it Screen shots and other documentation very important 27
  28. 28. What we we do about it? Learn more AI Now Institute 2018 Annual Report, includes 10 recommendations for AI https://ainowinstitute.org/AI_Now_2018_Report.pdf Algorithmic Accountability Policy toolkit https://ainowinstitute.org/aap-toolkit.pdf 28
  29. 29. What can we do? Learn More Kate Crawford / Microsoft Research, AI Now Institute Twitter : @katecrawford The Trouble with Bias : https://www.youtube.com/watch?v=fMym_BKWQzk There is a Blind Spot in AI Research : https://www.nature.com/news/there-is-a-blind-spot-in-ai-research-1.20805 29
  30. 30. What can we do? Learn More Virginia Eubanks / U of Albany Twitter : @PopTechWorks Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor : https://www.youtube.com/watch?v=TmRV17kAumc 30
  31. 31. What can we do? Learn More Cathy O'Neil Twitter : @mathbabedotorg Weapons of Math Destruction https://www.youtube.com/watch?v=TQHs8SA1qpk 31
  32. 32. Conclusion Algorithms are not objective Can be used to codify and harden biases under the guise of technology Machine Learning is great if you want the future to look like the past We should expect transparency and accountability from Algorithms Why did it make this decision? What consequences exist when decisions are biased? 32

×