Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

AI Fails: Avoiding bias in your systems

95 views

Published on

Bad AI showing sexist or racist correlations makes headlines. Nobody sets out to make a bad system, so why does this happen. I take a look at all the ways bias creeps into AI and where you should put effort to avoid it.

Slides annotated from a talk given at ImpactfulAI meetup 19th June 2019 London

Published in: Technology
  • Be the first to comment

  • Be the first to like this

AI Fails: Avoiding bias in your systems

  1. 1. 1StoryStream.ai Dr Janet Bastiman @yssybyl AI Fails: how can you begin to overcome bias in design and test
  2. 2. 2StoryStream.ai Dr Janet Bastiman @yssybyl The world’s leading automotive content platform StoryStream is a dedicated automotive content platform, trusted by some of the world’s leading car brands. Specifically created to help automotive brands provide a more relevant, engaging customer experience, fuelled with authentic content and designed for efficiently scaling content operations across global teams. ● Grow customer engagement and conversions by up to 25% ● Reduce content creation and management costs by up to 60% ● Provide a more authentic customer experience ● Understand your customer in a deeper way About StoryStream The Core StoryStream Benefits
  3. 3. 3StoryStream.ai Dr Janet Bastiman @yssybyl Tonight I’m going to be looking at why so many big companies have a problem with bias and what checks and balances you can put in place to help prevent falling victim to these types of errors. For argument’s sake, I’m using AI as a superset of machine learning, deep learning and all other techniques that lead to a system that appears to make intelligent decisions. Also, since this is a short talk, bear in mind that each one of these slides warrants a full presentation in itself, so this will be a high level introduction to get you thinking about things.
  4. 4. 4StoryStream.ai Dr Janet Bastiman @yssybyl AI fails - Bias ***
  5. 5. 5StoryStream.ai Dr Janet Bastiman @yssybyl All those headlines make us feel uncomfortable, and rightly so. They cover Amazon’s sexist recruitment AI, Google’s image tagging and the COMPAS system to predict re-offense rates. I think everyone in this room would nod sagely about how bad these are and claim it would never happen on their watch. So why do we keep seeing this happen? What about the AI that doesn’t make the headlines, the ones quietly deciding whether you get sent a special offer, a credit card, a cancer diagnosis? The things most of us are working on. What if our work is flawed but never makes the headlines – would we know? Nobody in this field sets out to make a bad AI, so why does this happen? It’d be easy to say have a diverse team and diverse data, but that’s not good enough.
  6. 6. 6StoryStream.ai Dr Janet Bastiman @yssybyl What is bias? an unwarranted correlation between input variable and output classification IMPACT is more important than ACCURACY
  7. 7. 7StoryStream.ai Dr Janet Bastiman @yssybyl Let’s take a step back and look at “What is bias?” If I gave you all a test no doubt you’d write answers around underfitting and overfitting – mathematical answers. Focus on the more descriptive definition: an unwarranted correlation. For many of us, from a position of privilege, it’s hard to really understand the impact of being on the receiving end of these correlations. So how are these biases introduced? Let’s take a look at the maths…
  8. 8. 8StoryStream.ai Dr Janet Bastiman @yssybyl Maths “Fairness” assumes: A. Calibration within groups B. Balance for negative class C. Balance for positive class Can only be achieved if prediction is perfect or there are completely equal base rates ● You cannot balance everything ● Either: ○ prediction is unbiased ○ or error is unbiased ● Fairness is personal Conclusions Chouldechova https://arxiv.org/abs/1610.07524 Kleinberg et al https://arxiv.org/abs/1609.05807
  9. 9. 9StoryStream.ai Dr Janet Bastiman @yssybyl Both of these papers were studies into whether the COMPAS system was deliberately biased, and come to the same conclusions via different proofs. They are both well worth a read. As a side note, there’s a minor mathematical error in the Kleinburg paper (which does not affect the proof) but worth noting you shouldn’t just blindly implement what you read in papers . Starting with a definition of fairness in both papers they conclude that: Unless you have perfect prediction and a balanced population, you will have either bias in positive prediction or bias in error rates.
  10. 10. 10StoryStream.ai Dr Janet Bastiman @yssybyl So to avoid bias, mathematically, we need to live in an unbiased world. Sadly this is not the case. You cannot have positive parity and error parity at the same time, you can only choose which is least unacceptable. COMPAS chose to minimise false negatives and as a result created something that was racially biased. For all real problems you will violate one of the fairness measures. Typically we focus on overall accuracy and most practitioners don’t think further. We are post-GDPR now so if you are making inferences against protected variables make sure you are storing them correctly and have some explainability (*whole other talk  )
  11. 11. 11StoryStream.ai Dr Janet Bastiman @yssybyl Data Errors ● Selection bias ● Random Sampling ● Over coverage ● Undercoverage ● Measurement (Response) error ● Processing errors ● Participation bias
  12. 12. 12StoryStream.ai Dr Janet Bastiman @yssybyl In addition to the mathematics of creating AI, bias creeps in earlier in the chain. Unless you are lucky enough to get a full view of your data pipeline, you way not have a good understanding of how you’ve ended up with the data in front of you. If you’ve done statistical sampling theory then you’ll be aware of this, but here’s a taster. There are seven key data sampling errors that you should know and be able to ask about before building any model. The data available to any company is by nature limited to a subset of all possible data. The graph shows the mathematical spread of accuracy of a system predicting a 50% average score based on population size of sample. Small data sets can cause large variations. Extrapolate to your own data – where are the holes?
  13. 13. 13StoryStream.ai Dr Janet Bastiman @yssybyl Example: Is Oxford racially biased for admissions? A couple of years ago, admissions data from Oxford University showed that there was a much lower proportion of BAME students offered places than were in the general population. While I’m not discounting that there was racial bias occurring, let’s look at some of the data biases involved: - Students at private schools are more likely to apply than state school students with the same grades (selection) - Students at private schools are mostly white (undercoverage) - Students from state schools are more likely to apply to popular / oversubscribed courses due to curriculum restrictions (participation) All of these affect the perceived outcome and can exacerbate or mask a true result. Know the providence of your data and where the sampling impacts your results.
  14. 14. 14StoryStream.ai Dr Janet Bastiman @yssybyl Everyone is biased You are no exception
  15. 15. 15StoryStream.ai Dr Janet Bastiman @yssybyl This is really important. Accept that everyone is biased in some way. We are biased by our experiences (positive and negative) and we are biased by the comments from the networks we trust. Every day our biases are reinforced. Our data sets are affected by our biases. Our test sets are affected by our biases. We need to get into a different mindset. The image on the next slide is from: https://www.designhacks.co/products/cognitive-bias-codex-poster Buy a copy and put it somewhere you can see everyday. I have!
  16. 16. 16StoryStream.ai Dr Janet Bastiman @yssybyl
  17. 17. 17StoryStream.ai Dr Janet Bastiman @yssybyl These are your biases and why – please give the people who created this the traffic and buy the poster! This is how you are manipulated. This is how you justify bad behaviour. Apply this to your day to day life. Question yourself if you find yourself agreeing or disagreeing on “gut instinct”. Stop yourself if you make sweeping generalisations. Challenge yourself. This is why we are bad at gathering data and why we are bad at analysing it. We are primed to see patterns even when they are not there. I’ve had blazing rows with more than one C-level exec because they have seen something in the data that just isn’t there. Saying a model is wrong because it doesn’t fit expectations is just as bad as saying it is correct just because it does fit your own biases.
  18. 18. 18StoryStream.ai Dr Janet Bastiman @yssybyl Without understanding your biases you will create data sets that fit your own experience profile. You will discard data points that don’t fit without being conscious of it. If you get the results you expect you will not test them as thoroughly as if they disagreed with your expectations. You will twist your models for an experience that makes you comfortable at the expense of others. Be cognisant of your own biases. Diverse teams help here, but even with this, challenge yourselves. Which brings me to testing… Most AI practitioners validate their models but do not test them in the way that test engineers do…
  19. 19. 19StoryStream.ai Dr Janet Bastiman @yssybyl AI testing is not TESTING What happens if your model gets bad data? Humans just love to prove superiority over tech. Learn how to break everything you create https://www.sempf.net/post/On-Testing1
  20. 20. 20StoryStream.ai Dr Janet Bastiman @yssybyl Sure, you test your models against known data and you probably have a golden test set and do final validation against that. You may even have a pipeline for constant sampling and retest as live data goes through your system. The problem is that fails are accepted as part of the overall statistics: “it’s only 1%”, “the system wasn’t designed for that”, “that failed because of [thing you’re not going to change]”. The issue is that most people are reticent to really and thoroughly test their systems. If you’ve come from a software engineering background then you should be familiar with these concepts, but optimisation and testing are the two biggest omissions in every AI course I’ve seen. Learn to break your models…
  21. 21. 21StoryStream.ai Dr Janet Bastiman @yssybyl All models fail in some circumstances – find those situations, go out of your way to understand your models so thoroughly you should never be surprised. Test them with the broadest range of data you can. Do your own adversarial attacks. I regularly test my team’s models with pictures of my cats and static… Read Bill Sempf’s blog post for a great example of how to test a simple input box that expects a number. Extrapolate this to your systems (*whole other talk in this!) This doesn’t mean that systems have to be perfect to be released. We live in the real world where you will be pushed (probably by people like me) to get solutions out. Push back, be clear on limitations so that decisions can be made about the risks. Add in safeguards for when your model is wrong. Even with all this testing, the thing that should be at the front of your mind is not accuracy but impact.
  22. 22. 22StoryStream.ai Dr Janet Bastiman @yssybyl Impact > accuracy
  23. 23. 23StoryStream.ai Dr Janet Bastiman @yssybyl All businesses should care about the impact of the AI they create. Rather than talking, accuracy, recall, precision, let’s shift to impact. What is the impact of mislabelling a car? Getting someone’s gender incorrect? Refusing a loan? Incarcerating an innocent person? Missing a diagnosis of a terminal illness? The answer may be different for different individuals – what might be a non issue for one person could be life-changing for another. Stop thinking from your own position of privilege and your own biases and take a broader view. How is the information used – will there be a human in the loop? Put yourself in the position of the most vulnerable and marginalised users of your system and ask what is the impact of a false positive or false negative on them. Don’t brush off those results if they don’t fit with your experience.
  24. 24. 24StoryStream.ai Dr Janet Bastiman @yssybyl Summary You are biased Challenge your biases Understand data provenance Break everything you create Create AI mindful of impact on the individual

×