2. • Adversarial attacks
• Tricking Models
• Reverse engineering of models (leaking information)
• Poisoned data
• Bias in data
• Building the right model with the wrong conclusions
• Ethics
• Privacy? Using AI for good!
• Deepfake
• GPT2
• Security in general
10. The square peg bias
This is where you just choose the wrong data set because it's what you have.
For example: You want model sportswear purchases for your online clothing store, but you only have data on what
people have been buying at brick-and-mortar shops.
11. Sampling bias
You choose your data to represent an environment.
Generally, you choose a subset of data that is representative and sufficiently large, but you have to watch out for the
human biases in picking that data; it can be as innocent as forgetting to include nighttime data in a training set
for facial recognition.
http://beauty.ai
https://www.techrepublic.com/article/top-10-ai-failures-of-2016/
12. Bias-variance trade-off
You may cause bias by overcorrecting for variance. If your model is too sensitive to variance, small fluctuations could
cause it to model random noise. Too much bias to correct this could miss complexity.
Nearest neighbor prediction regions.
Lighter colors indicate less certainty about predictions. You can adjust the value of k.
Value of k has direct impact on bias-variance trade-off
13. Bias-variance trade-off
You may cause bias by overcorrecting for variance. If your model is too sensitive to variance, small fluctuations could
cause it to model random noise. Too much bias to correct this could miss complexity.
Nearest neighbor prediction regions.
Lighter colors indicate less certainty about predictions. You can adjust the value of k.
Value of k has direct impact on bias-variance trade-off
14. Bias-variance trade-off
You may cause bias by overcorrecting for variance. If your model is too sensitive to variance, small fluctuations could
cause it to model random noise. Too much bias to correct this could miss complexity.
Nearest neighbor prediction regions.
Lighter colors indicate less certainty about predictions. You can adjust the value of k.
Value of k has direct impact on bias-variance trade-off
15. Measurement bias
This is when the device you use to collect the data has bias built in, like say a scale that incorrectly overestimates
weight;
so the data is sound, and no statistical correction would catch it.
Having multiple measuring devices can help prevent this.
16. Stereotype bias
You're training a machine learning algorithm to recognize people at work, so you give it lots of images of male
doctors and women teachers.
This might even be mathematically sound, since the stereotype is social and might exist in the data without you even
getting involved. But if you want a stronger ML, you'll need to correct for that social stereotype.
https://arxiv.org/pdf/1607.06520.pdf
man − woman ≈ computer programmer − homemaker.
17. • Have objective acceptance criteria. Know the amount of error you and your users are
willing to accept
• Test with new data.
• Don’t count on all results being accurate.
• Understand the architecture of the network as a part of the testing process
• Communicate the level of confidence you have in the results to management and users.
18. • Model performance
• Metamorphic testing
• C. Murphy, G. E. Kaiser, L. Hu, and L. Wu, “Properties of Machine Learning Applications for Use in
Metamorphic Testing,” in SEKE, 2008, vol. 8, pp. 867–872
• Driverless vehicle
https://www.researchgate.net/publication/331289445_Metamorphic_testing_of_driverless_cars
• Dual coding
• Coverage guided fuzzing
• TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing
• Comparison with simplified, linear models
• Testing with different data slices
19. Alan Turing
“if a machine is expected to be infallible, it cannot also be intelligent”
21. Dr. Viviene Ming (@SuperNova)
Good technology shouldn’t
substitute what we do,
it should make us better at doing
it.
Example Sexy Face
22. Zeynep Tufekci (@AMLD2019)
#ML and #AI surfacing and
exploiting #biases that we are not
even aware of — because we had
no way to identify them at scale
Asking them for open ending
questions….