Machine Learning in Practice, presented at CTO Summit Chicago 2019. This is a set of lessons and high level info on what we've learnt over the years at Rainforest QA.
11. Tester Quality
● Aim - make sure testers are doing the right thing.
● Journey:
● Pure code rules
● anti-fraud
● Rule based system
● Anti-tiredness / laziness
● Classifiers for various actions
● Clicks
● Scroll
● Drag
● Typing
● etc
22. Team matters:
6 x Data scientist
2 x Production Engineer
1 x Data Engineer
23. High level SLDC for AI
1. Figure out problem
2. Get data
3. Get / make training data
4. Build prototype model
5. Train model (can take non-trivial time)
6. Test using training data
7. Go to #1 (bad data) or #4 (hope-of-good-data + “bad model”)
8. Deploy
9. Test in production
10. Iterate (aka, go to #1 or #4)
29. mturk
• Super early AWS service; public since 2005, invented < 2001
24 x 7, on-demand, programmatic interface to do Human
Intelligence Tasks
• “Automate” the un-automatable
30. mturk
Pay (lots of) humans to do (lots of) things. Classic things:
• Extract data from receipts
• Identify things in photos
• Search for data for you (find the phone number of XYZ
restaurant)
• Transcribe audio
• Data science - ground truths for ml / ai
31. mturk
◦ It’s kinda hard to use right
◦ Single-Purpose APIs make this easier
46. –Automated Inference on Criminality using Face Images
Xiaolin Wu, Xi Zhang
“We study, for the first time, automated inference on criminality
based solely on still face images, which is free of any biases of
subjective judgments of human observers.”
50. Perception
● Logical and calculating
● Almost never wrong, reliable
● Mistakes are absurd when they do happen
● Trustworthy
● Objective and free of any biases
52. “Anything we build using data is going to reflect the biases and
decisions we make when collecting that data.”
Fred Benenson, ex-VP Data @ Kickstarter
58. Summary
● Be pragmatic; use the easiest thing
● Data quantity and quality really matters
● Bias and Ethics is a thing
● Testing is hard
● It’s mostly still 1999 ops wise