Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

9 17-16 - when recommendation systems go bad - rec sys


Published on

Evan Estola - Meetup
RecSys 2016 Industry Track

Published in: Technology

9 17-16 - when recommendation systems go bad - rec sys

  1. 1. When Recommendation Systems Go Bad Evan Estola RecSys 2016 9/17/16
  2. 2. About Me Evan Estola Lead Machine Learning Engineer @ Meetup @estola
  3. 3. We want a world full of real, local community. Women’s Veterans Meetup, San Antonio, TX
  4. 4. Why Recs at Meetup are Hard Cold Start Sparsity Lies Schenectady
  5. 5. Data Science impacts lives Ads you see Apps you download Friend’s Activity/Facebook feed News you’re exposed to If a product is available If you can get a ride Price you pay for things Admittance into college Job openings you find/get If you can get a loan
  6. 6. Recommendation Systems: Collaborative Filtering Completely Normal Book Recommendations For Asimov’s Foundation: Foundation and Empire Second Foundation Prelude to Foundation Forward the Foundation Foundation’s Edge Foundation and Earth
  7. 7. Completely Normal Search Engine Results Query: Obama birth place 1. Honolulu, HI 2. Wikipedia: Obama birth place conspiracy theories 3. Birth Certificate: Query: Obama birth certificate fake 1. 10 Facts that show Obama Birth Certificate is FAKE 2. OBAMA’S LAWYERS ADMIT TO FAKING BIRTH CERTIFICATE 3. Video: Proof Obama Birth Certificate is Fake
  8. 8. You just wanted a kitchen scale, now the internet thinks you’re a drug dealer You purchased: Mini digital pocket kitchen scale! You probably want: 100 pack subtle resealable baggies 250 perfectly legal ‘cigarette’ paper booklets Totally reasonable number of small plastic bags 1000 ‘cigar’ wraps Completely normal product results ette_rolling_papers_%287%29.JPG
  9. 9. Orbitz
  10. 10. Ego Member/customer/user first Focus on building the best product, not on being the most clever data scientist Much harder to spin a positive user story than a story about how smart you are
  11. 11. “Google searches involving black- sounding names are more likely to serve up ads suggestive of a criminal record” “Black-sounding” names 25% more likely to be served ad suggesting criminal record “NAME arrested?” Ads suggest queried name is associated with an arrest and warrants a background check Ads for services related to recovering from arrest/incarceration
  12. 12. Ethics We have accepted that Machine Learning can seem creepy, how do we prevent it from becoming immoral? We have an ethical obligation to not teach machines to be prejudiced.
  13. 13. Data Ethics Awareness Talk about it! Identify groups that could be negatively impacted by your work Make a choice Take a stand
  14. 14. Interpretable Models For simple problems, simple solutions are often worth a small concession in performance Inspectable models make it easier to debug problems in data collection, feature engineering etc. Only include features that work the way you want Don’t include feature interactions that you don’t want
  15. 15. Logistic Regression StraightDistanceFeature(-0.0311f), ChapterZipScore(0.0250f), RsvpCountFeature(0.0207f), AgeUnmatchFeature(-1.5876f), GenderUnmatchFeature(-3.0459f), StateMatchFeature(0.4931f), CountryMatchFeature(0.5735f), FacebookFriendsFeature(1.9617f), SecondDegreeFacebookFriendsFeature(0.1594f), ApproxAgeUnmatchFeature(-0.2986f), SensitiveUnmatchFeature(-0.1937f), KeywordTopicScoreFeatureNoSuppressed(4.2432f), TopicScoreBucketFeatureNoSuppressed(1.4469f,0.257f,10f), TopicScoreBucketFeatureSuppressed(0.2595f,0.099f,10f), ExtendedTopicsBucketFeatureNoSuppressed(1.6203f,1.091f,10f), ChapterRelatedTopicsBucketFeatureNoSuppressed(0.1702f,0.252f,0.641f), ChapterRelatedTopicsBucketFeatureNoSuppressed(0.4983f,0.641f,10f), DoneChapterTopicsFeatureNoSuppressed(3.3367f)
  16. 16. Feature Engineering and Interactions ● Good Feature: ○ Join! You’re interested in Tech x Meetup is about Tech ● Good Feature: ○ Don’t join! Group is intended only for Women x You are a Man ● Bad Feature: ○ Don’t join! Group is mostly Men x You are a Woman ● Horrible Feature: ○ Don’t join! Meetup is about Tech x You are a Woman Meetup is not interested in propagating gender stereotypes
  17. 17. Ensemble Models and Data segregation Ensemble Models: Combine outputs of several classifiers for increased accuracy If you have features that are useful but you’re worried about interaction (and your model does it automatically) use ensemble modeling to restrict the features to separate models.
  18. 18. Ensemble Model, Data Segregation Data: *Interests Searches Friends Location Data: *Gender Friends Location Data: Model1 Prediction Model2 Prediction Model1 Prediction Model2 Prediction Final Prediction
  19. 19. “Women less likely to be shown ads for high-paid jobs on Google, study shows” Carnegie Mellon ‘AdFisher’ project Fake profiles, track ads Career coaching for “200k+” Executive jobs Ad Male group: 1852 impressions Female group: 318
  20. 20. Diversity Controlled Testing Same technique can work to find bias in your own models! Generate Test Data Randomize sensitive feature in real data set Run Model Evaluate for unacceptable biased treatment
  21. 21. What about automating this? Fair Test algorithm - Florian Tramèr Still needs you to decide what features are bad Humanity required
  22. 22. “‘Holy F**K’: When Facial Recognition Algorithms Go Wrong” Google Photos Service Automatic image tagging Tagged African American couple as “gorillas”
  23. 23. ● Twitter bot ● “Garbage in, garbage out” ● Responsibility? “In the span of 15 hours Tay referred to feminism as a "cult" and a "cancer," as well as noting "gender equality = feminism" and "i love feminism now." Tweeting "Bruce Jenner" at the bot got similar mixed response, ranging from "caitlyn jenner is a hero & is a stunning, beautiful woman!" to the transphobic "caitlyn jenner isn't a real woman yet she won woman of the year?"” Twitter taught Microsoft’s AI chatbot to be a racist asshole in less than a day
  24. 24. Diverse test data Outliers can matter The real world is messy Some people will mess with you Some people look/act different than you Defense Diversity Design
  25. 25. “ There’s software used across the country to predict future criminals. And it’s biased against blacks.” Algorithm for predicting repeat offenders used in how harsh the sentence for a crime should be Proprietary model, undisclosed algorithm, features etc. Claims to not use race as a factor Nearly twice as likely to falsely label black defendants as likely future criminals More likely to mis-label whites as low risk
  26. 26. You know racist computers are a bad idea Don’t let your company invent racist computers @estola