Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Evan Estola, Lead Machine Learning Engineer, Meetup, at MLconf NYC 2017

Evan is a Lead Machine Learning Engineer working on the Data Team at Meetup. Combining product design, machine learning research and software engineering, Evan builds systems that help Meetup’s members find the best thing in the world: real local community. Before Meetup, Evan worked on hotel recommendations at Orbitz Worldwide, and he began his career in the Information Retrieval Lab at the Illinois Institute of Technology.

Abstract Summary:

Machine Learning Heresy and the Church of Optimality:
As Machine Learning continues to grow in both usage and impact on people’s lives, there has been a growing concern around the ethics of using these systems. In application areas such as hiring selection, loan review, and even prison sentencing, ML is being used in ways that raise questions about the fairness of these algorithms. But what does it mean for an algorithm to be fair? An algorithm will consistently make the same decision when given the same data, leading some people to argue that building an optimal algorithm is inherently fair. Even in the case of using sensitive features like age, race and gender, if the data is predictive, aren’t we just modeling reality?

In this talk, I will argue that these questions do not let us off the hook in regards to the impact of the systems we build as Machine Learning engineers. I think it is important to question the nature of how ‘optimal’ a model can even be in the first place. Finally, I will discuss what kinds of organizational resistance engineers might run into, and how to deal with questionable ethical decisions for the sake of being ‘optimal’.

Related Books

Free with a 30 day trial from Scribd

See all
  • Be the first to comment

Evan Estola, Lead Machine Learning Engineer, Meetup, at MLconf NYC 2017

  1. 1. Machine Learning Heresy and the Church of Optimality Evan Estola MLconf 3/24/17
  2. 2. About Me ● Evan Estola ● Staff Machine Learning Engineer, Data Team Lead @ Meetup ● ● @estola
  3. 3. Meetup ● Do more of what’s most important to you ● 270,000 Meetups, ~30 million members ● Recommendations ○ Cold Start ○ Sparsity ○ Lies
  4. 4. Data Science impacts lives ● Ads you see ● Friend’s Activity/Facebook feed ● News you’re exposed to ● If a product is available ● If you can get a ride ● Price you pay for things ● Admittance into college ● If you can get a loan ● Job openings you find ● Job openings you can get ● Punishment for crime
  5. 5. You just wanted a kitchen scale, now Amazon thinks you’re a drug dealer
  6. 6. ● “Black-sounding” names 25% more likely to be served ad suggesting criminal record
  7. 7. ● ● Fake profiles, track ads ● Career coaching for “200k+” Executive jobs Ad ● Male group: 1852 impressions ● Female group: 318
  8. 8. ● Twitter bot ● “Garbage in, garbage out” ● Responsibility? “In the span of 15 hours Tay referred to feminism as a "cult" and a "cancer," as well as noting "gender equality = feminism" and "i love feminism now." Tweeting "Bruce Jenner" at the bot got similar mixed response, ranging from "caitlyn jenner is a hero & is a stunning, beautiful woman!" to the transphobic "caitlyn jenner isn't a real woman yet she won woman of the year?"”
  9. 9. You know racist computers are a bad idea Don’t let your company invent racist computers @estola
  10. 10. Brief Math Aside ● Summary statistics are crap on multimodal distributions ● “there is no presently generally agreed summary statistic (or set of statistics) to quantify the parameters of a general bimodal distribution”
  11. 11. By restricting or removing certain features aren’t you sacrificing performance? Isn’t it actually adding bias if you decide which features to put in or not? If the data shows that there is a relationship between X and Y, isn’t that your ground truth? Isn’t that sub-optimal?
  12. 12. Bad Features ● Not all features are ok! ○ ‘Time travelling’ ■ Rating a movie => watched the movie ■ Went to a Meetup => joined the Meetup
  13. 13. Benign Features ● Not all Features are useful! ○ Member only features don’t affect ranking (in simple models) ○ Clicked an email => likely to join/rsvp/etc.
  14. 14. “It’s difficult to make predictions, especially about the future”
  15. 15. Misguided Models ● Offline performance != Online performance ● Predicting past behavior != Influencing behavior ● Clicks vs. buy behavior in ads
  16. 16. “Computers are useless, they can only give you answers”
  17. 17. Asking the right questions ● Need a human ○ Choosing features ○ Choosing the right target variable ○ Value-added ML
  18. 18. Asking the right questions ● Need a human ○ Auto-ethics ■ Tramer, FairTest ■ Defining un-ethical features ■ Who decides to look for fairness in the first place?
  19. 19.
  20. 20. Example ● Questionable real-world applications ○ Screen job applications ○ Screen college applications ○ Predict salary ○ Predict recidivism ● Features? ○ Race ○ Gender ○ Age
  21. 21. Correlating features ● Name -> Gender ● Name -> Age ● Grad Year -> Age ● Zip -> Socioeconomic Class ● Zip -> Race ● Likes -> Age, Gender, Race, Sexual Orientation... ● Credit score, SAT score, College prestigiousness...
  22. 22. At your job... Not everyone will have the same ethical values, but you don’t have to take ‘optimality’ as an argument against doing the right thing.
  23. 23. “All models are wrong, but some are useful” Your model is already biased, it will never be optimal. Don’t turn wisdom into heresy. @estola