Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Josh Wills, Head of Data Engineering, Slack at MLconf SF 2017


Published on

I Build The Black Box: Grappling with Product and Policy:

The rate of improvement in techniques for building machine learning models over the past 2 years has been astounding; between generalized embedding models like starspace and scalable, portable classifiers like XGBoost now mean that we can compress months of work into days or even hours. Unfortunately, we have not had any similar improvements in our ability to solve the product and policy problems that so often go hand-in-hand with building and deploying models; if anything, our reliance on self-optimizing black box techniques means that these problems are only getting harder, and as we bring machine learning to bear on more diverse domains, the stakes are only getting higher.

Bio: Josh Wills is the head of data engineering at Slack. Prior to Slack, he built and led data science teams at Cloudera and Google. He is the founder of the Apache Crunch project, co-authored an O’Reilly book on advanced analytics with Apache Spark, and wrote a popular tweet about data scientists.

Published in: Technology
  • Be the first to comment

Josh Wills, Head of Data Engineering, Slack at MLconf SF 2017

  1. 1. Josh Wills June 17, 2017 1 I Build The Black Box: Grappling With Product and Policy
  2. 2. About Me ● Cloudera’s (Former) Director of Data Science ● Slack’s (Former) Director of Data Engineering ● SLI Engineer @ Slack
  3. 3. Agenda Technology Product Policy
  4. 4. My Last Decade Technology
  5. 5. Historical Perspective ● The Product Problem Was a Given ○ Identify fraud/spam; predict clicks on ads ● Policy Concerns Were Limited ○ Either already well-defined or essentially nonexistent ● Everything Was About the Data and the Tech ○ Basic algorithmic choices followed by an endless cycle of feature engineering and experimentation
  6. 6. My First Day Back
  7. 7. Embeddings
  8. 8. A Seismic Shift
  9. 9. Keeping Up With The arXiv
  10. 10. Deep Architectures as Custom Hardware
  11. 11. AutoML as Cross-Platform Compiler
  12. 12. The Incredible Shrinking Technology Problems Technology Product Policy
  13. 13. The Data Product Problem
  14. 14. One Approach
  15. 15. Understanding Black Boxes
  16. 16. Black Boxes and Black Hats
  17. 17. Caring The Least At Scale
  18. 18. The Logic of Collective Action
  19. 19. Starting Small
  20. 20. Scaling Up
  21. 21. Scaling Way Up
  22. 22. Thank You! 22