Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016

1,026 views

Published on

Building a Machine Learning Platform at Quora: Each month, over 100 million people use Quora to share and grow their knowledge. Machine learning has played a critical role in enabling us to grow to this scale, with applications ranging from understanding content quality to identifying users’ interests and expertise. By investing in a reusable, extensible machine learning platform, our small team of ML engineers has been able to productionize dozens of different models and algorithms that power many features across Quora.

In this talk, I’ll discuss the core ideas behind our ML platform, as well as some of the specific systems, tools, and abstractions that have enabled us to scale our approach to machine learning.

Published in: Technology

Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016

  1. 1. Building a Machine Learning Platform at Quora Nikhil Garg @nikhilgarg28 @Quora @MLconf 11/11/16 The Quora Answer To “Build vs Buy” For ML Platforms
  2. 2. ● At Quora since 2012 ● Currently leading two ML engineering teams: ○ Content Quality ○ ML Platform A bit about me... @nikhilgarg28
  3. 3. To Grow And Share World’s Knowledge
  4. 4. Over 100 million monthly uniques Millions of questions & answers In hundreds of thousands of topics Supported by 80 engineers
  5. 5. What Slows Down ML Innovation?
  6. 6. ● Pipeline jungles ● Lots of glue code to get data in/out of general purpose packages. ● Strong coupling between business logic, data, ML algorithms and configuration. Curse Of Complexity
  7. 7. ● Online vs offline ● Production vs experimentation ● C++ vs Python ● Engineering vs research ● ...even more glue code and pipeline jungles. Clash Of Titans
  8. 8. ● Hard to reuse existing features, data, algorithms, tooling etc. ● Too costly to even get off the ground. Getting New Applications Off The Ground http://www.qvidian.com/blog/resistance-to-change-sales-organizations
  9. 9. Many Faces Of Chaos
  10. 10. One ring to bring them all and in the darkness bind them!
  11. 11. Collection of systems to sustainably increase the business impact of ML at scale. Machine Learning Platform
  12. 12. ML Platform: Build or Buy?
  13. 13. The Quora Answer: Build For Seven Reasons
  14. 14. Reason # 7 Just Can’t Buy Everything!
  15. 15. ● No matter how powerful the platform is, still need to maintain some form of integration ● This thin integration layer then becomes the platform. ● Real questions -- ○ How much does this in-house layer delegate? ○ How much control does it have over delegation? . Degree Of Integration & Delegation
  16. 16. Reason # 6 Fast Scalable Production Systems
  17. 17. End-To-End Online Production Systems ● External platforms at best can deploy “predictive models”, as services, not end-to-end online systems ● Gains come from optimizing the whole pipeline, not just algorithms. ● Latency: tens of milliseconds. Managing sharding, batching, data locality, caching, streaming, stragglers, graceful degradation... ● Real world systems -- boosts, diversity constraints, holes in data, skipping stages, hard filters… sounds familiar? Candidate Generation Feature Extraction Scoring Post Processing Data
  18. 18. Reason # 5 Blurry Line Between Experimentation & Production
  19. 19. ● We want the same code/systems/tools to work for both experimentation & production. ● But we need to carefully “control” the production code to keep it be fast. ● So need to “control” offline experimentation systems too. Candidate Generation Feature Extraction Scoring Post Processing Data Candidate Generation Feature Extraction Training
  20. 20. Reason # 4 Openly Using Open Source
  21. 21. ● Logistic Regression ● Elastic Nets ● Random Forests ● Gradient Boosted Decision Trees ● Matrix Factorization ● (Deep) Neural Networks ● LambdaMart ● Clustering ● Random walk based methods ● Word Embeddings ● LDA ● ... Production ML Algorithms At Quora Candidate Generation Feature Extraction Training/Scoring Post Processing Data
  22. 22. ● Open source is great -- lots of great technologies! ● Commerical ML platforms are also open sourcing stuff. ● Learning and cherry-picking favorite parts from ANY open source systems. ● May write our own algorithms too (e.g QMF) ● Building own platform = controlling the delegation, not lack of delegation
  23. 23. Reason # 3 Commercial Platforms’ Offerings Are Not Super Valuable To Us
  24. 24. ● Main offerings of external platforms are: ○ Lower operational overhead of running machines ○ Out-of-box distributed training. ● Operational overhead ○ Gets amortized over time ○ Shared with non-ML infrastructure. ● Can often train most models in a single multi-core machine. .
  25. 25. Reason # 2 Blurry Line Between ML & Product Dev
  26. 26. ● Answer ranking ● Feed ranking ● Search ranking ● User recommendations ● Topic recommendations ● Duplicate questions ● Email Digest ● Request Answers ● Trending now ● Topic expertise prediction ● Spam, abuse detection ● …. Blurry Line Between ML/Non-ML Product
  27. 27. Blurry Line Between ML/Non-ML Data Users Answer s Questio ns Topics Votes Follow Ask Write Cast Have Contain Get Commen ts Get Follow Write Have Have Billions of relationships and words
  28. 28. Blurry Line Between ML/Non-ML Codebase ● Integration with other utility libraries/services e.g A/B testing, debug tools, monitoring, alerting, data transfer, ... ● Empowering all product engineers to do ML.
  29. 29. Reason # 1 ML As Quora’s Core Competency
  30. 30. ● ML gives us a strategic competitive advantage. ● Want to control and develop deep expertise in the whole stack. ● Quora has a long term focus -- investment in platform more than pays off in the long term. ● Single most important reason to build ML Platform! ML: Critical For Our Strategic Focus Relevance Quality Demand
  31. 31. Summary
  32. 32. ● Anyone doing non-trivial ML needs an ML platform to sustain innovation at scale. ● Build vs buy decision is not all-or-nothing. ● Surface area and importance of ML are deciding factors in the build vs buy decision.
  33. 33. Nikhil Garg @nikhilgarg28 Thank You! YES, WE ARE HIRING :)

×