Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Accelerating development velocity of production ML systems with Docker


Published on

Accelerating development velocity of production ML systems with Docker by Kinnary Jangla

The rise of microservices has allowed ML systems to grow in complexity but has also introduced new challenges when things inevitably go wrong. Most companies provide isolated development environments for engineers to work within. While a necessity once a team reaches even a small size, this same organizational choice introduces potentially frustrating dependencies when those individual environments inevitably drift. Kinnary Jangla explains how Pinterest dockerized the services powering its home feed to accelerate development and decrease operational complexity and outlines the benefits Pinterest gained from this change that may be applicable to other microservice-based ML systems. This project was initially motivated by challenges arising from the difficulty of testing individual changes in a reproducible way. Without standardized environments, predeployment testing often yielded nonrepresentative results, causing downtime and confusion for those responsible for keeping the service up.

The Docker solution that was eventually deployed prepackages all dependencies found in each microservice, allowing developers to quickly set up large portions of the home feed stack and always test on the current team-wide configs. This architecture has enabled the team to debug latency issues, expand its testing suite to include connecting to simulated databases, and more quickly do development on our thrift APIs.

Kinnary shares tips and tricks for dockerizing a large-scale legacy production service and discusses how an architectural change like this can change how an ML team works.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Accelerating development velocity of production ML systems with Docker

  1. 1. May 24th, 2018Kinnary Jangla Accelerating development velocityofproduction MLsystemsusing docker
  2. 2. Pinterest is the visual discovery engine. Mission
  3. 3. A visual bookmark saved from the internet by a user. Pin
  4. 4. Boards
  5. 5. ● Most engaging surface. ● Drives ~40-50% of Pinterest’s user engagement and ads revenue. ● First product powered by machine learning. ● Core problem: predict the engagement likelihood between a user and a pin. Homefeed
  6. 6. As of Sep 14, 2017 100b+ Pins 75%+ Sign-ups 50%+ of users are outside of the U.S.
  7. 7. Machine learning @ Pinterest Interest-basedfeed Recommendations Search Ads VisualDiscovery 1 2 3 4 5
  8. 8. Monolithto Microservices
  9. 9. Monolith
  10. 10. Microservices
  11. 11. Agenda 1 2 3 4 5
  12. 12. Smartfeed Search Pinnability Picked-For-You
  13. 13. Challenges withMicroservices
  14. 14. End-to-end Model Building ● Training Data Generation ● Model Training ● Offline Evaluation ● Online Experiment
  15. 15. Modeltraining train models evaluate models A/B test rollout generate hypotheses
  16. 16. Candidate Generator Candidate Generator Candidate Generator Ranking Model Candidates Candidates Candidates Policy Layer Followed Topics Followed Users Recommendations Homefeed How does Homefeed work?
  17. 17. Services During Model Serving
  18. 18. Debugging
  19. 19. Solutions?
  20. 20. Whatis…
  21. 21. 1. Bundle applications with its runtime environment 2. Runs on its own network interface 3. Similar to a virtual machine
  22. 22. DockerfileDockerfile
  23. 23. Connectingservices
  24. 24. DockerCompose
  25. 25. DockerfileDocker-composefile
  26. 26. Demo Anythingthatcangowrongwillgowrong,EdwardMurphy
  27. 27. Docker commands Inspect Attach Exec Pause,Stop,Remove Top 1 2 3 4 5
  28. 28. SoftwareEngineeringDaily Podcast Dockerformicroservices, Apress
  29. 29. Q&A
  30. 30. © Copyright, All Rights Reserved, Pinterest Inc. 2018