Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Deploying ML models to production (frequently and safely) - PYCON 2018

399 views

Published on

Applying continuous delivery principles to the deployment of machine learning models.

Published in: Technology
  • Be the first to comment

Deploying ML models to production (frequently and safely) - PYCON 2018

  1. 1. How to deploy machine learning models to production (frequently and safely)
  2. 2. 2 hello pycon David Tan @davified Developer @ ThoughtWorks
  3. 3. 3 About us @thoughtworks https://www.thoughtworks.com/intelligent-empowerment
  4. 4. 1. First, a story about all of us...
  5. 5. 5
  6. 6. 6 Temperature check: who has... ● trained a ML model before? ● deployed a ML model for fun? ● deployed a ML model at work? ● an automated deployment pipeline for ML models?
  7. 7. 7 The million-dollar question How can we reliably and repeatably take our models from our laptop to production?
  8. 8. 8 What today’s talk is about Share principles and practices that can make it easier for teams to iteratively deploy better ML products Share about what to strive towards, and how to strive towards it
  9. 9. 9 Standing on the shoulders of giants ● @jezhumble ● @davefarley77 ● @mat_kelcey ● @codingnirvana ● @kief
  10. 10. 10 The stack for today’s demo
  11. 11. 11 Demo
  12. 12. 2. Why deploy frequently and safely?
  13. 13. 14 Why deploy? Until the model is in production, it creates value for no one except ourselves
  14. 14. 15 ● Iteratively improve our model (training with new {data, hyperparameters, features} ● Correct any biases ● Model decay ● If it’s hard, do it more often Why deploy frequently?
  15. 15. 16 Why deploy safely? One of these things are not like the others
  16. 16. 17 Why deploy safely? ● ML models affect decisions that impact lives… in real-time ● Hippocratic oath for us: Do no harm. ● Safety enable us to iteratively improve ML products that better serve people
  17. 17. 18 Machine learning is only one part of the problem/solution Source: Hidden Technical Debt in Machine Learning Systems (Google, 2015) Collecting data / data engineering training ML models Deploying and monitoring ML models Focus of this talk Finding the right business problem to solve
  18. 18. 19 Goal of today’s talk Notebook / playgroun d :-( :-) PROD (maybe ) Experiment / Develop Monitor Deploy Test Continuous Delivery commit and push
  19. 19. 4. So, how do we get there? Challenges (and solutions from Continuous Delivery practices)
  20. 20. 21 Our story’s main characters Mario the data scientist Luigi the engineer loca l PROD
  21. 21. Key concept: CI/CD Pipeline Run unit tests Deploy candidate model to STAGING Deploy model to PROD Train and evaluate model push Version control trigger feedback manua l trigger Model repositor y Data / feature repository Local env Model repositor y Source: Continuous Delivery (Jez Humble, Dave Farley)
  22. 22. loca l PROD #1: Automated configuration management Challenge ● Snowflake (dev) environments ● “Works on my machine!” Solution ● Single-command setup ● Version control all dependencies, configuration Benefits ● Enable experimentation by all teammates ● Production-like environment == discover potential deployment issues early on dev
  23. 23. 24 #1: Automated environment configuration management (Demo)
  24. 24. loca l PROD #2: Test pyramid Solution ● Testing strategy ● Test every method Benefits ● Fast feedback ● Safety harness allows team to boldly try new things / refactor Challenge ● How can I ensure my changes haven’t broken anything? ● How can I enforce the “goodness” of our models? Unit tests narrow/broad integration tests ML metrics tests Manual tests dev Automate d
  25. 25. 28 #2: Test pyramid (Demo)
  26. 26. loca l PROD #3: Continuous integration (CI) pipeline for automated testing Solution ● CI/CD pipeline: automates unit tests → train → test → deploy (to staging) ● Every code change is tested (assuming tests exist) ● Source code as the only source of software/models Benefits ● Fast feedback Challenge ● Everyone may not run tests. “Goodness” checks are done manually. ● We could deploy {bugs, errors, bad models} to production dev unit tests train & testVCS
  27. 27. 30 #3: CI pipeline (Demo)
  28. 28. loca l PROD #4: Artifact versioning Challenge ● How can we revert to previous models? ● Retraining == time- consuming ● Manual renaming/redeployment s of old models (if we still have them) Solution ● Build your binaries once ● Each artifact is tagged with metadata (training data, hyperparameters, datetime) Benefits ● Save on build times ● Confidence in artifact increases down the pipeline ● Metadata enables reproducibility dev train & test version artifactunit testsVCS
  29. 29. loca l PROD #5: Continuous delivery (CD) pipeline for automated deployment Solution ● Automated deployments triggered by pipeline ● Single-command deployment to staging/production ● Eliminate manual deployments Benefits ● More rehearsal == More confidence ● Disaster recovery: (single-command) deployment of last good model in production Challenge ● Deployments are scary ● Manual deployments == potential for mistakes dev train & test version artifact deploy-stagingunit testsVCS
  30. 30. 33 #5: CD pipeline for automated deployment (Demo) # Deploy model (the actual model) gcloud beta ml-engine versions create $VERSION_NAME --model $MODEL_NAME --origin $DEPLOYMENT_SOURCE --runtime-version=1.5 --framework $FRAMEWORK --python-version=3.5
  31. 31. 34 #5: CD pipeline for automated deployment (Demo) # Deploy to prod gcloud ml-engine versions set-default $version_to_deploy_to_prod -- model=$MODEL_NAME
  32. 32. loca l PROD #6: Canary releases + monitoring Solution ● Request shadowing pattern (credit: @codingnirvana) Benefits ● Confidence increases along the pipeline, backed by metrics ● Monitoring in production == Important source of feedback Challenge ● How can I know if I’m deploying a better / worse model? ● Deployment to production may not work as expected dev train & test version artifact deploy-staging deploy-canary- prod unit testsVCS
  33. 33. 36 #6: Canary releases + monitoring (Demo) ML App
  34. 34. loca l PROD #7: Start simple (tracer bullet) Solution ● Start with simple model + simple features ● Create solid pipeline first ● But, not simpler than what is required (and, don’t take expensive shortcuts) Benefits ● Discover integration issues/requirements sooner ● Demonstrate working software to stakeholders in less time Challenge ● Complex models == longer time to develop / debug ● Getting all the “right” features == weeks / months dev
  35. 35. 38 #7: Start simple (tracer bullet) (Demo) dev run-unit-tests train-and -evaluate-model deploy
  36. 36. loca l PROD #8: Collect more and better data with every release Solution ● Think about how you can collect labels (immediately or eventually) after serving predictions (credit: @mat_kelcey) ● Create bug reports for clients ● Complete the data pipeline cycle ● Caution: attempts to game your ML system Benefits ● More and better data. Nuff said. Challenge ● Data collection is hard ● Garbage in, garbage out dev train & test version artifact deploy-staging deploy-canary- prod deploy-produnit testsVCS
  37. 37. loca l PROD #9: Build cross-functional teams Solution ● Build cross functional teams (data scientist, data engineer, software engineer, UX, BA) Benefits ● Less nails (because not everyone is a hammer) ● Improve empathy + reduce silos == productivity Challenge ● How can we do all of the above? dev train & test version artifact deploy-staging deploy-canary- prod deploy-produnit testsVCS
  38. 38. loca l PROD #10: Kaizen mindset Solution ● Kaizen == 改善 == change for better ● Go through deployment health checklists as a team Benefits ● Iteratively get to good Challenge ● How can we do all of the above? dev train & test version artifact deploy-staging deploy-canary- prod deploy-produnit testsVCS
  39. 39. 43 #10: Kaizen - Health checklists ❏ General software engineering practices ❏ Source control (e.g. git) ❏ Unit tests ❏ CI pipeline to run automated tests ❏ Automated deployments ❏ Data / feature-related tests ❏ Test all code that creates input features, both in training and serving ❏ ... ❏ Model-related tests ❏ Test against a simpler model as a baseline ❏ ... Source: A rubric for ML production systems (Google, 2016)
  40. 40. 44 #10: Kaizen - Health checks ● How much calendar time to deploy a model from staging to production? ● How much calendar time to add a new feature to the production model? ● How comfortable does your team feel about iteratively deploying models?
  41. 41. 45
  42. 42. Conclusion
  43. 43. A generalizable approach for deploying ML models frequently and safely Run unit tests Deploy candidate model to STAGING Deploy model to PROD Train and evaluate model push Version control Credit: Continuous Delivery (Jez Humble, Dave Farley) trigger feedback manua l trigger Model repositor y Data / feature repository Local env Model repositor y
  44. 44. 48 Solve the right problem We don’t have a machine learning problem. We have a {business, data, software delivery, ML, UX} problem
  45. 45. 49 Solve the right problem Deployment and monitoring 03 Machine learning02 Data collection01 Focus of today’s talk
  46. 46. 50 How to deploy models to prod {frequently, safely, repeatably, reliably}? 1. Automate configuration management 2. Think about your test pyramid 3. Set up a continuous integration (CI) pipeline 4. Version your artifacts (i.e. models) 5. Automated deployment 6. Try canary releases 7. Start simple (tracer bullet) 8. Collect more and better data with every release 9. Build cross-functional teams 10. Kaizen / continuous improvement
  47. 47. THANK YOU
  48. 48. 52 We’re hiring! ● Software Developers (>= junior-level devs welcome) ● UX Designer ● Senior Information Security Consultant
  49. 49. 53 Resources for further reading ● Visibility and monitoring for machine learning (12-min video) ● Using continuous delivery with machine learning models to tackle fraud ● What’s your ML Test Score? A rubric for ML production systems (Google) ● Rules of Machine Learning (Google) ● Continuous Delivery (Jez Humble, Dave Farley) ● Why you need to improve your training data and how to do it

×