Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data Science Africa 2018 Deploying ML models

419 views

Published on

This talk was given at Data Science Africa 2018 in Nyeri, Kenya http://www.datascienceafrica.org/dsa2018/

Machine learning only delivers on its promise when it is deployed.

We discuss different kinds of deployments, suggestions for workflows and best practice for running machine learning projects and systems. We also do a live demo of Clipper, an open source model serving framework from UC Berkeley RISE lab.

Published in: Technology

Data Science Africa 2018 Deploying ML models

  1. 1. Confidential Restricted © 2017 Arm Limited1 Confidential Restricted © 2017 Arm Limited Deploying models 4 June 2018 Data Science Africa, Nyeri, Kenya Damon Civin Principal data scientist @DCivin With many thanks to Gen-Tao Chiang and Vikas Kache
  2. 2. Confidential Restricted © 2017 Arm Limited2
  3. 3. Confidential Restricted © 2017 Arm Limited3 My bio in 5 photos MSc in applied maths and noise punk in South Africa PhD in maths at Cambridge Drove from Cambridge to Mongolia in a tiny car Developer at a cyber- security startup Data Science strategy at ARM
  4. 4. Confidential Restricted © 2017 Arm Limited4 Outline About me Why do we do data science? Making the promise of ML real: deployment (systems) Demo: Clipper (RISE) Best practices
  5. 5. Confidential Restricted © 2017 Arm Limited5 What is Data Science?
  6. 6. Confidential Restricted © 2017 Arm Limited6 Why do we do data science? New features Optimising features Dashboards for decision making Ad-hoc analysis
  7. 7. Confidential Restricted © 2017 Arm Limited7 Why do we do data science?
  8. 8. Confidential Restricted © 2017 Arm Limited8 Why do we do data science?
  9. 9. Confidential Restricted © 2017 Arm Limited9 Why do we do data science?
  10. 10. Confidential Restricted © 2017 Arm Limited10 Why do we do data science?
  11. 11. Confidential Restricted © 2017 Arm Limited11 Why do we do data science?
  12. 12. Confidential Restricted © 2017 Arm Limited12 Deployment makes the promise of ML real
  13. 13. Confidential Restricted © 2017 Arm Limited13 Deployment makes the promise of ML real
  14. 14. Confidential Restricted © 2017 Arm Limited14
  15. 15. Confidential Restricted © 2017 Arm Limited15 Simplifies integration of ML in apps Clipper makes product teams happy. Simplifies model deployment Clipper makes data scientists happy. Throughput: Clipper makes the infra-team less unhappy. Accuracy: Clipper makes users happy.
  16. 16. Confidential Restricted © 2017 Arm Limited16 Model deployment demo
  17. 17. Confidential Restricted © 2017 Arm Limited17 Try it! curl -X POST -d '{"input": [4.3, 2.0,1.0, 0.1]}' 172.16.34.147:1337/iris/predict
  18. 18. Confidential Restricted © 2017 Arm Limited18
  19. 19. Confidential Restricted © 2017 Arm Limited19 Containers & docker • Standardized packaging for software and dependencies • Isolate apps from each other • Share the same OS kernel • Works with all major Linux and Windows Server
  20. 20. Confidential Restricted © 2017 Arm Limited20 Scaling with Kubernetes
  21. 21. Confidential Restricted © 2017 Arm Limited21 Clipper docs - http://clipper.ai/
  22. 22. Confidential Restricted © 2017 Arm Limited22
  23. 23. Confidential Restricted © 2017 Arm Limited23 1. Solve the right problem 2. Fail better every day 3. Data quality matters 4. Simplicity is your friend 5. Debugging makes you a wizard 6. Fairness & privacy are not dirty words
  24. 24. Confidential Restricted © 2017 Arm Limited24 What is your problem?
  25. 25. Confidential Restricted © 2017 Arm Limited25 What is your problem? DJ Patil US Chief Data Scientist
  26. 26. Confidential Restricted © 2017 Arm Limited26 What is your problem? Sheryl Sandberg COO Facebook
  27. 27. Confidential Restricted © 2017 Arm Limited27 Failure
  28. 28. Confidential Restricted © 2017 Arm Limited28 Failure
  29. 29. Confidential Restricted © 2017 Arm Limited29 Failure
  30. 30. Confidential Restricted © 2017 Arm Limited30 Failure
  31. 31. Confidential Restricted © 2017 Arm Limited31 Data quality
  32. 32. Confidential Restricted © 2017 Arm Limited32 Data quality
  33. 33. Confidential Restricted © 2017 Arm Limited33 Data quality
  34. 34. Confidential Restricted © 2017 Arm Limited34 Data quality
  35. 35. Confidential Restricted © 2017 Arm Limited35 Data quality
  36. 36. Confidential Restricted © 2017 Arm Limited36 Data quality
  37. 37. Confidential Restricted © 2017 Arm Limited37 Data quality
  38. 38. Confidential Restricted © 2017 Arm Limited38 Simplicity
  39. 39. Confidential Restricted © 2017 Arm Limited39 Simplicity "There is a lot of value in combining many different datasets and computing simple stats ” -- John Quinn, DSA2017
  40. 40. Confidential Restricted © 2017 Arm Limited40 Simplicity Visualisation Linear models Rules (gross!)
  41. 41. Confidential Restricted © 2017 Arm Limited41 Simplicity • Very soon, your simple prototype will become a complex system • You are uncertain at the beginning of a project, so capture that! • Simple models can be a baseline to measure complex models later
  42. 42. Confidential Restricted © 2017 Arm Limited42 Want to be a wizard? Get debugging! 1 2 3 4 5 6 7 8 9 from unnecessary_math import multiply def test_numbers_3_4(): assert multiply(3,4) == 12 def test_strings_a_3(): assert multiply('a',3) == 'aaa' http://pythontesting.net/framework/nose /nose-introduction/
  43. 43. Confidential Restricted © 2017 Arm Limited43 Want to be a wizard? Get debugging! 1 2 3 4 5 6 7 8 9 from unnecessary_math import multiply def test_numbers_3_4(): assert multiply(3,4) == 12 def test_strings_a_3(): assert multiply('a',3) == 'aaa' > nosetests test_um_nose.py .. ---------------------------------------------------------------------- Ran 2 tests in 0.000s OK http://pythontesting.net/framework/nose /nose-introduction/
  44. 44. Confidential Restricted © 2017 Arm Limited44 Want to be a wizard? Get debugging!
  45. 45. Confidential Restricted © 2017 Arm Limited45 Deployment makes the promise of ML real
  46. 46. Confidential Restricted © 2017 Arm Limited46 Want to be a wizard? Get debugging! • how asking dumb questions is actually a superpower • how you can read the source code to programs when all other avenues fail • debugging tools that make you FEEL like a wizard • Understanding what your _organization_ needs can make you amazing -- Julia Evans https://jvns.ca/blog/so-you-want-to-be-a-wizard/
  47. 47. Confidential Restricted © 2017 Arm Limited47 Fairness & Privacy Surveillance or Assistance?
  48. 48. Confidential Restricted © 2017 Arm Limited48 Fairness & Privacy
  49. 49. Confidential Restricted © 2017 Arm Limited49 Can we use the tools above to build fair ML? 1. Solve the right problem 2. Fail better every day 3. Data quality matters 4. Simplicity is your friend 5. Debugging makes you a wizard 1. Thresholds the same for all groups? 2. Commit to fix fairness issues 3. Bias is in the data, not the algorithms 4. Easier to interpret at first 5. Unit test for fairness
  50. 50. Confidential Restricted © 2017 Arm Limited50 @DCivin Further reading Martin Zinkevitch – Rules of ML for Google engineers Andrew Ng -- ML systems at NIPS 2016 Peter Warden – How and why to improve your training data Michael Jordan – The AI revolution hasn’t happened yet Lydia Liu et al – delayed impact of fairness in ML My more detailed notes on data science projects Jason Brownlee – Concept Drift Info Katie Malone -- Picking data science projects

×