Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ML Workshop 1: A New Architecture for Machine Learning Logistics

1,491 views

Published on

Having heard the high-level rationale for the rendezvous architecture in the introduction to this series, we will now dig in deeper to talk about how and why the pieces fit together. In terms of components, we will cover why streams work, why they need to be persistent, performant and pervasive in a microservices design and how they provide isolation between components. From there, we will talk about some of the details of the implementation of a rendezvous architecture including discussion of when the architecture is applicable, key components of message content and how failures and upgrades are handled. We will touch on the monitoring requirements for a rendezvous system but will save the analysis of the recorded data for later. Listen to the webinar on demand: https://mapr.com/resources/webinars/machine-learning-workshop-1/

Published in: Data & Analytics

ML Workshop 1: A New Architecture for Machine Learning Logistics

  1. 1. © 2017 MapR Technologies 1 Machine Learning Model Management The working of the rendezvous framework
  2. 2. © 2017 MapR Technologies 2 Contact Information Ted Dunning, PhD Chief Application Architect, MapR Technologies Committer, PMC member, board member, ASF O’Reilly author Email tdunning@mapr.com tdunning@apache.org Twitter @Ted_Dunning
  3. 3. © 2017 MapR Technologies 3 Traditional View
  4. 4. © 2017 MapR Technologies 4 Traditional View: This isn’t the whole story
  5. 5. © 2017 MapR Technologies 5 90% of the effort in successful machine learning isn’t in the training or model dev… It’s the logistics
  6. 6. © 2017 MapR Technologies 6 Rendezvous Architecture Input Scores RendezvousModel 1 Model 2 Model 3 request response Results
  7. 7. © 2017 MapR Technologies 7 What We Ultimately Want request response Model
  8. 8. © 2017 MapR Technologies 8 But This Isn’t The Answer Model 1 request response Load balancer Model 2 Model 3
  9. 9. © 2017 MapR Technologies 9 First Try with Streams Input Model 1 Model 2 Model 3 request response ?
  10. 10. © 2017 MapR Technologies 10 First Rendezvous Input Scores RendezvousModel 1 Model 2 Model 3 request response Results
  11. 11. © 2017 MapR Technologies 11 Some Key Points • Note that all models see identical inputs • All models run in production setting • All models send scores to same stream • The rendezvous server decides which scores to ignore • Roll forward, roll back, correlated comparison are all now trivial
  12. 12. © 2017 MapR Technologies 12 Reality Check, Injecting External State Model 1 Model 2 Model 3 request Raw Add external data Input Database The world
  13. 13. © 2017 MapR Technologies 13 Recording Raw Data (as it really was) Input Scores Decoy Model 2 Model 3 Archive
  14. 14. © 2017 MapR Technologies 14 Quality & Reproducibility of Input Data is Important! • Recording raw-ish data is really a big deal – Data as seen by a model is worth gold – Data reconstructed later often has time-machine leaks – Databases were made for updates, streams are safer • Raw data is useful for non-ML cases as well (think flexibility) • Decoy model records training data as seen by models under development & evaluation
  15. 15. © 2017 MapR Technologies 15 Canary for Comparison Real model ∆ Result Canary Decoy Archive Input
  16. 16. © 2017 MapR Technologies 16 What Does the Canary Do? • The canary is a real model, but is very rarely updated • The canary results are almost never used for decisioning • The virtue of the canary is stability • Comparing to the canary results gives insight into new models
  17. 17. © 2017 MapR Technologies 17 Isolated Development With Stream Replication Model 1 Model 2 Model 3 request Raw Add external data Input Internal 1 Internal 2 Internal 3 The world Model 4 Raw New external data Input Internal 4 Production Development
  18. 18. © 2017 MapR Technologies 18 Scores ArchiveDecoy m1 m2 m3 Features / profiles InputRaw
  19. 19. © 2017 MapR Technologies 19 ResultsRendezvousScores ArchiveDecoy m1 m2 m3 Features / profiles InputRaw
  20. 20. © 2017 MapR Technologies 20 Metrics Metrics ResultsRendezvousScores ArchiveDecoy m1 m2 m3 Features / profiles InputRaw
  21. 21. © 2017 MapR Technologies 21 Some Details • Inside the rendezvous server – Message contents … highlight return address – Rendezvous mailbox – Schedule ideas • Inside a model container – Identical inputs makes scaling easy – Nearly stateless models – Streaming shims, latency rig
  22. 22. © 2017 MapR Technologies 22 Message Content • Input request contains request data plus administrivia { timestamp: 1501020498314, messageId: "2a5f2b61fdd848d7954a51b49c2a9e2c", return: "proxy-217" provenance: { ... }, diagnostics: { ... }, ... application specific data here .. }
  23. 23. © 2017 MapR Technologies 23 Rendezvous Schedules • Simple part – Up to deadline, accept preferred models – Up to next deadline, accept more models – Near final deadline, accept default answer • But also some probabilistic choice • And also consider external experimental control – Inject as external state – Use in rendezvous to select model result – Open question how much power to expose
  24. 24. © 2017 MapR Technologies 24 The rendezvous server is simpler than it looks at first
  25. 25. © 2017 MapR Technologies 25 Model Life Cycle • Developer / modeler produces container spec – And uses this to build their development article • QA inspects container spec – And uses this to build a test article • Security inspects container spec – And uses this to build final artifact • Important to use tools like Grafeas to inspect supply chain http://bit.ly/grafeas • Important that each step be inspectable
  26. 26. © 2017 MapR Technologies 26 Almost all of the framework scales by trivial parallelism
  27. 27. © 2017 MapR Technologies 27 Scaling Up • Note about streams – At millions of updates per server, the streams aren’t part of the streaming question • Scaling up state injection – Partition raw input, replicate state injector – Beware external throughput limits – State injection does avoid duplicate queries • Scaling up models – Stateless models allow trivial scaling – Sequence state typically also trivial to scale • Scaling up the rendezvous – Match partition on raw and scores – Replicate trivially
  28. 28. © 2017 MapR Technologies 28 Metrics Metrics ResultsRendezvousScores ArchiveDecoy m1 m2 m3 Features / profiles InputRaw
  29. 29. © 2017 MapR Technologies 29 Metrics Metrics ResultsRendezvousScores ArchiveDecoy m1 m2 m3 Features / profiles InputRaw
  30. 30. © 2017 MapR Technologies 30 Metrics Metrics ResultsRendezvousScores ArchiveDecoy m1 m2 m3 Features / profiles InputRaw
  31. 31. © 2017 MapR Technologies 31 Metrics Metrics ResultsRendezvousScores ArchiveDecoy m1 m2 m3 Features / profiles InputRaw
  32. 32. © 2017 MapR Technologies 32 In-place update of the framework via modified Chandry-Lamport
  33. 33. © 2017 MapR Technologies 33 Transition Message Input Features / profiles Raw
  34. 34. © 2017 MapR Technologies 34 Transition Message Features / profiles Input Features / profiles Raw
  35. 35. © 2017 MapR Technologies 35 Transition Message Features / profiles Features / profiles InputRaw
  36. 36. © 2017 MapR Technologies 36 Summary: This is easy-ish
  37. 37. © 2017 MapR Technologies 37 Summary: This is easy-ish
  38. 38. © 2017 MapR Technologies 38 Summary: This is easy-ish Well, it isn’t real hard
  39. 39. © 2017 MapR Technologies 39 First Rendezvous Input Scores RendezvousModel 1 Model 2 Model 3 request response Results
  40. 40. © 2017 MapR Technologies 40 Additional Resources O’Reilly report by Ted Dunning & Ellen Friedman © March 2017 Read free courtesy of MapR: https://mapr.com/geo-distribution-big-data-and-analytics/ O’Reilly book by Ted Dunning & Ellen Friedman © March 2016 Read free courtesy of MapR: https://mapr.com/streaming-architecture-using- apache-kafka-mapr-streams/
  41. 41. © 2017 MapR Technologies 41 Additional Resources O’Reilly book by Ted Dunning & Ellen Friedman © June 2014 Read free courtesy of MapR: https://mapr.com/practical-machine-learning- new-look-anomaly-detection/ O’Reilly book by Ellen Friedman & Ted Dunning © February 2014 Read free courtesy of MapR: https://mapr.com/practical-machine-learning/
  42. 42. © 2017 MapR Technologies 42 Additional Resources by Ellen Friedman 8 Aug 2017 on MapR blog: https://mapr.com/blog/tensorflow-mxnet-caffe-h2o-which-ml-best/ by Ted Dunning 13 Sept 2017 in InfoWorld: https://www.infoworld.com/article/3223 688/machine-learning/machine- learning-skills-for-software- engineers.html
  43. 43. © 2017 MapR Technologies 43 New book: Machine Learning Logistics Model Management in the Real World O’Reilly book by Ellen Friedman & Ted Dunning © Sept 2017 Download free from MapR http://info.mapr.com/2017_Content_Machine-Learning- Logistics_eBook_Prereg_RegistrationPage.html Going to Strata Data NYC? Book will be released 26 Sept 2017: Visit MapR booth for free book signings or to talk about logistics
  44. 44. © 2017 MapR Technologies 44 Please support women in tech – help build girls’ dreams of what they can accomplish © Ellen Friedman 2015#womenintech #datawomen
  45. 45. © 2017 MapR Technologies 45 Q&A @mapr tdunning@mapr.com ENGAGE WITH US @ Ted_Dunning

×