Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

仕事ではじめる機械学習

13,264 views

Published on

データドリブンなプロダクトを作るためのプロジェクトの進め方や、機械学習システムを作る上で特有の難しさについて

Published in: Engineering
  • Be the first to comment

仕事ではじめる機械学習

  1. 1. Aki Ariga | Field Data Scientist 2018.05.17
  2. 2. 2 © Cloudera, Inc. All rights reserved. ● Field Data Scientist at Cloudera ● Previously research engineer at Toshiba, Rails developer at Cookpad ● Co-author of “ ” ● Founder of kawasaki.rb & MLCT ● Twitter: @chezou ● GitHub: https://github.com/chezou/ :
  3. 3. 3 © Cloudera, Inc. All rights reserved. Hidden technical debt in Machine learning systems [2] Project procedure Culture + +
  4. 4. © Cloudera, Inc. All rights reserved. Building a Data-driven product ≠ Research
  5. 5. 5 © Cloudera, Inc. All rights reserved. A journey for Data-driven product 1. 2. 3. A/B 4. A/B 5. 6. 7. http://tjo.hatenablog.com/entry/2016/01/18/080000 ( ) Culture BI Statistics ML
  6. 6. 6 © Cloudera, Inc. All rights reserved. 1. 2. 3. 4. 5. 6. 7. 8. Procedure in a Machine Learning project Step.4 7
  7. 7. 7 © Cloudera, Inc. All rights reserved. • • • • / Web • Typical project member recommendation for ML project
  8. 8. © Cloudera, Inc. All rights reserved. What’s the difference between academia and industry for ML?
  9. 9. 9 © Cloudera, Inc. All rights reserved. Production by Nick Youngson CC BY-SA 3.0 Alpha Stock Images
  10. 10. 10 © Cloudera, Inc. All rights reserved. Sample data science/machine learning workflow From data to exploration to action Data Engineering Data Science (Exploratory) Production (Operational) Data Wrangling Data Exploration Model Training & Testing Production Data Pipelines Batch Scoring Online Scoring Serving Data GovernanceCuration Data Engineering Acquisition Reports, Dashboards Data Models Predictions Business value 1.
  11. 11. 12 © Cloudera, Inc. All rights reserved. 1. 2. 3. Production MLOps
  12. 12. 13 © Cloudera, Inc. All rights reserved. 1. 2. 3. Production MLOps
  13. 13. 14 © Cloudera, Inc. All rights reserved. 1. Train by batch, predict on the fly, serve via REST API 2. Train by batch, predict by batch, serve through the shared DB 3. Train, predict, serve by streaming 4. Train by batch, predict on mobile app 1.
  14. 14. 15 © Cloudera, Inc. All rights reserved. Web Application DB Trained Model Execute training Extract feature Prediction result Activity log/ Contents data Feature Training result Feature Batch SystemAPI Server REST API User ID/ Item ID ML System Pattern 1: Train by batch, predict on the fly, serve via REST API 1.
  15. 15. 16 © Cloudera, Inc. All rights reserved. Extract feature & Train/update model Extract feature & Predict Trained Model Activity log Export model as PMML Model building layer Predicting & serving layer Updated model CDSW Prediction results HDFSRequest to predict Load model Example architecture: PMML + OpenScoring 1.
  16. 16. 17 © Cloudera, Inc. All rights reserved. Extract feature & Train/update model Extract feature & Predict Trained Model Activity log Save model on object storage Model building layer Predicting & serving layer Updated model Prediction results HDFSRequest to predict Load model Object storage Pack the runtime env with Docker CDSW Example architecture: Docker based API Server 1.
  17. 17. 18 © Cloudera, Inc. All rights reserved. Web Application DB Trained Model Batch System Execute training Extract feature Prediction result Activity log/ Contents data Feature Training result Feature Serve prediction Training BatchPrediction Batch Pattern 2: Train by batch, predict by batch, serve through the shared DB 1.
  18. 18. 19 © Cloudera, Inc. All rights reserved. Kudu/HBase Extract feature & Train/update model Extract feature & Predict Activity log Prediction results Model building & predicting layerServing layer Updated model Activity log Load trained model Prediction results HDFS CDSW Historical data Historical data Example architecture: Serving by HBase/Kudu Trained Model 1.
  19. 19. 20 © Cloudera, Inc. All rights reserved. Web Application Trained Model Stream-based ML System (e.g. Spark Streaming) Train & Predict Extract feature Prediction results Recent log data Feature Model updates Model - Querying for prediction - Showing or sending alerts - This component may work with message queue like Kafka Messagequeue (e.g.Kafka) Log data Prediction results Pattern 3: Train, predict, serve by streaming 1.
  20. 20. 21 © Cloudera, Inc. All rights reserved. Mobile Application DB Trained Model Batch System Execute training Extract feature Extract feature Request for prediction Activity logs/ Contents data Prediction result Activity log/ Contents data Feature Training resultFeature DB Trained Model Convert model Pattern 4: Train by batch, predict on a mobile app 1.
  21. 21. 22 © Cloudera, Inc. All rights reserved. Extract feature & Train/update model Extract feature & Predict Trained Model Activity log Convert model to TFLite/CoreML Model building layer Predicting & serving layer Updated model Prediction results HDFSRequest to predict Load model Storage in a smart phone CDSW Example architecture: Serving on a mobile app 1.
  22. 22. 23 © Cloudera, Inc. All rights reserved. Pattern 4’: Federated learning https://research.googleblog.com/2017/04/federated-learning- collaborative.html 1.
  23. 23. 24 © Cloudera, Inc. All rights reserved. 4 patterns Comparison 1. Pattern 1 (REST API) Pattern 2 (Shared DB) Pattern 3 (Streaming) Pattern 4 (Mobile app) Training by batch by batch NRT (by streaming) by batch Prediction NRT (on the fly) by batch NRT (by streaming) NRT (on the fly) Prediction result delivery NRT (via REST API) NRT (through the shared DB) NRT (by streaming via MQ ) NRT (via in-process API on mobile) Latency for prediction from getting new data So so So so ~ Long Very low Low Required time to predict Short Long Short Short Tight/loose coupling with app Loose Loose Loose Tight Dependency of languages Independent Independent Independent Depends on frameworks System management difficulty So so Easy Very Hard So so NRT: Near real time
  24. 24. 25 © Cloudera, Inc. All rights reserved. CI, CD and Blue Green deployment https://www.slideshare.net/hiroakikudo77/ss-84593653/14 1.
  25. 25. 26 © Cloudera, Inc. All rights reserved. 1. 2. 3. Production MLOps
  26. 26. 27 © Cloudera, Inc. All rights reserved. • /Feedback loop • • 2.
  27. 27. 28 © Cloudera, Inc. All rights reserved. • • ) MeCab • • ) • • • /Feedback loop https://twitter.com/hagino3000/status/986257856730034177 2.
  28. 28. 29 © Cloudera, Inc. All rights reserved. • • “safe to serve” & “desired prediction quality” [4] • (offline) (online) • “Silent failures” [3] • ) Join • ) • • • • serving 2.
  29. 29. 30 © Cloudera, Inc. All rights reserved. • • • [1] • ) DVC, Bitemporal Modeling • [4] • ) • • [2,4] • [4] 2.
  30. 30. 31 © Cloudera, Inc. All rights reserved. 1. 2. 3. Production MLOps
  31. 31. 32 © Cloudera, Inc. All rights reserved. • • [7] • Google, Facebook [4, 9] • / • / • • Researcher, Dev, Ops: https://www.slideshare.net/syou6162/ss-88255142 3.
  32. 32. 33 © Cloudera, Inc. All rights reserved. • IoT [8] • • (GDPR) 3.
  33. 33. 34 © Cloudera, Inc. All rights reserved. • Data-driven product • • • • ML systems Production • • • •
  34. 34. 35 © Cloudera, Inc. All rights reserved. • [1] “My model has higher BLEU, can I ship it? The Joel Test for machine learning systems”, L. Park, 2017, ACML-AIMLP Workshop • [2] “Hidden Technical Debt in Machine Learning Systems”, D. Sculley et al., NIPS’ 15 • [3] “Rules of Machine Learning: Best Practices for ML Engineering”, M. Zinkevich • [4] “TFX: A TensorFlow-Based Production-Scale Machine Learning Platform”, A. Naresh et al., KDD 2017 • [5] “What’s your ML test score? A rubric for ML production systems”, E. Breck et al., Reliable Machine Learning in the Wild - NIPS 2016 Workshop (2016) • [6] , 2017, ML Ops Study #1 • [7] , , 2018, HACKER TACKLE 2018 • [8] “DevOps for models: How to manage millions of models in production—and at the edge”, T. Tung et al., Strata Data Singapore, 2017 • [9] “Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective”, K. Hazelwood et al., IEEE HPCA, 2018
  35. 35. THANK YOU

×