Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ODSC London - ML in production

186 views

Published on

Productionising machine learning pipelines can be a daunting and difficult task for Data Scientists. Fortunately many novel tools and technologies have become available in the past years to address this issue and make it easier than ever to deploy ML models into production. In this session Oliver will walk through some of the best options on how to operationalise ML pipelines within the Tensorflow ecosystem and on Google Cloud Platform, based on actual case studies.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

ODSC London - ML in production

  1. 1. ML IN PRODUCTION Serverless and Painless Oliver Gindele @tinyoli oliver@datatonic.com 22.11.2019 ODSC London
  2. 2. Who is Oliver? + Head of Machine Learning + PhD in computational physics Who is datatonic? We are a strong team of data scientists, machine learning experts, software engineers and mathematicians. Our mission is to provide tailor-made systems to help your organization get smart actionable insights from large data volumes.
  3. 3. Why is moving models into production hard?
  4. 4. Define ML use cases Define specific ML use cases for the project Select algorithm Choose the right ML algorithm for the task Build ML model Develop the first iteration of the ML model Present results Present results of the model in a way that demonstrates its value to stakeholders Iterate ML model Refine the ML model to improve performance and efficacy Data pipeline & feature engineering Create the right features from raw data for the ML task Plan for deployment Prepare for deployment in production Operationalize model Deploy and operationalize ML model in production Monitor model Monitor deployed ML model and retrain or rebuild when performance degrades 1 3 10 789 Data exploration Perform exploratory analysis to understand the data 2 4 6 5 Start a new ML project Discover Model Build Deploy ML Project Life Cycle
  5. 5. Define ML use cases Define specific ML use cases for the project Select algorithm Choose the right ML algorithm for the task Build ML model Develop the first iteration of the ML model Present results Present results of the model in a way that demonstrates its value to stakeholders Iterate ML model Refine the ML model to improve performance and efficacy Data pipeline & feature engineering Create the right features from raw data for the ML task Plan for deployment Prepare for deployment in production Operationalize model Deploy and operationalize ML model in production Monitor model Monitor deployed ML model and retrain or rebuild when performance degrades 1 3 10 789 Data exploration Perform exploratory analysis to understand the data 2 4 6 5 Start a new ML project Discover Model Build Deploy ML Project Life Cycle The hard part! Needs SE/DevOps skills
  6. 6. The Model + MobileNetV2 + Lots of image augmentation, careful data selection + Transfer learning → retrain top layers + Small changes to overall architecture + Hyperparameter tuning → F1 score: 45% -> 97%
  7. 7. Define ML use cases Define specific ML use cases for the project Select algorithm Choose the right ML algorithm for the task Build ML model Develop the first iteration of the ML model Present results Present results of the model in a way that demonstrates its value to stakeholders Iterate ML model Refine the ML model to improve performance and efficacy Data pipeline & feature engineering Create the right features from raw data for the ML task Plan for deployment Prepare for deployment in production Operationalize model Deploy and operationalize ML model in production Monitor model Monitor deployed ML model and retrain or rebuild when performance degrades 1 3 10 789 Data exploration Perform exploratory analysis to understand the data 2 4 6 5 Start a new ML project Discover Model Build Deploy ML Project Life Cycle
  8. 8. We want: + Quick development cycles 🚀 + Continuous delivery 🔁 + Ship ML models like “normal” software 📦 + As much automation as possible 🤖
  9. 9. We want: + Quick development cycles 🚀 + Continuous delivery 🔁 + Ship ML models like “normal” software 📦 + As much automation as possible 🤖
  10. 10. Why MLOps? + Orchestration of multiple pipelines + Scalable ML Applications + Flexible, self-serve R&D + Reliable APIs + Ongoing data validation + Monitoring and validation of ML predictions + Model governance/versioning + Continuous Integration and Deployment
  11. 11. Avoid learning all this!
  12. 12. The Landscape (a sample) Tensorflow Extended (TFX) Machine-learning Studio AWS SageMaker GCP AI Platform
  13. 13. Current issues with data science platforms (generalisations, 1/2) + Incomplete, missing features + Not stable yet, still maturing (and changing!) + New custom languages/APIs to learn + Vendor Lock-in
  14. 14. Current issues with data science platforms (generalisations, 2/2) + Requires non DS/ML skillset (Docker, Spark, K8s) + Experimentation can’t be contained to a platform + Data access/data silos not solved + Focus on DS/ML but not business value
  15. 15. What now? (2019 View) Check if one the these tools ticks all your boxes → If not: roll your own tailored solution - It’s not that hard → Serverless and managed options are your friend! Composer BigQuery GCS AI Platform Managed (training, deployment, notebooks) Blob storage Fully managed data warehouse Managed Airflow (orchestration) Dataflow Managed ETL Google Cloud Platform example tools:
  16. 16. Let’s Build a Machine Learning Pipeline for Computer Vision 🖼 Preprocess images ⚙ Automate model training and evaluation 🔬 Monitor and track model performance 🏷 Store and version models
  17. 17. Training Images .tflite model Android and iOS apps🤷🏻‍♀ 🤷🏼‍♂ ❓ ❓
  18. 18. Even more challenges on mobile 🏎 Processor speed 📦 Storage space 🔋 Power ⏰ Latency 📴 Disconnections 📡 Bandwidth
  19. 19. Data preparation Machine Learning Pipeline Overview Image Upload GCS
  20. 20. Image Upload GCS Convert images to TFRecords Dataflow Data augmentation AI Platform Train/Eval Sets GCS Data preparation
  21. 21. Convert images to TFRecords 🏃 Dataflow is a serverless runner for Beam pipelines 🌬 Converted 50k jpeg training images in 10 minutes 🖼 TFRecords are serialized images stored in a set of files 🐍 Batch data processing in Apache Beam (Python)
  22. 22. Model training & evaluation AI Platform Convert model to TFLite AI Platform Store evaluation metrics BigQuery Model versions GCS Machine Learning
  23. 23. Train & evaluate and deploy model 📈 Serverless scalable model training with AI Platform 🔀 Distributed training & hyperparameter tuning ♻ Tensorflow code can run on AI Platform with no change 🔗 Easily Attach GPUs and TPUs 📲 Convert to .tflite (3.5 MB with full integer quantization)
  24. 24. Quantization https://heartbeat.fritz.ai/8-bit-quantization-and-tensorflow-lite-speeding-up-mobile-inference-with-low-precision-a882dfcafbbd
  25. 25. Evaluate new models
  26. 26. Automation Composer Image Upload GCS Convert images to TFRecords Dataflow Data augmentation AI Platform Model training & evaluation AI Platform Train/Eval Sets GCS Convert model to TFLite AI Platform Store evaluation metrics BigQuery Model versions GCS Data preparation Machine Learning
  27. 27. Serverless ML Pipeline build on GCP Exported to TFLite and deployed to Android and IOS apps Huge improvement of model performance ✅ ✅ ✅
  28. 28. Custom MLOps on GCP + Orchestration of multiple pipelines ✔ + Scalable ML Applications ✔ + Monitoring and validation of ML predictions ✔ + Ongoing data validation ✔ + Model Governance ✔ + Continuous Integration and Deployment ✔ AI Platform Composer BigQuery GCS Composer Gitlab AI Platform
  29. 29. Takeaways + Building your own custom solution is not that hard! + Cloud vendors already offer easy to use, fully managed and battle tested components + New MLops platforms are maturing rapidly —> Kubeflow, TFX, MLFlow —> Can’t wait for 2020!
  30. 30. Thank you. oliver@datatonic.com @tinyoli www.datatonic.com

×