Productionizing Machine Learning
Pipelines with Databricks & Azure ML
Trace Smith & Amirhessam Tahmassebi
Data Scientist - ExxonMobil
Agenda
Motivation
Tutorial Overview
Key Learnings
Live Tutorial
Machine Learning Templates
▪ Deployment of Machine Learning Applications
▪ Significant amount of time moving from dev to production
▪ Resources and experience to design and implement
▪ Developing Reproducible Frameworks
▪ Quickly jumpstart and accelerate Data Science projects into production
▪ Series of code repositories with production-ready code templates
▪ Developed around previous use-cases
▪ Segment into infrastructure code and business logic
▪ Data Scientist more focused on developing models
▪ Template Example
▪ Integrating Databricks with Azure Machine Learning and Azure DevOps for end-to-end model deployment
Model ManagementModel TrainingData Ingestion
Azure Machine
Learning
Azure
Pipelines
Tutorial Overview
source: https://www.kaggle.com/c/dogs-vs-cats
▪ Deep Learning: Computer Vision
▪ Image Classification – Convolution Neural Network
▪ Libraries: Tensorflow & PyTorch
▪ Dataset: Cats and Dog
▪ Open source dataset
▪ Reference: https://www.microsoft.com/en-us/download/
▪ Total Images: 25,000
▪ Images stored in Azure Blob Storage
▪ Mounted to Databricks Workspace
Source: https://cs231n.github.io/convolutional-networks/
Key Learnings
▪ Custom Containerization
▪ Databricks-connect
▪ Deep Learning (CNN)
▪ Parallelized Gridsearch
▪ Model Management w/ MLflow
▪ AzureML Workspace
▪ AzureML Pipelines
▪ Azure Model Registry
▪ Custom inference script
▪ Webservice Deployment
Azure Machine LearningMachine Learning
• Azure DevOps
• Code Quality Checks
▪ Automated Unit Testing
▪ Azure Pipelines
▪ Continuous deployment
Deployment
Thanks!
Questions?
Feedback
Your feedback is important to us.
Don’t forget to rate and
review the sessions.

Productionizing Machine Learning Pipelines with Databricks and Azure ML

  • 2.
    Productionizing Machine Learning Pipelineswith Databricks & Azure ML Trace Smith & Amirhessam Tahmassebi Data Scientist - ExxonMobil
  • 3.
  • 4.
    Machine Learning Templates ▪Deployment of Machine Learning Applications ▪ Significant amount of time moving from dev to production ▪ Resources and experience to design and implement ▪ Developing Reproducible Frameworks ▪ Quickly jumpstart and accelerate Data Science projects into production ▪ Series of code repositories with production-ready code templates ▪ Developed around previous use-cases ▪ Segment into infrastructure code and business logic ▪ Data Scientist more focused on developing models ▪ Template Example ▪ Integrating Databricks with Azure Machine Learning and Azure DevOps for end-to-end model deployment Model ManagementModel TrainingData Ingestion Azure Machine Learning Azure Pipelines
  • 5.
    Tutorial Overview source: https://www.kaggle.com/c/dogs-vs-cats ▪Deep Learning: Computer Vision ▪ Image Classification – Convolution Neural Network ▪ Libraries: Tensorflow & PyTorch ▪ Dataset: Cats and Dog ▪ Open source dataset ▪ Reference: https://www.microsoft.com/en-us/download/ ▪ Total Images: 25,000 ▪ Images stored in Azure Blob Storage ▪ Mounted to Databricks Workspace Source: https://cs231n.github.io/convolutional-networks/
  • 6.
    Key Learnings ▪ CustomContainerization ▪ Databricks-connect ▪ Deep Learning (CNN) ▪ Parallelized Gridsearch ▪ Model Management w/ MLflow ▪ AzureML Workspace ▪ AzureML Pipelines ▪ Azure Model Registry ▪ Custom inference script ▪ Webservice Deployment Azure Machine LearningMachine Learning • Azure DevOps • Code Quality Checks ▪ Automated Unit Testing ▪ Azure Pipelines ▪ Continuous deployment Deployment
  • 8.
  • 9.
  • 10.
    Feedback Your feedback isimportant to us. Don’t forget to rate and review the sessions.