DevOps for AI Apps
Richin Jain, Software Engineer (@richinjain)
Vivek Gupta, Data Scientist (@gkeviv)
Agenda
• Background
• DevOps Introduction
• Enterprise AI use case
• Workflows
• Traditional v/s AI App
• Proposed approach
• Pipeline
Background
• Over the course of engaging with various enterprise customers on their AI use case, this has been a
common ask.
• At one point in time Data Science was one off task where answer was given.
• Now it has been integrated with real time applications along with retraining and A/B testing.
• Because of this we have to relook at Data Science process and how it could be integrated with existing
software stack.
• Goal was to start looking at best practices from S/W engineering and how could it best applied here.
DevOps brings together people, processes, and technology, automating software delivery to provide continuous value to
your users.
What is DevOps?
Continuous Integration (CI)
• Focuses on blending the work of
individual developers together into a
repository.
• Each time you commit code, it’s
automatically built and tested and
bugs are detected faster.
Continuous Deployment (CD)
• Automate the entire process from
code commit to production if your
CI/CD tests are successful.
Continuous Learning & Monitoring
• Using CI/CD practices, paired with
monitoring tools, safely deliver features
to your customers as soon as they’re
ready.
Single environment
Multiple environment
Configure Code Build Test Package Deploy
Monitor
• Infrastructure as Code
• Continuous Integration
• Automated Testing
• Continuous Deployment
• Release Management
• Load Testing & Auto-Scale
• App Performance Monitoring
DevOps Maturity
Data Science
• Experimentation
• Modeling
• Versioning
• Lineage
• Conversion
• Export
• Quantization
• Inferencing
• Retraining
• A/B Testing
A C T I V I T I E S
• Need to solve ML problem quickly.
• ML stack might be different from rest of the
application stack.
• Lots of glue code.
• Testing accuracy of ML model.
• ML code is not always version controlled.
• Hard to reproduce models
• Integrating model into application can take weeks
• Need to re-write featurizing and scoring code
multiple times (in different languages)
• Want to start using customer data to build models
• Hard to track breaking changes
P A I N P O I N T S
Enterprise AI use case
• Contoso LLC has an image recognition scenario. Data Science team develops a
state-of-the-art image recognition model.
• Four ways it could be consumed
• User upload the images to Contoso's website and get instant results.
• User uploads several images or point to a folder and get results.
• Native mobile app
• Edge devices
API Based model integration
• Real time
• Batch
Embedded models
• Native Apps
• Edge devices
Workflow: App DeveloperWorkflow: App DeveloperBasic Workflow: Software Engineer
Workflow: Data ScientistWorkflow: App DeveloperBasic Workflow: Data Scientist
https://ai.google/research/pubs/pub45742
https://ai.google/research/pubs/pub45742
Model TestsData Tests
ML Infrastructure Tests Monitoring Tests
Proposed Approach
Data-
Science-
Repo
Publish
test
results
Get
Source
Code
Install
Requirements
Create Conda
Environment
Unit-testPylint
Data
Scientist
Code
Coverage
Data-
Science-
Repo
Get
Source
Code
Install
Requirements
Create Conda
Environment
Unit-test
Create
Docker
Image
Data
Scientist
Register
Model
Pull Req. Pass
Test
deployed
image
Deploy on
Test
Model
Testing and
Validation
Build
Artifact
Create Conda
Environment
Install
Requirements
Deploy to Test
(create/update)
Test environment
(continuous
deployment)
Dev
Artifact
Create Conda
Environment
Install
Requirements
Deploy to
Staging
(update)
Staging Environment
(nightly, other
services test here)
Test
Artifact
Create Conda
Environment
Install
Requirements
Deploy to Prod
(update)
Prod environment
(end of sprint)
Get
Source
Create Conda
Environment
Install
Requirements
Convert model
to other formats
ONNX, CoreML, WinML
(end of sprint, every time
there is a new model)
Get
Source
Create Conda
Environment
Install
Requirements
Retrain model
on new data
Retraining Pipeline
(every night, or triggered
on new data uploading to
blob)
It will run in a pre-prod
environment, so it has
access to production
data, and wouldn’t be
promoted unless it passes
A/B tests against prod
data.
Data
Validation
A/B Testing
Model
Testing and
Validation
Model
Management &
Promotion
• Core features of Azure ML service
exposed through a Python SDK and CLI.
• Easy and simple pip install
• Makes CI/CD much simpler.
• Best practices for architecting and managing an enterprise-ready AI application
lifecycle.
• Azure DevOps and Azure ML ease the adoption of DevOps by DS teams.
• Adoption will increase the agility, quality and delivery of DS teams.
Thank you !

DevOps for AI Apps

  • 1.
    DevOps for AIApps Richin Jain, Software Engineer (@richinjain) Vivek Gupta, Data Scientist (@gkeviv)
  • 2.
    Agenda • Background • DevOpsIntroduction • Enterprise AI use case • Workflows • Traditional v/s AI App • Proposed approach • Pipeline
  • 3.
    Background • Over thecourse of engaging with various enterprise customers on their AI use case, this has been a common ask. • At one point in time Data Science was one off task where answer was given. • Now it has been integrated with real time applications along with retraining and A/B testing. • Because of this we have to relook at Data Science process and how it could be integrated with existing software stack. • Goal was to start looking at best practices from S/W engineering and how could it best applied here.
  • 4.
    DevOps brings togetherpeople, processes, and technology, automating software delivery to provide continuous value to your users. What is DevOps? Continuous Integration (CI) • Focuses on blending the work of individual developers together into a repository. • Each time you commit code, it’s automatically built and tested and bugs are detected faster. Continuous Deployment (CD) • Automate the entire process from code commit to production if your CI/CD tests are successful. Continuous Learning & Monitoring • Using CI/CD practices, paired with monitoring tools, safely deliver features to your customers as soon as they’re ready.
  • 5.
  • 6.
    Configure Code BuildTest Package Deploy Monitor • Infrastructure as Code • Continuous Integration • Automated Testing • Continuous Deployment • Release Management • Load Testing & Auto-Scale • App Performance Monitoring DevOps Maturity
  • 7.
    Data Science • Experimentation •Modeling • Versioning • Lineage • Conversion • Export • Quantization • Inferencing • Retraining • A/B Testing A C T I V I T I E S • Need to solve ML problem quickly. • ML stack might be different from rest of the application stack. • Lots of glue code. • Testing accuracy of ML model. • ML code is not always version controlled. • Hard to reproduce models • Integrating model into application can take weeks • Need to re-write featurizing and scoring code multiple times (in different languages) • Want to start using customer data to build models • Hard to track breaking changes P A I N P O I N T S
  • 8.
    Enterprise AI usecase • Contoso LLC has an image recognition scenario. Data Science team develops a state-of-the-art image recognition model. • Four ways it could be consumed • User upload the images to Contoso's website and get instant results. • User uploads several images or point to a folder and get results. • Native mobile app • Edge devices
  • 9.
    API Based modelintegration • Real time • Batch Embedded models • Native Apps • Edge devices
  • 10.
    Workflow: App DeveloperWorkflow:App DeveloperBasic Workflow: Software Engineer
  • 11.
    Workflow: Data ScientistWorkflow:App DeveloperBasic Workflow: Data Scientist
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
    Build Artifact Create Conda Environment Install Requirements Deploy toTest (create/update) Test environment (continuous deployment) Dev Artifact Create Conda Environment Install Requirements Deploy to Staging (update) Staging Environment (nightly, other services test here) Test Artifact Create Conda Environment Install Requirements Deploy to Prod (update) Prod environment (end of sprint)
  • 19.
    Get Source Create Conda Environment Install Requirements Convert model toother formats ONNX, CoreML, WinML (end of sprint, every time there is a new model) Get Source Create Conda Environment Install Requirements Retrain model on new data Retraining Pipeline (every night, or triggered on new data uploading to blob) It will run in a pre-prod environment, so it has access to production data, and wouldn’t be promoted unless it passes A/B tests against prod data. Data Validation A/B Testing Model Testing and Validation Model Management & Promotion
  • 20.
    • Core featuresof Azure ML service exposed through a Python SDK and CLI. • Easy and simple pip install • Makes CI/CD much simpler.
  • 23.
    • Best practicesfor architecting and managing an enterprise-ready AI application lifecycle. • Azure DevOps and Azure ML ease the adoption of DevOps by DS teams. • Adoption will increase the agility, quality and delivery of DS teams.
  • 24.

Editor's Notes

  • #6 https://blogs.msdn.microsoft.com/visualstudioalmrangers/2017/04/20/set-up-a-cicd-pipeline-to-run-automated-tests-efficiently/
  • #7 Microsoft DevOps site - https://www.microsoft.com/en-us/cloud-platform/development-operations Source - http://www.itproguy.com/devops-practices/
  • #14 Link to Google paper - https://ai.google/research/pubs/pub45742
  • #15 Link to Google paper - https://ai.google/research/pubs/pub45742
  • #21  https://azure.microsoft.com/en-us/blog/what-s-new-in-azure-machine-learning-service/