1. DevOps for AI Apps
Richin Jain, Software Engineer (@richinjain)
Vivek Gupta, Data Scientist (@gkeviv)
2. Agenda
• Background
• DevOps Introduction
• Enterprise AI use case
• Workflows
• Traditional v/s AI App
• Proposed approach
• Pipeline
3. Background
• Over the course of engaging with various enterprise customers on their AI use case, this has been a
common ask.
• At one point in time Data Science was one off task where answer was given.
• Now it has been integrated with real time applications along with retraining and A/B testing.
• Because of this we have to relook at Data Science process and how it could be integrated with existing
software stack.
• Goal was to start looking at best practices from S/W engineering and how could it best applied here.
4. DevOps brings together people, processes, and technology, automating software delivery to provide continuous value to
your users.
What is DevOps?
Continuous Integration (CI)
• Focuses on blending the work of
individual developers together into a
repository.
• Each time you commit code, it’s
automatically built and tested and
bugs are detected faster.
Continuous Deployment (CD)
• Automate the entire process from
code commit to production if your
CI/CD tests are successful.
Continuous Learning & Monitoring
• Using CI/CD practices, paired with
monitoring tools, safely deliver features
to your customers as soon as they’re
ready.
7. Data Science
• Experimentation
• Modeling
• Versioning
• Lineage
• Conversion
• Export
• Quantization
• Inferencing
• Retraining
• A/B Testing
A C T I V I T I E S
• Need to solve ML problem quickly.
• ML stack might be different from rest of the
application stack.
• Lots of glue code.
• Testing accuracy of ML model.
• ML code is not always version controlled.
• Hard to reproduce models
• Integrating model into application can take weeks
• Need to re-write featurizing and scoring code
multiple times (in different languages)
• Want to start using customer data to build models
• Hard to track breaking changes
P A I N P O I N T S
8. Enterprise AI use case
• Contoso LLC has an image recognition scenario. Data Science team develops a
state-of-the-art image recognition model.
• Four ways it could be consumed
• User upload the images to Contoso's website and get instant results.
• User uploads several images or point to a folder and get results.
• Native mobile app
• Edge devices
9. API Based model integration
• Real time
• Batch
Embedded models
• Native Apps
• Edge devices
18. Build
Artifact
Create Conda
Environment
Install
Requirements
Deploy to Test
(create/update)
Test environment
(continuous
deployment)
Dev
Artifact
Create Conda
Environment
Install
Requirements
Deploy to
Staging
(update)
Staging Environment
(nightly, other
services test here)
Test
Artifact
Create Conda
Environment
Install
Requirements
Deploy to Prod
(update)
Prod environment
(end of sprint)
19. Get
Source
Create Conda
Environment
Install
Requirements
Convert model
to other formats
ONNX, CoreML, WinML
(end of sprint, every time
there is a new model)
Get
Source
Create Conda
Environment
Install
Requirements
Retrain model
on new data
Retraining Pipeline
(every night, or triggered
on new data uploading to
blob)
It will run in a pre-prod
environment, so it has
access to production
data, and wouldn’t be
promoted unless it passes
A/B tests against prod
data.
Data
Validation
A/B Testing
Model
Testing and
Validation
Model
Management &
Promotion
20. • Core features of Azure ML service
exposed through a Python SDK and CLI.
• Easy and simple pip install
• Makes CI/CD much simpler.
21.
22.
23. • Best practices for architecting and managing an enterprise-ready AI application
lifecycle.
• Azure DevOps and Azure ML ease the adoption of DevOps by DS teams.
• Adoption will increase the agility, quality and delivery of DS teams.