2. Continuous Integration (CI)
• Blend together the work of individual
engineers in a repository.
• Each time you commit code, it’s
automatically built and tested, and
bugs are detected faster.
Continuous Deployment (CD)
• Automate the entire process from
code commit to production (if your
CI/CD tests are successful.)
Continuous Learning & Monitoring
• Safely deliver features to your
customers as soon as they’re ready.
• Monitor your features in production
and know when they aren’t behaving
as expected.
3.
4. ML DevOps lifecycle
Experiment
Data Acquisition
Business Understanding
Initial Modeling
Develop
Modeling
Operate
Continuous Delivery
Data Feedback Loop
System + Model Monitoring
Experiment
+ Testing
Continuous Integration
Continuous Deployment
5.
6. Overcome that data science teams only own experiments, instead of being responsible for the
end-to-end flow from experiment to production to operational support on AI.
Benefits:
• Continuous delivery of value (data insights, models) to end users.
• End-to-end ownership of the Analytics Lifecycle by DS teams
• Enforcing a consistent approach to building and deploying AI
• Extending data science with SDE practices to increase delivery quality and cadence.
• Framework for continuous learning, lineage, auditability and regulatory compliance.
• Improving team collaboration through standardization in delivery practices.
7.
8. Use leaderboards, side by side run
comparison and model selection
Capture run metrics, intermediate
outputs, output logs and models
Produce Repeatable Experiments
80%
75%
90%
85%
Use well-defined pipelines
to capture the E2E model
training process
9. • Track model versions & metadata with a centralized
model registry
• Leverage containers to capture runtime
dependencies for inference
• Leverage an orchestrator like Kubernetes to provide
scalable inference
• Capture model telemetry – health, performance,
inputs / outputs
• Encapsulate each step in the lifecycle to enable
CI/CD and DevOps
• Automatically optimize models to take advantage of
hardware acceleration
10. Prepare
Data
Register &
Manage Model
Model training &
testing
Package &
Validate Model
…
Feature engineering Deploy Service
Monitor Model
Prepare Experiment Deploy
Data science workflow
11.
12.
13.
14.
15. App Developer IDE
Data Scientist
[ { "cat": 0.99218,
"feline": 0.81242 }]
IDE
Consume Model
DevOps
Pipeline
Predict
Update
Application
Publish Model
Deploy
Application
Validate
App
16. App Developer IDE
Data Scientist
[ { "cat": 0.99218,
"feline": 0.81242 }]
Model Store
Consume Model
DevOps
Pipeline
Predict
Validate
App
Update
Application
Deploy
Application
Publish Model
17. App Developer IDE
[ { "cat": 0.99218,
"feline": 0.81242 }]
Model Store
Consume Model
DevOps
Pipeline
Validate
Model
Predict
Validate
Model + App
Update
Application
Deploy
Application
Data Scientist
Publish Model
18. App Developer IDE
[ { "cat": 0.99218,
"feline": 0.81242 }]
Model Store
Consume Model
DevOps
Pipeline
Validate
Model
Predict
Validate
Model + App
Update
Application
Deploy
Application
Data Scientist
Publish Model
Collect
Feedback
Retrain Model
AB Test
19.
20. App Developer
Cloud Services
IDE
Data Scientist
[ { "cat": 0.99218,
"feline": 0.81242 }]
IDE
Apps
Edge Devices
Model Store
Consume Model
DevOps
Pipeline
Customize Model
Deploy Model
Predict
Validate
&
Flight
Model
+
App
Update
Application
Publish Model
Collect
Feedback
Deploy
Application
Model
Telemetry
Retrain Model
21. Source Code DevOps Pipeline
Register
Model
Training Pipeline
Data
Movement
Data Prep Model Training
Model
Store
DevOps Pipeline
DevTest
Deploy to PROD
Package
Model
Validate
Model
Get
Human
Approval
MODEL CI/CD (Machine Learning as a Service + DevOps)
Azure DevOps
Azure Machine Learning
Azure Data Factory
New model
registered,
trigger
release
ML Pipeline handles dataPrep,
training, evaluation – certifies the
model is of high quality
TRAIN MODEL
DEPLOY MODEL
Unit Test
Code
Code change,
trigger CI
Inference Data
Data Preparation Services
(Labeling, Feedback, Drift)
Data Lake New data, trigger CI
Data
Cooking
Pipeline
New inference
code, trigger
release
Data Warehouse
New training job is
started whenever source
code is pushed.
22. Continuous Integration and Delivery
Build Model (app) (testing + validation)
Deploy Resources
Deploy Model (app)
Logging & Monitoring
Real-Time
Azure Kubernetes Service
Application Performance Monitoring
Azure ML Experiments
Docker +
Conda Env.
Model / Data Monitoring
Batch
Azure ML Pipelines
Data Collection
23.
24. • Training data
• Featurization code (w/ tests)
• Training pipeline
• Training environment
• Evidence chain
• Model config
• Training job info
• Sample data
• Data profile
Use repeatable pipelines for your ML
workflow – they can get complicated.
25. Source Control
• Track changes in code (and configuration) over time, integrate work,
reproducibility and collaboration.
Dataset Versioning
• Training data plays an important role in the quality of the software
build. Hence, versioning of data is required for reproducability.
Model Versioning
• Version trained models in relation to code and training data for
traceability.
Experiment Tracking
• Version model experiment runs to understand which code, data and
e.g. selected features led to what output and performance, and
allow for reproducibility.
26.
27.
28. • The model response on a given record is not the expected one.
• Investigate the trainset and detect potential bias.
• Ensure that the preprocessing is not clipping any values etc.
• Document these corner cases & add them to validation process
Edge cases
• This type of bugs refers to the resiliency of the model in case of missing
values and how well can it handle unseen categorical values.
Null values /
unknown categories
• An input stream may stop producing data causing unexpected responses by
the model.
Input issues
29.
30.
31. Test Type Data Scientist App Dev / Ops
Unit Tests X
Data Integrity Tests X
Model Performance X
Model Validation X
Integration Tests X X
Load Tests X
Data Monitoring X
Skew Monitoring X
Model Monitoring X X
32. • Data (changes to shape / profile)
• Model in isolation (offline A/B)
• Model + app (functional testing)
• Only deploy after initial validation passes
• Ramp up traffic to new model using A/B
experimentations
• Functional behavior
• Performance characteristics
33.
34. • which data,
• which experiment / previous model(s),
• where’s the code / notebook)
• Was it converted / quantized?
• Private / compliant data
35.
36. • Focus on ML, not DevOps
• Get telemetry for service health and model behavior
• code-generation
• API specifications / interfaces
• Cloud Services
• Mobile / Embedded Applications
• Edge Devices
• Quantize / optimize models for target platform
• Compliant + Safe
37.
38. ML DevOps lifecycle
Experiment
Data Acquisition
Business Understanding
Initial Modeling
Develop
Modeling
Operate
Continuous Delivery
Data Feedback Loop
System + Model Monitoring
Experiment
+ Testing
Continuous Integration
Continuous Deployment
40. Continuous Integration and Delivery
Build Model (app) (testing + validation)
Deploy Resources
Deploy Model (app)
Logging & Monitoring
Real-Time
Azure Kubernetes Service
Application Performance Monitoring
Azure ML Experiments
Docker +
Conda Env.
Model / Data Monitoring
Batch
Azure ML Pipelines
Data Collection
44. Azure Machine Learning service
Set of Azure Cloud
Services
Python
SDK
Prepare Data
Build Models
Train Models
Manage Models
Track Experiments
Deploy Models
That enables you to:
46. Azure ML service Artifact
Workspace
The workspace is the top-level resource for the Azure Machine Learning service.
It provides a centralized place to work with all the artifacts you create when using Azure Machine
Learning service.
The workspace keeps a list of compute targets that can be used to train your model. It also keeps a
history of the training runs, including logs, metrics, output, and a snapshot of your scripts.
Models are registered with the workspace.
You can create multiple workspaces, and each workspace can be shared by multiple people.
When you create a new workspace, it automatically creates these Azure resources:
Azure Container Registry - Registers docker containers that are used during training and when
deploying a model.
Azure Storage - Used as the default datastore for the workspace.
Azure Application Insights - Stores monitoring information about your models.
Azure Key Vault - Stores secrets used by compute targets and other sensitive information needed
by the workspace.
49. ML Pipelines
Increase experiment velocity, reliability, repeatability
Use the technology of your choice for each step
Create & manage ML workflows concurrently
Define steps to prepare data, train, deploy, eval
Use diverse languages & run on diverse compute
Easy to compose and swap out steps as your workflow
evolves
Features
Sequencing and parallelization of steps, declarative
data dependencies
Unattended execution for long running pipeline, mixed
and diverse (heterogeneous) compute for steps
Data management and reusable components. Share
pipelines, code, intermediate data, and models
Compute
#1, #2
Compute
#3
Compute
#4
ML Pipeline
2 3
5 6
8
1 4
7
REST API w/ parameters enables retraining and batch
scoring
Fine controls for compute provision and deprovision
50.
51.
52.
53.
54.
55. Azure ML – Models and Model Registry
Model Model Registry
58. Cloud-hosted pipelines for Linux, Windows and macOS.
Azure DevOps Pipelines
Any language, any platform, any cloud
Build, test, and deploy Node.js, Python, Java, PHP,
Ruby, C/C++, .NET, Android, and iOS apps. Run in
parallel on Linux, macOS, and Windows. Deploy to
Azure, AWS, GCP or on-premises
Extensible
Explore and implement a wide range of community-
built build, test, and deployment tasks, along with
hundreds of extensions from Slack to SonarCloud.
Support for YAML, reporting and more
Containers and Kubernetes
Easily build and push images to container registries
like Docker Hub and Azure Container Registry.
Deploy containers to individual hosts or Kubernetes.
https://azure.com/pipelines
59.
60.
61. Continuous Integration and Delivery
Build Model (app) (testing + validation)
Deploy Resources
Deploy Model (app)
Logging & Monitoring
Real-Time
Azure Kubernetes Service
Application Performance Monitoring
Azure ML Experiments
Docker +
Conda Env.
Model / Data Monitoring
Batch
Azure ML Pipelines
Data Collection
Editor's Notes
Continuous Integration (CI) enables individual developers to collaborate more effectively with each other and blend their work into a code repository
Each time you commit code, it’s automatically built and tested, and bugs are detected faster.
Continuous Delivery (CD) is the process to build, test, configure and deploy from a build to a production environment
Key here is repeatability and consistency to the process, making sure it is well understood, repeatable by others and can aid in the process of verifying the correctness.
Continuous integration (CI)
Increase code coverage.
Build faster by splitting test and build runs
Automatically ensure you don't ship broken code.
Run tests continually.
Continuous delivery (CD)
Automatically deploy code to production.
Ensure deployment targets have latest code.
Use tested code from CI process.
More info can be found here: https://docs.microsoft.com/en-us/azure/devops/learn/what-is-devops
4
Here is the data scientist’s inner loop of work
Make this slide animation.
Developer work on the IDE of their choice on the application code.
They commit the code to source control of their choice (VSTS has good support for various source controls)
Separately, Data scientist work on developing their model.
Once happy they publish the model to a model repository (we can extend this with Vienna)
A build is kicked off in VSTS based on the commit in GitHub.
VSTS Build pipeline pulls the latest model from Blob container (can be extended with Vienna Model Management Service) and creates a container.
VSTS pushes the image to private image repository in Azure Container Registry
On a set schedule (nightly), release pipeline is kicked off.
Latest image from ACR is pulled and deployed across Kubernetes cluster on ACS.
Users request for the app goes through DNS server.
DNS server passes the request to load balancer and sends the response back to user.
Make this slide animation.
Developer work on the IDE of their choice on the application code.
They commit the code to source control of their choice (VSTS has good support for various source controls)
Separately, Data scientist work on developing their model.
Once happy they publish the model to a model repository (we can extend this with Vienna)
A build is kicked off in VSTS based on the commit in GitHub.
VSTS Build pipeline pulls the latest model from Blob container (can be extended with Vienna Model Management Service) and creates a container.
VSTS pushes the image to private image repository in Azure Container Registry
On a set schedule (nightly), release pipeline is kicked off.
Latest image from ACR is pulled and deployed across Kubernetes cluster on ACS.
Users request for the app goes through DNS server.
DNS server passes the request to load balancer and sends the response back to user.
Make this slide animation.
Developer work on the IDE of their choice on the application code.
They commit the code to source control of their choice (VSTS has good support for various source controls)
Separately, Data scientist work on developing their model.
Once happy they publish the model to a model repository (we can extend this with Vienna)
A build is kicked off in VSTS based on the commit in GitHub.
VSTS Build pipeline pulls the latest model from Blob container (can be extended with Vienna Model Management Service) and creates a container.
VSTS pushes the image to private image repository in Azure Container Registry
On a set schedule (nightly), release pipeline is kicked off.
Latest image from ACR is pulled and deployed across Kubernetes cluster on ACS.
Users request for the app goes through DNS server.
DNS server passes the request to load balancer and sends the response back to user.
Make this slide animation.
Developer work on the IDE of their choice on the application code.
They commit the code to source control of their choice (VSTS has good support for various source controls)
Separately, Data scientist work on developing their model.
Once happy they publish the model to a model repository (we can extend this with Vienna)
A build is kicked off in VSTS based on the commit in GitHub.
VSTS Build pipeline pulls the latest model from Blob container (can be extended with Vienna Model Management Service) and creates a container.
VSTS pushes the image to private image repository in Azure Container Registry
On a set schedule (nightly), release pipeline is kicked off.
Latest image from ACR is pulled and deployed across Kubernetes cluster on ACS.
Users request for the app goes through DNS server.
DNS server passes the request to load balancer and sends the response back to user.
Make this slide animation.
Developer work on the IDE of their choice on the application code.
They commit the code to source control of their choice (VSTS has good support for various source controls)
Separately, Data scientist work on developing their model.
Once happy they publish the model to a model repository (we can extend this with Vienna)
A build is kicked off in VSTS based on the commit in GitHub.
VSTS Build pipeline pulls the latest model from Blob container (can be extended with Vienna Model Management Service) and creates a container.
VSTS pushes the image to private image repository in Azure Container Registry
On a set schedule (nightly), release pipeline is kicked off.
Latest image from ACR is pulled and deployed across Kubernetes cluster on ACS.
Users request for the app goes through DNS server.
DNS server passes the request to load balancer and sends the response back to user.
[10:50 AM] Tim ScarfeJordan Edwards thanks for this! Assumptions: 1) the model store is keyed in some way on the build ID and/or the git commit id? 2) the ML pipeline is calling out to data bricks using the jobs API with python source checked into git i.e. not calling a mutable notebook
[11:23 AM] Jordan EdwardsTim Scarfe -
yes the model is pinned with the git commit as well as the pipeline / build ID (so you have an audit trail to exactly how it was produced)
yes the job should submit sources that are in git not in a magic notebook on the file system
<https://teams.microsoft.com/l/message/19:bfb1b4d771ff441393e2c89c9e80d14c@thread.skype/1547059832334?tenantId=72f988bf-86f1-41af-91ab-2d7cd011db47&groupId=66aa6f64-da6b-491b-b2e3-8e43ae872a7c&parentMessageId=1547054108278&teamName=DevOps for A.I. V-Team&channelName=General&createdTime=1547059832334>
Ideally release in my opinion will be automated to a staging environment once a new model hits the model store, Then integration testing and then a manual release gate for deployment to production, so I would not have the arrow from the repo with inference code changes directly triggering a release ... changes to inference code should trigger the build pipeline too. Perhaps there is room for triggering a different build pipeline based on filter conditions (Path Filters) that follows a seperate path other that registering a new model ?
38
The What…
Azure Pipelines is our offering for the heart of your DevOps needs… CI/CD… continuous integration & deployment.
Azure Pipelines is the perfect launchpad for your code – automating everything… from your builds and deployments so you spend less time with the nuts and bolts and more time being creative
At Microsoft we do just that. We deploy over 78k times a day with Azure Pipelines.
Open & extensible…
It’s great for any type of application, any platform or any cloud.
It has cloud hosted pools of Linux, Mac & Windows VMs that we manage for you.
Your not restricted to the functionality we provide, Pipelines has rich extensibility. Partners and the community can contribute extensions in our marketplace for everyone
One of my favourite things is when new extensions show up. We have over 500 today, ranging from community built to services from Slack to SonarCloud.
works has rich extensibility with a wide range of community extensions along with
If you want to build & test a Node app in a GitHub repo and deploy it via a docker container to AWS… go for it.
Containers / Modern…
Containers are becoming more & more the unity of deployment & Azure Pipelines is great for containers.
Azure Pipelines can build images, push them to container registries like Docker Hub and Azure Container Registry. You can deploy to any container host including Kubernetes.
Transition…
Donovan, is going to show us Azure Pipelines in action.