Continuous Deployment
for Deep Learning
—
Nick Pentreath
Principal Engineer
@MLnick
About
IBM Developer / © 2019 IBM Corporation
– @MLnick on Twitter & Github
– Principal Engineer, IBM CODAIT (Center for
Open-Source Data & AI Technologies)
– Machine Learning & AI
– Apache Spark committer & PMC
– Author of Machine Learning with Spark
– Various conferences & meetups
2
CODAIT
Improving the Enterprise AI Lifecycle in Open Source
Center for Open Source
Data & AI Technologies
IBM Developer / © 2019 IBM Corporation 3
CODAIT aims to make AI solutions dramatically
easier to create, deploy, and manage in the
enterprise.
We contribute to and advocate for the open-source
technologies that are foundational to IBM’s AI
offerings.
30+ open-source developers!
Agenda
IBM Developer / © 2019 IBM Corporation
– Overview of Continuous Integration &
Deployment
– The Machine Learning Workflow
– How is CI/CD for ML Different & Challenges
– Model Asset Exchange
– Conclusion
4
Continuous Integration
& Deployment
IBM Developer / © 2019 IBM Corporation 5
Testing
IBM Developer / © 2019 IBM Corporation 6
App
Data Store
Public Front-end Services Back-end Services
Service
Remote
Service
Remote
Service
Testing
IBM Developer / © 2019 IBM Corporation 7
App
Data Store
Public Front-end Services Back-end Services
Service
Remote
Service
Remote
Service
Unit testing
Testing
IBM Developer / © 2019 IBM Corporation 8
App
Data Store
Public Front-end Services Back-end Services
Service
Remote
Service
Remote
Service
Integration testing
Testing
IBM Developer / © 2019 IBM Corporation 9
App
Data Store
Public Front-end Services Back-end Services
Service
Remote
Service
Remote
Service
Post-deploy / Acceptance testing
Development Process
IBM Developer / © 2019 IBM Corporation 10
Build Test Merge Deploy
Dependencies
Starting Point
IBM Developer / © 2019 IBM Corporation 11
Build Test Merge
Dependencies
Manual
Automated
Deploy
Continuous Integration
IBM Developer / © 2019 IBM Corporation 12
Build Test Merge
Dependencies
Manual
Automated
Deploy
Continuous Deployment
IBM Developer / © 2019 IBM Corporation 13
Build Test Merge
Dependencies
Manual
Automated
Deploy
Source of changes
IBM Developer / © 2019 IBM Corporation 14
Build Test Merge Deploy
Dependencies
Changes come from:
• Our code
• Internal dependencies
• 3rd party dependencies
The Machine Learning
Workflow
IBM Developer / © 2019 IBM Corporation 15
Perception
IBM Developer / © 2019 IBM Corporation 16
In reality the
workflow spans teams …
IBM Developer / © 2019 IBM Corporation 17
… and tools …
IBM Developer / © 2019 IBM Corporation 18
… and is a small (but critical!)
piece of the puzzle
*Source: Hidden Technical Debt in Machine Learning Systems
IBM Developer / © 2019 IBM Corporation 19
What is a “model”?
IBM Developer / © 2019 IBM Corporation 20
Deep learning pipeline
IBM Developer / © 2019 IBM Corporation 21
beagle: 0.82
basset: 0.09
bluetick: 0.07
...
Input image Image pre-processing Prediction
Decode image
Resize
Normalization
Convert types / format
Inference Post-processing
[0.2, 0.3, … ]
(label, prob)
Sort
Label map
PIL, OpenCV, tf.image,
…
Custom
Python
* Logos trademarks of their respective projects
Pipelines, not Models
– Deploying (and testing) just the model
part of the workflow is not enough
– Entire pipeline must be taken into
account
• Data transforms
• Feature extraction & pre-processing
• DL / ML model
• Prediction transformation
– Even ETL is part of the pipeline!
– Pipelines in frameworks
• scikit-learn
• Spark ML pipelines
• TensorFlow Transform
• pipeliner (R)
IBM Developer / © 2019 IBM Corporation 22
Continuous Integration
for Machine Learning
IBM Developer / © 2019 IBM Corporation 23
Source of changes
IBM Developer / © 2019 IBM Corporation 24
Build Test Merge Deploy
Dependencies
Changes come from:
• Our code
• Internal dependencies
• 3rd party dependencies
• Data
• Model
• Time
Data Model
Data
IBM Developer / © 2019 IBM Corporation
– Data types
• Images, video, audio, raw text,
unstructured
– Size
– Schemas
* Logos trademarks of their respective projects
25
Models
IBM Developer / © 2019 IBM Corporation
– Size of models
– Resource requirements
– Hardware
• CPU, GPU, TPU, Mobile, Edge
– Need to manage and bridge many
different languages, frameworks
– Formats
– State of the art is changing very
rapidly
* Logos trademarks of their respective projects
26
Monitoring & Feedback
over Time
IBM Developer / © 2019 IBM Corporation 27
Monitoring
Performance Business
Monitoring
Software
Traditional software monitoring
Latency, throughput, resource usage, etc
Model performance metrics
Traditional ML evaluation measures
(accuracy, prediction error, AUC etc)
Business metrics
Impact of predictions on business
outcomes
• Additional revenue - e.g. uplift from recommender
• Cost savings – e.g. value of fraud prevented
• Metrics implicitly influencing these – e.g. user
engagement
IBM Developer / © 2019 IBM Corporation
Feedback
Data
Transform
TrainDeploy
Feedback
Adapt
An intelligent system must automatically
learn from & adapt to the world around it
Continual learning
Retraining, online learning,
reinforcement learning
Feedback loops
Explicit: models create or directly
influence their own training data
Implicit: predictions influence behavior
in longer-term or indirect ways
Humans in the loop
IBM Developer / © 2019 IBM Corporation
AI Fairness,
Transparency &
Security
30IBM Developer / © 2019 IBM Corporation
AI Fairness,
Transparency &
Security
31IBM Developer / © 2019 IBM Corporation
https://github.com/IBM/AIX360/
https://github.com/IBM/AIF360
https://github.com/IBM/adversarial-robustness-toolbox
32
Model
Asset
eXchange
(MAX)
ibm.biz/model-exchange
IBM Developer / © 2019 IBM Corporation
What is
MAX?
33
- One place for state-of-art open
source deep learning models
- Wide variety of domains
- Tested code and IP
- Free and open source
- Both trainable and trained versions
IBM Developer / © 2019 IBM Corporation
Deployable asset on MAX
IBM Developer / © 2019 IBM Corporation 34
Data ExpertiseComputeModel
REST API specificationPre Trained ModelI/O processing
Deployable Asset on Model Asset
Exchange
MetadataInferenceSwagger
Specification
Deployable asset on MAX
IBM Developer / © 2019 IBM Corporation 35
MetadataInferenceSwagger
Specification
Deployable asset on MAX
IBM Developer / © 2019 IBM Corporation 36
Hosted resources
IBM Developer / © 2019 IBM Corporation 37
Model
Service
Web App
Kubernetes
Trainable asset on MAX
IBM Developer / © 2019 IBM Corporation 38
Data ExpertiseComputeModel
REST API specificationPre Trained ModelI/O processing
Deployable Asset on Model Asset
Exchange
Trainable asset on MAX
IBM Developer / © 2019 IBM Corporation 39
Data ExpertiseComputeModel
REST API specificationPre Trained ModelI/O processing
Deployable Asset on Model Asset
Exchange
Standardized scripts & wrappers
MAX and CI/CD
IBM Developer / © 2019 IBM Corporation 40
– TravisCI for GitHub repositories
– Jenkins for more complex workflows
• Deploying long-running instances to
Kubernetes
• Periodic jobs
• Trainable model testing (using Watson
Machine Learning service)
Conclusion
– ML Pipelines, not just models
– Need to take into account
• Data
• Models
• Variation over time
• Fairness, explainability,
transparency
– Containers / Kube?
• KubeFlow, MLFlow
• Tekton
IBM Developer / © 2019 IBM Corporation 41
Thank you
IBM Developer / © 2019 IBM Corporation
Sign up for IBM Cloud and try Watson Studio: https://ibm.biz/BdznGk
codait.org
twitter.com/MLnick
github.com/MLnick
developer.ibm.com
42
Model Asset Exchange:
https://ibm.biz/model-exchange
IBM Developer / © 2019 IBM Corporation 43

Continuous Deployment for Deep Learning

  • 1.
    Continuous Deployment for DeepLearning — Nick Pentreath Principal Engineer @MLnick
  • 2.
    About IBM Developer /© 2019 IBM Corporation – @MLnick on Twitter & Github – Principal Engineer, IBM CODAIT (Center for Open-Source Data & AI Technologies) – Machine Learning & AI – Apache Spark committer & PMC – Author of Machine Learning with Spark – Various conferences & meetups 2
  • 3.
    CODAIT Improving the EnterpriseAI Lifecycle in Open Source Center for Open Source Data & AI Technologies IBM Developer / © 2019 IBM Corporation 3 CODAIT aims to make AI solutions dramatically easier to create, deploy, and manage in the enterprise. We contribute to and advocate for the open-source technologies that are foundational to IBM’s AI offerings. 30+ open-source developers!
  • 4.
    Agenda IBM Developer /© 2019 IBM Corporation – Overview of Continuous Integration & Deployment – The Machine Learning Workflow – How is CI/CD for ML Different & Challenges – Model Asset Exchange – Conclusion 4
  • 5.
    Continuous Integration & Deployment IBMDeveloper / © 2019 IBM Corporation 5
  • 6.
    Testing IBM Developer /© 2019 IBM Corporation 6 App Data Store Public Front-end Services Back-end Services Service Remote Service Remote Service
  • 7.
    Testing IBM Developer /© 2019 IBM Corporation 7 App Data Store Public Front-end Services Back-end Services Service Remote Service Remote Service Unit testing
  • 8.
    Testing IBM Developer /© 2019 IBM Corporation 8 App Data Store Public Front-end Services Back-end Services Service Remote Service Remote Service Integration testing
  • 9.
    Testing IBM Developer /© 2019 IBM Corporation 9 App Data Store Public Front-end Services Back-end Services Service Remote Service Remote Service Post-deploy / Acceptance testing
  • 10.
    Development Process IBM Developer/ © 2019 IBM Corporation 10 Build Test Merge Deploy Dependencies
  • 11.
    Starting Point IBM Developer/ © 2019 IBM Corporation 11 Build Test Merge Dependencies Manual Automated Deploy
  • 12.
    Continuous Integration IBM Developer/ © 2019 IBM Corporation 12 Build Test Merge Dependencies Manual Automated Deploy
  • 13.
    Continuous Deployment IBM Developer/ © 2019 IBM Corporation 13 Build Test Merge Dependencies Manual Automated Deploy
  • 14.
    Source of changes IBMDeveloper / © 2019 IBM Corporation 14 Build Test Merge Deploy Dependencies Changes come from: • Our code • Internal dependencies • 3rd party dependencies
  • 15.
    The Machine Learning Workflow IBMDeveloper / © 2019 IBM Corporation 15
  • 16.
    Perception IBM Developer /© 2019 IBM Corporation 16
  • 17.
    In reality the workflowspans teams … IBM Developer / © 2019 IBM Corporation 17
  • 18.
    … and tools… IBM Developer / © 2019 IBM Corporation 18
  • 19.
    … and isa small (but critical!) piece of the puzzle *Source: Hidden Technical Debt in Machine Learning Systems IBM Developer / © 2019 IBM Corporation 19
  • 20.
    What is a“model”? IBM Developer / © 2019 IBM Corporation 20
  • 21.
    Deep learning pipeline IBMDeveloper / © 2019 IBM Corporation 21 beagle: 0.82 basset: 0.09 bluetick: 0.07 ... Input image Image pre-processing Prediction Decode image Resize Normalization Convert types / format Inference Post-processing [0.2, 0.3, … ] (label, prob) Sort Label map PIL, OpenCV, tf.image, … Custom Python * Logos trademarks of their respective projects
  • 22.
    Pipelines, not Models –Deploying (and testing) just the model part of the workflow is not enough – Entire pipeline must be taken into account • Data transforms • Feature extraction & pre-processing • DL / ML model • Prediction transformation – Even ETL is part of the pipeline! – Pipelines in frameworks • scikit-learn • Spark ML pipelines • TensorFlow Transform • pipeliner (R) IBM Developer / © 2019 IBM Corporation 22
  • 23.
    Continuous Integration for MachineLearning IBM Developer / © 2019 IBM Corporation 23
  • 24.
    Source of changes IBMDeveloper / © 2019 IBM Corporation 24 Build Test Merge Deploy Dependencies Changes come from: • Our code • Internal dependencies • 3rd party dependencies • Data • Model • Time Data Model
  • 25.
    Data IBM Developer /© 2019 IBM Corporation – Data types • Images, video, audio, raw text, unstructured – Size – Schemas * Logos trademarks of their respective projects 25
  • 26.
    Models IBM Developer /© 2019 IBM Corporation – Size of models – Resource requirements – Hardware • CPU, GPU, TPU, Mobile, Edge – Need to manage and bridge many different languages, frameworks – Formats – State of the art is changing very rapidly * Logos trademarks of their respective projects 26
  • 27.
    Monitoring & Feedback overTime IBM Developer / © 2019 IBM Corporation 27
  • 28.
    Monitoring Performance Business Monitoring Software Traditional softwaremonitoring Latency, throughput, resource usage, etc Model performance metrics Traditional ML evaluation measures (accuracy, prediction error, AUC etc) Business metrics Impact of predictions on business outcomes • Additional revenue - e.g. uplift from recommender • Cost savings – e.g. value of fraud prevented • Metrics implicitly influencing these – e.g. user engagement IBM Developer / © 2019 IBM Corporation
  • 29.
    Feedback Data Transform TrainDeploy Feedback Adapt An intelligent systemmust automatically learn from & adapt to the world around it Continual learning Retraining, online learning, reinforcement learning Feedback loops Explicit: models create or directly influence their own training data Implicit: predictions influence behavior in longer-term or indirect ways Humans in the loop IBM Developer / © 2019 IBM Corporation
  • 30.
    AI Fairness, Transparency & Security 30IBMDeveloper / © 2019 IBM Corporation
  • 31.
    AI Fairness, Transparency & Security 31IBMDeveloper / © 2019 IBM Corporation https://github.com/IBM/AIX360/ https://github.com/IBM/AIF360 https://github.com/IBM/adversarial-robustness-toolbox
  • 32.
  • 33.
    What is MAX? 33 - Oneplace for state-of-art open source deep learning models - Wide variety of domains - Tested code and IP - Free and open source - Both trainable and trained versions IBM Developer / © 2019 IBM Corporation
  • 34.
    Deployable asset onMAX IBM Developer / © 2019 IBM Corporation 34 Data ExpertiseComputeModel REST API specificationPre Trained ModelI/O processing Deployable Asset on Model Asset Exchange MetadataInferenceSwagger Specification
  • 35.
    Deployable asset onMAX IBM Developer / © 2019 IBM Corporation 35 MetadataInferenceSwagger Specification
  • 36.
    Deployable asset onMAX IBM Developer / © 2019 IBM Corporation 36
  • 37.
    Hosted resources IBM Developer/ © 2019 IBM Corporation 37 Model Service Web App Kubernetes
  • 38.
    Trainable asset onMAX IBM Developer / © 2019 IBM Corporation 38 Data ExpertiseComputeModel REST API specificationPre Trained ModelI/O processing Deployable Asset on Model Asset Exchange
  • 39.
    Trainable asset onMAX IBM Developer / © 2019 IBM Corporation 39 Data ExpertiseComputeModel REST API specificationPre Trained ModelI/O processing Deployable Asset on Model Asset Exchange Standardized scripts & wrappers
  • 40.
    MAX and CI/CD IBMDeveloper / © 2019 IBM Corporation 40 – TravisCI for GitHub repositories – Jenkins for more complex workflows • Deploying long-running instances to Kubernetes • Periodic jobs • Trainable model testing (using Watson Machine Learning service)
  • 41.
    Conclusion – ML Pipelines,not just models – Need to take into account • Data • Models • Variation over time • Fairness, explainability, transparency – Containers / Kube? • KubeFlow, MLFlow • Tekton IBM Developer / © 2019 IBM Corporation 41
  • 42.
    Thank you IBM Developer/ © 2019 IBM Corporation Sign up for IBM Cloud and try Watson Studio: https://ibm.biz/BdznGk codait.org twitter.com/MLnick github.com/MLnick developer.ibm.com 42 Model Asset Exchange: https://ibm.biz/model-exchange
  • 43.
    IBM Developer /© 2019 IBM Corporation 43