MLOps – Past, Present
and Future
Nisha Talagala
CEO, Pyxeda AI
In this talk
• What is “MLOps”
• MLOps Areas and Evolution
• Two MLOps topics
• Cloud - End to end pipelines in the Cloud
• Drift – suddenly everywhere
• MLOps Future
Sophisticated AI
technologies, more
available every day
Each logo is a (separate) service offered by GCP, AWS or Azure for part of an AI workflow
Problem: Minimal
adoption
https://www.oreilly.com/library/view/the-new-artificial/9781492048978/
https://emerj.com/ai-sector-overviews/valuing-the-artificial-intelligence-market-graphs-and-predictions/
Despite the advanced services available, AI usage still minimal
Why?
• Flow is complex and multi-
faceted
• Tools are at different levels
and tackle subsets of
workflow
• Multiple roles collaborate
Data
Train
Model(s)
Develop
Model(s)
Test
Model(s)
Deploy
Model(s)
Connect
to
Business
app
App developers
Data Scientists
ML Engineers
Operations
Business
Need
Monitor
and
Optimize
Many levels of complexity
A typical flow
• Use case definition
• Data prep
• Modeling
• Training
• Deploy
• Integrate
• Monitor/Optimize
• Iterate
Data
Train
Model(s)
Develop
Model(s)
Test
Model(s)
Deploy
Model(s)
Connect
to
Business
app
App developers
Data Scientists
ML Engineers
Operations
Business
Need
Monitor
and
Optimize
MLOps – the term
• Production ML has been done
for many years in large web
companies and others
• First MLOps Platform for
enterprises – from ParallelM in
2018
• Inspired by Database practices
and DBAs, Software lifecycle
• Focus on full lifecycle tooling
combined with Best Practices
• https://www.kdnuggets.com/2
018/04/operational-machine-
learning-successful-mlops.html
MLOps (a compound of “machine learning” and
“operations”) is a practice for collaboration and
communication between data scientists and operations
professionals to help manage production ML (or deep
learning) lifecycle.[1] Similar to
the DevOps or DataOps approaches, MLOps looks to
increase automation and improve the quality of production
ML while also focusing on business and regulatory
requirements. While MLOps also started as a set of best
practices, it is slowly evolving into an independent
approach to ML lifecycle management. MLOps applies to
the entire lifecycle - from integrating with model
generation (software development lifecycle, continuous
integration/continuous delivery), orchestration, and
deployment, to health, diagnostics, governance, and
business metrics.
https://www.kdnuggets.com/2018/04/operational-machine-
learning-successful-mlops.html
MLOps Evolution
Taken from:
https://www.kdnuggets.com/2018/0
4/operational-machine-learning-
successful-mlops.html
• Deployment and Serving
• Many tools but also startups
• Reproducibility
• Many open source tools
• Containerization, Orchestration
• Established patterns
• Governance
• Emerging rules and tools
• Monitoring, Health Diagnostics
• Many startups
• Lifecycle - Emerging
2018 2020
In this talk
• Origin of “MLOps”
• MLOps Areas and Evolution
• Two MLOps topics
• End to end pipelines in the Cloud
• Drift – suddenly everywhere
• MLOps Future
MLOps in Today’s Context – AWS Example
Substantial tooling available for each stage from multiple cloud vendors
Labeling
Data Prep and Visualization
Modeling and Deployment
Marketplaces
Service APIs
Manipulate raw data
Build, tune or deploy your own
models
Buy a model or algorithm
Use a pre-built AI (example
voice to text, etc.)
Infrastructure: Compute, Authentication, Data source, Logs etc.
Where your AI runs and what
monitors it
S3
A Sample ML Lifecycle in AWS
Dataset
prep and
transform
AWS
Sagemaker
Sagemaker
endpoint +
Marketplace
inference code
External
endpoint (basic
Lambda)
Dataset Model
Artifact
Postman
Request Prediction
AWS
Marketplace
Modified
Dataset
API Gateway
Custom
Docker
Containers
In this talk
• Origin of “MLOps”
• MLOps Areas and Evolution
• Two MLOps topics
• End to end pipelines in the Cloud
• ML Health and Drift – suddenly everywhere
• MLOps Future
MLOps and COVID
What is Drift?
https://www.weforum.org/agenda/2020/
05/here-s-how-to-check-in-on-your-ai-
system-as-covid-19-plays-havoc/
May 22, 2020
Drift
•Types of Drift
• Concept Drift
• Data Drift
• Prediction inputs
• Training vs Prediction
Data Drift Example - Gradual change in
distribution
Training Data
Production Data: 1-500 Production Data: 500-1000 Production Data: 1000-1500
Bank churn data from Kaggle – Drift Test Generator by Pyxeda (contact info@pyxeda.ai)
How can this type of shift in distribution be
detected
Simple rules like
• monitoring the mean (for continuous vaiables)
• RMSE (for categorical variables)
Other techniques to measure distribution divergence in one dimension
• Kolmogorov-Smirnov test (continuous)
• Bhattacharyya distance (categorical)
• Earth movers distance (categorical)
How can this type of shift in distribution be
detected
Relational Drift
• Drift can happen in relationship between two variables
Techniques for detecting multi-dimensional drift
• Not much literature yet
In this talk
• Origin of “MLOps”
• MLOps Areas and Evolution
• Two MLOps topics
• End to end pipelines in the Cloud
• ML Health and Drift – suddenly everywhere
• MLOps Future
Model Governance – Current and Emerging
• The process of ensuring that Model creation meets an organization
or industry’s compliance and other requirements
• For heavily regulated industries, this is a legal issue
• For example – Finance has model governance rules that MUST be followed
before models are deployed
• For other industries, it is a risk mitigation issue
• For example - If someone complains of bias – what model was used and how
did it come about?
• New laws – EU GDPR, CA CCPA, etc. apply across industries
Governance is a combination of Compliance, Reproducibility, Security
and Integrity
Model Security – On the Horizon
• Models can be hacked
• The more advanced the AI, the more likely it can be hacked
• For example – if the AI is self adapting (such as online algorithms), they can
“drift” and a well placed attack can control the direction of drift
• Other attacks can “probe” the model, understand its behavior and
then exploit it
• Corrupting data is another way to corrupt the (consequent) model
Model Security is an emerging topic. Overview of security and
relationship to integrity - see
https://www.forbes.com/sites/cognitiveworld/2019/01/29/ml-integrity-
four-production-pillars-for-trustworthy-ai/#279ee9ec5e6f
Summary
MLOps is here to stay. One startup in 2018 to 10s or more now.
Started with Deployment, now encompasses the whole lifecycle
Can also be seen in MLOps Engineer, Full Stack Data Scientist and
Applied ML Engineer roles and teams
Expect future to include Validation, Security, Adaptability and Scale

Ml ops past_present_future

  • 1.
    MLOps – Past,Present and Future Nisha Talagala CEO, Pyxeda AI
  • 2.
    In this talk •What is “MLOps” • MLOps Areas and Evolution • Two MLOps topics • Cloud - End to end pipelines in the Cloud • Drift – suddenly everywhere • MLOps Future
  • 3.
    Sophisticated AI technologies, more availableevery day Each logo is a (separate) service offered by GCP, AWS or Azure for part of an AI workflow
  • 4.
  • 5.
    Why? • Flow iscomplex and multi- faceted • Tools are at different levels and tackle subsets of workflow • Multiple roles collaborate Data Train Model(s) Develop Model(s) Test Model(s) Deploy Model(s) Connect to Business app App developers Data Scientists ML Engineers Operations Business Need Monitor and Optimize Many levels of complexity
  • 6.
    A typical flow •Use case definition • Data prep • Modeling • Training • Deploy • Integrate • Monitor/Optimize • Iterate Data Train Model(s) Develop Model(s) Test Model(s) Deploy Model(s) Connect to Business app App developers Data Scientists ML Engineers Operations Business Need Monitor and Optimize
  • 7.
    MLOps – theterm • Production ML has been done for many years in large web companies and others • First MLOps Platform for enterprises – from ParallelM in 2018 • Inspired by Database practices and DBAs, Software lifecycle • Focus on full lifecycle tooling combined with Best Practices • https://www.kdnuggets.com/2 018/04/operational-machine- learning-successful-mlops.html MLOps (a compound of “machine learning” and “operations”) is a practice for collaboration and communication between data scientists and operations professionals to help manage production ML (or deep learning) lifecycle.[1] Similar to the DevOps or DataOps approaches, MLOps looks to increase automation and improve the quality of production ML while also focusing on business and regulatory requirements. While MLOps also started as a set of best practices, it is slowly evolving into an independent approach to ML lifecycle management. MLOps applies to the entire lifecycle - from integrating with model generation (software development lifecycle, continuous integration/continuous delivery), orchestration, and deployment, to health, diagnostics, governance, and business metrics. https://www.kdnuggets.com/2018/04/operational-machine- learning-successful-mlops.html
  • 8.
    MLOps Evolution Taken from: https://www.kdnuggets.com/2018/0 4/operational-machine-learning- successful-mlops.html •Deployment and Serving • Many tools but also startups • Reproducibility • Many open source tools • Containerization, Orchestration • Established patterns • Governance • Emerging rules and tools • Monitoring, Health Diagnostics • Many startups • Lifecycle - Emerging 2018 2020
  • 9.
    In this talk •Origin of “MLOps” • MLOps Areas and Evolution • Two MLOps topics • End to end pipelines in the Cloud • Drift – suddenly everywhere • MLOps Future
  • 10.
    MLOps in Today’sContext – AWS Example Substantial tooling available for each stage from multiple cloud vendors Labeling Data Prep and Visualization Modeling and Deployment Marketplaces Service APIs Manipulate raw data Build, tune or deploy your own models Buy a model or algorithm Use a pre-built AI (example voice to text, etc.) Infrastructure: Compute, Authentication, Data source, Logs etc. Where your AI runs and what monitors it
  • 11.
    S3 A Sample MLLifecycle in AWS Dataset prep and transform AWS Sagemaker Sagemaker endpoint + Marketplace inference code External endpoint (basic Lambda) Dataset Model Artifact Postman Request Prediction AWS Marketplace Modified Dataset API Gateway Custom Docker Containers
  • 12.
    In this talk •Origin of “MLOps” • MLOps Areas and Evolution • Two MLOps topics • End to end pipelines in the Cloud • ML Health and Drift – suddenly everywhere • MLOps Future
  • 13.
    MLOps and COVID Whatis Drift? https://www.weforum.org/agenda/2020/ 05/here-s-how-to-check-in-on-your-ai- system-as-covid-19-plays-havoc/ May 22, 2020
  • 14.
    Drift •Types of Drift •Concept Drift • Data Drift • Prediction inputs • Training vs Prediction
  • 15.
    Data Drift Example- Gradual change in distribution Training Data Production Data: 1-500 Production Data: 500-1000 Production Data: 1000-1500 Bank churn data from Kaggle – Drift Test Generator by Pyxeda (contact info@pyxeda.ai)
  • 16.
    How can thistype of shift in distribution be detected Simple rules like • monitoring the mean (for continuous vaiables) • RMSE (for categorical variables) Other techniques to measure distribution divergence in one dimension • Kolmogorov-Smirnov test (continuous) • Bhattacharyya distance (categorical) • Earth movers distance (categorical)
  • 17.
    How can thistype of shift in distribution be detected Relational Drift • Drift can happen in relationship between two variables Techniques for detecting multi-dimensional drift • Not much literature yet
  • 18.
    In this talk •Origin of “MLOps” • MLOps Areas and Evolution • Two MLOps topics • End to end pipelines in the Cloud • ML Health and Drift – suddenly everywhere • MLOps Future
  • 19.
    Model Governance –Current and Emerging • The process of ensuring that Model creation meets an organization or industry’s compliance and other requirements • For heavily regulated industries, this is a legal issue • For example – Finance has model governance rules that MUST be followed before models are deployed • For other industries, it is a risk mitigation issue • For example - If someone complains of bias – what model was used and how did it come about? • New laws – EU GDPR, CA CCPA, etc. apply across industries Governance is a combination of Compliance, Reproducibility, Security and Integrity
  • 20.
    Model Security –On the Horizon • Models can be hacked • The more advanced the AI, the more likely it can be hacked • For example – if the AI is self adapting (such as online algorithms), they can “drift” and a well placed attack can control the direction of drift • Other attacks can “probe” the model, understand its behavior and then exploit it • Corrupting data is another way to corrupt the (consequent) model Model Security is an emerging topic. Overview of security and relationship to integrity - see https://www.forbes.com/sites/cognitiveworld/2019/01/29/ml-integrity- four-production-pillars-for-trustworthy-ai/#279ee9ec5e6f
  • 21.
    Summary MLOps is hereto stay. One startup in 2018 to 10s or more now. Started with Deployment, now encompasses the whole lifecycle Can also be seen in MLOps Engineer, Full Stack Data Scientist and Applied ML Engineer roles and teams Expect future to include Validation, Security, Adaptability and Scale