Presented By:
Sudeep James Tirkey
Vinay Kumar
MLOps
Bridging the gap between Data
Scientists and Ops.
Our Agenda
01 What is MLOps
02 Why MLOps
03 Benefits of MLOps
04 MLOps Lifecycle
Our Agenda
01 Track ML model
Improvements
02 ML model management Life
Cycle
03 Existing Solutions
04 Intro MLFlow
05 Demo
● MLOps (a compound of “machine learning” and “operationalization”) is the
practice of operationalizing and managing the lifecycle of ML in production.
● MLOps establishes a culture and environment where ML technologies can
generate business benefits by optimizing the ML lifecycle to automate and scale
ML initiatives and optimized business return of ML in production.
● MLOps enables collaboration across diverse users (such as Data Scientists, Data
Engineers, Business Analysts and ITOps) on ML operations and enables a data
driven continuous optimization of ML operations’ impact
What is MLOps
Why MLOps
● This is a tweet from someone, it says, "The story of enterprise machine learning is,
it took me 3 weeks to develop a model and it's been 11 months and it's still not
deployed."
● On one side, there are IT operations administrators are experts in deployment
and management of software and services in production.
● On the other side, data scientists are experts in the algorithms and associated
mathematics.
● Operating ML in production and deploying new models requires the combined
skills of both groups and their respective processes.
Expertise Mismatch
● As ML becomes a more critical function in business applications, there is a greater
need to manage and track the process by which new ML models become deployed
and drive outcomes.
● In risk applications, customers need to have clear-cut reasons for how they were
adversely impacted by a decision. ML is a very useful tool for enhancing risk
scorecards but explainability is not ML’s strong suit.
● Depending on the industry, there are emerging regulations and practices in this
area as well.
Regulatory and Process needs
● The bottleneck that results from complicated, non-intuitive algorithms eases with
a better division of expertise and bigger collaboration from operations and data
teams.
● This complexity requires custom algorithmic knowledge beyond standard
operations for diagnostics, test and optimization.
Non-Intuitive Complexity
● Retraining of the models and algorithm periodically.
● Ease of re-deployment and configuration modifications in the system.
● Out of sample observation should be done in real-time scoring for high
performance and efficiency.
● Monitor model performance over time.
● Adaptive ETL and expertise to manage new data feeds and transactional systems
as data sources for AI and machine learning tools.
● Security and authorize access levels to different areas of the analytical methods
● Reliable backup and recovery processes/tools.
This goes into the territory traditionally inhabited by DevOps. Data Scientists should
ideally discover to handle the part of those conditions themselves or at least be
informative consultants to classical DevOps personnel.
Productionizing ML Models
● Scale ML Models initiatives broadly by swiftly and flawlessly converting the true
potential of machine learning into business processes (which already exist) and
systems (which exist across the enterprise).
● It helps to manage and maintain the partnership of data science and IT/Ops
teams to work together to provide ML-powered applications which can provide
some values in results.
● It helps to minimize the hazard to the organization by regularizing and importing
in place hefty governance checklist and balances, and by enabling the use of best
practices for machine learning models at the production level.
Benefits of MLOps
● Data/model versioning != code versioning
● Model reuse entirely has different case than software reuse, as models need
tuning based on scenarios and data.
● Fine-tuning is needed when to reuse a model. Transfer learning on it, and it leads
to a training pipeline.
● Retraining ability requires on-demand as the models decay over time.
MLOps vs DevOps
Lifecycle
References
● http://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learnin
g-systems.pdf
● https://mlops.org/
Tracking Machine Learning
Experiments
What are we trying to solve?
● keep track of each experiment, their hyper-params,
results, artifacts.
● Traditional Ways:
■ Good ol’ pen and paper.
■ Excel Sheets
■ Git
Model Mgmt. Life Cycle
Existing Tracking Solutions
● studio.ml
● Polyaxon
● Michelangelo
Introducing MLFlow
● Integrates with existing Code; requires little change.
● Platform Independent.
● can be used by 1 to 1000 people.
● Scales to big data with Apache Spark.
● Throws metrics in Prometheus format.
Demo
Predict the delay of a flight by considering:
● Source and Destination.
● Departure and Arrival Time.
● Airline, Flight Number, Air Time.
● Month, Week, Day..
Thank You !

MLOps Bridging the gap between Data Scientists and Ops.

  • 1.
    Presented By: Sudeep JamesTirkey Vinay Kumar MLOps Bridging the gap between Data Scientists and Ops.
  • 2.
    Our Agenda 01 Whatis MLOps 02 Why MLOps 03 Benefits of MLOps 04 MLOps Lifecycle
  • 3.
    Our Agenda 01 TrackML model Improvements 02 ML model management Life Cycle 03 Existing Solutions 04 Intro MLFlow 05 Demo
  • 4.
    ● MLOps (acompound of “machine learning” and “operationalization”) is the practice of operationalizing and managing the lifecycle of ML in production. ● MLOps establishes a culture and environment where ML technologies can generate business benefits by optimizing the ML lifecycle to automate and scale ML initiatives and optimized business return of ML in production. ● MLOps enables collaboration across diverse users (such as Data Scientists, Data Engineers, Business Analysts and ITOps) on ML operations and enables a data driven continuous optimization of ML operations’ impact What is MLOps
  • 5.
  • 6.
    ● This isa tweet from someone, it says, "The story of enterprise machine learning is, it took me 3 weeks to develop a model and it's been 11 months and it's still not deployed." ● On one side, there are IT operations administrators are experts in deployment and management of software and services in production. ● On the other side, data scientists are experts in the algorithms and associated mathematics. ● Operating ML in production and deploying new models requires the combined skills of both groups and their respective processes. Expertise Mismatch
  • 7.
    ● As MLbecomes a more critical function in business applications, there is a greater need to manage and track the process by which new ML models become deployed and drive outcomes. ● In risk applications, customers need to have clear-cut reasons for how they were adversely impacted by a decision. ML is a very useful tool for enhancing risk scorecards but explainability is not ML’s strong suit. ● Depending on the industry, there are emerging regulations and practices in this area as well. Regulatory and Process needs
  • 8.
    ● The bottleneckthat results from complicated, non-intuitive algorithms eases with a better division of expertise and bigger collaboration from operations and data teams. ● This complexity requires custom algorithmic knowledge beyond standard operations for diagnostics, test and optimization. Non-Intuitive Complexity
  • 9.
    ● Retraining ofthe models and algorithm periodically. ● Ease of re-deployment and configuration modifications in the system. ● Out of sample observation should be done in real-time scoring for high performance and efficiency. ● Monitor model performance over time. ● Adaptive ETL and expertise to manage new data feeds and transactional systems as data sources for AI and machine learning tools. ● Security and authorize access levels to different areas of the analytical methods ● Reliable backup and recovery processes/tools. This goes into the territory traditionally inhabited by DevOps. Data Scientists should ideally discover to handle the part of those conditions themselves or at least be informative consultants to classical DevOps personnel. Productionizing ML Models
  • 11.
    ● Scale MLModels initiatives broadly by swiftly and flawlessly converting the true potential of machine learning into business processes (which already exist) and systems (which exist across the enterprise). ● It helps to manage and maintain the partnership of data science and IT/Ops teams to work together to provide ML-powered applications which can provide some values in results. ● It helps to minimize the hazard to the organization by regularizing and importing in place hefty governance checklist and balances, and by enabling the use of best practices for machine learning models at the production level. Benefits of MLOps
  • 12.
    ● Data/model versioning!= code versioning ● Model reuse entirely has different case than software reuse, as models need tuning based on scenarios and data. ● Fine-tuning is needed when to reuse a model. Transfer learning on it, and it leads to a training pipeline. ● Retraining ability requires on-demand as the models decay over time. MLOps vs DevOps
  • 13.
  • 14.
  • 15.
  • 16.
    What are wetrying to solve? ● keep track of each experiment, their hyper-params, results, artifacts. ● Traditional Ways: ■ Good ol’ pen and paper. ■ Excel Sheets ■ Git
  • 17.
  • 18.
    Existing Tracking Solutions ●studio.ml ● Polyaxon ● Michelangelo
  • 19.
    Introducing MLFlow ● Integrateswith existing Code; requires little change. ● Platform Independent. ● can be used by 1 to 1000 people. ● Scales to big data with Apache Spark. ● Throws metrics in Prometheus format.
  • 20.
    Demo Predict the delayof a flight by considering: ● Source and Destination. ● Departure and Arrival Time. ● Airline, Flight Number, Air Time. ● Month, Week, Day..
  • 22.