SlideShare a Scribd company logo
1 of 18
Download to read offline
UNLOCKING MLOPS POTENTIAL:
STREAMLINING MACHINE LEARNING
LIFECYCLE WITH DATABRICKS
-Abishek Subramanian
VENUE : Sterlings Mac Hotel Bengaluru
Date : 21 March, 2024
ABISHEK SUBRAMANIAN
SENIOR PLATFORM SOLUTION ENGINEER – DATABRICKS
• ˘
2
@abishek-subramanian
WHAT IS MLOPS?
MLOps is a set of processes and
automated steps to manage code,
data, and models. It combines
DevOps, DataOps, and ModelOps.
3
GENERAL RECOMMENDATIONS FOR MLOPS
This section includes some general
recommendations for MLOps on Databricks with
links for more information.
4
CREATE A SEPARATE ENVIRONMENT FOR EACH STAGE
• An execution environment is the place where models and data are created or
consumed by code. Each execution environment consists of compute instances, their
runtimes and libraries, and automated jobs.
• Databricks recommends creating separate environments for the different stages of
ML code and model development with clearly defined transitions between stages.
The workflow described in this article follows this process, using the common names
for the stages:
• Development
• Staging
• Production
5
ACCESS CONTROL AND VERSIONING
6
•Use Git for version control.
Pipelines and code should be stored in Git for version control. Moving ML logic between stages can
then be interpreted as moving code from the development branch, to the staging branch, to the
release branch. Use Databricks Git folders to integrate with your Git provider and sync notebooks and
source code with Databricks workspaces. Databricks also provides additional tools for Git integration
and version control; see Developer tools and guidance.
•Store data in a lakehouse architecture using Delta tables.
Data should be stored in a lakehouse architecture in your cloud account. Both raw data and feature
tables should be stored as Delta tables with access controls to determine who can read and modify
them.
• Manage model development with MLflow.
You can use MLflow to track the model development process and save code
snapshots, model parameters, metrics, and other metadata.
• Use Models in Unity Catalog to manage the model lifecycle.
Use Models in Unity Catalog to manage model versioning, governance, and
deployment status.
7
DEPLOY CODE, NOT MODELS
• In most situations, Databricks recommends that during the ML development
process, you promote code, rather than models, from one environment to the
next. Moving project assets this way ensures that all code in the ML
development process goes through the same code review and integration
testing processes. It also ensures that the production version of the model is
trained on production code. For a more detailed discussion of the options and
trade-offs, see Model deployment patterns.
•
• URL : https://docs.databricks.com/en/machine-learning/mlops/deployment-
patterns.html
8
RECOMMENDED MLOPS WORKFLOW
• The following sections describe a typical MLOps workflow, covering each of the
three stages: development, staging, and production.
• This section uses the terms “data scientist” and “ML engineer” as archetypal
personas; specific roles and responsibilities in the MLOps workflow will vary
between teams and organizations.
9
DEVELOPMENT
STAGE
10
• The focus of the development
stage is experimentation. Data
scientists develop features
and models and run
experiments to optimize
model performance. The
output of the development
process is ML pipeline code
that can include feature
computation, model training,
inference, and monitoring.
Ref link : https://docs.databricks.com/en/machine-learning/mlops/mlops-workflow.htmlˇ˘č
DEVELOPMENT
STAGE
• Data sources
• Exploratory data analysis (EDA)
• Code
• Train model (development)
• Validate and deploy model
• Commit code
4/2/24
11
https://docs.databricks.com/en/machine-learning/mlops/mlops-workflow.html
STAGING STAGE
12
• The focus of this stage is testing
the ML pipeline code to ensure it
is ready for production. All of the
ML pipeline code is tested in this
stage, including code for model
training as well as feature
engineering pipelines, inference
code, and so on.
• ML engineers create a CI pipeline
to implement the unit and
integration tests run in this stage.
The output of the staging
process is a release branch that
triggers the CI/CD system to
start the production stage
Ref link : https://docs.databricks.com/en/machine-learning/mlops/mlops-workflow.htmlˇ˘
STAGING STAGE
• Data
• Merge code
• Integration tests (CI)
• Merge to staging branch
• Create a release branch
4/2/24
13
https://docs.databricks.com/en/machine-learning/mlops/mlops-workflow.html
PRODUCTION
STAGE
14
• ML engineers own the production environment
where ML pipelines are deployed and executed.
These pipelines trigger model training, validate and
deploy new model versions, publish predictions to
downstream tables or applications, and monitor the
entire process to avoid performance degradation
and instability.
• Data scientists typically do not have write or
compute access in the production environment.
However, it is important that they have visibility to
test results, logs, model artifacts, production
pipeline status, and monitoring tables. This visibility
allows them to identify and diagnose problems in
production and to compare the performance of new
models to models currently in production. You can
grant data scientists read-only access to assets in
the production catalog for these purposes.
https://docs.databricks.com/en/machine-learning/mlops/mlops-workflow.html
PRODUCTION
STAGE
• Train model
• Validate model
• Deploy model
• Model Serving
• Inference: batch or streaming
• Lakehouse Monitoring
• Retraining
4/2/24
15
https://docs.databricks.com/en/machine-learning/mlops/mlops-workflow.html
MLOPS — END-TO-END PIPELINE DEMO
• This demo covers a full MLOps pipeline. We’ll show you how Databricks
Lakehouse can be leveraged to orchestrate and deploy models in production
while ensuring governance, security and robustness.
• Ingest data and save them in a feature store
• Build ML models with Databricks AutoML
• Set up MLflow hooks to automatically test your models
• Create the model test job
• Automatically move models in production once the tests are validated
• Periodically retrain your model to prevent drift
16
COMMAND
17
• To install the demo, get a free Databricks workspace and
execute the following two commands in a Python notebook
• %pip install dbdemos
• import dbdemos dbdemos.install('mlops-end2end')
Try Databricks free
https://www.databricks.com/try-databricks?itm_data=demo_center#account
THANK YOU 18
@abishek-subramanian
https://community.databricks.com/t5/bangalore/gh-p/databricks-community-bangalore-user-grou
https://www.linkedin.com/groups/14275663/
https://docs.databricks.com/en/machine-learning/mlops/mlops-workflow.html

More Related Content

Similar to Unlocking MLOps Potential: Streamlining Machine Learning Lifecycle with Databricks

[DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar
[DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar[DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar
[DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela PoklukarDataScienceConferenc1
 
Apex Enterprise Patterns Galore - Boston, MA dev group meeting 062719
Apex Enterprise Patterns Galore - Boston, MA dev group meeting 062719Apex Enterprise Patterns Galore - Boston, MA dev group meeting 062719
Apex Enterprise Patterns Galore - Boston, MA dev group meeting 062719BingWang77
 
Limited Budget but Effective End to End MLOps Practices (Machine Learning Mod...
Limited Budget but Effective End to End MLOps Practices (Machine Learning Mod...Limited Budget but Effective End to End MLOps Practices (Machine Learning Mod...
Limited Budget but Effective End to End MLOps Practices (Machine Learning Mod...IRJET Journal
 
Machine Learning Models: From Research to Production 6.13.18
Machine Learning Models: From Research to Production 6.13.18Machine Learning Models: From Research to Production 6.13.18
Machine Learning Models: From Research to Production 6.13.18Cloudera, Inc.
 
Experimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOpsExperimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOpsDatabricks
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionFlorian Wilhelm
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 
Consolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest AirportsConsolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest AirportsDatabricks
 
Tech leaders guide to effective building of machine learning products
Tech leaders guide to effective building of machine learning productsTech leaders guide to effective building of machine learning products
Tech leaders guide to effective building of machine learning productsGianmario Spacagna
 
Machine Learning Model Deployment: Strategy to Implementation
Machine Learning Model Deployment: Strategy to ImplementationMachine Learning Model Deployment: Strategy to Implementation
Machine Learning Model Deployment: Strategy to ImplementationDataWorks Summit
 
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021Sandesh Rao
 
Comparing Various SDLC Models On The Basis Of Available Methodology
Comparing Various SDLC Models On The Basis Of Available MethodologyComparing Various SDLC Models On The Basis Of Available Methodology
Comparing Various SDLC Models On The Basis Of Available MethodologyIJMER
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowLviv Startup Club
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowEdunomica
 
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...Sotrender
 
An Introduction To Model  View  Controller In XPages
An Introduction To Model  View  Controller In XPagesAn Introduction To Model  View  Controller In XPages
An Introduction To Model  View  Controller In XPagesUlrich Krause
 
Modelon Modelica executable requirements Ansys Conference 2016
Modelon Modelica executable requirements Ansys Conference 2016Modelon Modelica executable requirements Ansys Conference 2016
Modelon Modelica executable requirements Ansys Conference 2016Modelon
 
software process model
software process modelsoftware process model
software process modeljuhi kumari
 

Similar to Unlocking MLOps Potential: Streamlining Machine Learning Lifecycle with Databricks (20)

[DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar
[DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar[DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar
[DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar
 
Apex Enterprise Patterns Galore - Boston, MA dev group meeting 062719
Apex Enterprise Patterns Galore - Boston, MA dev group meeting 062719Apex Enterprise Patterns Galore - Boston, MA dev group meeting 062719
Apex Enterprise Patterns Galore - Boston, MA dev group meeting 062719
 
Limited Budget but Effective End to End MLOps Practices (Machine Learning Mod...
Limited Budget but Effective End to End MLOps Practices (Machine Learning Mod...Limited Budget but Effective End to End MLOps Practices (Machine Learning Mod...
Limited Budget but Effective End to End MLOps Practices (Machine Learning Mod...
 
MLOps.pptx
MLOps.pptxMLOps.pptx
MLOps.pptx
 
Machine Learning Models: From Research to Production 6.13.18
Machine Learning Models: From Research to Production 6.13.18Machine Learning Models: From Research to Production 6.13.18
Machine Learning Models: From Research to Production 6.13.18
 
Experimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOpsExperimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOps
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to Production
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 
Consolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest AirportsConsolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest Airports
 
Tech leaders guide to effective building of machine learning products
Tech leaders guide to effective building of machine learning productsTech leaders guide to effective building of machine learning products
Tech leaders guide to effective building of machine learning products
 
Machine Learning Model Deployment: Strategy to Implementation
Machine Learning Model Deployment: Strategy to ImplementationMachine Learning Model Deployment: Strategy to Implementation
Machine Learning Model Deployment: Strategy to Implementation
 
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
 
Comparing Various SDLC Models On The Basis Of Available Methodology
Comparing Various SDLC Models On The Basis Of Available MethodologyComparing Various SDLC Models On The Basis Of Available Methodology
Comparing Various SDLC Models On The Basis Of Available Methodology
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with Kubeflow
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with Kubeflow
 
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
 
An Introduction To Model  View  Controller In XPages
An Introduction To Model  View  Controller In XPagesAn Introduction To Model  View  Controller In XPages
An Introduction To Model  View  Controller In XPages
 
Modelon Modelica executable requirements Ansys Conference 2016
Modelon Modelica executable requirements Ansys Conference 2016Modelon Modelica executable requirements Ansys Conference 2016
Modelon Modelica executable requirements Ansys Conference 2016
 
Software Engineering CSE/IT.pptx
 Software Engineering CSE/IT.pptx Software Engineering CSE/IT.pptx
Software Engineering CSE/IT.pptx
 
software process model
software process modelsoftware process model
software process model
 

Recently uploaded

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 

Recently uploaded (20)

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 

Unlocking MLOps Potential: Streamlining Machine Learning Lifecycle with Databricks

  • 1. UNLOCKING MLOPS POTENTIAL: STREAMLINING MACHINE LEARNING LIFECYCLE WITH DATABRICKS -Abishek Subramanian VENUE : Sterlings Mac Hotel Bengaluru Date : 21 March, 2024
  • 2. ABISHEK SUBRAMANIAN SENIOR PLATFORM SOLUTION ENGINEER – DATABRICKS • ˘ 2 @abishek-subramanian
  • 3. WHAT IS MLOPS? MLOps is a set of processes and automated steps to manage code, data, and models. It combines DevOps, DataOps, and ModelOps. 3
  • 4. GENERAL RECOMMENDATIONS FOR MLOPS This section includes some general recommendations for MLOps on Databricks with links for more information. 4
  • 5. CREATE A SEPARATE ENVIRONMENT FOR EACH STAGE • An execution environment is the place where models and data are created or consumed by code. Each execution environment consists of compute instances, their runtimes and libraries, and automated jobs. • Databricks recommends creating separate environments for the different stages of ML code and model development with clearly defined transitions between stages. The workflow described in this article follows this process, using the common names for the stages: • Development • Staging • Production 5
  • 6. ACCESS CONTROL AND VERSIONING 6 •Use Git for version control. Pipelines and code should be stored in Git for version control. Moving ML logic between stages can then be interpreted as moving code from the development branch, to the staging branch, to the release branch. Use Databricks Git folders to integrate with your Git provider and sync notebooks and source code with Databricks workspaces. Databricks also provides additional tools for Git integration and version control; see Developer tools and guidance. •Store data in a lakehouse architecture using Delta tables. Data should be stored in a lakehouse architecture in your cloud account. Both raw data and feature tables should be stored as Delta tables with access controls to determine who can read and modify them.
  • 7. • Manage model development with MLflow. You can use MLflow to track the model development process and save code snapshots, model parameters, metrics, and other metadata. • Use Models in Unity Catalog to manage the model lifecycle. Use Models in Unity Catalog to manage model versioning, governance, and deployment status. 7
  • 8. DEPLOY CODE, NOT MODELS • In most situations, Databricks recommends that during the ML development process, you promote code, rather than models, from one environment to the next. Moving project assets this way ensures that all code in the ML development process goes through the same code review and integration testing processes. It also ensures that the production version of the model is trained on production code. For a more detailed discussion of the options and trade-offs, see Model deployment patterns. • • URL : https://docs.databricks.com/en/machine-learning/mlops/deployment- patterns.html 8
  • 9. RECOMMENDED MLOPS WORKFLOW • The following sections describe a typical MLOps workflow, covering each of the three stages: development, staging, and production. • This section uses the terms “data scientist” and “ML engineer” as archetypal personas; specific roles and responsibilities in the MLOps workflow will vary between teams and organizations. 9
  • 10. DEVELOPMENT STAGE 10 • The focus of the development stage is experimentation. Data scientists develop features and models and run experiments to optimize model performance. The output of the development process is ML pipeline code that can include feature computation, model training, inference, and monitoring. Ref link : https://docs.databricks.com/en/machine-learning/mlops/mlops-workflow.htmlˇ˘č
  • 11. DEVELOPMENT STAGE • Data sources • Exploratory data analysis (EDA) • Code • Train model (development) • Validate and deploy model • Commit code 4/2/24 11 https://docs.databricks.com/en/machine-learning/mlops/mlops-workflow.html
  • 12. STAGING STAGE 12 • The focus of this stage is testing the ML pipeline code to ensure it is ready for production. All of the ML pipeline code is tested in this stage, including code for model training as well as feature engineering pipelines, inference code, and so on. • ML engineers create a CI pipeline to implement the unit and integration tests run in this stage. The output of the staging process is a release branch that triggers the CI/CD system to start the production stage Ref link : https://docs.databricks.com/en/machine-learning/mlops/mlops-workflow.htmlˇ˘
  • 13. STAGING STAGE • Data • Merge code • Integration tests (CI) • Merge to staging branch • Create a release branch 4/2/24 13 https://docs.databricks.com/en/machine-learning/mlops/mlops-workflow.html
  • 14. PRODUCTION STAGE 14 • ML engineers own the production environment where ML pipelines are deployed and executed. These pipelines trigger model training, validate and deploy new model versions, publish predictions to downstream tables or applications, and monitor the entire process to avoid performance degradation and instability. • Data scientists typically do not have write or compute access in the production environment. However, it is important that they have visibility to test results, logs, model artifacts, production pipeline status, and monitoring tables. This visibility allows them to identify and diagnose problems in production and to compare the performance of new models to models currently in production. You can grant data scientists read-only access to assets in the production catalog for these purposes. https://docs.databricks.com/en/machine-learning/mlops/mlops-workflow.html
  • 15. PRODUCTION STAGE • Train model • Validate model • Deploy model • Model Serving • Inference: batch or streaming • Lakehouse Monitoring • Retraining 4/2/24 15 https://docs.databricks.com/en/machine-learning/mlops/mlops-workflow.html
  • 16. MLOPS — END-TO-END PIPELINE DEMO • This demo covers a full MLOps pipeline. We’ll show you how Databricks Lakehouse can be leveraged to orchestrate and deploy models in production while ensuring governance, security and robustness. • Ingest data and save them in a feature store • Build ML models with Databricks AutoML • Set up MLflow hooks to automatically test your models • Create the model test job • Automatically move models in production once the tests are validated • Periodically retrain your model to prevent drift 16
  • 17. COMMAND 17 • To install the demo, get a free Databricks workspace and execute the following two commands in a Python notebook • %pip install dbdemos • import dbdemos dbdemos.install('mlops-end2end') Try Databricks free https://www.databricks.com/try-databricks?itm_data=demo_center#account