Data Science as a Service: Intersection of Cloud Computing and Data SciencePouria Amirian
Dr. Pouria Amirian explains data science, steps in a data science workflow and show some experiments in AzureML. He also mentions about big data issues in a data science project and solutions to them.
Data Science as a Service: Intersection of Cloud Computing and Data SciencePouria Amirian
Dr. Pouria Amirian from the University of Oxford explains Data Science and its relationship with Big Data and Cloud Computing. Then he illustrates using AzureML to perform a simple data science analytics.
Keynote presentation from ECBS conference. The talk is about how to use machine learning and AI in improving software engineering. Experiences from our project in Software Center (www.software-center.se).
Crossing the Analytics Chasm and Getting the Models You Developed DeployedRobert Grossman
There are two cultures in data science and analytics - those that develop analytic models and those that deploy analytic models into operational systems. In this talk, we review the life cycle of analytic models and provide an overview of some of the approaches that have been developed for managing analytic models and workflows and for deploying them, including using analytic engines and analytic containers . We give a quick overview of languages for analytic models (PMML) and analytic workflows (PFA). We also describe the emerging discipline of AnalyticOps that has borrowed some of the techniques of DevOps.
The adoption of machine learning techniques for software defect prediction: A...RAKESH RANA
The adoption of machine learning techniques for software defect prediction: An initial industrial validation
Presented at:
11th Joint Conference On Knowledge-Based Software Engineering, JCKBSE, Volgograd, Russia, 2014
Get full text of publication at:
http://rakeshrana.website/index.php/work/publications/
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
Looking to build a robust machine learning infrastructure to streamline MLOps? Learn from Provectus experts how to ensure the success of your MLOps initiative by implementing Data QA components in your ML infrastructure.
For most organizations, the development of multiple machine learning models, their deployment and maintenance in production are relatively new tasks. Join Provectus as we explain how to build an end-to-end infrastructure for machine learning, with a focus on data quality and metadata management, to standardize and streamline machine learning life cycle management (MLOps).
Agenda
- Data Quality and why it matters
- Challenges and solutions of Data Testing
- Challenges and solutions of Model Testing
- MLOps pipelines and why they matter
- How to expand validation pipelines for Data Quality
Data Science as a Service: Intersection of Cloud Computing and Data SciencePouria Amirian
Dr. Pouria Amirian explains data science, steps in a data science workflow and show some experiments in AzureML. He also mentions about big data issues in a data science project and solutions to them.
Data Science as a Service: Intersection of Cloud Computing and Data SciencePouria Amirian
Dr. Pouria Amirian from the University of Oxford explains Data Science and its relationship with Big Data and Cloud Computing. Then he illustrates using AzureML to perform a simple data science analytics.
Keynote presentation from ECBS conference. The talk is about how to use machine learning and AI in improving software engineering. Experiences from our project in Software Center (www.software-center.se).
Crossing the Analytics Chasm and Getting the Models You Developed DeployedRobert Grossman
There are two cultures in data science and analytics - those that develop analytic models and those that deploy analytic models into operational systems. In this talk, we review the life cycle of analytic models and provide an overview of some of the approaches that have been developed for managing analytic models and workflows and for deploying them, including using analytic engines and analytic containers . We give a quick overview of languages for analytic models (PMML) and analytic workflows (PFA). We also describe the emerging discipline of AnalyticOps that has borrowed some of the techniques of DevOps.
The adoption of machine learning techniques for software defect prediction: A...RAKESH RANA
The adoption of machine learning techniques for software defect prediction: An initial industrial validation
Presented at:
11th Joint Conference On Knowledge-Based Software Engineering, JCKBSE, Volgograd, Russia, 2014
Get full text of publication at:
http://rakeshrana.website/index.php/work/publications/
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
Looking to build a robust machine learning infrastructure to streamline MLOps? Learn from Provectus experts how to ensure the success of your MLOps initiative by implementing Data QA components in your ML infrastructure.
For most organizations, the development of multiple machine learning models, their deployment and maintenance in production are relatively new tasks. Join Provectus as we explain how to build an end-to-end infrastructure for machine learning, with a focus on data quality and metadata management, to standardize and streamline machine learning life cycle management (MLOps).
Agenda
- Data Quality and why it matters
- Challenges and solutions of Data Testing
- Challenges and solutions of Model Testing
- MLOps pipelines and why they matter
- How to expand validation pipelines for Data Quality
The catalyst for the success of automobiles came not through the invention of the car but rather through the establishment of an innovative assembly line. History shows us that the ability to mass produce and distribute a product is the key to driving adoption of any innovation, and machine learning is no different. MLOps is the assembly line of Machine Learning and in this presentation we will discuss the core capabilities your organization should be focused on to implement a successful MLOps system.
How to re-use existing system models to generate test casesTransWare AG
This tutorial on Model Based Testing gives you a brief overview how to re-use system models to generate test cases with the software BPM-X.
This tutorial covers the following topics:
1. Introduction
2. Use models for agile development
3. From system model to test cases
4. Summary
5. Next webinars
A confluence of events is accelerating the growth of AI in the Enterprise - (i) The COVID pandemic is accelerating the digital transformation of enterprises, (ii) increased digital sales & digital interaction is fueling interest in operationalizing AI to drive revenue and cost efficiencies and (iii) Enterprise databases and enterprise apps are infusing AI to transparently augment predictive capabilities for clients. Enterprise Power Systems are pillars of the global economy hosting our trinity of operating systems
Certification Study Group - NLP & Recommendation Systems on GCP Session 5gdgsurrey
This session features Raghavendra Guttur's exploration of "Atlas," a chatbot powered by Llama2-7b with MiniLM v2 enhancements for IT support. ChengCheng Tan will discuss ML pipeline automation, monitoring, optimization, and maintenance.
MLflow is an MLOps tool that enables data scientist to quickly productionize their Machine Learning projects. To achieve this, MLFlow has four major components which are Tracking, Projects, Models, and Registry. MLflow lets you train, reuse, and deploy models with any library and package them into reproducible steps. MLflow is designed to work with any machine learning library and require minimal changes to integrate into an existing codebase. In this session, we will cover the common pain points of machine learning developers such as tracking experiments, reproducibility, deployment tool and model versioning. Ready to get your hands dirty by doing quick ML project using mlflow and release to production to understand the ML-Ops lifecycle.
Denis Jannot - Towards Data Science Engineering Principles - Codemotion Milan...Codemotion
Over the last half century we have developed and refined the discipline of software engineering in order to accelerate the development and deployment of applications. This has involved a general shift towards DevOps practices that align developer and business objectives and dramatically reduce time-to-delivery. With the recent rise of data science and data analytics, the time has come to apply the principles of DevOps to data science and leverage the lessons from software engineering (and its systematic and repeatable methodology) to the discipline of data science.
End-to-End Machine learning pipelines for Python driven organizations - Nick ...PyData
The recent advances in machine learning and artificial intelligence are amazing! Yet, in order to have real value within a company, data scientists must be able to get their models off of their laptops and deployed within a company’s data pipelines and infrastructure. In this session, I'll demonstrate how one-off experiments can be transformed into scalable ML pipelines with minimal effort.
Demystifying Cognitive Approaches to Predictive Maintenance Part 1Anita Raj
While much has been written about industrial digital transformation, only a few industrial companies have nailed the transformation given the complexity and magnitude. Watch the slides to get industry perspectives based on the key lessons learned about scaling and operationalizing “connected ecosystem” for Industrial IoT. The goal is to accurately paint the picture of where the gaps exist and how companies can take advantage of cognitive approaches for higher machine lifetime and greater productivity.
R Tool for Visual Studio และการทำงานร่วมกันเป็นทีม โดย เฉลิมวงศ์ วิจิตรปิยะกุ...BAINIDA
R Tool for Visual Studio และการทำงานร่วมกันเป็นทีม โดย เฉลิมวงศ์ วิจิตรปิยะกุล MVP, Microsoft Thailand
THE FIRST NIDA BUSINESS ANALYTICS AND DATA SCIENCES CONTEST/CONFERENCE
Scaling & Managing Production Deployments with H2O ModelOpsSri Ambati
This presentation was made on June 30th, 2020.
Recording of the presentation is available here: https://youtu.be/9LajqAL_CU8
As enterprises “make their own AI”, a new set of challenges emerge. Maintaining reproducibility, traceability, and verifiability of machine learning models, as well as recording experiments, tracking insights, and reproducing results, are key. Collaboration between teams is also necessary as “model factories” are created for enterprise-wide model data science efforts. Additionally, monitoring of models ensures that drift or performance degradation is addressed with either retraining or model updates. Finally, data and model lineage in case of rollbacks or addressing regulatory compliance is necessary.
H2O ModelOps delivers centralized catalog and management, deployment, monitoring, collaboration, and administration of machine learning models. In this webinar, we learn how H2O can assist with operationalizing, scaling and managing production deployments.
Speaker's Bio:
Felix is a part of the Customer Success team in Asia Pacific at H2O.ai. An engineer and an IIM alumni, Felix has held prominent positions in the data science industry.
From Model-based to Model and Simulation-based Systems ArchitecturesObeo
Achieving quality engineering through descriptive and analytical models
Systems architecture design is a key activity that affect the
overall systems engineering cost. It is hence fundamental
to ensure that the system architecture reaches a proper quality.
In this paper, we leverage on MBSE approaches and complement them
with simulation techniques, as a prom-ising way to improve the quality of the system architecture definition, and to come up with inno-vative solutions while securing the systems engineering process.
How to apply machine learning into your CI/CD pipelineAlon Weiss
A quick introduction to AIOps, the business reasons why the CI/CD pipeline needs to constantly improve, and how this can be accomplished with data that's already available with existing Machine Learning and other algorithms.
Why do the majority of Data Science projects never make it to production?Itai Yaffe
María de la Fuente (Solutions Architect Manager for IMEA) @ Databricks
While most companies understand the value creation of leveraging data and are taking on board an AI strategy, only 13% of the data science projects make it to production successfully.
Besides the well-known skills gap in the market, we need to level up our end-to-end approach and cover all aspects involved when working with AI.
In this session, we will discuss the main obstacles to overcome and how we can avoid the major pitfalls to ensure our data science journey becomes successful.
The catalyst for the success of automobiles came not through the invention of the car but rather through the establishment of an innovative assembly line. History shows us that the ability to mass produce and distribute a product is the key to driving adoption of any innovation, and machine learning is no different. MLOps is the assembly line of Machine Learning and in this presentation we will discuss the core capabilities your organization should be focused on to implement a successful MLOps system.
How to re-use existing system models to generate test casesTransWare AG
This tutorial on Model Based Testing gives you a brief overview how to re-use system models to generate test cases with the software BPM-X.
This tutorial covers the following topics:
1. Introduction
2. Use models for agile development
3. From system model to test cases
4. Summary
5. Next webinars
A confluence of events is accelerating the growth of AI in the Enterprise - (i) The COVID pandemic is accelerating the digital transformation of enterprises, (ii) increased digital sales & digital interaction is fueling interest in operationalizing AI to drive revenue and cost efficiencies and (iii) Enterprise databases and enterprise apps are infusing AI to transparently augment predictive capabilities for clients. Enterprise Power Systems are pillars of the global economy hosting our trinity of operating systems
Certification Study Group - NLP & Recommendation Systems on GCP Session 5gdgsurrey
This session features Raghavendra Guttur's exploration of "Atlas," a chatbot powered by Llama2-7b with MiniLM v2 enhancements for IT support. ChengCheng Tan will discuss ML pipeline automation, monitoring, optimization, and maintenance.
MLflow is an MLOps tool that enables data scientist to quickly productionize their Machine Learning projects. To achieve this, MLFlow has four major components which are Tracking, Projects, Models, and Registry. MLflow lets you train, reuse, and deploy models with any library and package them into reproducible steps. MLflow is designed to work with any machine learning library and require minimal changes to integrate into an existing codebase. In this session, we will cover the common pain points of machine learning developers such as tracking experiments, reproducibility, deployment tool and model versioning. Ready to get your hands dirty by doing quick ML project using mlflow and release to production to understand the ML-Ops lifecycle.
Denis Jannot - Towards Data Science Engineering Principles - Codemotion Milan...Codemotion
Over the last half century we have developed and refined the discipline of software engineering in order to accelerate the development and deployment of applications. This has involved a general shift towards DevOps practices that align developer and business objectives and dramatically reduce time-to-delivery. With the recent rise of data science and data analytics, the time has come to apply the principles of DevOps to data science and leverage the lessons from software engineering (and its systematic and repeatable methodology) to the discipline of data science.
End-to-End Machine learning pipelines for Python driven organizations - Nick ...PyData
The recent advances in machine learning and artificial intelligence are amazing! Yet, in order to have real value within a company, data scientists must be able to get their models off of their laptops and deployed within a company’s data pipelines and infrastructure. In this session, I'll demonstrate how one-off experiments can be transformed into scalable ML pipelines with minimal effort.
Demystifying Cognitive Approaches to Predictive Maintenance Part 1Anita Raj
While much has been written about industrial digital transformation, only a few industrial companies have nailed the transformation given the complexity and magnitude. Watch the slides to get industry perspectives based on the key lessons learned about scaling and operationalizing “connected ecosystem” for Industrial IoT. The goal is to accurately paint the picture of where the gaps exist and how companies can take advantage of cognitive approaches for higher machine lifetime and greater productivity.
R Tool for Visual Studio และการทำงานร่วมกันเป็นทีม โดย เฉลิมวงศ์ วิจิตรปิยะกุ...BAINIDA
R Tool for Visual Studio และการทำงานร่วมกันเป็นทีม โดย เฉลิมวงศ์ วิจิตรปิยะกุล MVP, Microsoft Thailand
THE FIRST NIDA BUSINESS ANALYTICS AND DATA SCIENCES CONTEST/CONFERENCE
Scaling & Managing Production Deployments with H2O ModelOpsSri Ambati
This presentation was made on June 30th, 2020.
Recording of the presentation is available here: https://youtu.be/9LajqAL_CU8
As enterprises “make their own AI”, a new set of challenges emerge. Maintaining reproducibility, traceability, and verifiability of machine learning models, as well as recording experiments, tracking insights, and reproducing results, are key. Collaboration between teams is also necessary as “model factories” are created for enterprise-wide model data science efforts. Additionally, monitoring of models ensures that drift or performance degradation is addressed with either retraining or model updates. Finally, data and model lineage in case of rollbacks or addressing regulatory compliance is necessary.
H2O ModelOps delivers centralized catalog and management, deployment, monitoring, collaboration, and administration of machine learning models. In this webinar, we learn how H2O can assist with operationalizing, scaling and managing production deployments.
Speaker's Bio:
Felix is a part of the Customer Success team in Asia Pacific at H2O.ai. An engineer and an IIM alumni, Felix has held prominent positions in the data science industry.
From Model-based to Model and Simulation-based Systems ArchitecturesObeo
Achieving quality engineering through descriptive and analytical models
Systems architecture design is a key activity that affect the
overall systems engineering cost. It is hence fundamental
to ensure that the system architecture reaches a proper quality.
In this paper, we leverage on MBSE approaches and complement them
with simulation techniques, as a prom-ising way to improve the quality of the system architecture definition, and to come up with inno-vative solutions while securing the systems engineering process.
How to apply machine learning into your CI/CD pipelineAlon Weiss
A quick introduction to AIOps, the business reasons why the CI/CD pipeline needs to constantly improve, and how this can be accomplished with data that's already available with existing Machine Learning and other algorithms.
Why do the majority of Data Science projects never make it to production?Itai Yaffe
María de la Fuente (Solutions Architect Manager for IMEA) @ Databricks
While most companies understand the value creation of leveraging data and are taking on board an AI strategy, only 13% of the data science projects make it to production successfully.
Besides the well-known skills gap in the market, we need to level up our end-to-end approach and cover all aspects involved when working with AI.
In this session, we will discuss the main obstacles to overcome and how we can avoid the major pitfalls to ensure our data science journey becomes successful.
Similar to Machine learning operations model book mlops (20)
How can I successfully sell my pi coins in Philippines?DOT TECH
Even tho pi not launched globally, crypto whales, holders, investors are looking forward to hold up to 20,000 pi coins before mainnet launch in 2026.
All a miner or pioneer has to do to sell is to get in contact with a legitimate pi vendor ( a person that buys pi coins from miners and resell them to investors)
I will leave the telegram contact of my personal pi vendor:
@Pi_vendor_247
#pi network
#pi 2024
#sell pi
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
StarCompliance is a leading firm specializing in the recovery of stolen cryptocurrency. Our comprehensive services are designed to assist individuals and organizations in navigating the complex process of fraud reporting, investigation, and fund recovery. We combine cutting-edge technology with expert legal support to provide a robust solution for victims of crypto theft.
Our Services Include:
Reporting to Tracking Authorities:
We immediately notify all relevant centralized exchanges (CEX), decentralized exchanges (DEX), and wallet providers about the stolen cryptocurrency. This ensures that the stolen assets are flagged as scam transactions, making it impossible for the thief to use them.
Assistance with Filing Police Reports:
We guide you through the process of filing a valid police report. Our support team provides detailed instructions on which police department to contact and helps you complete the necessary paperwork within the critical 72-hour window.
Launching the Refund Process:
Our team of experienced lawyers can initiate lawsuits on your behalf and represent you in various jurisdictions around the world. They work diligently to recover your stolen funds and ensure that justice is served.
At StarCompliance, we understand the urgency and stress involved in dealing with cryptocurrency theft. Our dedicated team works quickly and efficiently to provide you with the support and expertise needed to recover your assets. Trust us to be your partner in navigating the complexities of the crypto world and safeguarding your investments.
1. Prof. Dr. Jan Kirenz
Machine Learning Operations (MLOps)
Usage of Pipelines in the ML Lifecycle with
Tensor Flow Extended (TFX) and Kubeflow
Prof. Dr. Jan Kirenz
HdM Stuttgart
2. Prof. Dr. Jan Kirenz
80-85% PoC Factory
The Proof of Concept Factory
Most companies...
● … conduct AI experiments and pilots
but achieve a low scaling success
rate
● … have significant under investment,
yielding low returns
Source: Accenture (2019) https://www.accenture.com/us-en/insights/artificial-intelligence/ai-investments
3. Prof. Dr. Jan Kirenz
https://www.gartner.com/smarterwithgartner/gartner-top-10-data-and-analytics-trends-for-2021/
Scalable AI
4. Prof. Dr. Jan Kirenz
ML Project Code
The problem with scaling AI: ML code is only a
fraction of a production-ready ML project code
ML
Code 5-10%
5. Prof. Dr. Jan Kirenz
Monitoring
Hidden technical debt in machine learning systems
Sculley, D. et al. (2015). Hidden technical debt in machine learning systems. Advances in neural information processing systems, 28, pp. 2503-2511
Data Collection
Configuration
Feature Engineering
Data
Verification
Metadata Management
Model Analysis
Serving
Infra-
structure
Automation
Process Management
Machine Resource
Management
Testing and Debugging
ML
Code
6. Prof. Dr. Jan Kirenz
Machine learning operations (MLOps)
● ML Engineering culture and practice that
aims at unifying ML System development
(Dev) and ML system operations (Ops)
● Tools and principles to support workflow
standardization and automation through
the ML system lifecycle (e.g. with pipelines)
7. Prof. Dr. Jan Kirenz
Prof. Dr. Jan Kirenz
Machine learning
lifecycle
8. Prof. Dr. Jan Kirenz
Plan
Model
Deployment Data
Business Analyst
Data
Engineer
Software
Developer
Data Scientist
Lifecycle
of an ML System
Plan | Data | Model | Deployment
9. Prof. Dr. Jan Kirenz
Plan
Model
Deployment Data
Identify use
case
Frame
problem
Identify
variables
Define metrics
Business Analyst
Data
Engineer
Software
Developer
Data Scientist
Lifecycle
of an ML System
Plan | Data | Model | Deployment
10. Prof. Dr. Jan Kirenz
Plan
Model
Deployment Data
Identify use
case
Frame
problem
Identify
variables
Define metrics
Business Analyst
Data
Engineer
Software
Developer
Data Scientist
Data ingestion
Analyze &
clean data
Define schema
Feature
engineering
Data splitting
Anomaly
detection
Data
preprocessing
Lifecycle
of an ML System
Plan | Data | Model | Deployment
11. Prof. Dr. Jan Kirenz
Plan
Model
Deployment Data
Identify use
case
Frame
problem
Identify
variables
Define metrics
Business Analyst
Data
Engineer
Software
Developer
Data Scientist
Data ingestion
Analyze &
clean data
Define schema
Feature
engineering
Evaluate
model
Model Training
& tuning
Select
algorithm
Data splitting
Anomaly
detection
Data
preprocessing
Lifecycle
of an ML System
Plan | Data | Model | Deployment
12. Prof. Dr. Jan Kirenz
Plan
Model
Deployment Data
Identify use
case
Frame
problem
Identify
variables
Define metrics
Business Analyst
Data
Engineer
Software
Developer
Data Scientist
Data ingestion
Analyze &
clean data
Define schema
Feature
engineering
Validate model
Deploy model
Serve model
Retrain
triggers
Evaluate
model
Model Training
& tuning
Monitor model
Select
algorithm
Data splitting
Anomaly
detection
Data
preprocessing
Lifecycle
of an ML System
Plan | Data | Model | Deployment
13. Prof. Dr. Jan Kirenz
Plan
Model
Deployment Data
Identify use
case
Frame
problem
Identify
variables
Define metrics
Business Analyst
Data
Engineer
Software
Developer
Data Scientist
Data ingestion
Analyze &
clean data
Define schema
Feature
engineering
Validate model
Deploy model
Serve model
Retrain
triggers
Evaluate
model
Model Training
& tuning
Monitor model
Select
algorithm
Data splitting
Anomaly
detection
Data
preprocessing
Lifecycle
of an ML System
Plan | Data | Model | Deployment
Common issues which lead to a PoC to production gap
● Lack of reuse and duplication
● Inconsistency (data, code, models)
● Manual and slow transition from PoC to production
14. Prof. Dr. Jan Kirenz
Plan
Model
Deployment Data
Identify use
case
Frame
problem
Identify
variables
Define metrics
Business Analyst
Data
Engineer
Software
Developer
Data Scientist
Data ingestion
Analyze &
clean data
Define schema
Feature
engineering
Validate model
Deploy model
Serve model
Retrain
triggers
Evaluate
model
Model Training
& tuning
Monitor model
Select
algorithm
Data splitting
Anomaly
detection
Data
preprocessing
Model
management
Model registry
Data and feature
management
Feature store
Pipeline
management
Pipeline orchestration
Metadata
management
Metadata store
Lifecycle
of an ML System
Plan | Data | Model | Deployment
MLOps components
15. Prof. Dr. Jan Kirenz
What is a pipeline?
● Description of an ML workflow
● A pipeline component is a self-contained
set of user code that performs one step in
the pipeline
● Includes the definition of the configuration
and inputs required to run the pipeline (e.g.
model hyperparameters)
… do this
… than that
Start
...
… the end
The workflow is
also called directed
acyclic graph (DAG)
This is a component
Complete workflow of the ML
system lifecycle
16. Prof. Dr. Jan Kirenz
Source: Baer & Ngahane (2019)
… do this
… than that
Start
… the end
17. Prof. Dr. Jan Kirenz
TensorFlow Extended (TFX)
● Google-production-scale machine learning
(ML) platform based on TensorFlow
● Portable to multiple environments (Azure,
AWS, Google Cloud, IBM, ...)
● Python based toolkit; can be used with
notebooks
● Helps you orchestrate your ML process:
Apache Airflow, Apache Beam or Kubeflow
pipelines
Source: TensorFlow (2021)
18. Prof. Dr. Jan Kirenz
TFX 1.0 (19.05.21)
● Enterprise-grade support
● Security patches and select bug fixes for
up to three years
● Guaranteed API & Artifact backward
compatibility
Source: Google (2021)
19. Prof. Dr. Jan Kirenz
Plan
Model
Deployment Data
Identify use
case
Frame
problem
Identify
variables
Define metrics
Business Analyst
Data
Engineer
Software
Developer
Data Scientist
Data ingestion
Analyze &
clean data
Define schema
Feature
engineering
Validate model
Deploy model
Serve model
Evaluate
model
Model Training
& tuning
Monitor model
ExampleGen
Select
algorithm
StatisticsGen
SchemaGen
Example
Validator
Transform
Data splitting
Trainer
Tuner
Evaluator
InfraValidator
Anomaly
detection
Pusher
HUB / JS / LITE / SERVING
Model Server
TF
Data
Validation
(TFDV)
TFT
TF
KerasTuner
TensorFlow
Model Analysis
(TFMA)
BulkInferrer
Data
preprocessing
Metadata Store: ML Metadata (MLMD)
TFX Options for Pipeline
Orchestration
20. Prof. Dr. Jan Kirenz
Plan
Model
Deployment Data
Identify use
case
Frame
problem
Identify
variables
Define metrics
Business Analyst
Data
Engineer
Software
Developer
Data Scientist
Data ingestion
Analyze &
clean data
Define schema
Feature
engineering
Validate model
Deploy model
Serve model
Evaluate
model
Model Training
& tuning
Monitor model
ExampleGen
Select
algorithm
StatisticsGen
SchemaGen
Example
Validator
Transform
Data splitting
Trainer
Tuner
Evaluator
InfraValidator
Anomaly
detection
Pusher
HUB / JS / LITE / SERVING
Model Server
TF
Data
Validation
(TFDV)
TFT
TF
KerasTuner
TensorFlow
Model Analysis
(TFMA)
BulkInferrer
Metadata Store (ML Metadata)
TFX Options for Pipeline
Orchestration
Data
preprocessing
21. Prof. Dr. Jan Kirenz
Plan
Model
Deployment Data
Identify use
case
Frame
problem
Identify
variables
Define metrics
Business Analyst
Data
Engineer
Software
Developer
Data Scientist
Data ingestion
Analyze &
clean data
Define schema
Feature
engineering
Validate model
Deploy model
Serve model
Evaluate
model
Model Training
& tuning
Monitor model
ExampleGen
Select
algorithm
StatisticsGen
SchemaGen
Example
Validator
Transform
Data splitting
Trainer
Tuner
Evaluator
InfraValidator
Anomaly
detection
Pusher
HUB / JS / LITE / SERVING
Model Server
TF
Data
Validation
(TFDV)
TFT
TF
KerasTuner
TensorFlow
Model Analysis
(TFMA)
BulkInferrer
Data
preprocessing
Metadata Store (ML Metadata)
TFX Options for Pipeline
Orchestration
22. Prof. Dr. Jan Kirenz
Plan
Model
Deployment Data
Identify use
case
Frame
problem
Identify
variables
Define metrics
Business Analyst
Data
Engineer
Software
Developer
Data Scientist
Data ingestion
Analyze &
clean data
Define schema
Feature
engineering
Validate model
Deploy model
Serve model
Evaluate
model
Model Training
& tuning
Monitor model
ExampleGen
Select
algorithm
StatisticsGen
SchemaGen
Example
Validator
Transform
Data splitting
Trainer
Tuner
Evaluator
InfraValidator
Anomaly
detection
Pusher
HUB / JS / LITE / SERVING
Model Server
TF
Data
Validation
(TFDV)
TFT
TF
TensorFlow
Model Analysis
(TFMA)
BulkInferrer
Data
preprocessing
Metadata Store (ML Metadata)
TFX Options for Pipeline
Orchestration
23. Prof. Dr. Jan Kirenz
Plan
Model
Deployment Data
Identify use
case
Frame
problem
Identify
variables
Define metrics
Business Analyst
Data
Engineer
Software
Developer
Data Scientist
Data ingestion
Analyze &
clean data
Define schema
Feature
engineering
Validate model
Deploy model
Serve model
Evaluate
model
Model Training
& tuning
Monitor model
ExampleGen
Select
algorithm
StatisticsGen
SchemaGen
Example
Validator
Transform
Data splitting
Trainer
Tuner
Evaluator
InfraValidator
Anomaly
detection
Pusher
HUB / JS / LITE / SERVING
Model Server
TF
Data
Validation
(TFDV)
TFT
TF
TensorFlow
Model Analysis
(TFMA)
BulkInferrer
Data
preprocessing
Metadata Store (ML Metadata)
TFX Options for Pipeline
Orchestration
24. Prof. Dr. Jan Kirenz
Plan
Model
Deployment Data
Identify use
case
Frame
problem
Identify
variables
Define metrics
Business Analyst
Data
Engineer
Software
Developer
Data Scientist
Data ingestion
Analyze &
clean data
Define schema
Feature
engineering
Validate model
Deploy model
Serve model
Evaluate
model
Model Training
& tuning
Monitor model
ExampleGen
Select
algorithm
StatisticsGen
SchemaGen
Example
Validator
Transform
Data splitting
Trainer
Tuner
Evaluator
InfraValidator
Anomaly
detection
Pusher
HUB / JS / LITE / SERVING
Model Server
TF
Data
Validation
(TFDV)
TFT
TF
KerasTuner
TensorFlow
Model Analysis
(TFMA)
BulkInferrer
Data
preprocessing
Metadata Store (ML Metadata)
TFX Options for Pipeline
Orchestration
25. Prof. Dr. Jan Kirenz
Tuner
Evaluator
InfraValidator
ExampleGen
StatisticsGen
SchemaGen
Example
Validator
Transform
Trainer
Pusher
HUB / JS / LITE / SERVING
Model Server
KerasTuner
BulkInferrer
Metadata Store (ML Metadata)
TF
Data
Validation
(TFDV)
TFT
TF
TFX Options for Pipeline
Orchestration
TensorFlow
Model Analysis
(TFMA)
TensorFlow Lite is a set of tools that enables on-device
machine learning by helping developers run their models
on mobile, embedded, and IoT devices.
26. Prof. Dr. Jan Kirenz
Plan
Model
Deployment Data
Identify use
case
Frame
problem
Identify
variables
Define metrics
Business Analyst
Data
Engineer
Software
Developer
Data Scientist
Data ingestion
Analyze &
clean data
Define schema
Feature
engineering
Validate model
Deploy model
Serve model
Retrain model
Evaluate
model
Model Training
& tuning
Monitor model
ExampleGen
Select
algorithm
StatisticsGen
SchemaGen
Example
Validator
Transform
Data splitting
Trainer
Tuner
Evaluator
InfraValidator
Anomaly
detection
Pusher
HUB / JS / LITE / SERVING
Model Server
TF
Data
Validation
(TFDV)
TFT
TF
KerasTuner
TensorFlow
Model Analysis
(TFMA)
BulkInferrer
Data
preprocessing
Metadata Store (ML Metadata)
TFX Options for Pipeline
Orchestration
Production phase:
automate the execution
of the ML pipeline based
on a schedule or certain
triggering conditions.
Development phase: run the ML experiment, instead of
manually executing each step.
Data preparation
phase:
automatically
ingest, validate
and transform
data and provide
features to models
27. Prof. Dr. Jan Kirenz
Prof. Dr. Jan Kirenz
Pipeline orchestration
28. Prof. Dr. Jan Kirenz
TFX & Apache Airflow
● Programmatically author, schedule and
monitor workflows with Python code.
● User interface to visualize pipelines
running in production, monitor progress,
and troubleshoot issues.
29. Prof. Dr. Jan Kirenz
TFX & Apache Beam
● Provides a framework for running batch
and streaming data processing jobs that
run on a variety of runners (Spark, Flink, ...).
● Beam provides an abstraction layer which
enables TFX to run on any supported
runner without code modifications
● TFX only uses the Beam Python API
30. Prof. Dr. Jan Kirenz
TFX & Kubeflow pipelines
The Kubeflow Pipelines platform consists of:
● An engine for scheduling multi-step ML
workflows (using Kubernetes).
● User interface (UI) for managing and
tracking experiments, jobs, and runs.
● Python SDK for defining and manipulating
pipelines and components.
● Notebooks for interacting with the system
using the SDK
Kubeflow Pipelines is available as a core component of Kubeflow or as
a standalone installation.
36. Prof. Dr. Jan Kirenz
Google’s Vertex AI
Launched in May 2021
37. Prof. Dr. Jan Kirenz
ML Pipelines | wrap-up
Source: TensorFlow (2021)
By using a ML pipeline, you can:
● Automate your ML process, which lets you
regularly retrain, evaluate, and deploy your
model.
● Utilize distributed compute resources for
processing large datasets and workloads.
● Increase the velocity of experimentation by
running a pipeline with different sets of
hyperparameters.
To learn more visit the following tutorials @:
https://kirenz.github.io/
MLOps tutorials on how to:
● Install TF and TFX
● Build your first TFX pipeline
● Install Kubeflow
● Build your first Kubeflow pipeline
Jan Kirenz
www.kirenz.com
38. Prof. Dr. Jan Kirenz
Backup
Continuous Integration
and Continuous Delivery
pipeline for an ML/AI
project in Microsoft Azure
39. Prof. Dr. Jan Kirenz
Continuous Integration
and Continuous Delivery
pipeline for an ML/AI
project in Microsoft Azure
40. Prof. Dr. Jan Kirenz
CI/CD and source code management
41. Prof. Dr. Jan Kirenz
Cloud based machine learning service
42. Prof. Dr. Jan Kirenz
Source: Microsoft (2021)https://github.com/Microsoft/MLOpsPython
43. Prof. Dr. Jan Kirenz
Source: Microsoft (2021) https://docs.microsoft.com/en-us/azure/architecture/reference-architectures/ai/mlops-python
1
2
3
44. Prof. Dr. Jan Kirenz
https://github.com/Microsoft/MLOpsPython