Pythonsevilla2019 - Introduction to MLFlow

•Download as PPTX, PDF•

0 likes•183 views

In this talk, I present an introduction of MLFlow. I also show some examples of using it by means of MLFlow Tracking, MLFlow Projects and MLFlow Models. I also used Databricks as an example of remote tracking.

Technology

Introduction to
MLFLOW
Accelerating the Machine
Learning Lifecycle

Who am I?
Fernando Ortega Gallego
Data Enginer @ Plain Concepts UK
• Interests & hobbies:
– Machine Learning and NLP
– Python, Azure and my daughter
– Honda biker
@FernanOrtega
fgallego@plainconcepts.com

Roadmap
Introduction
MLFlow Tracking
MLFlow Projects
MLFlow Models
Conclusions

Machine Learning Lifecycle
Data
prep
TrainDeploy
Raw
data
5

Problems
• Reproduce experiments
• Compare experiments
• Fine tune previous experiments
across teams
• Share data, parameters or
metrics
• Deploy trained models
7

It is difficult to productionize and share
8

Custom ML Platforms
• Facebook FBLearner, Google
TFX, Uber Michelangelo
• Advantages:
– Standarise the ML loop
• Disadvantages:
– Limited to a few algorithms or
frameworks
– Tied to the company’s
infrastructure
9

Are there any similar solutions in an open manner?
10

Introducing MLFlow
• It works with any ML library &
language
• It runs the same way anywhere
• It is designed to be useful both
for 1 person, small teams or big
teams

MLFlow components
15
Tracking Projects Models

MLFlow Tracking
• Record and query experiments:
– Data
– Code
– Model parameters
– Results (performance metrics)
– Model
17
Tracking

Motivation
18
Centralised repository of useful information to analyse
several runs of training.

Key concepts
• Parameters
• Metrics
• Tags and notes
• Artifacts
• Source
• Version
20

MLFlow Projects
• Packaging format for
reproducible ML runs
• Defines dependencies for
reproducibility
• Execution API for running
projects locally or remote
24
Projects

Motivation
25
Diverse set of tools and environments involves difficulty
to productionalize and share ML work

Key concepts
• MLproject file
• Entry points
• Environments
– Conda
– Docker
– System
• Run
27

Code example
28
mlflow run git@github.com:mlflow/mlflow-example.git -P alpha=0.5
mlflow run <uri> -m databricks --cluster-spec <json-cluster-spec>

MLFlow Models
• Packaging format for ML Models
• Defines dependencies for
reproducibility
• Deployment APIs
31
Models

Motivation
32
ML Frameworks
Batch & Stream Scoring
Serving tools
Inference code

Key concepts
• MLmodel file
• Storage format
• Entry points
• Flavours
• Custom model
34

MLFlow rocks!
• Log important parameters,
metrics, and other data that is
important to the machine
learning model
• Track the environment a model
is run on
• Run any machine learning codes
on that environment
• Deploy and export models to
various platforms with multiple
packaging formats
38

MLFlow 1.0 (4-jun)
• Support for step tracking
• Improved Search features
• Batched logging of metrics
• Support for HDFS
• Windows support for the client
• Build Docker images to deploy
• ONNX model flavour
39

MLFlow last updates (1.4: 31-oct)
• Windows support
• Tags and descriptions
• Google Cloud run models
• Log directories as artifacts
• CLI command to export to CSV
• Keras compatibility with TF 2.0
• Model Registry in preview
40

Pythonsevilla2019 - Introduction to MLFlow

Description Data Science and ML development bring many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools, and parameters to get the best results, and they need to track this information to reproduce work. MLflow addresses some of these challenges during an ML model development cycle. Abstract ML development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools, and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models. To address these problems, many companies are building custom “ML platforms” that automate this lifecycle, but even these platforms are limited to a few supported algorithms and to each company’s internal infrastructure. In this session, we introduce MLflow, a new open source project from Databricks that aims to design an open ML platform where organizations can use any ML library and development tool of their choice to reliably build and share ML applications. MLflow introduces simple abstractions to package reproducible projects, track results, and encapsulate models that can be used with many existing tools, accelerating the ML lifecycle for organizations of any size. With a short demo, you see a complete ML model life-cycle example, you will walk away with: MLflow concepts and abstractions for models, experiments, and projects How to get started with MLFlow Using tracking Python APIs during model training Using MLflow UI to visually compare and contrast experimental runs with different tuning parameters and evaluate metrics

Managing the Complete Machine Learning Lifecycle with MLflow

Databricks

ML development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models. To solve for these challenges, Databricks unveiled last year MLflow, an open source project that aims at simplifying the entire ML lifecycle. MLflow introduces simple abstractions to package reproducible projects, track results, and encapsulate models that can be used with many existing tools, accelerating the ML lifecycle for organizations of any size. In the past year, the MLflow community has grown quickly: over 120 contributors from over 40 companies have contributed code to the project, and over 200 companies are using MLflow. In this tutorial, we will show you how using MLflow can help you: Keep track of experiments runs and results across frameworks. Execute projects remotely on to a Databricks cluster, and quickly reproduce your runs. Quickly productionize models using Databricks production jobs, Docker containers, Azure ML, or Amazon SageMaker. We will demo the building blocks of MLflow as well as the most recent additions since the 1.0 release. What you will learn: Understand the three main components of open source MLflow (MLflow Tracking, MLflow Projects, MLflow Models) and how each help address challenges of the ML lifecycle. How to use MLflow Tracking to record and query experiments: code, data, config, and results. How to use MLflow Projects packaging format to reproduce runs on any platform. How to use MLflow Models general format to send models to diverse deployment tools. Prerequisites: A fully-charged laptop (8-16GB memory) with Chrome or Firefox Python 3 and pip pre-installed Pre-Register for a Databricks Standard Trial Basic knowledge of Python programming language Basic understanding of Machine Learning Concepts

MLflow: A Platform for Production Machine Learning

Matei Zaharia

Introduction to MLflow

Databricks

MLOps Using MLflow

Databricks

MLflow is an MLOps tool that enables data scientist to quickly productionize their Machine Learning projects. To achieve this, MLFlow has four major components which are Tracking, Projects, Models, and Registry. MLflow lets you train, reuse, and deploy models with any library and package them into reproducible steps. MLflow is designed to work with any machine learning library and require minimal changes to integrate into an existing codebase. In this session, we will cover the common pain points of machine learning developers such as tracking experiments, reproducibility, deployment tool and model versioning. Ready to get your hands dirty by doing quick ML project using mlflow and release to production to understand the ML-Ops lifecycle.

MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...

Databricks

ML development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools, and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models. To address these problems, many companies are building custom “ML platforms” that automate this lifecycle, but even these platforms are limited to a few supported algorithms and to each company’s internal infrastructure. In this session, we introduce MLflow, a new open source project from Databricks that aims to design an open ML platform where organizations can use any ML library and development tool of their choice to reliably build and share ML applications. MLflow introduces simple abstractions to package reproducible projects, track results, and encapsulate models that can be used with many existing tools, accelerating the ML lifecycle for organizations of any size. In this deep-dive session, through a complete ML model life-cycle example, you will walk away with: MLflow concepts and abstractions for models, experiments, and projects How to get started with MLFlow Understand aspects of MLflow APIs Using tracking APIs during model training Using MLflow UI to visually compare and contrast experimental runs with different tuning parameters and evaluate metrics Package, save, and deploy an MLflow model Serve it using MLflow REST API What’s next and how to contribute

MLOps in action

Pieter de Bruin

Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...

DataWorks Summit

Specialized tools for machine learning development and model governance are becoming essential. MlFlow is an open source platform for managing the machine learning lifecycle. Just by adding a few lines of code in the function or script that trains their model, data scientists can log parameters, metrics, artifacts (plots, miscellaneous files, etc.) and a deployable packaging of the ML model. Every time that function or script is run, the results will be logged automatically as a byproduct of those lines of code being added, even if the party doing the training run makes no special effort to record the results. MLflow application programming interfaces (APIs) are available for the Python, R and Java programming languages, and MLflow sports a language-agnostic REST API as well. Over a relatively short time period, MLflow has garnered more than 3,300 stars on GitHub , almost 500,000 monthly downloads and 80 contributors from more than 40 companies. Most significantly, more than 200 companies are now using MLflow. We will demo MlFlow Tracking , Project and Model components with Azure Machine Learning (AML) Services and show you how easy it is to get started with MlFlow on-prem or in the cloud.

ML development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools, and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models. To address these problems, many companies are building custom “ML platforms” that automate this lifecycle, but even these platforms are limited to a few supported algorithms and to each company’s internal infrastructure. In this session, we introduce MLflow, a new open source project from Databricks that aims to design an open ML platform where organizations can use any ML library and development tool of their choice to reliably build and share ML applications. MLflow introduces simple abstractions to package reproducible projects, track results, and encapsulate models that can be used with many existing tools, accelerating the ML lifecycle for organizations of any size. In this deep-dive session, through a complete ML model life-cycle example, you will walk away with: MLflow concepts and abstractions for models, experiments, and projects How to get started with MLFlow Understand aspects of MLflow APIs Using tracking APIs during model training Using MLflow UI to visually compare and contrast experimental runs with different tuning parameters and evaluate metrics Package, save, and deploy an MLflow model Serve it using MLflow REST API What’s next and how to contribute

MLOps in action

Pieter de Bruin

Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...

DataWorks Summit

MLOps with Azure DevOps

Marco Parenzan

MLOps refers to applying DevOps practices and principles to machine learning. This allows for machine learning models and projects to be developed and deployed using automated pipelines for continuous integration and delivery. MLOps benefits include making machine learning work reproducible and auditable, enabling validation of models, and providing observability through monitoring of models after deployment. MLOps uses the same development practices as software engineering to ensure quality control for machine learning.

MLOps - The Assembly Line of ML

Jordan Birdsell

The catalyst for the success of automobiles came not through the invention of the car but rather through the establishment of an innovative assembly line. History shows us that the ability to mass produce and distribute a product is the key to driving adoption of any innovation, and machine learning is no different. MLOps is the assembly line of Machine Learning and in this presentation we will discuss the core capabilities your organization should be focused on to implement a successful MLOps system.

Databricks Overview for MLOps

Databricks

1) Databricks provides a machine learning platform for MLOps that includes tools for data ingestion, model training, runtime environments, and monitoring. 2) It offers a collaborative data science workspace for data engineers, data scientists, and ML engineers to work together on projects using notebooks. 3) The platform provides end-to-end governance for machine learning including experiment tracking, reproducibility, and model governance.

"Managing the Complete Machine Learning Lifecycle with MLflow"

Databricks

Machine Learning development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools, and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models. To address these problems, many companies are building custom “ML platforms” that automate this lifecycle, but even these platforms are limited to a few supported algorithms and to each company’s internal infrastructure. In this session, we introduce MLflow, a new open-source project from Databricks that aims to design an open ML platform where organizations can use any ML library and development tool of their choice to reliably build and share ML applications. MLflow introduces simple abstractions to package reproducible projects, track results, and encapsulate models that can be used with many existing tools, accelerating the ML lifecycle for organizations of any size.

Simplifying Model Management with MLflow

Databricks

<p>Last summer, Databricks launched MLflow, an open source platform to manage the machine learning lifecycle, including experiment tracking, reproducible runs and model packaging. MLflow has grown quickly since then, with over 120 contributors from dozens of companies, including major contributions from R Studio and Microsoft. It has also gained new capabilities such as automatic logging from TensorFlow and Keras, Kubernetes integrations, and a high-level Java API. In this talk, we’ll cover some of the new features that have come to MLflow, and then focus on a major upcoming feature: model management with the MLflow Model Registry. Many organizations face challenges tracking which models are available in the organization and which ones are in production. The MLflow Model Registry provides a centralized database to keep track of these models, share and describe new model versions, and deploy the latest version of a model through APIs. We’ll demonstrate how these features can simplify common ML lifecycle tasks.</p>

MLOps with Kubeflow

Saurabh Kaushik

This document discusses MLOps and Kubeflow. It begins with an introduction to the speaker and defines MLOps as addressing the challenges of independently autoscaling machine learning pipeline stages, choosing different tools for each stage, and seamlessly deploying models across environments. It then introduces Kubeflow as an open source project that uses Kubernetes to minimize MLOps efforts by enabling composability, scalability, and portability of machine learning workloads. The document outlines key MLOps capabilities in Kubeflow like Jupyter notebooks, hyperparameter tuning with Katib, and model serving with KFServing and Seldon Core. It describes the typical machine learning process and how Kubeflow supports experimental and production phases.

Introducing MLOps.pdf

Dr. Anish Cheriyan (PhD)

“Houston, we have a model...” Introduction to MLOps

Rui Quintino

The document introduces MLOps (Machine Learning Operations) and the need to operationalize machine learning models beyond just model deployment. It discusses challenges like data and model drift, retraining models, software dependencies, monitoring models in production, and the need for automation, testing, and reproducibility across the full machine learning lifecycle from data to deployment. An example MLOps workflow is shown using GitHub and Azure ML to enable experiment tracking, automation, and continuous integration and delivery of models.

mlflow: Accelerating the End-to-End ML lifecycle

Databricks

Building and deploying a machine learning model can be difficult to do once. Enabling other data scientists (or yourself, one month later) to reproduce your pipeline, to compare the results of different versions, to track what’s running where, and to redeploy and rollback updated models is much harder. In this talk, I’ll introduce MLflow, a new open source project from Databricks that simplifies the machine learning lifecycle. MLflow provides APIs for tracking experiment runs between multiple users within a reproducible environment, and for managing the deployment of models to production. MLflow is designed to be an open, modular platform, in the sense that you can use it with any existing ML library and development process. MLflow was launched in June 2018 and has already seen significant community contributions, with over 50 contributors and new features including language APIs, integrations with popular ML libraries, and storage backends. I’ll show how MLflow works and explain how to get started with MLflow.

Seamless MLOps with Seldon and MLflow

Databricks

The document provides an overview of seamless MLOps using Seldon and MLflow. It discusses how MLOps is challenging due to the wide range of requirements across the ML lifecycle. MLflow helps with training by allowing experiment tracking and model versioning. Seldon Core helps with deployment by providing servers to containerize models and infrastructure for monitoring, A/B testing, and feedback. The demo shows training models with MLflow, deploying them to Seldon for A/B testing, and collecting feedback to optimize models.

MLflow with Databricks

Liangjun Jiang

Apply MLOps at Scale by H&M

Databricks

H&M uses machine learning for various use cases including logistics, production, sales, marketing, and design/buying. MLOps principles like model versioning, reproducibility, scalability, and automated training are applied to manage the machine learning lifecycle. The technical stack includes Kubernetes, Docker, Azure Databricks for interactive development, Airflow for automated training, and Seldon for model serving. The goal is to apply MLOps at scale for various prediction scenarios through a continuous integration/continuous delivery pipeline.

From Data Science to MLOps

Carl W. Handlin

The document discusses moving from data science to MLOps. It defines MLOps as extending DevOps methodology to include machine learning, data science, and data engineering assets. Key concepts of MLOps include iterative development, automation, continuous integration and delivery, versioning, testing, reproducibility, monitoring, source control, and model/feature stores. MLOps helps address challenges of moving models to production like the deployment gap by establishing best practices and tools for testing, deploying, managing, and monitoring models.

MLOps Virtual Event: Automating ML at Scale

Databricks

ML is transforming many industries but operating ML systems at scale is complex as it involves many teams, constant data and model updates, and moving from development to production. ML platforms aim to help with this by providing software to manage the entire ML lifecycle from data to experimentation to production deployment through a consistent interface. Desirable features for an ML platform include ease of use, integration with data infrastructure for governance, and collaboration functions to enable sharing of code, data, models and experiments. Databricks provides an open source ML platform that integrates with data lakes and a data science workspace to help organizations perform MLOps at scale.

MLOps by Sasha Rosenbaum

Sasha Rosenbaum

This document discusses MLOps, which is applying DevOps practices and principles to machine learning to enable continuous delivery of ML models. It explains that ML models need continuous improvement through retraining but data scientists currently lack tools for quick iteration, versioning, and deployment. MLOps addresses this by providing ML pipelines, model management, monitoring, and retraining in a reusable workflow similar to how software is developed. Implementing even a basic CI/CD pipeline for ML can help iterate models more quickly than having no pipeline at all. The document encourages building responsible AI through practices like ensuring model performance and addressing bias.

Ml ops intro session

Avinash Patil

Productionzing ML Model Using MLflow Model Serving

Databricks

Productionzing ML Models are needs to ensure model integrity while it efficiently replicate runtime environments across servers besides it keep track of how each of our models were created. It helps us better trace the root cause of changes and issues over time as we acquire new data and update our model. We have greater accountability over our models and the results they generate. MLflow Model Serving delivers cost-effective and on-click deployment of model for real-time inferences. Also the Model Version deployed in the Model Serving can also be conveniently managed with MLflow Model Registry. We will going to cover following topics Deployment, Consumption and Monitoring. For deployment, we will demo the different version deployment and validate the deployment. For consumption, we demo connecting power bi and generate prediction report using ML Model deployed in MLflow serving. Lastly will wrap up with managing the MLflow serving like, access rights and monitoring capabilities.

The A-Z of Data: Introduction to MLOps

DataPhoenix

Команда Data Phoenix Events приглашает всех, 17 августа в 19:00, на первый вебинар из серии "The A-Z of Data", который будет посвящен MLOps. В рамках вводного вебинара, мы рассмотрим, что такое MLOps, основные принципы и практики, лучшие инструменты и возможные архитектуры. Мы начнем с простого жизненного цикла разработки ML решений и закончим сложным, максимально автоматизированным, циклом, который нам позволяет реализовать MLOps. https://dataphoenix.info/the-a-z-of-data/ https://dataphoenix.info/the-a-z-of-data-introduction-to-mlops/

Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus

Manasi Vartak

MLOps and Data Quality: Deploying Reliable ML Models in Production

Provectus

Looking to build a robust machine learning infrastructure to streamline MLOps? Learn from Provectus experts how to ensure the success of your MLOps initiative by implementing Data QA components in your ML infrastructure. For most organizations, the development of multiple machine learning models, their deployment and maintenance in production are relatively new tasks. Join Provectus as we explain how to build an end-to-end infrastructure for machine learning, with a focus on data quality and metadata management, to standardize and streamline machine learning life cycle management (MLOps). Agenda - Data Quality and why it matters - Challenges and solutions of Data Testing - Challenges and solutions of Model Testing - MLOps pipelines and why they matter - How to expand validation pipelines for Data Quality

Nasscom ml ops webinar

Sameer Mahajan

This document outlines a presentation on revamping machine learning pipelines with MLOps. The presentation covers the machine learning lifecycle and challenges with ML productization. It provides examples of end-to-end ML platforms like Uber's Michaelangelo and Google's TFX. The presentation discusses MLOps best practices and methodologies, including build, retrain and release pipelines. It demonstrates MLflow and shows demos of Airflow, Tensorflow model serving, and TFX-based MLOps systems on Google Cloud and Azure.

Convenient Containerization with MLFLow Project

Knoldus Inc.

Deploying even regular software applications into production is a difficult task to accomplish. It’s even worse when the application is a Machine Learning pipeline. Usually, Machine Learning models are packaged as standalone, executable entities. However, that presents a lot of problems like platform incompatibility, unscalable models, and inconsistent libraries. These problems can be solved by deploying Machine Learning models using Docker containers. This has several advantages - ~ Your ML applications can run on any platform without any porting or testing required. ~ Once packaged into containers, ML systems can be exposed as microservices. This allows external services to (containers or not) to leverage them at any time without the need to integrate the code inside the application. ~ ML applications that exist within containers can easily be scaled by placing them on cloud-based systems or on container management systems like Kubernetes or Swarm. This webinar will walk you through the process of containerizing Machine learning models with MLFlow so that they can be ready for deployment in production environments.

What's hot

MLOps with Azure DevOps

Marco Parenzan

MLOps - The Assembly Line of ML

Jordan Birdsell

Databricks Overview for MLOps

Databricks

"Managing the Complete Machine Learning Lifecycle with MLflow"

Databricks

Simplifying Model Management with MLflow

Databricks

MLOps with Kubeflow

Saurabh Kaushik

Introducing MLOps.pdf

Dr. Anish Cheriyan (PhD)

“Houston, we have a model...” Introduction to MLOps

Rui Quintino

mlflow: Accelerating the End-to-End ML lifecycle

Databricks

Seamless MLOps with Seldon and MLflow

Databricks

MLflow with Databricks

Liangjun Jiang

Apply MLOps at Scale by H&M

Databricks

From Data Science to MLOps

Carl W. Handlin

MLOps Virtual Event: Automating ML at Scale

Databricks

MLOps by Sasha Rosenbaum

Sasha Rosenbaum

Ml ops intro session

Avinash Patil

Productionzing ML Model Using MLflow Model Serving

Databricks

The A-Z of Data: Introduction to MLOps

DataPhoenix

Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus

Manasi Vartak

MLOps and Data Quality: Deploying Reliable ML Models in Production

Provectus

What's hot (20)

MLOps with Azure DevOps

MLOps - The Assembly Line of ML

Databricks Overview for MLOps

"Managing the Complete Machine Learning Lifecycle with MLflow"

Simplifying Model Management with MLflow

MLOps with Kubeflow

Introducing MLOps.pdf

“Houston, we have a model...” Introduction to MLOps

mlflow: Accelerating the End-to-End ML lifecycle

Seamless MLOps with Seldon and MLflow

MLflow with Databricks

Apply MLOps at Scale by H&M

From Data Science to MLOps

MLOps Virtual Event: Automating ML at Scale

MLOps by Sasha Rosenbaum

Ml ops intro session

Productionzing ML Model Using MLflow Model Serving

The A-Z of Data: Introduction to MLOps

Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus

MLOps and Data Quality: Deploying Reliable ML Models in Production

Similar to Pythonsevilla2019 - Introduction to MLFlow

Nasscom ml ops webinar

Sameer Mahajan

Convenient Containerization with MLFLow Project

Knoldus Inc.

MLflow: Infrastructure for a Complete Machine Learning Life Cycle

Databricks

Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...

Databricks

Because MLflow is an API-first platform, there are many patterns for using it in complex workflows and integrating it with existing tools. In this talk, we’ll demo a few best practices for using MLflow in a more complex workflow. These include: * Run multi-step workflows on MLflow, such as data preparation steps followed by training, and organizing your projects so you can automatically reuse past work. * Tune Hyperparameter on MLflow with open source hyperparameter tuning packages. * Save a model in MLflow (eg, from a new machine learning library) and deploying it to the existing deployment tools.

MLOps for production-level machine learning

cnvrg.io AI OS - Hands-on ML Workshops

MLOps (a compound of “machine learning” and “operations”) is a practice for collaboration and communication between data scientists and operations professionals to help manage the production machine learning lifecycle. Similar to the DevOps term in the software development world, MLOps looks to increase automation and improve the quality of production ML while also focusing on business and regulatory requirements. MLOps applies to the entire ML lifecycle - from integrating with model generation (software development lifecycle, continuous integration/continuous delivery), orchestration, and deployment, to health, diagnostics, governance, and business metrics. To watch the full presentation click here: https://info.cnvrg.io/mlopsformachinelearning In this webinar, we’ll discuss core practices in MLOps that will help data science teams scale to the enterprise level. You’ll learn the primary functions of MLOps, and what tasks are suggested to accelerate your teams machine learning pipeline. Join us in a discussion with cnvrg.io Solutions Architect, Aaron Schneider, and learn how teams use MLOps for more productive machine learning workflows. - Reduce friction between science and engineering - Deploy your models to production faster - Health, diagnostics and governance of ML models - Kubernetes as a core platform for MLOps - Support advanced use-cases like continual learning with MLOps

Building a MLOps Platform Around MLflow to Enable Model Productionalization i...

Databricks

Getting machine learning models to production is notoriously difficult: it involves multiple teams (data scientists, data and machine learning engineers, operations, …), who often does not speak to each other very well; the model can be trained in one environment but then productionalized in completely different environment; it is not just about the code, but also about the data (features) and the model itself… At DataSentics, as a machine learning and cloud engineering studio, we see this struggle firsthand – on our internal projects and client’s projects as well.

Unifying Twitter around a single ML platform - Twitter AI Platform 2019

Karthik Murugesan

[DSC Europe 23] Petar Zecevic - ML in Production on Databricks

DataScienceConferenc1

Deploying ML models in production, with or without CI/CD, is significantly more complicated than deploying traditional applications. That is mainly because ML models do not just consist of the code used for their training, but they also depend on the data they are trained on and on the supporting code. Monitoring ML models also adds additional complexity beyond what is usually done for traditional applications. This talk will cover these problems and best practices for solving them, with special focus on how it's done on the Databricks platform.

Scaling up Machine Learning Development

Matei Zaharia

Mlflow with databricks

Liangjun Jiang

Tuning ML Models: Scaling, Workflows, and Architecture

Databricks

This document discusses best practices for tuning machine learning models. It covers architectural patterns like single-machine versus distributed training and training one model per group. It also discusses workflows for hyperparameter tuning including setting up full pipelines before tuning, evaluating metrics on validation data, and tracking results for reproducibility. Finally it provides tips for handling code, data, and cluster configurations for distributed hyperparameter tuning and recommends tools to use.

From idea to production in a day – Leveraging Azure ML and Streamlit to build...

Florian Roscheck

How to leverage Azure ML, automated machine learning, and Streamlit to build and test machine learning apps quickly? Find out about our favorite Hackathon stack and walk away with some code to build and user-test your own machine learning ideas fast. Experimentation, bringing machine learning ideas in front of users, is essential to innovation. Yet, in our corporate hackathons, our data science team has struggled many times with how to build and deploy user-facing machine learning ideas in just a single day. Over the past 2+ years, we have developed a routine around using Azure Machine Learning, automated machine learning, and Streamlit to build and user test machine learning ideas quickly. The aim of this talk is to pass on practical, technical knowledge to fellow data scientists about how to leverage this stack to achieve high build and user test speeds. During the talk, we will walk through the process of building a computer vision system for identifying trash in images via an app using the open-source TACO dataset (http://tacodataset.org/). Working through a Jupyter notebook, we will load the data into Azure Machine Learning and trigger an automated machine learning run on the data. In this context, we will quickly get to know the training and testing metrics available in Azure ML to evaluate the model. We will then download the machine learning model as a file packaged in the open-source ONNX format (https://onnx.ai/). Using the open-source Python web application framework Streamlit (https://github.com/streamlit/streamlit), we will program an application in which users can upload images and embed the machine learning model in it to identify trash in these images. Using a to-be-published infrastructure-as-code pipeline on Azure DevOps, we will deploy the application to the public internet on the Azure platform. From here, users can test it. The stack and code presented in this talk will enable fellow data scientists to accelerate their data science development, leading to quicker experimentation and, therefore, to faster innovation of products with machine learning at their core.

Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific

Databricks

Open, Secure & Transparent AI Pipelines

Nick Pentreath

AI algorithms offer great promise in criminal justice, credit scoring, hiring and other domains. However, algorithmic fairness is a legitimate concern. Possible bias and adversarial contamination can come from training data, inappropriate data handling/model selection or incorrect algorithm design. This talk discusses how to build an open, transparent, secure and fair pipeline that fully integrates into the AI lifecycle — leveraging open-source projects such as AI Fairness 360 (AIF360), Adversarial Robustness Toolbox (ART), the Fabric for Deep Learning (FfDL) and the Model Asset eXchange (MAX).

Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...

Akash Tandon

ML solutions in production start from data ingestion and extend upto the actual deployment step. We want this workflow to be scalable, portable and simple. Containers and kubernetes are great at the former two but not the latter if you aren't a devops practitioner. We'll explore how you can leverage the Kubeflow project to deploy best-of-breed open-source systems for ML to diverse infrastructures.

Databricks for MLOps Presentation (AI/ML)

Knoldus Inc.

TensorFlow 101

Raghu Rajah

This document provides an overview of Machine Learning with TensorFlow 101. It introduces TensorFlow, describing its programming model and how it uses computational graphs for distributed execution. It then gives a simplified view of machine learning, and provides examples of linear regression and deep learning with TensorFlow. The presenter is an entrepreneur who has been dabbling with machine learning for the past 3 years using tools like Spark, H2O.ai and TensorFlow.

MLflow: Infrastructure for a Complete Machine Learning Life Cycle

Databricks

Dmitry Spodarets: Modern MLOps toolchain 2023

Lviv Startup Club

Team Data Science Process Presentation (TDSP), Aug 29, 2017

Debraj GuhaThakurta

Similar to Pythonsevilla2019 - Introduction to MLFlow (20)

Nasscom ml ops webinar

Convenient Containerization with MLFLow Project

MLflow: Infrastructure for a Complete Machine Learning Life Cycle

Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...

MLOps for production-level machine learning

Building a MLOps Platform Around MLflow to Enable Model Productionalization i...

Unifying Twitter around a single ML platform - Twitter AI Platform 2019

[DSC Europe 23] Petar Zecevic - ML in Production on Databricks

Scaling up Machine Learning Development

Mlflow with databricks

Tuning ML Models: Scaling, Workflows, and Architecture

From idea to production in a day – Leveraging Azure ML and Streamlit to build...

Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific

Open, Secure & Transparent AI Pipelines

Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...

Databricks for MLOps Presentation (AI/ML)

TensorFlow 101

MLflow: Infrastructure for a Complete Machine Learning Life Cycle

Dmitry Spodarets: Modern MLOps toolchain 2023

Team Data Science Process Presentation (TDSP), Aug 29, 2017

Recently uploaded

Climate Impact of Software Testing at Nordic Testing Days

Kari Kakkonen

My slides at Nordic Testing Days 6.6.2024 Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.

Introduction to CHERI technology - Cybersecurity

mikeeftimakis1

Essentials of Automations: The Art of Triggers and Actions in FME

Safe Software

In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation. We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios. Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!

“I’m still / I’m still / Chaining from the Block”

Claudio Di Ciccio

Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack

shyamraj55

Communications Mining Series - Zero to Hero - Session 1

DianaGray10

This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered: • Communication Mining Overview • Why is it important? • How can it help today’s business and the benefits • Phases in Communication Mining • Demo on Platform overview • Q/A

UiPath Test Automation using UiPath Test Suite series, part 6

DianaGray10

Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI. UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities. Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes. What will you get from this session? 1. Insights into integrating generative AI. 2. Understanding how this integration enhances test automation within the UiPath platform 3. Practical demonstrations 4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath Topics covered: What is generative AI Test Automation with generative AI and Open AI. UiPath integration with generative AI Speaker: Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP

Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf

Malak Abu Hammad

Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers: * What is Vector Search? * Importance and benefits of vector search * Practical use cases across various industries * Step-by-step implementation guide * Live demos with code snippets * Enhancing LLM capabilities with vector search * Best practices and optimization strategies Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications. #MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology

Mariano G Tinti - Decoding SpaceX

Mariano Tinti

HCL Notes and Domino License Cost Reduction in the World of DLAU

panagenda

Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/ The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this! We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model. Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward. These topics will be covered - Reducing license cost by finding and fixing misconfigurations and superfluous accounts - How do CCB and CCX licenses really work? - Understanding the DLAU tool and how to best utilize it - Tips for common problem areas, like team mailboxes, functional/test users, etc - Practical examples and best practices to implement right away

Full-RAG: A modern architecture for hyper-personalization

Zilliz

Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.

GraphRAG for Life Science to increase LLM accuracy

Tomaz Bratanic

GenAI Pilot Implementation in the organizations

kumardaparthi1024

AI 101: An Introduction to the Basics and Impact of Artificial Intelligence

IndexBug

Best 20 SEO Techniques To Improve Website Visibility In SERP

Pixlogix Infotech

HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU

panagenda

Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/ DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen! Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell. Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten. Diese Themen werden behandelt - Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten - Wie funktionieren CCB- und CCX-Lizenzen wirklich? - Verstehen des DLAU-Tools und wie man es am besten nutzt - Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw. - Praxisbeispiele und Best Practices zum sofortigen Umsetzen

20240609 QFM020 Irresponsible AI Reading List May 2024

Matthew Sinclair

Driving Business Innovation: Latest Generative AI Advancements & Success Story

Safe Software

Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency. During the hour, we’ll take you through: Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board. Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes. Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI. We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI. This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!

20240607 QFM018 Elixir Reading List May 2024

Matthew Sinclair

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...

Neo4j

Leonard Jayamohan, Partner & Generative AI Lead, Deloitte This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.

Recently uploaded (20)

Climate Impact of Software Testing at Nordic Testing Days

Introduction to CHERI technology - Cybersecurity

Essentials of Automations: The Art of Triggers and Actions in FME

“I’m still / I’m still / Chaining from the Block”

Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack

Communications Mining Series - Zero to Hero - Session 1

UiPath Test Automation using UiPath Test Suite series, part 6

Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf

Mariano G Tinti - Decoding SpaceX

HCL Notes and Domino License Cost Reduction in the World of DLAU

Full-RAG: A modern architecture for hyper-personalization

GraphRAG for Life Science to increase LLM accuracy

GenAI Pilot Implementation in the organizations

AI 101: An Introduction to the Basics and Impact of Artificial Intelligence

Best 20 SEO Techniques To Improve Website Visibility In SERP

HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU

20240609 QFM020 Irresponsible AI Reading List May 2024

Driving Business Innovation: Latest Generative AI Advancements & Success Story

20240607 QFM018 Elixir Reading List May 2024

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...

Pythonsevilla2019 - Introduction to MLFlow

1. Introduction to MLFLOW Accelerating the Machine Learning Lifecycle

2. Who am I? Fernando Ortega Gallego Data Enginer @ Plain Concepts UK • Interests & hobbies: – Machine Learning and NLP – Python, Azure and my daughter – Honda biker @FernanOrtega fgallego@plainconcepts.com

3. Roadmap Introduction MLFlow Tracking MLFlow Projects MLFlow Models Conclusions

4. Roadmap Introduction MLFlow Tracking MLFlow Projects MLFlow Models Conclusions

5. Machine Learning Lifecycle Data prep TrainDeploy Raw data 5

6. Problems • Reproduce experiments • Compare experiments • Fine tune previous experiments across teams • Share data, parameters or metrics • Deploy trained models 7

7. It is difficult to productionize and share 8

8. Custom ML Platforms • Facebook FBLearner, Google TFX, Uber Michelangelo • Advantages: – Standarise the ML loop • Disadvantages: – Limited to a few algorithms or frameworks – Tied to the company’s infrastructure 9

9. Are there any similar solutions in an open manner? 10

10. 11

11. Introducing MLFlow • It works with any ML library & language • It runs the same way anywhere • It is designed to be useful both for 1 person, small teams or big teams

12. MLFlow community 13

13. Supported Integrations: June’19 14

14. MLFlow components 15 Tracking Projects Models

15. Roadmap Introduction MLFlow Tracking MLFlow Projects MLFlow Models Conclusions

16. MLFlow Tracking • Record and query experiments: – Data – Code – Model parameters – Results (performance metrics) – Model 17 Tracking

17. Motivation 18 Centralised repository of useful information to analyse several runs of training.

18. How it is working 19

19. Key concepts • Parameters • Metrics • Tags and notes • Artifacts • Source • Version 20

20. Code example 21

21. 22 Demo

22. Roadmap Introduction MLFlow Tracking MLFlow Projects MLFlow Models Conclusions

23. MLFlow Projects • Packaging format for reproducible ML runs • Defines dependencies for reproducibility • Execution API for running projects locally or remote 24 Projects

24. Motivation 25 Diverse set of tools and environments involves difficulty to productionalize and share ML work

25. How it is working 26

26. Key concepts • MLproject file • Entry points • Environments – Conda – Docker – System • Run 27

27. Code example 28 mlflow run git@github.com:mlflow/mlflow-example.git -P alpha=0.5 mlflow run <uri> -m databricks --cluster-spec <json-cluster-spec>

28. 29 Demo

29. Roadmap Introduction MLFlow Tracking MLFlow Projects MLFlow Models Conclusions

30. MLFlow Models • Packaging format for ML Models • Defines dependencies for reproducibility • Deployment APIs 31 Models

31. Motivation 32 ML Frameworks Batch & Stream Scoring Serving tools Inference code

32. How it is working 33

33. Key concepts • MLmodel file • Storage format • Entry points • Flavours • Custom model 34

34. Code example 35

35. 36 Demo

36. Roadmap Introduction MLFlow Tracking MLFlow Projects MLFlow Models Conclusions

37. MLFlow rocks! • Log important parameters, metrics, and other data that is important to the machine learning model • Track the environment a model is run on • Run any machine learning codes on that environment • Deploy and export models to various platforms with multiple packaging formats 38

38. MLFlow 1.0 (4-jun) • Support for step tracking • Improved Search features • Batched logging of metrics • Support for HDFS • Windows support for the client • Build Docker images to deploy • ONNX model flavour 39

39. MLFlow last updates (1.4: 31-oct) • Windows support • Tags and descriptions • Google Cloud run models • Log directories as artifacts • CLI command to export to CSV • Keras compatibility with TF 2.0 • Model Registry in preview 40

40. Future of MLFlow 41

Editor's Notes

Backend stores are entity store (filestore and database) and artifact repository (s3, azure blob, Google cloud, dbfs)
Parameters: key-value inputs to your code Metrics: numeric values Tags and notes: additional information about a run Artifacts: arbitrary files, including data and models Source: code Version: version of the code
Backend stores are entity store (filestore and database) and artifact repository (s3, azure blob, Google cloud, dbfs)
Backend stores are entity store (filestore and database) and artifact repository (s3, azure blob, Google cloud, dbfs)
Conclusion Workflow tools can greatly simplify the ML lifecycle • Improve usability for both data scientists and engineers • Same way software dev lifecycle tools simplify development Learn more about MLflow at mlflow.org

Pythonsevilla2019 - Introduction to MLFlow

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Pythonsevilla2019 - Introduction to MLFlow

Similar to Pythonsevilla2019 - Introduction to MLFlow (20)

Recently uploaded

Recently uploaded (20)

Pythonsevilla2019 - Introduction to MLFlow

Editor's Notes