Build machine learning pipelines from research to production

•

0 likes•74 views

Streamlining your machine learning pipeline is critical for enterprise data science to deliver better business results. Accelerating the process from data, to processing to training to deployment and back again will help you get better performing models, faster. Watch the full presentation with audio and video here: https://info.cnvrg.io/build-machine-learning-pipelines This presentation will offer solutions to the common challenges data scientists and data engineers face when building a machine learning pipeline. We will dissect each part of the pipeline and offer strategies on how to design your machine learning pipelines for a more efficient, integrated and automated process. We’ll tackle ways to connect all your data sourcing in one unified location. How to create modular ML components for easy reproducibility, and automate MLOps for quick training of models and hyperparameter optimization. Streamline frequent deployment of models leveraging the power of Kubernetes. And lastly, you’ll learn to design a monitoring toolkit with Grafana and Kibana for easy CI/CD. Join Solutions Architect, Aaron Schneider as he builds and end-to-end machine learning pipeline, and explains how to optimize each part for a more efficient workflow. Key webinar takeaways: - Set up an efficient machine learning pipeline - Learn key MLOps solutions streamlining science and engineering - Create reusable ML components - Build a suite of monitoring and visualization tools - Instantly train and deploy ML models with Kubernetes - Use CI/CD to design an auto-adaptive machine learning pipeline Watch the full presentation here: https://info.cnvrg.io/build-machine-learning-pipelines

Data & Analytics

whoami
● Solutions Architect @ cnvrg.io
● = built by data scientists, for data scientists to help teams:
○ Get from data to models to production in the most eﬃcient and fast way
○ Bridge science and engineering
○ Automate MLOps
○ Help teams streamline every element of their pipelines
Aaron
Schneider
aaron@cnvrg.io
LinkedIn: azschneider

def agenda(30 mins):
● KPIs of a well built ML Pipeline
● ML Pipeline overview
● Pipeline stage issue analysis
○ Typical issues
○ Possible solutions
● Live demo

def pipeline():
● The goal behind a pipeline is ﬂow
● Information ﬂow easily and eﬃciently between parts of the ML Stack
● Automate the process
● Each step integrated with the next
● Simply execute and monitor
● Scale easily
● Minimize and automate DevOps tasks -> MLOps

pipeline.elements()
Data Processing Training MonitorDeployment

Research
pipeline.elements()
Data
Monitor + Retrain
Deployment
Processing
Training
01
02
0304
05

#Ask the right questions!!
● Major steps are similar between use cases
● By asking questions you can discover how YOUR pipeline can be optimized
● Are you doing it the best way under the circumstances?
● 2 types of questions:
Data Science related:
Questions to do with the Machine Learning task
Pipeline related:
Questions to do with the Machine Learning pipeline

pipeline.breakdown(data)
Data Science questions:
● Where are we collecting data?
○ Historical?
○ Live?
● Where will the data be stored?
● Streaming or batch or both?
○
Pipeline questions:
● How will all the data be integrated?
○ API calls?
○ NFS?
○ HDFS?
○ Other solutions?
● Should we and can we version our
data?

pipeline.breakdown(processing)
Data Science questions:
● What features are important?
● What shape should our data be in?
● What needs to be cleaned?
Pipeline questions:
● How will we automate the
processing?
● Which compute will we use for
processing?
○ Local?
○ Cloud?
○ On-premise?
● Should we use distributed compute
(eg Spark)?
● How can we easily leverage compute
for this step?

pipeline.breakdown(training)
Data Science questions:
● What model will we use?
○ Deep learning or classic
models?
● Will we be doing HPO?
● How will we compare models?
● What is our accuracy metric?
Pipeline questions:
● How will we parallelize HPO?
● Which compute will we use for
training?
○ Local?
○ Cloud?
○ On-premise?
● How can we manage artifacts and
models?
● How can we automate the
comparison of experiments?

pipeline.breakdown(deployment)
Data Science questions:
● Where are we deploying to?
● Batch or streaming?
● Can we autoscale?
Pipeline questions:
● How can we quickly deploy our best
model?
● Can this be automated?
○ Should this be automated?
● How can we track input/output?
Webinar: Deploy your ML models to
production with Kubernetes

pipeline.breakdown(monitor+retrain)
Data Science questions:
● What are we monitoring?
○ Logs?
○ Usage?
○ Demand?
● Can we detect accuracy?
Pipeline questions:
● How can we automatically trigger
retraining?
● How do we redeploy with zero
downtime?
● Can we take input data and export to
dataset?

pipeline.build()
● There are many tools that claim to help streamline your pipeline
● Often incomplete solutions
● Some require deep technical knowledge to even set up (defeats the
purpose?)
● Make sure to research tools properly
● You may ﬁnd yourself sinking time into setting up a tool that in the end
doesn’t help you

webinar.summary()
● Ask questions throughout the process
● Planning is key!
● Find tools that save you time and simplify your workﬂow
● Make sure to link each stage of your pipeline for maximum ﬂow
● The more seamless and eﬃcient the pipeline, the quicker you can build
and develop models

Thanks!
https://cnvrg.io
info@cnvrg.io
+972506600186

What's hot

DevTernity - OOP in the enterpriseNicolas Fränkel

Lightning talk how to edit the Silverstripe CMS docsMichaelPritchard21

Introduction to Scala by Piotr Wiśniowski ScalacScalac

projectsMonika Kelpšaitė

The Bleeding Edge - Whats New in Angular 2Lohith Goudagere Nagaraj

Matlab-Assignment-ProjectsPhdtopiccom

Network-Simulation-Tools-ComparisonPhdtopiccom

Reactive Programming with RxJavaGrand Parade Poland

Matlab-Programming-Homework-HelpPhdtopiccom

Matlab-Assignment-Help-IndiaPhdtopiccom

Intro to TypeScript, HTML5DevConf Oct 2013Matt Harrington

Entity framework advancedUsama Nada

Matlab-Master-Thesis-ProjectsPhdtopiccom

Matlab-Assignment-Help-QatarPhdtopiccom

End to-end test automation at scalemabl

From zero to test in 60 secondsHugh McCamphill

Buliding Reliable Data AppsGleb Mezhanskiy

Spring IO - Spring Boot for DevOpsNicolas Fränkel

ArashResumeOct15Arash Zahoory

Luna - How to build and maintain a github projectPanayiotis Arvanitis

What's hot (20)

DevTernity - OOP in the enterprise

Lightning talk how to edit the Silverstripe CMS docs

Introduction to Scala by Piotr Wiśniowski Scalac

projects

The Bleeding Edge - Whats New in Angular 2

Matlab-Assignment-Projects

Network-Simulation-Tools-Comparison

Reactive Programming with RxJava

Matlab-Programming-Homework-Help

Matlab-Assignment-Help-India

Intro to TypeScript, HTML5DevConf Oct 2013

Entity framework advanced

Matlab-Master-Thesis-Projects

Matlab-Assignment-Help-Qatar

End to-end test automation at scale

From zero to test in 60 seconds

Buliding Reliable Data Apps

Spring IO - Spring Boot for DevOps

ArashResumeOct15

Luna - How to build and maintain a github project

Similar to Build machine learning pipelines from research to production

“Houston, we have a model...” Introduction to MLOpsRui Quintino

Machine Learning Orchestration with AirflowAnant Corporation

MLOps for production-level machine learningcnvrg.io AI OS - Hands-on ML Workshops

Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Anant Corporation

Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfvitm11

Data Pipline Observability meetup Omid Vahdaty

Production ready big ml workflows from zero to hero daniel marcous @ wazeIdo Shilon

Machine learning in productionTuri, Inc.

Pipeline oriented data analyticsBorys Biletskyy

ICLR 2020 RecapSri Ambati

Michelangelo - Machine Learning Platform - 2018Karthik Murugesan

Using Machine Learning & Artificial Intelligence to Create Impactful Customer...Costanoa Ventures

Machine learning at scale - Webinar By zekeLabszekeLabs Technologies

Model Drift Monitoring using Tensorflow Model AnalysisVivek Raja P S

AI hype or realityAwantik Das

C2_W1---.pdfHumayun Kabir

PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...Gabriel Moreira

Aws autopilotVivek Raja P S

Tuning ML Models: Scaling, Workflows, and ArchitectureDatabricks

Similar to Build machine learning pipelines from research to production (20)

“Houston, we have a model...” Introduction to MLOps

Machine Learning Orchestration with Airflow

MLOps for production-level machine learning

Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...

Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf

Data Pipline Observability meetup

Production ready big ml workflows from zero to hero daniel marcous @ waze

Machine learning in production

Pipeline oriented data analytics

ICLR 2020 Recap

Michelangelo - Machine Learning Platform - 2018

Using Machine Learning & Artificial Intelligence to Create Impactful Customer...

Machine learning at scale - Webinar By zekeLabs

Model Drift Monitoring using Tensorflow Model Analysis

AI hype or reality

C2_W1---.pdf

PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...

Aws autopilot

Tuning ML Models: Scaling, Workflows, and Architecture

Recently uploaded

Call Girls in Saket 99530🔝 56974 Escort Service9953056974 Low Rate Call Girls In Saket, Delhi NCR

Decoding Loan Approval: Predictive Modeling in ActionBoston Institute of Analytics

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa

INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman

High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh

办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss

PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava

E-Commerce Order PredictionShraddha Kamble.pptxBoston Institute of Analytics

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster

Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863

Data Science Jobs and Salaries Analysis.pptxFurkanTasci3

VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor

Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster

20240419 - Measurecamp Amsterdam - SAM.pdfHuman37

From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat

Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408

Recently uploaded (20)

Call Girls in Saket 99530🔝 56974 Escort Service

Decoding Loan Approval: Predictive Modeling in Action

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf

INTERNSHIP ON PURBASHA COMPOSITE TEX LTD

High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...

办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一

PKS-TGC-1084-630 - Stage 1 Proposal.pptx

E-Commerce Order PredictionShraddha Kamble.pptx

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx

Dubai Call Girls Wifey O52&786472 Call Girls Dubai

Data Science Jobs and Salaries Analysis.pptx

VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...

Call Girls In Dwarka 9654467111 Escorts Service

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024

20240419 - Measurecamp Amsterdam - SAM.pdf

From idea to production in a day – Leveraging Azure ML and Streamlit to build...

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service

Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps

Build machine learning pipelines from research to production

2. whoami ● Solutions Architect @ cnvrg.io ● = built by data scientists, for data scientists to help teams: ○ Get from data to models to production in the most eﬃcient and fast way ○ Bridge science and engineering ○ Automate MLOps ○ Help teams streamline every element of their pipelines Aaron Schneider aaron@cnvrg.io LinkedIn: azschneider

3. def agenda(30 mins): ● KPIs of a well built ML Pipeline ● ML Pipeline overview ● Pipeline stage issue analysis ○ Typical issues ○ Possible solutions ● Live demo

4. def pipeline(): ● The goal behind a pipeline is flow ● Information flow easily and efficiently between parts of the ML Stack ● Automate the process ● Each step integrated with the next ● Simply execute and monitor ● Scale easily ● Minimize and automate DevOps tasks -> MLOps

5. pipeline.elements() Data Processing Training MonitorDeployment

6. Research pipeline.elements() Data Monitor + Retrain Deployment Processing Training 01 02 0304 05

7. #Ask the right questions!! ● Major steps are similar between use cases ● By asking questions you can discover how YOUR pipeline can be optimized ● Are you doing it the best way under the circumstances? ● 2 types of questions: Data Science related: Questions to do with the Machine Learning task Pipeline related: Questions to do with the Machine Learning pipeline

8. pipeline.breakdown(data) Data Science questions: ● Where are we collecting data? ○ Historical? ○ Live? ● Where will the data be stored? ● Streaming or batch or both? ○ Pipeline questions: ● How will all the data be integrated? ○ API calls? ○ NFS? ○ HDFS? ○ Other solutions? ● Should we and can we version our data?

9. pipeline.breakdown(processing) Data Science questions: ● What features are important? ● What shape should our data be in? ● What needs to be cleaned? Pipeline questions: ● How will we automate the processing? ● Which compute will we use for processing? ○ Local? ○ Cloud? ○ On-premise? ● Should we use distributed compute (eg Spark)? ● How can we easily leverage compute for this step?

10. pipeline.breakdown(training) Data Science questions: ● What model will we use? ○ Deep learning or classic models? ● Will we be doing HPO? ● How will we compare models? ● What is our accuracy metric? Pipeline questions: ● How will we parallelize HPO? ● Which compute will we use for training? ○ Local? ○ Cloud? ○ On-premise? ● How can we manage artifacts and models? ● How can we automate the comparison of experiments?

11. pipeline.breakdown(deployment) Data Science questions: ● Where are we deploying to? ● Batch or streaming? ● Can we autoscale? Pipeline questions: ● How can we quickly deploy our best model? ● Can this be automated? ○ Should this be automated? ● How can we track input/output? Webinar: Deploy your ML models to production with Kubernetes

12. pipeline.breakdown(monitor+retrain) Data Science questions: ● What are we monitoring? ○ Logs? ○ Usage? ○ Demand? ● Can we detect accuracy? Pipeline questions: ● How can we automatically trigger retraining? ● How do we redeploy with zero downtime? ● Can we take input data and export to dataset?

13. if not pipeline:

14. if pipeline:

15. pipeline.build() ● There are many tools that claim to help streamline your pipeline ● Often incomplete solutions ● Some require deep technical knowledge to even set up (defeats the purpose?) ● Make sure to research tools properly ● You may ﬁnd yourself sinking time into setting up a tool that in the end doesn’t help you

16. cnvrg.demo()

17. webinar.summary() ● Ask questions throughout the process ● Planning is key! ● Find tools that save you time and simplify your workflow ● Make sure to link each stage of your pipeline for maximum flow ● The more seamless and efficient the pipeline, the quicker you can build and develop models

18. Thanks! https://cnvrg.io info@cnvrg.io +972506600186

19. Next webinar!

Build machine learning pipelines from research to production

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Build machine learning pipelines from research to production

Similar to Build machine learning pipelines from research to production (20)

More from cnvrg.io AI OS - Hands-on ML Workshops

More from cnvrg.io AI OS - Hands-on ML Workshops (8)

Recently uploaded

Recently uploaded (20)

Build machine learning pipelines from research to production