Streamlining your machine learning pipeline is critical for enterprise data science to deliver better business results. Accelerating the process from data, to processing to training to deployment and back again will help you get better performing models, faster. Watch the full presentation with audio and video here: https://info.cnvrg.io/build-machine-learning-pipelines
This presentation will offer solutions to the common challenges data scientists and data engineers face when building a machine learning pipeline.
We will dissect each part of the pipeline and offer strategies on how to design your machine learning pipelines for a more efficient, integrated and automated process. We’ll tackle ways to connect all your data sourcing in one unified location. How to create modular ML components for easy reproducibility, and automate MLOps for quick training of models and hyperparameter optimization. Streamline frequent deployment of models leveraging the power of Kubernetes. And lastly, you’ll learn to design a monitoring toolkit with Grafana and Kibana for easy CI/CD. Join Solutions Architect, Aaron Schneider as he builds and end-to-end machine learning pipeline, and explains how to optimize each part for a more efficient workflow.
Key webinar takeaways:
- Set up an efficient machine learning pipeline
- Learn key MLOps solutions streamlining science and engineering
- Create reusable ML components
- Build a suite of monitoring and visualization tools
- Instantly train and deploy ML models with Kubernetes
- Use CI/CD to design an auto-adaptive machine learning pipeline
Watch the full presentation here: https://info.cnvrg.io/build-machine-learning-pipelines
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Build machine learning pipelines from research to production
1.
2. whoami
● Solutions Architect @ cnvrg.io
● = built by data scientists, for data scientists to help teams:
○ Get from data to models to production in the most efficient and fast way
○ Bridge science and engineering
○ Automate MLOps
○ Help teams streamline every element of their pipelines
Aaron
Schneider
aaron@cnvrg.io
LinkedIn: azschneider
3. def agenda(30 mins):
● KPIs of a well built ML Pipeline
● ML Pipeline overview
● Pipeline stage issue analysis
○ Typical issues
○ Possible solutions
● Live demo
4. def pipeline():
● The goal behind a pipeline is flow
● Information flow easily and efficiently between parts of the ML Stack
● Automate the process
● Each step integrated with the next
● Simply execute and monitor
● Scale easily
● Minimize and automate DevOps tasks -> MLOps
7. #Ask the right questions!!
● Major steps are similar between use cases
● By asking questions you can discover how YOUR pipeline can be optimized
● Are you doing it the best way under the circumstances?
● 2 types of questions:
Data Science related:
Questions to do with the Machine Learning task
Pipeline related:
Questions to do with the Machine Learning pipeline
8. pipeline.breakdown(data)
Data Science questions:
● Where are we collecting data?
○ Historical?
○ Live?
● Where will the data be stored?
● Streaming or batch or both?
○
Pipeline questions:
● How will all the data be integrated?
○ API calls?
○ NFS?
○ HDFS?
○ Other solutions?
● Should we and can we version our
data?
9. pipeline.breakdown(processing)
Data Science questions:
● What features are important?
● What shape should our data be in?
● What needs to be cleaned?
Pipeline questions:
● How will we automate the
processing?
● Which compute will we use for
processing?
○ Local?
○ Cloud?
○ On-premise?
● Should we use distributed compute
(eg Spark)?
● How can we easily leverage compute
for this step?
10. pipeline.breakdown(training)
Data Science questions:
● What model will we use?
○ Deep learning or classic
models?
● Will we be doing HPO?
● How will we compare models?
● What is our accuracy metric?
Pipeline questions:
● How will we parallelize HPO?
● Which compute will we use for
training?
○ Local?
○ Cloud?
○ On-premise?
● How can we manage artifacts and
models?
● How can we automate the
comparison of experiments?
11. pipeline.breakdown(deployment)
Data Science questions:
● Where are we deploying to?
● Batch or streaming?
● Can we autoscale?
Pipeline questions:
● How can we quickly deploy our best
model?
● Can this be automated?
○ Should this be automated?
● How can we track input/output?
Webinar: Deploy your ML models to
production with Kubernetes
12. pipeline.breakdown(monitor+retrain)
Data Science questions:
● What are we monitoring?
○ Logs?
○ Usage?
○ Demand?
● Can we detect accuracy?
Pipeline questions:
● How can we automatically trigger
retraining?
● How do we redeploy with zero
downtime?
● Can we take input data and export to
dataset?
15. pipeline.build()
● There are many tools that claim to help streamline your pipeline
● Often incomplete solutions
● Some require deep technical knowledge to even set up (defeats the
purpose?)
● Make sure to research tools properly
● You may find yourself sinking time into setting up a tool that in the end
doesn’t help you
17. webinar.summary()
● Ask questions throughout the process
● Planning is key!
● Find tools that save you time and simplify your workflow
● Make sure to link each stage of your pipeline for maximum flow
● The more seamless and efficient the pipeline, the quicker you can build
and develop models