꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199012 ☎️ Hard And Sexy Vip Call
Kubeflow.pptx
1.
2. Few of the major challenges in the enterprise machine learning practitioner space
• Data scientists tend to prefer their own unique cluster of libraries and tools.
• Data scientists needs to interoperate with the rest of the enterprise for resources, data, and model
deployment.
• Many organization using machine learning locally but unable to deploy it to production or able to deploy
models but unable to manage them effectively
• Data science work consists largely of scripts, as opposed to full-fledged applications, and this makes
deploying them as a production workflow from scratch that much harder
• Need : An operating a platform to support the different constituents using the system is hard.
A Shared Multitenant Machine Learning Environment
3. Few of the major challenges in the enterprise machine learning practitioner space
• Common developper stack
• Jupyter Notebook
• Python code
• Many of their tools are Python-based, where Python environments prove difficult to manage across a
large organization
• Dependency management with containers, typically Docker
• Many times, data scientists want to build containers and then just “move them around” to take
advantage of more powerful resources beyond their laptop, either in the on-premise datacenter or in
the cloud.
• Wrapping Python machine learning models in containers is how many people start putting
initial models into production.
-> deploying the container on a pod on Kubernetes is a natural next step.
4. Solution
An operational pattern similar to the “lab and the factory” pattern
We want our data scientists to be able to explore new data ideas in “the lab,” and once they find a setup
they like, move it to “the factory” so that the modeling workflow can be run consistently and in a
reproducible manner
• needs to build a multitenant machine learning platform,
.
5. Kubeflow
• Kubeflow solves the problem of how to take machine learning from research to production.
• Kubeflow is a collection of cloud native tools for all of the stages of MDLC Model development life
cycle :
• data exploration : Notebooks / Jupyter
• feature preparation : Apache Spark / TensorFlow Transform
• model training/tuning : supports a variety of distributed training frameworks. (TensorFlow, PyTorch, Apache MXNet,
XGBoost
• Hyperparameter Tuning :
• and model versioning
• Two key systems that are great examples of the value Kubeflow provides beyond Kubernetes are
the notebook system in Kubeflow, and Kubeflow Pipelines.
6. Kubeflow
• Composability : Gives the freedom to mix and match the best tools for the job.
• Portability : Write once, reproduce and run everywhere
• Scalability : important as your dataset grows
7. Team alignment
• Line of business : The part of the company that intends to use the results of the machine learning
model to produce revenue for the organization
• DevOps : The group responsible for making sure the platform is operational and secure
• Data engineering : The group responsible for getting data from the system of record (or data
warehouse) and converting it into a form for the data science team to use
• Data science : The team responsible for building and testing the machine learning model
8. Hidden Technical Debt in Machine Learning Systems
D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips
12. Kubeflow Pipeline
• One of the most important componenet in kubeflow plateform.
• It convert a notebook into a pipeline
• A kubeflow pipeline :
• Write ML code, dockerize, puis utiliser kubeflow DSL pour écrire, compiler, uploader et executer
le pipeline,
Une plateforme de machine learning, conçu comme un moyen d'exécuter des workflow de machine learning sur kubernetes
Une plateforme de machine learning, conçu comme un moyen d'exécuter des workflow de machine learning sur kubernetes
ce schéma provient d'un article scientifique publié en 2015 qui illustre que le code ml n'est qu'une petite partie d'un processus plus étendu et complexe, Ce qu’on voit autour c’est tout ce qu'il faut faire pour passer en production (monitoring, logging, testing, resource management ….
ML tools : A ce niveau, kubeflow permet de travailler avec tout type de frawework ou de librarie de ML, grace nottamment au mécanisme d’opérateur de kubernetes.
Un opérateur Kubernetes est un contrôleur spécifique à une application qui permet d'étendre les fonctions de l'API Kubernetes afin de créer, configurer et gérer des instances d'applications complexes au nom d'un utilisateur Kubernetes.
La deuxième couche c’est vraiement le Coeur de workfkow, et qui de divise en deux parties :
Une première partie avec des capabilities comme :
Kale : qui permet de convertir des worlkfkow en pipeline
Pipeline en lui même qui permet d’orchester le workflow
Jupyter Notebooks :
Katib:
Kubeflow UI :visually performs tasks such as creation des Pipelines, lancer des job pour l’ optimization des hyperparameter avecKatib, ou bien lancer des servers Jupyter Notebook.
Une deuxième sous la forme d’adds pour le monitoring, le service MESH
La dernière couche c’est la couche plateforme : kubeflow est cloud agnostic et peut aussi bien s’éxécuter sur une machine el local, que sur des solutions on prem ( kubeone) ou un CSP
A high level overwiew of what we are going to implement
We start by a jupyter notebook, ( a local environment where you can develop your models or your training algorithms,)
Then, use kale, to annotate notebook cells to convert this notebook to a scalable pipeline
Then spin up a hyperparameter tuning job ( using katib) tp run hundreds or thousands of parallel pipeline
Once done, use again kale to select the best model from the resultant workload
Then, serve this model for everyone to use,
Each step is backed by PVC ( not un kubernative !!)
First step : create a development environment ( new notebook server ) : a full jupyter environment ready to use
Kale va aider à ajouter des annotations au niveau du notebook jupyter, pour les dépendences et les
becausData science and machine learning are inherently pipeline processes
Un ensemble d’étape à reproduire chaque fois