Successfully reported this slideshow.
Your SlideShare is downloading. ×

Building Notebook-based AI Pipelines with Elyra and Kubeflow

Ad

Notebook-based AI Pipelines with
Elyra and Kubeflow
Nick Pentreath
Principal Engineer, IBM
@MLnick

Ad

About
DEG / Nov 18, 2020 / © 2020 IBM Corporation
– @MLnick on Twitter, Github, LinkedIn
– Principal Engineer, IBM CODAIT ...

Ad

Improving the Enterprise AI Lifecycle in Open Source
DEG / Nov 18, 2020 / © 2020 IBM Corporation 3
– CODAIT aims to make A...

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Check these out next

1 of 23 Ad
1 of 23 Ad

Building Notebook-based AI Pipelines with Elyra and Kubeflow

Download to read offline

A typical machine learning pipeline begins as a series of preprocessing steps followed by experimentation, optimization and model-tuning, and, finally deployment. Jupyter notebooks have become a hugely popular tool for data scientists and other machine learning practitioners to explore and experiment as part of this workflow, due to the flexibility and interactivity they provide. However, with notebooks it is often a challenge to move from the experimentation phase to creating a robust, modular and production-grade end-to-end AI pipeline.

A typical machine learning pipeline begins as a series of preprocessing steps followed by experimentation, optimization and model-tuning, and, finally deployment. Jupyter notebooks have become a hugely popular tool for data scientists and other machine learning practitioners to explore and experiment as part of this workflow, due to the flexibility and interactivity they provide. However, with notebooks it is often a challenge to move from the experimentation phase to creating a robust, modular and production-grade end-to-end AI pipeline.

Advertisement
Advertisement

More Related Content

Slideshows for you (18)

Similar to Building Notebook-based AI Pipelines with Elyra and Kubeflow (20)

Advertisement

More from Databricks (20)

Advertisement

Building Notebook-based AI Pipelines with Elyra and Kubeflow

  1. 1. Notebook-based AI Pipelines with Elyra and Kubeflow Nick Pentreath Principal Engineer, IBM @MLnick
  2. 2. About DEG / Nov 18, 2020 / © 2020 IBM Corporation – @MLnick on Twitter, Github, LinkedIn – Principal Engineer, IBM CODAIT (Center for Open-Source Data & AI Technologies) – Machine Learning & AI – Apache Spark committer & PMC – Author of Machine Learning with Spark – Various conferences & meetups 2
  3. 3. Improving the Enterprise AI Lifecycle in Open Source DEG / Nov 18, 2020 / © 2020 IBM Corporation 3 – CODAIT aims to make AI solutions dramatically easier to create, deploy, and manage in the enterprise. – We contribute to and advocate for the open-source technologies that are foundational to IBM’s AI offerings. – 30+ open-source developers! Center for Open Source Data & AI Technologies codait.org CODAIT Open Source @ IBM
  4. 4. Agenda 4 – Machine learning workflow – JupyerLab & Elyra – Demo – Conclusion DEG / Nov 18, 2020 / © 2020 IBM Corporation
  5. 5. Machine Learning Workflow 5 Data Analyze Process Train Deploy Predict & Maintain DEG / Nov 18, 2020 / © 2020 IBM Corporation
  6. 6. Workflow spans teams … 6 Data Analyze Process Train Deploy Predict & Maintain DEG / Nov 18, 2020 / © 2020 IBM Corporation Data Engineers Data Scientists & Researchers Machine Learning & Production Engineers
  7. 7. … and tools 7 Data Analyze Process Train Deploy DEG / Nov 18, 2020 / © 2020 IBM Corporation Data formats • CSV, SQL • JSON, Parquet, AVRO • Binary (image, audio) • … Data Engineers Data Scientists & Researchers Machine Learning & Production Engineers Analysis & data viz • ggplot • dplyr • matplotlib • Pandas • SparkSQL • … Pre-processing & pipelines • dplyr • pandas • scikit-learn • SparkSQL / SparkML • … Frameworks • R, scikit- learn • SparkML • TensorFlow • PyTorch • LightGBM, XGBoost • … Formats & mechanisms • Variety of formats • Containers • …
  8. 8. Iteration & Experimentation 8 Data Analyze Process Train Deploy DEG / Nov 18, 2020 / © 2020 IBM Corporation Data Scientists & Researchers Load Clean Explore Interpret Refine
  9. 9. Iteration & Experimentation 9 Data Process Train Deploy DEG / Nov 18, 2020 / © 2020 IBM Corporation Data Scientists & Researchers Extract features Pre- process Train Evaluate Refine Analyze
  10. 10. Interactive Notebooks DEG / Nov 18, 2020 / © 2020 IBM Corporation 10 Notebooks have become the de-facto standard for content-rich, interactive & iterative work * Logos trademarks of their respective projects
  11. 11. Elyra Overview DEG / Nov 18, 2020 / © 2020 IBM Corporation 11 Elyra is a set of AI- centric extensions to JupyterLab Notebooks * Logos trademarks of their respective projects
  12. 12. Elyra Key Features DEG / Nov 18, 2020 / © 2020 IBM Corporation 12 – Visual Pipeline Editor Visual editor for building AI pipelines, enabling the conversion of multiple notebooks into batch jobs or workflows. – Notebooks as batch jobs – Python script execution – Automated Table of Contents – Code Snippets – Git integration
  13. 13. Elyra Key Features DEG / Nov 18, 2020 / © 2020 IBM Corporation 13 – Visual Pipeline Editor – Notebooks as batch jobs Extends the notebook UI to simplify the submission of notebooks as a batch job for model training – Python script execution – Automated Table of Contents – Code Snippets – Git integration
  14. 14. Elyra Key Features DEG / Nov 18, 2020 / © 2020 IBM Corporation 14 – Visual Pipeline Editor – Notebooks as batch jobs – Python script execution Edit and execute python scripts against local or cloud-based resources – Automated Table of Contents – Code Snippets – Git integration
  15. 15. Elyra Key Features DEG / Nov 18, 2020 / © 2020 IBM Corporation 15 – Visual Pipeline Editor – Notebooks as batch jobs – Python script execution – Automated Table of Contents Generate & navigate table of contents from notebooks & python scripts – Code Snippets – Git integration
  16. 16. Elyra Key Features DEG / Nov 18, 2020 / © 2020 IBM Corporation 16 – Visual Pipeline Editor – Notebooks as batch jobs – Python script execution – Automated Table of Contents – Code Snippets Easy creation and insertion of reusable code snippets for various languages – Git integration
  17. 17. Elyra Key Features DEG / Nov 18, 2020 / © 2020 IBM Corporation 17 – Visual Pipeline Editor – Notebooks as batch jobs – Python script execution – Automated Table of Contents – Code Snippets – Git integration Track project changes and share among teammates
  18. 18. DEG / Nov 18, 2020 / © 2020 IBM Corporation Getting started with Elyra 1. Try Elyra from Binder ibm.biz/elyra-demo 2. Run Elyra from Docker ibm.biz/elyra-docker-installation 3. Install Elyra on your local machine ibm.biz/elyra-installation 18
  19. 19. DEG / Nov 18, 2020 / © 2020 IBM Corporation 19
  20. 20. Start using Elyra today! Getting started with Elyra ibm.biz/elyra-installation Elyra on Github github.com/elyra-ai/elyra Elyra Notebook projects on Github github.com/CODAIT/flight-delay-notebooks github.com/CODAIT/covid-notebooks Contributing to the projects • Star and fork, submit bug reports, suggest improvements, help with code reviews, join our community meetings ibm.biz/elyra-demo gitter.im/elyra-ai/community DEG / Nov 18, 2020 / © 2020 IBM Corporation 20
  21. 21. Thank you codait.org twitter.com/codait_org github.com/CODAIT developer.ibm.com 21DEG / Nov 18, 2020 / © 2020 IBM Corporation Check out the Data Asset Exchange https://ibm.biz/data-exchange Sign up for IBM Cloud https://ibm.biz/Bdqkfg
  22. 22. DEG / Nov 18, 2020 / © 2020 IBM Corporation 22
  23. 23. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.

×