Scaling AI/ML with Containers and Kubernetes Tushar Katarki
AI is popular and yet faces several challenges in the industry: 1) self-service and automation 2) Deployment into production 3) Access to data. These challenges can be addressed with containers and Kubernetes. They help you build AI-as-a-service with open source tools and Kuberentes. Data Scientists can use the service for data, experimentation and to deliver models into production iteratively with self-service and automation. Using Kubernetes, one is able to run massive machine learning pipelines iteratively in an automated fashion that can be repeated.
Data Scientists and Machine Learning practitioners, nowadays, seem to be churning out models by the dozen and they continuously experiment to find ways to improve their accuracies. They also use a variety of ML and DL frameworks & languages , and a typical organization may find that this results in a heterogenous, complicated bunch of assets that require different types of runtimes, resources and sometimes even specialized compute to operate efficiently.
But what does it mean for an enterprise to actually take these models to "production" ? How does an organization scale inference engines out & make them available for real-time applications without significant latencies ? There needs to be different techniques for batch (offline) inferences and instant, online scoring. Data needs to be accessed from various sources and cleansing, transformations of data needs to be enabled prior to any predictions. In many cases, there maybe no substitute for customized data handling with scripting either.
Enterprises also require additional auditing and authorizations built in, approval processes and still support a "continuous delivery" paradigm whereby a data scientist can enable insights faster. Not all models are created equal, nor are consumers of a model - so enterprises require both metering and allocation of compute resources for SLAs.
In this session, we will take a look at how machine learning is operationalized in IBM Data Science Experience (DSX), a Kubernetes based offering for the Private Cloud and optimized for the HortonWorks Hadoop Data Platform. DSX essentially brings in typical software engineering development practices to Data Science, organizing the dev->test->production for machine learning assets in much the same way as typical software deployments. We will also see what it means to deploy, monitor accuracies and even rollback models & custom scorers as well as how API based techniques enable consuming business processes and applications to remain relatively stable amidst all the chaos.
Speaker
Piotr Mierzejewski, Program Director Development IBM DSX Local, IBM
ODSC East 2020 Accelerate ML Lifecycle with Kubernetes and Containerized Da...Abhinav Joshi
This deck provide an overview of containers and Kubernetes, and how these technologies can help solve the challenges faced by data scientists, ML engineers, and application developers. Next, it showcases the key capabilities required in a containers and kubernetes platform to help data scientists easily use technologies like Jupyter Notebooks, ML frameworks, programming languages to innovate faster. Finally it discusses the available platform options (e.g. KubeFlow, Open Data Hub, etc.), and some examples of how data scientists are accelerating their ML initiatives with containers and kubernetes platform.
Architecting an Open Source AI Platform 2018 editionDavid Talby
How to build a scalable AI platform using open source software. The end-to-end architecture covers data integration, interactive queries & visualization, machine learning & deep learning, deploying models to production, and a full 24x7 operations toolset in a high-compliance environment.
Software engineering practices for the data science and machine learning life...DataWorks Summit
With the advent of newer frameworks and toolkits, data scientists are now more productive than ever and starting to prove indispensable to enterprises. Typical organizations have large teams of data scientists who build out key analytics assets that are used on a daily basis and an integral part of live transactions. However, there is also quite a lot of chaos and complexities that get introduced because of the state of the industry. Many packages used by data scientists are from open source, and even if they are well curated, there is a growing tendency to pick out the cutting-edge or unstable packages and frameworks to accelerate analytics. Different data scientists may use different versions of runtimes, different Python or R versions, or even different versions of the same packages. Predominantly data scientists work on their laptops and it becomes difficult to reproduce their environments for use by others. Since data science is now a team sport across multiple personas, involving non-practitioners, traditional application developers, execs, and IT operators, how does an enterprise create a platform for productive cross-role collaboration?
Enterprises need a very reliable and repeatable process, especially when it results in something that affects their production environments. They also require a well managed approach that enables the graduation of an asset from development through a testing and staging process to production. Given the pace of businesses nowadays, the process needs to be quite agile and flexible too—even enabling an easy path to reversing a change. Compliance and audit processes require clear lineage and history as well as approval chains.
In the traditional software engineering world, this lifecycle has been well understood and best practices have been followed for ages. But what does it mean when you have non-programmers or users who are not really trained in software engineering philosophies or who perceive all of this as "big process" roadblocks in their daily work ? How do you we engage them in a productive manner and yet support enterprise requirements for reliability, tracking, and a clear continuous integration and delivery practice? The presenters, in this session, will bring up interesting techniques based on their user research, real life customer interviews, and productized best practices. The presenters also invite the audience to share their stories and best practices to make this a lively conversation.
Speaker
Sriram Srinivasan, Senior Technical Staff Member, Analytics Platform Architect, IBM
The ODAHU project is focused on creating services, extensions for third party systems and tools which help to accelerate building enterprise level systems with automated AI/ML models life cycle.
mlflow: Accelerating the End-to-End ML lifecycleDatabricks
Building and deploying a machine learning model can be difficult to do once. Enabling other data scientists (or yourself, one month later) to reproduce your pipeline, to compare the results of different versions, to track what’s running where, and to redeploy and rollback updated models is much harder.
In this talk, I’ll introduce MLflow, a new open source project from Databricks that simplifies the machine learning lifecycle. MLflow provides APIs for tracking experiment runs between multiple users within a reproducible environment, and for managing the deployment of models to production. MLflow is designed to be an open, modular platform, in the sense that you can use it with any existing ML library and development process. MLflow was launched in June 2018 and has already seen significant community contributions, with over 50 contributors and new features including language APIs, integrations with popular ML libraries, and storage backends. I’ll show how MLflow works and explain how to get started with MLflow.
Scaling AI/ML with Containers and Kubernetes Tushar Katarki
AI is popular and yet faces several challenges in the industry: 1) self-service and automation 2) Deployment into production 3) Access to data. These challenges can be addressed with containers and Kubernetes. They help you build AI-as-a-service with open source tools and Kuberentes. Data Scientists can use the service for data, experimentation and to deliver models into production iteratively with self-service and automation. Using Kubernetes, one is able to run massive machine learning pipelines iteratively in an automated fashion that can be repeated.
Data Scientists and Machine Learning practitioners, nowadays, seem to be churning out models by the dozen and they continuously experiment to find ways to improve their accuracies. They also use a variety of ML and DL frameworks & languages , and a typical organization may find that this results in a heterogenous, complicated bunch of assets that require different types of runtimes, resources and sometimes even specialized compute to operate efficiently.
But what does it mean for an enterprise to actually take these models to "production" ? How does an organization scale inference engines out & make them available for real-time applications without significant latencies ? There needs to be different techniques for batch (offline) inferences and instant, online scoring. Data needs to be accessed from various sources and cleansing, transformations of data needs to be enabled prior to any predictions. In many cases, there maybe no substitute for customized data handling with scripting either.
Enterprises also require additional auditing and authorizations built in, approval processes and still support a "continuous delivery" paradigm whereby a data scientist can enable insights faster. Not all models are created equal, nor are consumers of a model - so enterprises require both metering and allocation of compute resources for SLAs.
In this session, we will take a look at how machine learning is operationalized in IBM Data Science Experience (DSX), a Kubernetes based offering for the Private Cloud and optimized for the HortonWorks Hadoop Data Platform. DSX essentially brings in typical software engineering development practices to Data Science, organizing the dev->test->production for machine learning assets in much the same way as typical software deployments. We will also see what it means to deploy, monitor accuracies and even rollback models & custom scorers as well as how API based techniques enable consuming business processes and applications to remain relatively stable amidst all the chaos.
Speaker
Piotr Mierzejewski, Program Director Development IBM DSX Local, IBM
ODSC East 2020 Accelerate ML Lifecycle with Kubernetes and Containerized Da...Abhinav Joshi
This deck provide an overview of containers and Kubernetes, and how these technologies can help solve the challenges faced by data scientists, ML engineers, and application developers. Next, it showcases the key capabilities required in a containers and kubernetes platform to help data scientists easily use technologies like Jupyter Notebooks, ML frameworks, programming languages to innovate faster. Finally it discusses the available platform options (e.g. KubeFlow, Open Data Hub, etc.), and some examples of how data scientists are accelerating their ML initiatives with containers and kubernetes platform.
Architecting an Open Source AI Platform 2018 editionDavid Talby
How to build a scalable AI platform using open source software. The end-to-end architecture covers data integration, interactive queries & visualization, machine learning & deep learning, deploying models to production, and a full 24x7 operations toolset in a high-compliance environment.
Software engineering practices for the data science and machine learning life...DataWorks Summit
With the advent of newer frameworks and toolkits, data scientists are now more productive than ever and starting to prove indispensable to enterprises. Typical organizations have large teams of data scientists who build out key analytics assets that are used on a daily basis and an integral part of live transactions. However, there is also quite a lot of chaos and complexities that get introduced because of the state of the industry. Many packages used by data scientists are from open source, and even if they are well curated, there is a growing tendency to pick out the cutting-edge or unstable packages and frameworks to accelerate analytics. Different data scientists may use different versions of runtimes, different Python or R versions, or even different versions of the same packages. Predominantly data scientists work on their laptops and it becomes difficult to reproduce their environments for use by others. Since data science is now a team sport across multiple personas, involving non-practitioners, traditional application developers, execs, and IT operators, how does an enterprise create a platform for productive cross-role collaboration?
Enterprises need a very reliable and repeatable process, especially when it results in something that affects their production environments. They also require a well managed approach that enables the graduation of an asset from development through a testing and staging process to production. Given the pace of businesses nowadays, the process needs to be quite agile and flexible too—even enabling an easy path to reversing a change. Compliance and audit processes require clear lineage and history as well as approval chains.
In the traditional software engineering world, this lifecycle has been well understood and best practices have been followed for ages. But what does it mean when you have non-programmers or users who are not really trained in software engineering philosophies or who perceive all of this as "big process" roadblocks in their daily work ? How do you we engage them in a productive manner and yet support enterprise requirements for reliability, tracking, and a clear continuous integration and delivery practice? The presenters, in this session, will bring up interesting techniques based on their user research, real life customer interviews, and productized best practices. The presenters also invite the audience to share their stories and best practices to make this a lively conversation.
Speaker
Sriram Srinivasan, Senior Technical Staff Member, Analytics Platform Architect, IBM
The ODAHU project is focused on creating services, extensions for third party systems and tools which help to accelerate building enterprise level systems with automated AI/ML models life cycle.
mlflow: Accelerating the End-to-End ML lifecycleDatabricks
Building and deploying a machine learning model can be difficult to do once. Enabling other data scientists (or yourself, one month later) to reproduce your pipeline, to compare the results of different versions, to track what’s running where, and to redeploy and rollback updated models is much harder.
In this talk, I’ll introduce MLflow, a new open source project from Databricks that simplifies the machine learning lifecycle. MLflow provides APIs for tracking experiment runs between multiple users within a reproducible environment, and for managing the deployment of models to production. MLflow is designed to be an open, modular platform, in the sense that you can use it with any existing ML library and development process. MLflow was launched in June 2018 and has already seen significant community contributions, with over 50 contributors and new features including language APIs, integrations with popular ML libraries, and storage backends. I’ll show how MLflow works and explain how to get started with MLflow.
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsAnyscale
Apache Spark has rapidly become a key tool for data scientists to explore, understand and transform massive datasets and to build and train advanced machine learning models. The question then becomes, how do I deploy these model to a production environment? How do I embed what I have learned into customer facing data applications?
In this webinar, we will discuss best practices from Databricks on
how our customers productionize machine learning models
do a deep dive with actual customer case studies,
show live tutorials of a few example architectures and code in Python, Scala, Java and SQL.
Machine Learning operations brings data science to the world of devops. Data scientists create models on their workstations. MLOps adds automation, validation and monitoring to any environment including machine learning on kubernetes. In this session you hear about latest developments and see it in action.
In this deck from the 2019 UK HPC Conference, Glyn Bowden from HPE presents: The Eco-System of AI and How to Use It.
"This presentation walks through HPE's current view on AI applications, where it is driving outcomes and innovation, and where the challenges lay. We look at the eco-system that sits around an AI project and look at ways this can impact the success of the endeavor."
Watch the video: https://wp.me/p3RLHQ-kVS
Learn more: https://www.hpe.com/us/en/solutions/artificial-intelligence.html
and
http://hpcadvisorycouncil.com/events/2019/uk-conference/agenda.php
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this session, we will take a deep-dive into the DevOps process that comes with Azure Machine Learning service, a cloud service that you can use to track as you build, train, deploy and manage models. We zoom into how the data science process can be made traceable and deploy the model with Azure DevOps to a Kubernetes cluster.
At the end of this session, you will have a good grasp of the technological building blocks of Azure machine learning services and can bring a machine learning project safely into production.
S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...Henry Saputra
The Krylov Project is the key component in eBay's AI Platform initiative that provides an easy to use, open, and fast AI orchestration engine that is deployed as managed services in eBay cloud.
Using Krylov, AI scientists can access eBay's massive datasets; build and train AI models; spin up powerful compute (high-memory or GPU instances) on the Krylov compute cluster; and set up machine learning pipelines, such as using declarative constructs that stitch together pipeline lifecycle.
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...Big Data Value Association
The main goal of the session is to showcase approaches that greatly simplify the work of a data analyst when performing data analytics, or when employing machine learning algorithms, over Big Data. The session will include presentations on
(a) How data analytics workflows can be easily and graphically composed, and then optimized for execution,
(b) How raw data with great variety can be easily queried using SQL interfaces, and
(c) How complex machine learning operations can be performed efficiently in distributed settings.
After these presentations, the speakers will participate in a discussion with the audience, in order to discuss further tools that could make the work of a data analyst more simple.
This slide was used in ISO/IEC JTC1 SC36 Plenary Meeting in June 22, 2015.
Title of this slide is 'Proof of Concept for Learning Analytics Interoperability and subtitle is 'Reference Model based on open source SW'.
The Open Data Lake Platform Brief - Data Sheets | WhitepaperVasu S
An open data lake platform provides a robust and future-proof data management paradigm to support a wide range of data processing needs, including data exploration, ad-hoc analytics, streaming analytics, and machine learning.
Hydrosphere.io Platform for AI/ML Operations AutomationRustem Zakiev
Simple and robust ML models deployment
Automated versioning
Easy models and versions management
Score the model from your app or microservice via REST, gRPC or Kafka stream API.
A/B and Canary testing on production traffic.
Hot-wing bumpless model replacement in production pipeline
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...Databricks
ML development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools, and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models. To address these problems, many companies are building custom “ML platforms” that automate this lifecycle, but even these platforms are limited to a few supported algorithms and to each company’s internal infrastructure. In this session, we introduce MLflow, a new open source project from Databricks that aims to design an open ML platform where organizations can use any ML library and development tool of their choice to reliably build and share ML applications. MLflow introduces simple abstractions to package reproducible projects, track results, and encapsulate models that can be used with many existing tools, accelerating the ML lifecycle for organizations of any size. In this deep-dive session, through a complete ML model life-cycle example, you will walk away with:
MLflow concepts and abstractions for models, experiments, and projects
How to get started with MLFlow
Understand aspects of MLflow APIs
Using tracking APIs during model training
Using MLflow UI to visually compare and contrast experimental runs with different tuning parameters and evaluate metrics
Package, save, and deploy an MLflow model
Serve it using MLflow REST API
What’s next and how to contribute
Use of standards and related issues in predictive analyticsPaco Nathan
My presentation at KDD 2016 in SF, in the "Special Session on Standards in Predictive Analytics In the Era of Big and Fast Data" morning track about PMML and PFA http://dmg.org/kdd2016.html
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...Akash Tandon
ML solutions in production start from data ingestion and extend upto the actual deployment step. We want this workflow to be scalable, portable and simple. Containers and kubernetes are great at the former two but not the latter if you aren't a devops practitioner. We'll explore how you can leverage the Kubeflow project to deploy best-of-breed open-source systems for ML to diverse infrastructures.
Developing and deploying AI solutions on the cloud using Team Data Science Pr...Debraj GuhaThakurta
Presented at: Global Big AI Conference, Santa Clara, Jan 2018 Developing and deploying AI solutions on the cloud using Team Data Science Process (TDSP) and Azure Machine Learning (AML)
Metaflow: The ML Infrastructure at NetflixBill Liu
Metaflow was started at Netflix to answer a pressing business need: How to enable an organization of data scientists, who are not software engineers by training, build and deploy end-to-end machine learning workflows and applications independently. We wanted to provide the best possible user experience for data scientists, allowing them to focus on parts they like (modeling using their favorite off-the-shelf libraries) while providing robust built-in solutions for the foundational infrastructure: data, compute, orchestration, and versioning.
Today, the open-source Metaflow powers hundreds of business-critical ML projects at Netflix and other companies from bioinformatics to real estate.
In this talk, you will learn about:
- What to expect from a modern ML infrastructure stack.
- Using Metaflow to boost the productivity of your data science organization, based on lessons learned from Netflix.
- Deployment strategies for a full stack of ML infrastructure that plays nicely with your existing systems and policies.
https://www.aicamp.ai/event/eventdetails/W2021080510
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.Luciano Resende
In this session Luciano will explore the different projects that compose the Jupyter ecosystem; including Jupyter Notebooks, JupyterLab, JupyterHub and Jupyter Enterprise Gateway. Jupyter Notebooks are the current open standard for data science and AI model development, and IBM is dedicated to contributing to their success and adoption. Continuing the trend of building out the Jupyter ecosystem, Luciano will introduce Elyra. It's a project built to extend JupyterLab with AI-centric capabilities. He'll showcase the extensions that allow you to build Notebook Pipelines, execute notebooks as batch jobs, navigate and execute Python scripts, and tie neatly into Notebook versioning.
New Explore Careers and College Majors 2024.pdfDr. Mary Askew
Explore Careers and College Majors is a new online, interactive, self-guided career, major and college planning system.
The career system works on all devices!
For more Information, go to https://bit.ly/3SW5w8W
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsAnyscale
Apache Spark has rapidly become a key tool for data scientists to explore, understand and transform massive datasets and to build and train advanced machine learning models. The question then becomes, how do I deploy these model to a production environment? How do I embed what I have learned into customer facing data applications?
In this webinar, we will discuss best practices from Databricks on
how our customers productionize machine learning models
do a deep dive with actual customer case studies,
show live tutorials of a few example architectures and code in Python, Scala, Java and SQL.
Machine Learning operations brings data science to the world of devops. Data scientists create models on their workstations. MLOps adds automation, validation and monitoring to any environment including machine learning on kubernetes. In this session you hear about latest developments and see it in action.
In this deck from the 2019 UK HPC Conference, Glyn Bowden from HPE presents: The Eco-System of AI and How to Use It.
"This presentation walks through HPE's current view on AI applications, where it is driving outcomes and innovation, and where the challenges lay. We look at the eco-system that sits around an AI project and look at ways this can impact the success of the endeavor."
Watch the video: https://wp.me/p3RLHQ-kVS
Learn more: https://www.hpe.com/us/en/solutions/artificial-intelligence.html
and
http://hpcadvisorycouncil.com/events/2019/uk-conference/agenda.php
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this session, we will take a deep-dive into the DevOps process that comes with Azure Machine Learning service, a cloud service that you can use to track as you build, train, deploy and manage models. We zoom into how the data science process can be made traceable and deploy the model with Azure DevOps to a Kubernetes cluster.
At the end of this session, you will have a good grasp of the technological building blocks of Azure machine learning services and can bring a machine learning project safely into production.
S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...Henry Saputra
The Krylov Project is the key component in eBay's AI Platform initiative that provides an easy to use, open, and fast AI orchestration engine that is deployed as managed services in eBay cloud.
Using Krylov, AI scientists can access eBay's massive datasets; build and train AI models; spin up powerful compute (high-memory or GPU instances) on the Krylov compute cluster; and set up machine learning pipelines, such as using declarative constructs that stitch together pipeline lifecycle.
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...Big Data Value Association
The main goal of the session is to showcase approaches that greatly simplify the work of a data analyst when performing data analytics, or when employing machine learning algorithms, over Big Data. The session will include presentations on
(a) How data analytics workflows can be easily and graphically composed, and then optimized for execution,
(b) How raw data with great variety can be easily queried using SQL interfaces, and
(c) How complex machine learning operations can be performed efficiently in distributed settings.
After these presentations, the speakers will participate in a discussion with the audience, in order to discuss further tools that could make the work of a data analyst more simple.
This slide was used in ISO/IEC JTC1 SC36 Plenary Meeting in June 22, 2015.
Title of this slide is 'Proof of Concept for Learning Analytics Interoperability and subtitle is 'Reference Model based on open source SW'.
The Open Data Lake Platform Brief - Data Sheets | WhitepaperVasu S
An open data lake platform provides a robust and future-proof data management paradigm to support a wide range of data processing needs, including data exploration, ad-hoc analytics, streaming analytics, and machine learning.
Hydrosphere.io Platform for AI/ML Operations AutomationRustem Zakiev
Simple and robust ML models deployment
Automated versioning
Easy models and versions management
Score the model from your app or microservice via REST, gRPC or Kafka stream API.
A/B and Canary testing on production traffic.
Hot-wing bumpless model replacement in production pipeline
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...Databricks
ML development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools, and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models. To address these problems, many companies are building custom “ML platforms” that automate this lifecycle, but even these platforms are limited to a few supported algorithms and to each company’s internal infrastructure. In this session, we introduce MLflow, a new open source project from Databricks that aims to design an open ML platform where organizations can use any ML library and development tool of their choice to reliably build and share ML applications. MLflow introduces simple abstractions to package reproducible projects, track results, and encapsulate models that can be used with many existing tools, accelerating the ML lifecycle for organizations of any size. In this deep-dive session, through a complete ML model life-cycle example, you will walk away with:
MLflow concepts and abstractions for models, experiments, and projects
How to get started with MLFlow
Understand aspects of MLflow APIs
Using tracking APIs during model training
Using MLflow UI to visually compare and contrast experimental runs with different tuning parameters and evaluate metrics
Package, save, and deploy an MLflow model
Serve it using MLflow REST API
What’s next and how to contribute
Use of standards and related issues in predictive analyticsPaco Nathan
My presentation at KDD 2016 in SF, in the "Special Session on Standards in Predictive Analytics In the Era of Big and Fast Data" morning track about PMML and PFA http://dmg.org/kdd2016.html
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...Akash Tandon
ML solutions in production start from data ingestion and extend upto the actual deployment step. We want this workflow to be scalable, portable and simple. Containers and kubernetes are great at the former two but not the latter if you aren't a devops practitioner. We'll explore how you can leverage the Kubeflow project to deploy best-of-breed open-source systems for ML to diverse infrastructures.
Developing and deploying AI solutions on the cloud using Team Data Science Pr...Debraj GuhaThakurta
Presented at: Global Big AI Conference, Santa Clara, Jan 2018 Developing and deploying AI solutions on the cloud using Team Data Science Process (TDSP) and Azure Machine Learning (AML)
Metaflow: The ML Infrastructure at NetflixBill Liu
Metaflow was started at Netflix to answer a pressing business need: How to enable an organization of data scientists, who are not software engineers by training, build and deploy end-to-end machine learning workflows and applications independently. We wanted to provide the best possible user experience for data scientists, allowing them to focus on parts they like (modeling using their favorite off-the-shelf libraries) while providing robust built-in solutions for the foundational infrastructure: data, compute, orchestration, and versioning.
Today, the open-source Metaflow powers hundreds of business-critical ML projects at Netflix and other companies from bioinformatics to real estate.
In this talk, you will learn about:
- What to expect from a modern ML infrastructure stack.
- Using Metaflow to boost the productivity of your data science organization, based on lessons learned from Netflix.
- Deployment strategies for a full stack of ML infrastructure that plays nicely with your existing systems and policies.
https://www.aicamp.ai/event/eventdetails/W2021080510
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.Luciano Resende
In this session Luciano will explore the different projects that compose the Jupyter ecosystem; including Jupyter Notebooks, JupyterLab, JupyterHub and Jupyter Enterprise Gateway. Jupyter Notebooks are the current open standard for data science and AI model development, and IBM is dedicated to contributing to their success and adoption. Continuing the trend of building out the Jupyter ecosystem, Luciano will introduce Elyra. It's a project built to extend JupyterLab with AI-centric capabilities. He'll showcase the extensions that allow you to build Notebook Pipelines, execute notebooks as batch jobs, navigate and execute Python scripts, and tie neatly into Notebook versioning.
New Explore Careers and College Majors 2024.pdfDr. Mary Askew
Explore Careers and College Majors is a new online, interactive, self-guided career, major and college planning system.
The career system works on all devices!
For more Information, go to https://bit.ly/3SW5w8W
The Impact of Artificial Intelligence on Modern Society.pdfssuser3e63fc
Just a game Assignment 3
1. What has made Louis Vuitton's business model successful in the Japanese luxury market?
2. What are the opportunities and challenges for Louis Vuitton in Japan?
3. What are the specifics of the Japanese fashion luxury market?
4. How did Louis Vuitton enter into the Japanese market originally? What were the other entry strategies it adopted later to strengthen its presence?
5. Will Louis Vuitton have any new challenges arise due to the global financial crisis? How does it overcome the new challenges?Assignment 3
1. What has made Louis Vuitton's business model successful in the Japanese luxury market?
2. What are the opportunities and challenges for Louis Vuitton in Japan?
3. What are the specifics of the Japanese fashion luxury market?
4. How did Louis Vuitton enter into the Japanese market originally? What were the other entry strategies it adopted later to strengthen its presence?
5. Will Louis Vuitton have any new challenges arise due to the global financial crisis? How does it overcome the new challenges?Assignment 3
1. What has made Louis Vuitton's business model successful in the Japanese luxury market?
2. What are the opportunities and challenges for Louis Vuitton in Japan?
3. What are the specifics of the Japanese fashion luxury market?
4. How did Louis Vuitton enter into the Japanese market originally? What were the other entry strategies it adopted later to strengthen its presence?
5. Will Louis Vuitton have any new challenges arise due to the global financial crisis? How does it overcome the new challenges?
Want to move your career forward? Looking to build your leadership skills while helping others learn, grow, and improve their skills? Seeking someone who can guide you in achieving these goals?
You can accomplish this through a mentoring partnership. Learn more about the PMISSC Mentoring Program, where you’ll discover the incredible benefits of becoming a mentor or mentee. This program is designed to foster professional growth, enhance skills, and build a strong network within the project management community. Whether you're looking to share your expertise or seeking guidance to advance your career, the PMI Mentoring Program offers valuable opportunities for personal and professional development.
Watch this to learn:
* Overview of the PMISSC Mentoring Program: Mission, vision, and objectives.
* Benefits for Volunteer Mentors: Professional development, networking, personal satisfaction, and recognition.
* Advantages for Mentees: Career advancement, skill development, networking, and confidence building.
* Program Structure and Expectations: Mentor-mentee matching process, program phases, and time commitment.
* Success Stories and Testimonials: Inspiring examples from past participants.
* How to Get Involved: Steps to participate and resources available for support throughout the program.
Learn how you can make a difference in the project management community and take the next step in your professional journey.
About Hector Del Castillo
Hector is VP of Professional Development at the PMI Silver Spring Chapter, and CEO of Bold PM. He's a mid-market growth product executive and changemaker. He works with mid-market product-driven software executives to solve their biggest growth problems. He scales product growth, optimizes ops and builds loyal customers. He has reduced customer churn 33%, and boosted sales 47% for clients. He makes a significant impact by building and launching world-changing AI-powered products. If you're looking for an engaging and inspiring speaker to spark creativity and innovation within your organization, set up an appointment to discuss your specific needs and identify a suitable topic to inspire your audience at your next corporate conference, symposium, executive summit, or planning retreat.
About PMI Silver Spring Chapter
We are a branch of the Project Management Institute. We offer a platform for project management professionals in Silver Spring, MD, and the DC/Baltimore metro area. Monthly meetings facilitate networking, knowledge sharing, and professional development. For event details, visit pmissc.org.
This comprehensive program covers essential aspects of performance marketing, growth strategies, and tactics, such as search engine optimization (SEO), pay-per-click (PPC) advertising, content marketing, social media marketing, and more
1. AIOPS
HVORDAN VELGE UT KI PROSJEKTER OG GÅ FRA PROTOTYP TIL DRIFT
Volker Hoffmann <volker.hoffmann@sintef.no>
Energytics Avslutningsseminar, 10 Juni 2020
2. Context: Elements on AI Transformation
2
Cases
& Value
Data
Ecosystems
Techniques
& Tools
Workflow
Integration
Open
Culture
Source: Bughin et al. McKinsey Global Institute, Artificial intelligence, June 2017.
3. Elements on AI Transformation
3
Cases
& Value
Data
Ecosystems
Techniques
& Tools
Workflow
Integration
Open
Culture
4. Framework: Concept to Impact
4
Cases
& Value
Data
Ecosystems
Techniques
& Tools
Workflow
Integration
Open
Culture
5. 5
Business Drivers
Data
Science
Model
Versioning
Model
Deployment
Data
Pipelines
MLOps & DevOps
Business Development & Data Science
Bring together internal and
external stakeholders with
data scientists. Generate
ideas about processes that
can be improved with data-
driven insights.
Set performance indicators
and requirements so that
results can be decision
gated.
Facilitated by frameworks
such as Design Thinking, 10
Types of Innovation, or
types of Business Process
Analysis (LEAN, Six Sigma).
Prepare, process, and
analyze data to assess
validity of ideas generated
together with business
stakeholders.
Log and document
experiments and keep
stakeholders updated using
performance indicators.
This is a heavily iterative
process as hypotheses
become invalidated, are
adjusted, and finally
validated.
Python, R, Julia
For successful applications,
iteratively improve models
by model tuning, feature
engineering, and addition
of new data sources.
Track and archive model
versions and associated
training data (data
provenance).
This ensures reproducibility
and enables automated
rollout (and rollback) when
models improve (or fail).
Make models available for
integration into new and
existing workflows. Serve
models through APIs, so
they are agnostic to to
existing infrastructure.
Automate rollout/rollback
through by continuous
integration and deployment
(CI/CD) with performance
monitoring.
This ensures that best
performing models are
automatically deployed and
that misbehaving models
are rolled back.
Spark, Docker, Kubernetes
To exploit operational
models, build robust data
ingestion pipelines. This
enables real- and near-time
processing and continuous
improvement of models.
Archive all inbound data to
allow data scientists and
business stakeholders to
develop new and iterate on
existing improvements.
Kafka, MQTT, Avro, Thrift
Data Engineering
6. 6
Business Drivers
Data
Science
Model
Versioning
Model
Deployment
Data
Pipelines
MLOps & DevOps
Business Development & Data Science
Connect with stakeholders
Low hanging fruits
High-impact projects
Set performance indicators
Set performance targets
Use Design Thinking, 10
Types of Innovation, LEAN,
Six Sigma, etc
Keep stakeholders updated
Build prototypes
Log experiments
Iterate, iterate, iterate
Python, R, Julia
Iteratively improve models
Feature engineering
Hyperparameters
Track and archive models
Track and archive data
Make it reproducible
Add models to workflows
Serve through APIs
Monitor performance
Automate deployment
Automate rollback
Spark, Docker, Kubernetes
Build robust pipelines
Operate on real-time data
Archive data
Kafka, MQTT, Avro, Thrift
Data Engineering
7. 7
Business Drivers
Data
Science
Model
Versioning
Model
Deployment
Data
Pipelines
MLOps & DevOps
Business Development & Data Science
Connect with stakeholders
Low hanging fruits
High-impact projects
Set performance indicators
Set performance targets
Use Design Thinking, 10
Types of Innovation, LEAN,
Six Sigma, etc
Keep stakeholders updated
Build prototypes
Log experiments
Iterate, iterate, iterate
Python, R, Julia
Iteratively improve models
Feature engineering
Hyperparameters
Track and archive models
Track and archive data
Make it reproducible
Add models to workflows
Serve through APIs
Monitor performance
Automate deployment
Automate rollback
Spark, Docker, Kubernetes
Build robust pipelines
Operate on real-time data
Archive data
Kafka, MQTT, Avro, Thrift
Data Engineering
8. 8
Business Drivers
Data
Science
Model
Versioning
Model
Deployment
Data
Pipelines
MLOps & DevOps
Business Development & Data Science
Connect with stakeholders
Low hanging fruits
High-impact projects
Set performance indicators
Set performance targets
Use Design Thinking, 10
Types of Innovation, LEAN,
Six Sigma, etc
Keep stakeholders updated
Build prototypes
Log experiments
Iterate, iterate, iterate
Python, R, Julia
Iteratively improve models
Feature engineering
Hyperparameters
Track and archive models
Track and archive data
Make it reproducible
Add models to workflows
Serve through APIs
Monitor performance
Automate deployment
Automate rollback
Spark, Docker, Kubernetes
Build robust pipelines
Operate on real-time data
Archive data
Kafka, MQTT, Avro, Thrift
Data Engineering
9. 9
Business Drivers
Data
Science
Model
Versioning
Model
Deployment
Data
Pipelines
MLOps & DevOps
Business Development & Data Science
Connect with stakeholders
Low hanging fruits
High-impact projects
Set performance indicators
Set performance targets
Use Design Thinking, 10
Types of Innovation, LEAN,
Six Sigma, etc
Keep stakeholders updated
Build prototypes
Log experiments
Iterate, iterate, iterate
Python, R, Julia
Iteratively improve models
Feature engineering
Hyperparameters
Track and archive models
Track and archive data
Make it reproducible
Add models to workflows
Serve through APIs
Monitor performance
Automate deployment
Automate rollback
Spark, Docker, Kubernetes
Build robust pipelines
Operate on real-time data
Archive data
Kafka, MQTT, Avro, Thrift
Data Engineering
10. 10
Business Drivers
Data
Science
Model
Versioning
Model
Deployment
Data
Pipelines
MLOps & DevOps
Business Development & Data Science
Connect with stakeholders
Low hanging fruits
High-impact projects
Set performance indicators
Set performance targets
Use Design Thinking, 10
Types of Innovation, LEAN,
Six Sigma, etc
Keep stakeholders updated
Build prototypes
Log experiments
Iterate, iterate, iterate
Python, R, Julia
Iteratively improve models
Feature engineering
Hyperparameters
Track and archive models
Track and archive data
Make it reproducible
Add models to workflows
Serve through APIs
Monitor performance
Automate deployment
Automate rollback
Spark, Docker, Kubernetes
Build robust pipelines
Operate on real-time data
Archive data
Kafka, MQTT, Avro, Thrift
Data Engineering
11. Tools to Speed You Up
11
Cases
& Value
Data
Ecosystems
Techniques
& Tools
Workflow
Integration
Open
Culture
12. Neptune
Neptune is a tracker for experiments with
focus on Python. It is a hosted service.
Experiments are tracked by using library
hooks to register (model) parameters,
evaluation results, and upload artifacts (such
as models, hashes of training data, or even
code). The library can track hardware usage
and experiment progress.
The results can be analyzed and compared on
a website. There are also collaborative
options. Neptune has integrations with
Jupyter notebooks, various ML libraries,
visualizers (HiFlow, TensorBoard), other
trackers (MLFlow), and external offerings
(Amazon Sagemaker).
They provide an API to query results from
experiment. This can be used to feed CI/CD
pipelines for model deployment.
12
$ conda create --name neptune python=3.6
$ conda activate neptune
$ conda install -c conda-forge neptune-client
$ cd /somewhere
---> https://ui.neptune.ai/
---> Create Account, Log In, “Getting Started”
13. MLFlow
MLFlow is an experiment tracker and generic
model server. It can be self-hosted.
Experiments are tracked by using library
hooks to register (model) parameters,
evaluation results, and upload artifacts (such
as models, hashes of training data, or even
code). Artifacts can be logged to local,
remote, or cloud storage (S3, GFS, etc).
Results can be analyzed through a web UI
and CSV export is available. Models are
packaged as a wrapper around the underlying
format (Sklearn, XGBoost, Torch, etc). They
can be pushed to Spark for batch inference
or served through REST.
There are CLI, Python, R, Java, and REST
APIs for further integration with CI/CD
pipelines. Models can be pushed to cloud
services (SageMaker, AzureML, ...).
13
$ conda create --name mlflow python=3.6
$ conda activate mlflow
$ conda install -c conda-forge mlflow
$ cd /somewhere
$ mlflow ui
---> http://localhost:5000
---> https://mlflow.org/docs/latest/quickstart.html
14. Kubeflow
Kubeflow is essentially a self-hosted version
of the Google AI platform. It uses Kubernetes
to abstract away infrastructure.
Kubeflow can deploy Jupyter notebooks, run
pipelines for data processing and model
training (scheduled, on-demand), organize
runs, archive models and other artifacts, and
expose models through endpoints. Pipelines
are compute graphs and are described in
Python with a DSL. Their components are
wrapped as Docker images.
It integrates with GCP so it can elastically
scale to the cloud compute and storage
(e.g., distributed model training). It also
integrates with offerings such as BigQuery or
Dataproc.
The solution is heavy and complex but
enables rapid scale-out. It is especially
applicable if infrastructure is already
managed through Kubernetes.
14
https://www.kubeflow.org/docs/started/workstation/getting-started-minikf/
$ cd /somewhere
$ vagrant init arrikto/minikf
$ vagrant up
---> http://10.10.10.10
15. Pachyderm
Pachyderm is a versioning system and
execution environment for data and
processing pipelines. Hosted and self-hosted
options exist.
At its core is data provenance. Data is
committed to a repository and acted upon by
processors in pipelines. The results (data and
other artefacts like models) are committed
back into a repository. By design, all use of
data is traceable through pipelines. Pipelines
are described in JSON and processors are
packaged as Docker images. Pachyderm can
integrate (but not deploy) Jupyter and can
push/pull data from cloud stores (S3, etc).
Pachyderm is built on top of Kubernetes, so
can easily scale-out and run in various
clouds. Self-hosting comes with the usual
Kubernetes complexity.
15
$ cd /somewhere
$ download from (https://github.com/pachyderm/pachyderm/releases)
$ tar xfvz release_filename_linux_amd64.tar.gz
$ pachctl version --client-only
---> https://docs.pachyderm.com/latest/pachub/pachub_getting_started/
---> https://docs.pachyderm.com/latest/getting_started/beginner_tutorial/
17. 17
Just Getting
Started
ScaleUp on
Google
Containers
and
Provenance
Hosted One-
Stop Shop
Remain
Undecided
Track Experiments
Archive Some Models
Work on Low Hanging Fruits
Try Neptune or MLFlow.
Expect Scaling?
Integrate w/ Google Cloud?
Running Kubernetes?
Try Kubeflow.
!!! Complex !!!
Want pipelines?
Want containers?
Want data provenance?
Try Pachyderm.
Deploy models?
Track data?
Need end-to-end?
Try Dataiku, SageMaker,
AzureML, DataBricks, or
Google AI Platform.
No idea?
Try Neptune or MLFlow.
Recommendations
Bildet illustrerer:
· bredden i SINTEFs ekspertise, fra havrom til verdensrom.
· hvilke områder og bransjer vi jobber innen for å realisere visjonen Teknologi for et bedre samfunn.
Bildestilen er basert på stikkordene fremtidsrettet, teknologi og norsk natur (naturressurser). SINTEFs visuelle univers er utviklet for SINTEF av Headspin Productions AS.