This session is continuation of “Automated Production Ready ML at Scale” in last Spark AI Summit at Europe. In this session you will learn about how H&M evolves reference architecture covering entire MLOps stack addressing a few common challenges in AI and Machine learning product, like development efficiency, end to end traceability, speed to production, etc.
This session is continuation of “Automated Production Ready ML at Scale” in last Spark AI Summit at Europe. In this session you will learn about how H&M evolves reference architecture covering entire MLOps stack addressing a few common challenges in AI and Machine learning product, like development efficiency, end to end traceability, speed to production, etc.
An Introduction to XAI! Towards Trusting Your ML Models!Mansour Saffar
Machine learning (ML) is currently disrupting almost every industry and is being used as the core component in many systems. The decisions made by these systems may have a great impact on society and specific individuals and thus the decision-making process has to be clear and explainable so humans can trust it. Explainable AI (XAI) is a rather new field in ML in which researchers try to develop models that are able to explain the decision-making process behind ML models. In this talk, we'll learn about the fundamentals of XAI and discuss why we need to start to integrate XAI with our ML models!
Presented in Edmonton DataScience Meetup on October 2nd, 2019. Learn more: https://youtu.be/gEkPXOsDt_w
What MLflow is; what problem it solves for machine learning lifecycle; and how it solves; How it will be used with Databricks; and CI/CD pipeline with Databricks.
Unified Approach to Interpret Machine Learning Model: SHAP + LIMEDatabricks
For companies that solve real-world problems and generate revenue from the data science products, being able to understand why a model makes a certain prediction can be as crucial as achieving high prediction accuracy in many applications. However, as data scientists pursuing higher accuracy by implementing complex algorithms such as ensemble or deep learning models, the algorithm itself becomes a blackbox and it creates the trade-off between accuracy and interpretability of a model’s output.
To address this problem, a unified framework SHAP (SHapley Additive exPlanations) was developed to help users interpret the predictions of complex models. In this session, we will talk about how to apply SHAP to various modeling approaches (GLM, XGBoost, CNN) to explain how each feature contributes and extract intuitive insights from a particular prediction. This talk is intended to introduce the concept of general purpose model explainer, as well as help practitioners understand SHAP and its applications.
Managing the Complete Machine Learning Lifecycle with MLflowDatabricks
ML development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models.
To solve for these challenges, Databricks unveiled last year MLflow, an open source project that aims at simplifying the entire ML lifecycle. MLflow introduces simple abstractions to package reproducible projects, track results, and encapsulate models that can be used with many existing tools, accelerating the ML lifecycle for organizations of any size.
In the past year, the MLflow community has grown quickly: over 120 contributors from over 40 companies have contributed code to the project, and over 200 companies are using MLflow.
In this tutorial, we will show you how using MLflow can help you:
Keep track of experiments runs and results across frameworks.
Execute projects remotely on to a Databricks cluster, and quickly reproduce your runs.
Quickly productionize models using Databricks production jobs, Docker containers, Azure ML, or Amazon SageMaker.
We will demo the building blocks of MLflow as well as the most recent additions since the 1.0 release.
What you will learn:
Understand the three main components of open source MLflow (MLflow Tracking, MLflow Projects, MLflow Models) and how each help address challenges of the ML lifecycle.
How to use MLflow Tracking to record and query experiments: code, data, config, and results.
How to use MLflow Projects packaging format to reproduce runs on any platform.
How to use MLflow Models general format to send models to diverse deployment tools.
Prerequisites:
A fully-charged laptop (8-16GB memory) with Chrome or Firefox
Python 3 and pip pre-installed
Pre-Register for a Databricks Standard Trial
Basic knowledge of Python programming language
Basic understanding of Machine Learning Concepts
In this talk, I present an introduction of MLFlow. I also show some examples of using it by means of MLFlow Tracking, MLFlow Projects and MLFlow Models. I also used Databricks as an example of remote tracking.
Feature Store as a Data Foundation for Machine LearningProvectus
Looking to design and build a centralized, scalable Feature Store for your Data Science & Machine Learning teams to take advantage of? Come and learn from experts of Provectus and Amazon Web Services (AWS) how to!
Feature Store is a key component of the ML stack and data infrastructure, which enables feature engineering and management. By having a Feature Store, organizations can save massive amounts of resources, innovate faster, and drive ML processes at scale. In this webinar, you will learn how to build a Feature Store with a data mesh pattern and see how to achieve consistency between real-time and training features, to improve reproducibility with time-traveling for data.
Agenda
- Modern Data Lakes & Modern ML Infrastructure
- Existing and Emerging Architectural Shifts
- Feature Store: Overview and Reference Architecture
- AWS Perspective on Feature Store
Intended Audience
Technology executives & decision makers, manager-level tech roles, data architects & analysts, data engineers & data scientists, ML practitioners & ML engineers, and developers
Presenters
- Stepan Pushkarev, Chief Technology Officer, Provectus
- Gandhi Raketla, Senior Solutions Architect, AWS
- German Osin, Senior Solutions Architect, Provectus
Feel free to share this presentation with your colleagues and don't hesitate to reach out to us at info@provectus.com if you have any questions!
REQUEST WEBINAR: https://provectus.com/webinar-feature-store-as-data-foundation-for-ml-nov-2020/
MLOps (a compound of “machine learning” and “operations”) is a practice for collaboration and communication between data scientists and operations professionals to help manage the production machine learning lifecycle. Similar to the DevOps term in the software development world, MLOps looks to increase automation and improve the quality of production ML while also focusing on business and regulatory requirements. MLOps applies to the entire ML lifecycle - from integrating with model generation (software development lifecycle, continuous integration/continuous delivery), orchestration, and deployment, to health, diagnostics, governance, and business metrics.
To watch the full presentation click here: https://info.cnvrg.io/mlopsformachinelearning
In this webinar, we’ll discuss core practices in MLOps that will help data science teams scale to the enterprise level. You’ll learn the primary functions of MLOps, and what tasks are suggested to accelerate your teams machine learning pipeline. Join us in a discussion with cnvrg.io Solutions Architect, Aaron Schneider, and learn how teams use MLOps for more productive machine learning workflows.
- Reduce friction between science and engineering
- Deploy your models to production faster
- Health, diagnostics and governance of ML models
- Kubernetes as a core platform for MLOps
- Support advanced use-cases like continual learning with MLOps
Deep Learning at Extreme Scale (in the Cloud) with the Apache Kafka Open Sou...Kai Wähner
How to Build a Machine Learning Infrastructure with Kafka, Connect, Streams, KSQL, etc…
This talk shows how to build Machine Learning models at extreme scale and how to productionize the built models in mission-critical real time applications by leveraging open source components in the public cloud. The session discusses the relation between TensorFlow and the Apache Kafka ecosystem - and why this is a great fit for machine learning at extreme scale.
The Machine Learning architecture includes: Kafka Connect for continuous high volume data ingestion into the public cloud, TensorFlow leveraging Deep Learning algorithms to build an analytic model on powerful GPUs, Kafka Streams for model deployment and inference in real time, and KSQL for real time analytics of predictions, alerts and model accuracy.
Sensor analytics for predictive alerting in real time is used as real world example from Internet of Things scenarios. A live demo shows the out-of-the-box integration and dynamic scalability of these components on Google Cloud.
Key takeaways for the audience
• Learn how to build a Machine Learning infrastructure at extreme scale and how to productionize the built models in mission-critical real time applications
• Understand the benefits of a machine learning platform on the public cloud
• Learn about an extreme scale Machine Learning architecture around the Apache Kafka open source ecosystem including Kafka Connect, Kafka Streams and KSQL
• See a live demo for an Internet of Things use case: Sensor analytics for predictive alerting in real time
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...David Talby
An April 2023 presentation to the AMIA working group on natural language processing. The talk focuses on three current trends in NLP and how they apply in healthcare: Large language models, No-code, and Responsible AI.
How to Build a ML Platform Efficiently Using Open-SourceDatabricks
Fast-growing startups usually face a common set of challenges when employing machine learning. Data scientists are expected to work on new products and develop new models as well as iterate on existing ones. Once in production, models should be continuously monitored and regularly maintained as the infrastructure evolves. Before too long, data scientists end up spending most of their time doing maintenance and firefighting of existing models instead of creating new ones.
At GetYourGuide, we faced these challenges and decided to think about machine learning development holistically, which led us to our machine learning platform. Our platform uses MLflow to keep track of our machine learning life-cycle and ease the development experience. To integrate our models into our production environment, we also need to deal with additional requirements like API specification, SLOs and monitoring. To empower our data scientists, we have built a templating system that takes care of the heavy lifting of going to production, leveraging software engineering tools and ML-specific ones like BentoML.
In this talk we will present:
– Our previous approaches for deploying models and their tradeoffs
– Our data science and platform principles
– The main functionalities of our platform
– A live demo to create a new service
– Our learnings in the process
ML development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models. To address these problems, many companies are building custom “ML platforms” that automate this lifecycle, but even these platforms are limited to a few supported algorithms and to each company’s internal infrastructure. In this talk, I present MLflow, a new open source project from Databricks that aims to design an open ML platform where organizations can use any ML library and development tool of their choice to reliably build and share ML applications. MLflow introduces simple abstractions to package reproducible projects, track results, and encapsulate models that can be used with many existing tools, accelerating the ML lifecycle for organizations of any size.
mlflow: Accelerating the End-to-End ML lifecycleDatabricks
Building and deploying a machine learning model can be difficult to do once. Enabling other data scientists (or yourself, one month later) to reproduce your pipeline, to compare the results of different versions, to track what’s running where, and to redeploy and rollback updated models is much harder.
In this talk, I’ll introduce MLflow, a new open source project from Databricks that simplifies the machine learning lifecycle. MLflow provides APIs for tracking experiment runs between multiple users within a reproducible environment, and for managing the deployment of models to production. MLflow is designed to be an open, modular platform, in the sense that you can use it with any existing ML library and development process. MLflow was launched in June 2018 and has already seen significant community contributions, with over 50 contributors and new features including language APIs, integrations with popular ML libraries, and storage backends. I’ll show how MLflow works and explain how to get started with MLflow.
Advanced Natural Language Processing with Apache Spark NLPDatabricks
This hands-on deep-dive session uses the open-source Apache Spark NLP library to explore advanced NLP in Python. Apache Spark NLP provides state-of-the-art accuracy, speed, and scalability for language understanding by delivering production-grade implementations of some of the most recent research in applied deep learning. Apache Spark NLP is the only open-source NLP library that can natively scale to use any Apache Spark cluster, as well as take advantage of the latest processors from Intel and Nvidia. It’s the most widely used NLP library in the enterprise today.
You’ll edit and run executable Python notebooks as we walk through these common NLP tasks: document classification, named entity recognition, sentiment analysis, spell checking and correction, grammar understanding, question answering, and translation. The discussion of each NLP task includes the latest advances in deep learning and transfer learning used to tackle it – from the hundreds of BERT-based embeddings to models based on the T5 transformer, MarianNMT, multilingual and domain-specific models.
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Mihai Criveti
Mihai is the Principal Architect for Platform Engineering and Technology Solutions at IBM, responsible for Cloud Native and AI Solutions. He is a Red Hat Certified Architect, CKA/CKS, a leader in the IBM Open Innovation community, and advocate for open source development. Mihai is driving the development of Retrieval Augmentation Generation platforms, and solutions for Generative AI at IBM that leverage WatsonX, Vector databases, LangChain, HuggingFace and open source AI models.
Mihai will share lessons learned building Retrieval Augmented Generation, or “Chat with Documents” platforms and APIs that scale, and deploy on Kubernetes. His talk will cover use cases for Generative AI, limitations of Large Language Models, use of RAG, Vector Databases and Fine Tuning to overcome model limitations and build solutions that connect to your data and provide content grounding, limit hallucinations and form the basis of explainable AI. In terms of technology, he will cover LLAMA2, HuggingFace TGIS, SentenceTransformers embedding models using Python, LangChain, and Weaviate and ChromaDB vector databases. He’ll also share tips on writing code using LLM, including building an agent for Ansible and containers.
Scaling factors for Large Language Model Architectures:
• Vector Database: consider sharding and High Availability
• Fine Tuning: collecting data to be used for fine tuning
• Governance and Model Benchmarking: how are you testing your model performance
over time, with different prompts, one-shot, and various parameters
• Chain of Reasoning and Agents
• Caching embeddings and responses
• Personalization and Conversational Memory Database
• Streaming Responses and optimizing performance. A fine tuned 13B model may
perform better than a poor 70B one!
• Calling 3rd party functions or APIs for reasoning or other type of data (ex: LLMs are
terrible at reasoning and prediction, consider calling other models)
• Fallback techniques: fallback to a different model, or default answers
• API scaling techniques, rate limiting, etc.
• Async, streaming and parallelization, multiprocessing, GPU acceleration (including
embeddings), generating your API using OpenAPI, etc.
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
Looking to build a robust machine learning infrastructure to streamline MLOps? Learn from Provectus experts how to ensure the success of your MLOps initiative by implementing Data QA components in your ML infrastructure.
For most organizations, the development of multiple machine learning models, their deployment and maintenance in production are relatively new tasks. Join Provectus as we explain how to build an end-to-end infrastructure for machine learning, with a focus on data quality and metadata management, to standardize and streamline machine learning life cycle management (MLOps).
Agenda
- Data Quality and why it matters
- Challenges and solutions of Data Testing
- Challenges and solutions of Model Testing
- MLOps pipelines and why they matter
- How to expand validation pipelines for Data Quality
An Introduction to XAI! Towards Trusting Your ML Models!Mansour Saffar
Machine learning (ML) is currently disrupting almost every industry and is being used as the core component in many systems. The decisions made by these systems may have a great impact on society and specific individuals and thus the decision-making process has to be clear and explainable so humans can trust it. Explainable AI (XAI) is a rather new field in ML in which researchers try to develop models that are able to explain the decision-making process behind ML models. In this talk, we'll learn about the fundamentals of XAI and discuss why we need to start to integrate XAI with our ML models!
Presented in Edmonton DataScience Meetup on October 2nd, 2019. Learn more: https://youtu.be/gEkPXOsDt_w
What MLflow is; what problem it solves for machine learning lifecycle; and how it solves; How it will be used with Databricks; and CI/CD pipeline with Databricks.
Unified Approach to Interpret Machine Learning Model: SHAP + LIMEDatabricks
For companies that solve real-world problems and generate revenue from the data science products, being able to understand why a model makes a certain prediction can be as crucial as achieving high prediction accuracy in many applications. However, as data scientists pursuing higher accuracy by implementing complex algorithms such as ensemble or deep learning models, the algorithm itself becomes a blackbox and it creates the trade-off between accuracy and interpretability of a model’s output.
To address this problem, a unified framework SHAP (SHapley Additive exPlanations) was developed to help users interpret the predictions of complex models. In this session, we will talk about how to apply SHAP to various modeling approaches (GLM, XGBoost, CNN) to explain how each feature contributes and extract intuitive insights from a particular prediction. This talk is intended to introduce the concept of general purpose model explainer, as well as help practitioners understand SHAP and its applications.
Managing the Complete Machine Learning Lifecycle with MLflowDatabricks
ML development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models.
To solve for these challenges, Databricks unveiled last year MLflow, an open source project that aims at simplifying the entire ML lifecycle. MLflow introduces simple abstractions to package reproducible projects, track results, and encapsulate models that can be used with many existing tools, accelerating the ML lifecycle for organizations of any size.
In the past year, the MLflow community has grown quickly: over 120 contributors from over 40 companies have contributed code to the project, and over 200 companies are using MLflow.
In this tutorial, we will show you how using MLflow can help you:
Keep track of experiments runs and results across frameworks.
Execute projects remotely on to a Databricks cluster, and quickly reproduce your runs.
Quickly productionize models using Databricks production jobs, Docker containers, Azure ML, or Amazon SageMaker.
We will demo the building blocks of MLflow as well as the most recent additions since the 1.0 release.
What you will learn:
Understand the three main components of open source MLflow (MLflow Tracking, MLflow Projects, MLflow Models) and how each help address challenges of the ML lifecycle.
How to use MLflow Tracking to record and query experiments: code, data, config, and results.
How to use MLflow Projects packaging format to reproduce runs on any platform.
How to use MLflow Models general format to send models to diverse deployment tools.
Prerequisites:
A fully-charged laptop (8-16GB memory) with Chrome or Firefox
Python 3 and pip pre-installed
Pre-Register for a Databricks Standard Trial
Basic knowledge of Python programming language
Basic understanding of Machine Learning Concepts
In this talk, I present an introduction of MLFlow. I also show some examples of using it by means of MLFlow Tracking, MLFlow Projects and MLFlow Models. I also used Databricks as an example of remote tracking.
Feature Store as a Data Foundation for Machine LearningProvectus
Looking to design and build a centralized, scalable Feature Store for your Data Science & Machine Learning teams to take advantage of? Come and learn from experts of Provectus and Amazon Web Services (AWS) how to!
Feature Store is a key component of the ML stack and data infrastructure, which enables feature engineering and management. By having a Feature Store, organizations can save massive amounts of resources, innovate faster, and drive ML processes at scale. In this webinar, you will learn how to build a Feature Store with a data mesh pattern and see how to achieve consistency between real-time and training features, to improve reproducibility with time-traveling for data.
Agenda
- Modern Data Lakes & Modern ML Infrastructure
- Existing and Emerging Architectural Shifts
- Feature Store: Overview and Reference Architecture
- AWS Perspective on Feature Store
Intended Audience
Technology executives & decision makers, manager-level tech roles, data architects & analysts, data engineers & data scientists, ML practitioners & ML engineers, and developers
Presenters
- Stepan Pushkarev, Chief Technology Officer, Provectus
- Gandhi Raketla, Senior Solutions Architect, AWS
- German Osin, Senior Solutions Architect, Provectus
Feel free to share this presentation with your colleagues and don't hesitate to reach out to us at info@provectus.com if you have any questions!
REQUEST WEBINAR: https://provectus.com/webinar-feature-store-as-data-foundation-for-ml-nov-2020/
MLOps (a compound of “machine learning” and “operations”) is a practice for collaboration and communication between data scientists and operations professionals to help manage the production machine learning lifecycle. Similar to the DevOps term in the software development world, MLOps looks to increase automation and improve the quality of production ML while also focusing on business and regulatory requirements. MLOps applies to the entire ML lifecycle - from integrating with model generation (software development lifecycle, continuous integration/continuous delivery), orchestration, and deployment, to health, diagnostics, governance, and business metrics.
To watch the full presentation click here: https://info.cnvrg.io/mlopsformachinelearning
In this webinar, we’ll discuss core practices in MLOps that will help data science teams scale to the enterprise level. You’ll learn the primary functions of MLOps, and what tasks are suggested to accelerate your teams machine learning pipeline. Join us in a discussion with cnvrg.io Solutions Architect, Aaron Schneider, and learn how teams use MLOps for more productive machine learning workflows.
- Reduce friction between science and engineering
- Deploy your models to production faster
- Health, diagnostics and governance of ML models
- Kubernetes as a core platform for MLOps
- Support advanced use-cases like continual learning with MLOps
Deep Learning at Extreme Scale (in the Cloud) with the Apache Kafka Open Sou...Kai Wähner
How to Build a Machine Learning Infrastructure with Kafka, Connect, Streams, KSQL, etc…
This talk shows how to build Machine Learning models at extreme scale and how to productionize the built models in mission-critical real time applications by leveraging open source components in the public cloud. The session discusses the relation between TensorFlow and the Apache Kafka ecosystem - and why this is a great fit for machine learning at extreme scale.
The Machine Learning architecture includes: Kafka Connect for continuous high volume data ingestion into the public cloud, TensorFlow leveraging Deep Learning algorithms to build an analytic model on powerful GPUs, Kafka Streams for model deployment and inference in real time, and KSQL for real time analytics of predictions, alerts and model accuracy.
Sensor analytics for predictive alerting in real time is used as real world example from Internet of Things scenarios. A live demo shows the out-of-the-box integration and dynamic scalability of these components on Google Cloud.
Key takeaways for the audience
• Learn how to build a Machine Learning infrastructure at extreme scale and how to productionize the built models in mission-critical real time applications
• Understand the benefits of a machine learning platform on the public cloud
• Learn about an extreme scale Machine Learning architecture around the Apache Kafka open source ecosystem including Kafka Connect, Kafka Streams and KSQL
• See a live demo for an Internet of Things use case: Sensor analytics for predictive alerting in real time
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...David Talby
An April 2023 presentation to the AMIA working group on natural language processing. The talk focuses on three current trends in NLP and how they apply in healthcare: Large language models, No-code, and Responsible AI.
How to Build a ML Platform Efficiently Using Open-SourceDatabricks
Fast-growing startups usually face a common set of challenges when employing machine learning. Data scientists are expected to work on new products and develop new models as well as iterate on existing ones. Once in production, models should be continuously monitored and regularly maintained as the infrastructure evolves. Before too long, data scientists end up spending most of their time doing maintenance and firefighting of existing models instead of creating new ones.
At GetYourGuide, we faced these challenges and decided to think about machine learning development holistically, which led us to our machine learning platform. Our platform uses MLflow to keep track of our machine learning life-cycle and ease the development experience. To integrate our models into our production environment, we also need to deal with additional requirements like API specification, SLOs and monitoring. To empower our data scientists, we have built a templating system that takes care of the heavy lifting of going to production, leveraging software engineering tools and ML-specific ones like BentoML.
In this talk we will present:
– Our previous approaches for deploying models and their tradeoffs
– Our data science and platform principles
– The main functionalities of our platform
– A live demo to create a new service
– Our learnings in the process
ML development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models. To address these problems, many companies are building custom “ML platforms” that automate this lifecycle, but even these platforms are limited to a few supported algorithms and to each company’s internal infrastructure. In this talk, I present MLflow, a new open source project from Databricks that aims to design an open ML platform where organizations can use any ML library and development tool of their choice to reliably build and share ML applications. MLflow introduces simple abstractions to package reproducible projects, track results, and encapsulate models that can be used with many existing tools, accelerating the ML lifecycle for organizations of any size.
mlflow: Accelerating the End-to-End ML lifecycleDatabricks
Building and deploying a machine learning model can be difficult to do once. Enabling other data scientists (or yourself, one month later) to reproduce your pipeline, to compare the results of different versions, to track what’s running where, and to redeploy and rollback updated models is much harder.
In this talk, I’ll introduce MLflow, a new open source project from Databricks that simplifies the machine learning lifecycle. MLflow provides APIs for tracking experiment runs between multiple users within a reproducible environment, and for managing the deployment of models to production. MLflow is designed to be an open, modular platform, in the sense that you can use it with any existing ML library and development process. MLflow was launched in June 2018 and has already seen significant community contributions, with over 50 contributors and new features including language APIs, integrations with popular ML libraries, and storage backends. I’ll show how MLflow works and explain how to get started with MLflow.
Advanced Natural Language Processing with Apache Spark NLPDatabricks
This hands-on deep-dive session uses the open-source Apache Spark NLP library to explore advanced NLP in Python. Apache Spark NLP provides state-of-the-art accuracy, speed, and scalability for language understanding by delivering production-grade implementations of some of the most recent research in applied deep learning. Apache Spark NLP is the only open-source NLP library that can natively scale to use any Apache Spark cluster, as well as take advantage of the latest processors from Intel and Nvidia. It’s the most widely used NLP library in the enterprise today.
You’ll edit and run executable Python notebooks as we walk through these common NLP tasks: document classification, named entity recognition, sentiment analysis, spell checking and correction, grammar understanding, question answering, and translation. The discussion of each NLP task includes the latest advances in deep learning and transfer learning used to tackle it – from the hundreds of BERT-based embeddings to models based on the T5 transformer, MarianNMT, multilingual and domain-specific models.
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Mihai Criveti
Mihai is the Principal Architect for Platform Engineering and Technology Solutions at IBM, responsible for Cloud Native and AI Solutions. He is a Red Hat Certified Architect, CKA/CKS, a leader in the IBM Open Innovation community, and advocate for open source development. Mihai is driving the development of Retrieval Augmentation Generation platforms, and solutions for Generative AI at IBM that leverage WatsonX, Vector databases, LangChain, HuggingFace and open source AI models.
Mihai will share lessons learned building Retrieval Augmented Generation, or “Chat with Documents” platforms and APIs that scale, and deploy on Kubernetes. His talk will cover use cases for Generative AI, limitations of Large Language Models, use of RAG, Vector Databases and Fine Tuning to overcome model limitations and build solutions that connect to your data and provide content grounding, limit hallucinations and form the basis of explainable AI. In terms of technology, he will cover LLAMA2, HuggingFace TGIS, SentenceTransformers embedding models using Python, LangChain, and Weaviate and ChromaDB vector databases. He’ll also share tips on writing code using LLM, including building an agent for Ansible and containers.
Scaling factors for Large Language Model Architectures:
• Vector Database: consider sharding and High Availability
• Fine Tuning: collecting data to be used for fine tuning
• Governance and Model Benchmarking: how are you testing your model performance
over time, with different prompts, one-shot, and various parameters
• Chain of Reasoning and Agents
• Caching embeddings and responses
• Personalization and Conversational Memory Database
• Streaming Responses and optimizing performance. A fine tuned 13B model may
perform better than a poor 70B one!
• Calling 3rd party functions or APIs for reasoning or other type of data (ex: LLMs are
terrible at reasoning and prediction, consider calling other models)
• Fallback techniques: fallback to a different model, or default answers
• API scaling techniques, rate limiting, etc.
• Async, streaming and parallelization, multiprocessing, GPU acceleration (including
embeddings), generating your API using OpenAPI, etc.
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
Looking to build a robust machine learning infrastructure to streamline MLOps? Learn from Provectus experts how to ensure the success of your MLOps initiative by implementing Data QA components in your ML infrastructure.
For most organizations, the development of multiple machine learning models, their deployment and maintenance in production are relatively new tasks. Join Provectus as we explain how to build an end-to-end infrastructure for machine learning, with a focus on data quality and metadata management, to standardize and streamline machine learning life cycle management (MLOps).
Agenda
- Data Quality and why it matters
- Challenges and solutions of Data Testing
- Challenges and solutions of Model Testing
- MLOps pipelines and why they matter
- How to expand validation pipelines for Data Quality
Tutorial for Machine Learning 101 (an all-day tutorial at Strata + Hadoop World, New York City, 2015)
The course is designed to introduce machine learning via real applications like building a recommender image analysis using deep learning.
In this talk we cover deployment of machine learning models.
How to Automate your Enterprise Application / ERP TestingRTTS
Your organization has a major system that is central to running its business.
-Maybe it’s an ERP system running SAP, Oracle, Lawson or maybe a CRM system running Salesforce or Microsoft Dynamics,
- or it’s a banking or trading system at a bank or other financial institution,
- or an HR system running payroll through PeopleSoft or Workday
Whatever the system is, it is constantly sending or receiving data feeds (generally in XML or flat file formats) to or from a customer, vendor, or another internal system.
These major data interfaces are present in companies across every industry — from Financials to Pharmaceuticals, and Retail to Utilities — and they are handling data that is crucial to each business. As systems become more complex, it becomes more difficult for you to catch bad records or major data defects effectively before they reach their target system.
Catch those "hard-to-find" data defects
Your systems could be sending/receiving hundreds of feeds from different applications or data sources and each with different owners. In these circumstances, you may have little to no control over the format or quality of the data. Now this data needs to be integrated, mapped, and transformed into your systems. Can your existing manual testing process handle this task?
The challenges you’re facing:
Business: You’re working under time and resource constraints, so you need to speed up testing yet still increase coverage of data tested
Technology: There is no easy way to natively test flat files, XML files, databases or Excel against any other data format
Resources: You do not have enough people to test all of the data from the data feeds all of the time
You know that this data needs to be consistently accurate and reliable — and catching any bad data or data defects seems almost impossible.
Solve your Data Interface testing challenges
QuerySurge is built to automate the testing for any movement of data, testing simple or complex transformations (ETL), as well as data movement without any transformation.
- Test across different platforms, whether Big Data, data warehouse, database(s), NoSQL document store, flat files, json, web services or xml.
- Automate the testing effort from the kickoff of tests to the data comparison to auto-emailing the results.
- Speed up data testing and validation by as much as 1,000 times.
- Schedule tests to run immediately, every Tuesday at 2:00am or after an event, such as an ETL job, triggers the tests.
- Utilize the Data Analytics Dashboard and Data Intelligence Reports to analyze your data testing.
- Get 100% coverage with a dramatic decrease in testing time
It will allow you to quickly compare file to file, file to XML, and XML/files to a database without having to import your files into a database first (it also compares database to database).
This presentation on batch process analytics was given at Emerson Exchange, 2010. A overview of batch data analytics is presented and information provided on a field trail of on-line batch data analytics at the Lubrizol, Rouen, France plant.
Describes a model to analyze software systems and determine areas of risk. Discusses limitations of typical test design methods and provides an example of how to use the model to create high volume automated testing framework.
Building functional Quality Gates with ReportPortalDmitriy Gumeniuk
Presented at SeleniumConf 2023, this talk explores the experience of building Quality Gates using ReportPortal.io for a test regression suite with 200,000 test cases. The discussion highlights the distinctions between functional and non-functional quality gates, explaining why Sonarqube's Quality Gates may be insufficient. It also outlines how to break down the regression structure to organize execution sequences controlled by quality gate checks. These checks are based on various factors, including functional application aspects, test failure types, test case priorities, tested components, user flows, and more—providing a comprehensive approach to ensuring software quality.
Speaker: Dmitriy Gumeniuk, CEO ReportPortal.io,
Head of Testing Products at EPAM Systems.
The talk on youtube: https://www.youtube.com/watch?v=At5MEWqf_TI
Deliver Trusted Data by Leveraging ETL TestingCognizant
We explore how extract, transform and load (ETL) testing with SQL scripting is crucial to data validation and show how to test data on a large scale in a streamlined manner with an Informatica ETL testing tool.
FlorenceAI: Reinventing Data Science at HumanaDatabricks
Humana strives to help the communities we serve and our individual members achieve their best health – no small task in the past year! We had the opportunity to rethink our existing operations and reimagine what a collaborative ML platform for hundreds of data scientists might look like. The primary goal of our ML Platform, named FlorenceAI, is to automate and accelerate the delivery lifecycle of data science solutions at scale. In this presentation, we will walk through an end-to-end example of how to build a model at scale on FlorenceAI and deploy it to production. Tools highlighted include Azure Databricks, MLFlow, AppInsights, and Azure Data Factory.
We will employ slides, notebooks and code snippets covering problem framing and design, initial feature selection, model design and experimentation, and a framework of centralized production code to streamline implementation. Hundreds of data scientists now use our feature store that has tens of thousands of features refreshed in daily and monthly cadences across several years of historical data. We already have dozens of models in production and also daily provide fresh insights for our Enterprise Clinical Operating Model. Each day, billions of rows of data are generated to give us timely information.
We already have examples of teams operating orders of magnitude faster and at a scale not within reach using fixed on-premise resources. Given rapid adoption from a dozen pilot users to over 100 MAU in the first 5 months, we will also share some anecodotes about key early wins created by the platform. We want FlorenceAI to enable Humana’s data scientists to focus their efforts where they add the most value so we can continue to deliver high-quality solutions that remain fresh, relevant and fair in an ever changing world.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Securing your Kubernetes cluster_ a step-by-step guide to success !
Testing and Deployment - Full Stack Deep Learning
1. Full Stack Deep Learning
Josh Tobin, Sergey Karayev, Pieter Abbeel
Testing & Deployment
2. Full Stack Deep Learning
“All-in-one”
Deployment
or or
Development Training/Evaluation
or
Data
or
Experiment Management
Labeling
Interchange
Storage
Monitoring
?
Hardware /
Mobile
Software Engineering
Frameworks
Database
Hyperparameter TuningDistributed Training
Web
Versioning
Processing
CI / Testing
?
Resource Management
Testing & Deployment - Overview
3. Full Stack Deep Learning
Module Overview
• Concepts
• Project structure
• ML Test Score: a rubric for production readiness
• Infrastructure/Tooling
• CI/Testing
• Docker in more depth
• Deploying to the web
• Monitoring
• Deploying to hardware/mobile
3
4. Full Stack Deep Learning
4
Training System
- Process raw data
- Run experiments
- Manage results
Prediction System
- Process input data
- Construct networks with
trained weights
- Make predictions
Serving System
- Serve predictions
- Scale to demand
Training System
Tests
- Test full training pipeline
- Start from raw data
- Runs in < 1 day (nightly)
- Catches upstream
regressions
Validation Tests
- Test prediction system on
validation set
- Start from processed data
- Runs in < 1 hour (every
code push)
- Catches model
regressions
Functionality Tests
- Test prediction system on
a few important examples
- Runs in <5 minutes
(during development)
- Catches code regressions
Monitoring
- Alert to downtime,
errors, distribution
shifts
- Catches service and
data regressions
Training/Validation Data Production Data
Testing & Deployment - Project Structure
5. Full Stack Deep Learning
5
The ML Test Score:
A Rubric for ML Production Readiness and Technical Debt Reduction
Eric Breck, Shanqing Cai, Eric Nielsen, Michael Salib, D. Sculley
Google, Inc.
ebreck, cais, nielsene, msalib, dsculley@google.com
bstract—Creating reliable, production-level machine learn-
systems brings on a host of concerns not found in
ll toy examples or even large offline research experiments.
ing and monitoring are key considerations for ensuring
production-readiness of an ML system, and for reducing
nical debt of ML systems. But it can be difficult to formu-
specific tests, given that the actual prediction behavior of
given model is difficult to specify a priori. In this paper,
present 28 specific tests and monitoring needs, drawn from
may find difficult. Note that this rubric focuses on is
specific to ML systems, and so does not include gen
software engineering best practices such as ensuring g
unit test coverage and a well-defined binary release proc
Such strategies remain necessary as well. We do call
a few specific areas for unit or integration tests that h
unique ML-related behavior.
How to read the tests: Each test is written as
IEEE Big Data 2017
An exhaustive framework/checklist from practitioners at Google
Testing & Deployment - ML Test Score
6. Full Stack Deep Learning
6
Follow-up to previous work
from Google
Testing & Deployment - ML Test Score
7. Full Stack Deep Learning
7
ML Systems Require Extensive Testing and Monitoring. The key consideration is that unlike a manually coded system (left),
avior is not easily specified in advance. This behavior depends on dynamic qualities of the data, and on various model configuration
d with determining how reliable an ML system is
an how to build one.
of surprising sources of technical debt in ML
bias. Visualization tools such as Facets1
can be ver
for analyzing the data to produce the schema. Invar
capture in a schema can also be inferred automatica
Traditional Software
ML Systems Require Extensive Testing and Monitoring. The key consideration is that unlike a manually coded system
ehavior is not easily specified in advance. This behavior depends on dynamic qualities of the data, and on various model config
ed with determining how reliable an ML system is
han how to build one.
s of surprising sources of technical debt in ML
bias. Visualization tools such as Facets1
can b
for analyzing the data to produce the schema
capture in a schema can also be inferred autom
Machine Learning Software
Training
System
Prediction
System
Serving
System
Data
Testing & Deployment - ML Test Score
8. Full Stack Deep Learning
8and then compare them to the data to avoid an anchoring
1 Feature expectations are captured in a schema.
2 All features are beneficial.
3 No feature’s cost is too much.
4 Features adhere to meta-level requirements.
5 The data pipeline has appropriate privacy controls.
6 New features can be added quickly.
7 All input feature code is tested.
Table I
BRIEF LISTING OF THE SEVEN DATA TESTS.
out a wide variety of potential features to improv
quality.
How? Programmatically enforce these requ
that all models in production properly adhere t
Data 5: The data pipeline has appropr
controls: Training data, validation data, and voc
all have the potential to contain sensitive user
teams often are aware of the need to remove per
tifiable information (PII), during this type of e
1https://pair-code.github.io/facets/
Data Tests
Table III
BRIEF LISTING OF THE ML INFRASTRUCTURE TESTS
1 Training is reproducible.
2 Model specs are unit tested.
3 The ML pipeline is Integration tested.
4 Model quality is validated before serving.
5 The model is debuggable.
6 Models are canaried before serving.
7 Serving models can be rolled back.
country, users by frequency of use, or movies by genre.
Infra 1: Training is reproducible: Idea
twice on the same data should produce two ide
els. Deterministic training dramatically simplifie
about the whole system and can aid auditability
ging. For example, optimizing feature generatio
delicate process but verifying that the old and
generation code will train to an identical model
more confidence that the refactoring was correc
of diff-testing relies entirely on deterministic tra
Unfortunately, model training is often not rep
practice, especially when working with non-conv
ML Infrastructure Tests
the model specification can help make training auditabl
improve reproducibility.
1 Model specs are reviewed and submitted.
2 Offline and online metrics correlate.
3 All hyperparameters have been tuned.
4 The impact of model staleness is known.
5 A simpler model is not better.
6 Model quality is sufficient on important data slices.
7 The model is tested for considerations of inclusion.
Table II
BRIEF LISTING OF THE SEVEN MODEL TESTS
Model Tests
Table IV
BRIEF LISTING OF THE SEVEN MONITORING TESTS
1 Dependency changes result in notification.
2 Data invariants hold for inputs.
3 Training and serving are not skewed.
4 Models are not too stale.
5 Models are numerically stable.
6 Computing performance has not regressed.
7 Prediction quality has not regressed.
normally, when not in emergency conditions.
Monitoring Tests
Testing & Deployment - ML Test Score
9. Full Stack Deep Learning
Scoring the Test
Points Description
0 More of a research project than a productionized system.
(0,1] Not totally untested, but it is worth considering the possibility of serious holes in reliability.
(1,2] There’s been first pass at basic productionization, but additional investment may be needed.
(2,3] Reasonably tested, but it’s possible that more of those tests and procedures may be automated.
(3,5] Strong levels of automated testing and monitoring, appropriate for mission-critical systems.
> 5 Exceptional levels of automated testing and monitoring.
Table V
L Test Score. THIS SCORE IS COMPUTED BY TAKING THE minimum SCORE FROM EACH OF THE FOUR TEST A
MS AT DIFFERENT POINTS IN THEIR DEVELOPMENT MAY REASONABLY AIM TO BE AT DIFFERENT POINTS AL
worth the same number of points. This is numbers”). When we asked if they had d9Testing & Deployment - ML Test Score
10. Full Stack Deep LearningFigure 4. Average scores for interviewed teams. These graphs display the average score for each test across the 36 systems we examined.
• 36 teams with ML
systems in production
at Google
Results at Google
10Testing & Deployment - ML Test Score
12. Full Stack Deep Learning
“All-in-one”
Deployment
or or
Development Training/Evaluation
or
Data
or
Experiment Management
Labeling
Interchange
Storage
Monitoring
?
Hardware /
Mobile
Software Engineering
Frameworks
Database
Hyperparameter TuningDistributed Training
Web
Versioning
Processing
CI / Testing
?
Resource Management
Testing & Deployment - CI/Testing
13. Full Stack Deep Learning
Testing / CI
• Unit / Integration Tests
• Tests for individual module functionality and for the whole system
• Continuous Integration
• Tests are run every time new code is pushed to the repository, before updated model is
deployed.
• SaaS for CI
• CircleCI, Travis, Jenkins, Buildkite
• Containerization (via Docker)
• A self-enclosed environment for running the tests
13Testing & Deployment - CI/Testing
14. Full Stack Deep Learning
14
Training System
- Process raw data
- Run experiments
- Manage results
Prediction System
- Process input data
- Construct networks with
trained weights
- Make predictions
Serving System
- Serve predictions
- Scale to demand
Training System
Tests
- Test full training pipeline
- Start from raw data
- Runs in < 1 day (nightly)
- Catches upstream
regressions
Validation Tests
- Test prediction system on
validation set
- Start from processed data
- Runs in < 1 hour (every
code push)
- Catches model
regressions
Functionality Tests
- Test prediction system on
a few important examples
- Runs in <5 minutes
(during development)
- Catches code regressions
Monitoring
- Alert to downtime,
errors, distribution
shifts
- Catches service and
data regressions
Training/Validation Data Production Data
Testing & Deployment - CI/Testing
15. Full Stack Deep Learning
15
and then compare them to the data to avoid an anchoring
1 Feature expectations are captured in a schema.
2 All features are beneficial.
3 No feature’s cost is too much.
4 Features adhere to meta-level requirements.
5 The data pipeline has appropriate privacy controls.
6 New features can be added quickly.
7 All input feature code is tested.
Table I
BRIEF LISTING OF THE SEVEN DATA TESTS.
out a wide variety of potential features to improv
quality.
How? Programmatically enforce these requ
that all models in production properly adhere t
Data 5: The data pipeline has appropr
controls: Training data, validation data, and voc
all have the potential to contain sensitive user
teams often are aware of the need to remove per
tifiable information (PII), during this type of e
1https://pair-code.github.io/facets/
Data Tests
improve reproducibility.
1 Model specs are reviewed and submitted.
2 Offline and online metrics correlate.
3 All hyperparameters have been tuned.
4 The impact of model staleness is known.
5 A simpler model is not better.
6 Model quality is sufficient on important data slices.
7 The model is tested for considerations of inclusion.
Table II
BRIEF LISTING OF THE SEVEN MODEL TESTS
Model Tests
Table III
BRIEF LISTING OF THE ML INFRASTRUCTURE TESTS
1 Training is reproducible.
2 Model specs are unit tested.
3 The ML pipeline is Integration tested.
4 Model quality is validated before serving.
5 The model is debuggable.
6 Models are canaried before serving.
7 Serving models can be rolled back.
country, users by frequency of use, or movies by genre.
Examining sliced data avoids having fine-grained quality
Infra 1: Training is reproducible: Idea
twice on the same data should produce two ide
els. Deterministic training dramatically simplifie
about the whole system and can aid auditability
ging. For example, optimizing feature generatio
delicate process but verifying that the old and
generation code will train to an identical model
more confidence that the refactoring was correc
of diff-testing relies entirely on deterministic tra
Unfortunately, model training is often not rep
practice, especially when working with non-conv
ML Infrastructure Tests
Table IV
BRIEF LISTING OF THE SEVEN MONITORING TESTS
1 Dependency changes result in notification.
2 Data invariants hold for inputs.
3 Training and serving are not skewed.
4 Models are not too stale.
5 Models are numerically stable.
6 Computing performance has not regressed.
7 Prediction quality has not regressed.
normally, when not in emergency conditions.
Monitoring Tests
Testing & Deployment - CI/Testing
16. Full Stack Deep Learning
CircleCI / Travis / etc
• SaaS for continuous integration
• Integrate with your repository: every push kicks off a cloud job
• Job can be defined as commands in a Docker container, and can store
results for later review
• No GPUs
• CircleCI has a free plan
16Testing & Deployment - CI/Testing
17. Full Stack Deep Learning
Jenkins / Buildkite
• Nice option for running CI on your own hardware, in the cloud, or mixed
• Good option for a scheduled training test (can use your GPUs)
• Very flexible
17
18. Full Stack Deep Learning
Docker
• Before proceeding, we have to understand Docker better.
• Docker vs VM
• Dockerfile and layer-based images
• DockerHub and the ecosystem
• Kubernetes and other orchestrators
18Testing & Deployment - Docker
19. Full Stack Deep Learning
19
https://medium.freecodecamp.org/a-beginner-friendly-introduction-to-containers-vms-and-docker-79a9e3e119b
No OS -> Light weight
Testing & Deployment - Docker
20. Full Stack Deep Learning
• Spin up a container for
every discrete task
• For example, a web app
might have four containers:
• Web server
• Database
• Job queue
• Worker
20
https://www.docker.com/what-container
Light weight -> Heavy use
Testing & Deployment - Docker
21. Full Stack Deep Learning
Dockerfile
21Testing & Deployment - Docker
22. Full Stack Deep Learning
Strong Ecosystem
• Images are easy to
find, modify, and
contribute back to
DockerHub
• Private images
easy to store in
same place
22
https://docs.docker.com/engine/docker-overview
Testing & Deployment - Docker
23. Full Stack Deep Learning
23
https://www.docker.com/what-container#/package_software
Testing & Deployment - Docker
24. Full Stack Deep Learning
Container Orchestration
• Distributing containers onto underlying VMs
or bare metal
• Kubernetes is the open-source winner, but
cloud providers have good offerings, too.
24Testing & Deployment - Docker
26. Full Stack Deep Learning
“All-in-one”
Deployment
or or
Development Training/Evaluation
or
Data
or
Experiment Management
Labeling
Interchange
Storage
Monitoring
?
Hardware /
Mobile
Software Engineering
Frameworks
Database
Hyperparameter TuningDistributed Training
Web
Versioning
Processing
CI / Testing
?
Resource Management
Testing & Deployment - Web Deployment
27. Full Stack Deep Learning
Terminology
• Inference: making predictions (not training)
• Model Serving: ML-specific deployment, usually focused on optimizing
GPU usage
27Testing & Deployment - Web Deployment
28. Full Stack Deep Learning
Web Deployment
• REST API
• Serving predictions in response to canonically-formatted HTTP requests
• The web server is running and calling the prediction system
• Options
• Deploying code to VMs, scale by adding instances
• Deploy code as containers, scale via orchestration
• Deploy code as a “serverless function”
• Deploy via a model serving solution
28Testing & Deployment - Web Deployment
29. Full Stack Deep Learning
29
Training System
- Process raw data
- Run experiments
- Manage results
Prediction System
- Process input data
- Construct networks with
trained weights
- Make predictions
Serving System
- Serve predictions
- Scale to demand
Training System
Tests
- Test full training pipeline
- Start from raw data
- Runs in < 1 day (nightly)
- Catches upstream
regressions
Validation Tests
- Test prediction system on
validation set
- Start from processed data
- Runs in < 1 hour (every
code push)
- Catches model
regressions
Functionality Tests
- Test prediction system on
a few important examples
- Runs in <5 minutes
(during development)
- Catches code regressions
Monitoring
- Alert to downtime,
errors, distribution
shifts
- Catches service and
data regressions
Training/Validation Data Production Data
Testing & Deployment - Web Deployment
30. Full Stack Deep Learning
30
and then compare them to the data to avoid an anchoring
1 Feature expectations are captured in a schema.
2 All features are beneficial.
3 No feature’s cost is too much.
4 Features adhere to meta-level requirements.
5 The data pipeline has appropriate privacy controls.
6 New features can be added quickly.
7 All input feature code is tested.
Table I
BRIEF LISTING OF THE SEVEN DATA TESTS.
out a wide variety of potential features to improv
quality.
How? Programmatically enforce these requ
that all models in production properly adhere t
Data 5: The data pipeline has appropr
controls: Training data, validation data, and voc
all have the potential to contain sensitive user
teams often are aware of the need to remove per
tifiable information (PII), during this type of e
1https://pair-code.github.io/facets/
Data Tests
improve reproducibility.
1 Model specs are reviewed and submitted.
2 Offline and online metrics correlate.
3 All hyperparameters have been tuned.
4 The impact of model staleness is known.
5 A simpler model is not better.
6 Model quality is sufficient on important data slices.
7 The model is tested for considerations of inclusion.
Table II
BRIEF LISTING OF THE SEVEN MODEL TESTS
Model Tests
Table III
BRIEF LISTING OF THE ML INFRASTRUCTURE TESTS
1 Training is reproducible.
2 Model specs are unit tested.
3 The ML pipeline is Integration tested.
4 Model quality is validated before serving.
5 The model is debuggable.
6 Models are canaried before serving.
7 Serving models can be rolled back.
country, users by frequency of use, or movies by genre.
Examining sliced data avoids having fine-grained quality
Infra 1: Training is reproducible: Idea
twice on the same data should produce two ide
els. Deterministic training dramatically simplifie
about the whole system and can aid auditability
ging. For example, optimizing feature generatio
delicate process but verifying that the old and
generation code will train to an identical model
more confidence that the refactoring was correc
of diff-testing relies entirely on deterministic tra
Unfortunately, model training is often not rep
practice, especially when working with non-conv
ML Infrastructure Tests
Table IV
BRIEF LISTING OF THE SEVEN MONITORING TESTS
1 Dependency changes result in notification.
2 Data invariants hold for inputs.
3 Training and serving are not skewed.
4 Models are not too stale.
5 Models are numerically stable.
6 Computing performance has not regressed.
7 Prediction quality has not regressed.
normally, when not in emergency conditions.
Monitoring Tests
Testing & Deployment - Web Deployment
31. Full Stack Deep Learning
31
DEPLOY CODE TO YOUR OWN BARE METAL
Testing & Deployment - Web Deployment
32. Full Stack Deep Learning
32
DEPLOY CODE TO YOUR OWN BARE METAL
DEPLOY CODE TO CLOUD INSTANCES
Testing & Deployment - Web Deployment
33. Full Stack Deep Learning
Deploying code to cloud instances
• Each instance has to be provisioned with dependencies and app code
• Number of instances is managed manually or via auto-scaling
• Load balancer sends user traffic to the instances
33
34. Full Stack Deep Learning
Deploying code to cloud instances
• Cons:
• Provisioning can be brittle
• Paying for instances even when not using them (auto-scaling does help)
34Testing & Deployment - Web Deployment
35. Full Stack Deep Learning
35
DEPLOY DOCKER CONTAINERS TO
CLOUD INSTANCES
DEPLOY CODE TO YOUR OWN BARE METAL
DEPLOY CODE TO CLOUD INSTANCES
Testing & Deployment - Web Deployment
36. Full Stack Deep Learning
Deploying containers
• App code and dependencies are packaged into Docker containers
• Kubernetes or alternative (AWS Fargate) orchestrates containers (DB /
workers / web servers / etc.
36Testing & Deployment - Web Deployment
37. Full Stack Deep Learning
Deploying containers
• Cons:
• Still managing your own servers and paying for uptime, not compute-
time
37Testing & Deployment - Web Deployment
38. Full Stack Deep Learning
38
DEPLOY CODE TO YOUR OWN BARE METAL
DEPLOY CODE TO CLOUD INSTANCES
DEPLOY DOCKER CONTAINERS TO
CLOUD INSTANCES
DEPLOY SERVERLESS FUNCTIONS
Testing & Deployment - Web Deployment
39. Full Stack Deep Learning
Deploying code as serverless functions
• App code and dependencies are packaged into .zip files, with a single
entry point function
• AWS Lambda (or Google Cloud Functions, or Azure Functions) manages
everything else: instant scaling to 10,000+ requests per second, load
balancing, etc.
• Only pay for compute-time.
39Testing & Deployment - Web Deployment
40. Full Stack Deep Learning
40
Testing & Deployment - Web Deployment
41. Full Stack Deep Learning
Deploying code as serverless functions
• Cons:
• Entire deployment package has to fit within 500MB, <5 min execution,
<3GB memory (on AWS Lambda)
• Only CPU execution
41Testing & Deployment - Web Deployment
42. Full Stack Deep Learning
42
and then compare them to the data to avoid an anchoring
1 Feature expectations are captured in a schema.
2 All features are beneficial.
3 No feature’s cost is too much.
4 Features adhere to meta-level requirements.
5 The data pipeline has appropriate privacy controls.
6 New features can be added quickly.
7 All input feature code is tested.
Table I
BRIEF LISTING OF THE SEVEN DATA TESTS.
out a wide variety of potential features to improv
quality.
How? Programmatically enforce these requ
that all models in production properly adhere t
Data 5: The data pipeline has appropr
controls: Training data, validation data, and voc
all have the potential to contain sensitive user
teams often are aware of the need to remove per
tifiable information (PII), during this type of e
1https://pair-code.github.io/facets/
Data Tests
improve reproducibility.
1 Model specs are reviewed and submitted.
2 Offline and online metrics correlate.
3 All hyperparameters have been tuned.
4 The impact of model staleness is known.
5 A simpler model is not better.
6 Model quality is sufficient on important data slices.
7 The model is tested for considerations of inclusion.
Table II
BRIEF LISTING OF THE SEVEN MODEL TESTS
Model Tests
Table III
BRIEF LISTING OF THE ML INFRASTRUCTURE TESTS
1 Training is reproducible.
2 Model specs are unit tested.
3 The ML pipeline is Integration tested.
4 Model quality is validated before serving.
5 The model is debuggable.
6 Models are canaried before serving.
7 Serving models can be rolled back.
country, users by frequency of use, or movies by genre.
Examining sliced data avoids having fine-grained quality
Infra 1: Training is reproducible: Idea
twice on the same data should produce two ide
els. Deterministic training dramatically simplifie
about the whole system and can aid auditability
ging. For example, optimizing feature generatio
delicate process but verifying that the old and
generation code will train to an identical model
more confidence that the refactoring was correc
of diff-testing relies entirely on deterministic tra
Unfortunately, model training is often not rep
practice, especially when working with non-conv
ML Infrastructure Tests
Table IV
BRIEF LISTING OF THE SEVEN MONITORING TESTS
1 Dependency changes result in notification.
2 Data invariants hold for inputs.
3 Training and serving are not skewed.
4 Models are not too stale.
5 Models are numerically stable.
6 Computing performance has not regressed.
7 Prediction quality has not regressed.
normally, when not in emergency conditions.
Monitoring Tests
Testing & Deployment - Web Deployment
43. Full Stack Deep Learning
Deploying code as serverless functions
• Canarying
• Easy to have two versions of Lambda functions in production, and start
sending low volume traffic to one.
• Rollback
• Easy to switch back to the previous version of the function.
43Testing & Deployment - Web Deployment
44. Full Stack Deep Learning
Model Serving
• Web deployment options specialized for machine learning models
• Tensorflow Serving (Google)
• Model Server for MXNet (Amazon)
• Clipper (Berkeley RISE Lab)
• SaaS solutions like Algorithmia
• The most important feature is batching requests for GPU inference
44Testing & Deployment - Web Deployment
45. Full Stack Deep Learning
Tensorflow Serving
• Able to serve Tensorflow predictions at high
throughput.
• Used by Google and their “all-in-one” machine
learning offerings
• Overkill unless you know you need to use GPU for
inference
45
Figure 2: TFS2
(hosted model serving) architecture.
2.2.1 Inter-Request Batching
46. Full Stack Deep Learning
MXNet Model Server
• Amazon’s answer to Tensorflow Serving
• Part of their SageMaker “all-in-one” machine learning offering.
46
47. Full Stack Deep Learning
Clipper
• Open-source model serving via
REST, using Docker.
• Nice-to-haves on top of the basics
• e.g “state-of-the-art bandit and
ensemble methods to
intelligently select and combine
predictions”
• Probably too early to be actually
useful (unless you know why you
need it)
47
48. Full Stack Deep Learning
Algorithmia
• Some startups promise effortless model serving
48
49. Full Stack Deep Learning
Seldon
49
https://www.seldon.io/tech/products/core/
Testing & Deployment - Web Deployment
50. Full Stack Deep Learning
Takeaway
• If you are doing CPU inference, can get away with scaling by launching
more servers, or going serverless.
• The dream is deploying Docker as easily as Lambda
• Next best thing is either Lambda or Docker, depending on your model’s
needs.
• If using GPU inference, things like TF Serving and Clipper become useful
by adaptive batching, etc.
50Testing & Deployment - Web Deployment
52. Full Stack Deep Learning
“All-in-one”
Deployment
or or
Development Training/Evaluation
or
Data
or
Experiment Management
Labeling
Interchange
Storage
Monitoring
?
Hardware /
Mobile
Software Engineering
Frameworks
Database
Hyperparameter TuningDistributed Training
Web
Versioning
Processing
CI / Testing
?
Resource Management
Testing & Deployment - Monitoring
53. Full Stack Deep Learning
53
Training System
- Process raw data
- Run experiments
- Manage results
Prediction System
- Process input data
- Construct networks with
trained weights
- Make predictions
Serving System
- Serve predictions
- Scale to demand
Training System
Tests
- Test full training pipeline
- Start from raw data
- Runs in < 1 day (nightly)
- Catches upstream
regressions
Validation Tests
- Test prediction system on
validation set
- Start from processed data
- Runs in < 1 hour (every
code push)
- Catches model
regressions
Functionality Tests
- Test prediction system on
a few important examples
- Runs in <5 minutes
(during development)
- Catches code regressions
Monitoring
- Alert to downtime,
errors, distribution
shifts
- Catches service and
data regressions
Training/Validation Data Production Data
Testing & Deployment - Monitoring
54. Full Stack Deep Learning
54
and then compare them to the data to avoid an anchoring
1 Feature expectations are captured in a schema.
2 All features are beneficial.
3 No feature’s cost is too much.
4 Features adhere to meta-level requirements.
5 The data pipeline has appropriate privacy controls.
6 New features can be added quickly.
7 All input feature code is tested.
Table I
BRIEF LISTING OF THE SEVEN DATA TESTS.
out a wide variety of potential features to improv
quality.
How? Programmatically enforce these requ
that all models in production properly adhere t
Data 5: The data pipeline has appropr
controls: Training data, validation data, and voc
all have the potential to contain sensitive user
teams often are aware of the need to remove per
tifiable information (PII), during this type of e
1https://pair-code.github.io/facets/
Data Tests
improve reproducibility.
1 Model specs are reviewed and submitted.
2 Offline and online metrics correlate.
3 All hyperparameters have been tuned.
4 The impact of model staleness is known.
5 A simpler model is not better.
6 Model quality is sufficient on important data slices.
7 The model is tested for considerations of inclusion.
Table II
BRIEF LISTING OF THE SEVEN MODEL TESTS
Model Tests
Table III
BRIEF LISTING OF THE ML INFRASTRUCTURE TESTS
1 Training is reproducible.
2 Model specs are unit tested.
3 The ML pipeline is Integration tested.
4 Model quality is validated before serving.
5 The model is debuggable.
6 Models are canaried before serving.
7 Serving models can be rolled back.
country, users by frequency of use, or movies by genre.
Examining sliced data avoids having fine-grained quality
Infra 1: Training is reproducible: Idea
twice on the same data should produce two ide
els. Deterministic training dramatically simplifie
about the whole system and can aid auditability
ging. For example, optimizing feature generatio
delicate process but verifying that the old and
generation code will train to an identical model
more confidence that the refactoring was correc
of diff-testing relies entirely on deterministic tra
Unfortunately, model training is often not rep
practice, especially when working with non-conv
ML Infrastructure Tests
Table IV
BRIEF LISTING OF THE SEVEN MONITORING TESTS
1 Dependency changes result in notification.
2 Data invariants hold for inputs.
3 Training and serving are not skewed.
4 Models are not too stale.
5 Models are numerically stable.
6 Computing performance has not regressed.
7 Prediction quality has not regressed.
normally, when not in emergency conditions.
Monitoring Tests
Testing & Deployment - Monitoring
55. Full Stack Deep Learning
Monitoring
• Alarms for when things go
wrong, and records for tuning
things.
• Cloud providers have decent
monitoring solutions.
• Anything that can be logged
can be monitored (i.e. data
skew)
• Will explore in lab
55Testing & Deployment - Monitoring
56. Full Stack Deep Learning
Data Distribution Monitoring
• Underserved need!
56
Domino Data Lab
57. Full Stack Deep Learning
Closing the Flywheel
• Important to monitor the business uses of the model, not just its own
statistics
• Important to be able to contribute failures back to dataset
57Testing & Deployment - Monitoring
58. Full Stack Deep Learning
Closing the Flywheel
58Testing & Deployment - Monitoring
59. Full Stack Deep Learning 59Testing & Deployment - Monitoring
Closing the Flywheel
60. Full Stack Deep Learning 60Testing & Deployment - Monitoring
Closing the Flywheel
61. Full Stack Deep Learning 61Testing & Deployment - Monitoring
Closing the Flywheel
63. Full Stack Deep Learning
“All-in-one”
Deployment
or or
Development Training/Evaluation
or
Data
or
Experiment Management
Labeling
Interchange
Storage
Monitoring
?
Hardware /
Mobile
Software Engineering
Frameworks
Database
Hyperparameter TuningDistributed Training
Web
Versioning
Processing
CI / Testing
?
Resource Management
Testing & Deployment - Hardware/Mobile
64. Full Stack Deep Learning
Problems
• Embedded and mobile frameworks are less fully featured than full
PyTorch/Tensorflow
• Have to be careful with architecture
• Interchange format
• Embedded and mobile devices have little memory and slow/expensive
compute
• Have to reduce network size / quantize weights / distill knowledge
64Testing & Deployment - Hardware/Mobile
65. Full Stack Deep Learning
Problems
• Embedded and mobile frameworks are less fully featured than full
PyTorch/Tensorflow
• Have to be careful with architecture
• Interchange format
• Embedded and mobile devices have little memory and slow/expensive
compute
• Have to reduce network size / quantize weights / distill knowledge
65Testing & Deployment - Hardware/Mobile
66. Full Stack Deep Learning
Tensorflow Lite
• Not all models will work, but much better than "Tensorflow Mobile" used
to be
66Testing & Deployment - Hardware/Mobile
67. Full Stack Deep Learning
PyTorch Mobile
• Optional quantization
• TorchScript: static execution graph
• Libraries for Android and iOS
67
68. Full Stack Deep Learning
68
https://heartbeat.fritz.ai/core-ml-vs-ml-kit-which-mobile-machine-learning-framework-is-right-for-you-e25c5d34c765
CoreML
- Released at Apple WWDC 2017
- Inference only
- https://coreml.store/
- Supposed to work with both
CoreML and MLKit
- Announced at Google I/O 2018
- Either via API or on-device
- Offers pre-trained models, or
can upload Tensorflow Lite
model
Testing & Deployment - Hardware/Mobile
69. Full Stack Deep Learning
Problems
69Testing & Deployment - Hardware/Mobile
• Embedded and mobile frameworks are less fully featured than full
PyTorch/Tensorflow
• Have to be careful with architecture
• Interchange format
• Embedded and mobile devices have little memory and slow/expensive
compute
• Have to reduce network size / quantize weights / distill knowledge
70. Full Stack Deep Learning
• Open Neural Network Exchange: open-source format for deep learning
models
• The dream is to mix different frameworks, such that frameworks that are
good for development (PyTorch) don’t also have to be good at inference
(Caffe2)
70Testing & Deployment - Hardware/Mobile
Interchange Format
71. Full Stack Deep Learning 71Testing & Deployment - Hardware/Mobile
Interchange Format
https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-onnx
72. Full Stack Deep Learning
Problems
• Embedded and mobile frameworks are less fully featured than full
PyTorch/Tensorflow
• Have to be careful with architecture
• Interchange format
• Embedded and mobile devices have little memory and slow/expensive
compute
• Have to reduce network size / quantize weights / distill knowledge
72Testing & Deployment - Hardware/Mobile
73. Full Stack Deep Learning
NVIDIA for Embedded
• Todo
73Testing & Deployment - Hardware/Mobile
74. Full Stack Deep Learning
Problems
74Testing & Deployment - Hardware/Mobile
• Embedded and mobile frameworks are less fully featured than full
PyTorch/Tensorflow
• Have to be careful with architecture
• Interchange format
• Embedded and mobile devices have little memory and slow/expensive
compute
• Have to reduce network size / quantize weights / distill knowledge
75. Full Stack Deep Learning
75
Standard
1x1
Inception
https://medium.com/@yu4u/why-mobilenet-and-its-variants-e-g-shufflenet-are-fast-1c7048b9618d
MobileNet
Testing & Deployment - Hardware/Mobile
MobileNets
76. Full Stack Deep Learning
76
https://medium.com/@yu4u/why-mobilenet-and-its-variants-e-g-shufflenet-are-fast-1c7048b9618d
Testing & Deployment - Hardware/Mobile
77. Full Stack Deep Learning
Problems
77Testing & Deployment - Hardware/Mobile
• Embedded and mobile frameworks are less fully featured than full
PyTorch/Tensorflow
• Have to be careful with architecture
• Interchange format
• Embedded and mobile devices have little memory and slow/expensive
compute
• Have to reduce network size / quantize weights / distill knowledge
78. Full Stack Deep Learning
Knowledge distillation
• First, train a large "teacher" network on the data directly
• Next, train a smaller "student" network to match the "teacher" network's
prediction distribution
78
79. Full Stack Deep Learning
Recommended case study: DistillBERT
79
https://medium.com/huggingface/distilbert-8cf3380435b5