This are the slides for Data Engineering Track Module 2. Prepared for University of Toronto in march 2022. Watch the playlist at https://www.youtube.com/playlist?list=PLWoneCyhdP1DWijBQo7zj2uJbuEXaE6E2
These are the slides for Module 2 of Data Engineering Track, for University of Toronto, March 2022. The video playlist is available at https://www.youtube.com/playlist?list=PLWoneCyhdP1DWijBQo7zj2uJbuEXaE6E2
C19013010 the tutorial to build shared ai services session 1Bill Liu
This document provides an agenda and overview for a tutorial on building shared AI services. The tutorial consists of two modules: the first module discusses a case study of AI as a service and challenges of traditional machine learning, and how deep learning can help address these challenges. The second module introduces Keras and options for running Keras on Spark, including a use case, code lab, and prerequisites for running the code lab in Docker containers.
This presentation covers an overview of Analytics and Machine learning. It also covers the Microsoft's contribution in Machine learning space. Azure ML Studio, a SaaS based portal to create, experiment and share Machine Learning Solutions to the external world.
Introducción al Machine Learning AutomáticoSri Ambati
¿Cómo puede llevar el aprendizaje automático a las masas? Los proyectos de Machine Learning con la búsqueda de talento, el tiempo para construir e implementar modelos y confiar en los modelos que se construyen.
¿Cómo puede tener varios equipos en su organización para crear modelos de ML precisos sin ser expertos en ciencia de datos o aprendizaje automático?
¿Se pregunta sobre los diferentes sabores de AutoML?
H2O Driverless AI emplea las técnicas de científicos expertos en datos en una aplicación fácil de usar que ayuda a escalar sus esfuerzos de ciencia de datos. La inteligencia artificial Driverless permite a los científicos de datos trabajar en proyectos más rápido utilizando la automatización y la potencia de computación de vanguardia de las GPU para realizar tareas en minutos que solían tomar meses.
Con H2O Driverless AI, todos, incluyendo expertos y científicos de datos junior, científicos de dominio e ingenieros de datos pueden desarrollar modelos confiables de aprendizaje automático. Esta plataforma de aprendizaje automático de última generación ofrece una funcionalidad única y avanzada para la visualización de datos, la ingeniería de características, la interpretabilidad del modelo y la implementación de baja latencia.
H2O Driverless AI hace:
* Visualización automática de datos
* Ingeniería automática de funciones a nivel de Grandmaster
* Selección automática del modelo
* Ajuste y capacitación automáticos del modelo
* Paralelización automática utilizando múltiples CPU o GPU
* Ensamblaje automático del modelo
*automática del Interpretaciónaprendizaje automático (MLI)
* Generación automática de código de puntuación
¿Quieres probarlo tú mismo? Puede obtener una prueba gratuita aquí: H2O Driverless AI trial.
Venga a esta sesión y descubra cómo comenzar con el Aprendizaje automático automático con AI sin conductor H2O, y cree modelos potentes con solo unos pocos clics.
¡Te veo pronto!
Acerca de H2O.ai
H2O.ai es una empresa visionaria de software de código abierto de Silicon Valley que creó y reimaginó lo que es posible. Somos una empresa de fabricantes que trajeron al mercado nuevas plataformas y tecnologías para impulsar el movimiento de inteligencia artificial. Somos los creadores de, H2O, la principal plataforma de aprendizaje de ciencia de datos de fuente abierta y de aprendizaje automático utilizada por casi la mitad de Fortune 500 y en la que confían más de 14,000 organizaciones y cientos de miles de científicos de datos de todo el mundo.
Microsoft DevOps for AI with GoDataDrivenGoDataDriven
Artificial Intelligence (AI) and machine learning (ML) technologies extend the capabilities of software applications that are now found throughout our daily life: digital assistants, facial recognition, photo captioning, banking services, and product recommendations. The difficult part about integrating AI or ML into an application is not the technology, or the math, or the science or the algorithms. The challenge is getting the model deployed into a production environment and keeping it operational and supportable. Software development teams know how to deliver business applications and cloud services. AI/ML teams know how to develop models that can transform a business. But when it comes to putting the two together to implement an application pipeline specific to AI/ML — to automate it and wrap it around good deployment practices — the process needs some effort to be successful.
One of the most popular buzz words nowadays in the technology world is “Machine Learning (ML).” Most economists and business experts foresee Machine Learning changing every aspect of our lives in the next 10 years through automating and optimizing processes. This is leading many organizations to seek experts who can implement Machine Learning into their businesses.
The paper will be written for statistical programmers who want to explore Machine Learning career, add Machine Learning skills to their experiences or enter a Machine Learning fields. The paper will discuss about personal journey to become to a Machine Learning Engineer from a statistical programmer. The paper will share my personal experience on what motivated me to start Machine Learning career, how I started it, and what I have learned and done to be a Machine Learning Engineer. In addition, the paper will also discuss the future of Machine Learning in Pharmaceutical Industry, especially in Biometric department.
Keynote presentation from ECBS conference. The talk is about how to use machine learning and AI in improving software engineering. Experiences from our project in Software Center (www.software-center.se).
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTrivadis
This document provides an overview of artificial intelligence trends and applications in development and operations. It discusses how AI is being used for rapid prototyping, intelligent programming assistants, automatic error handling and code refactoring, and strategic decision making. Examples are given of AI tools from Microsoft, Facebook, and Codota. The document also discusses challenges like interpretability of neural networks and outlines a vision of "Software 2.0" where programs are generated automatically to satisfy goals. It emphasizes that AI will transform software development over the next 10 years.
These are the slides for Module 2 of Data Engineering Track, for University of Toronto, March 2022. The video playlist is available at https://www.youtube.com/playlist?list=PLWoneCyhdP1DWijBQo7zj2uJbuEXaE6E2
C19013010 the tutorial to build shared ai services session 1Bill Liu
This document provides an agenda and overview for a tutorial on building shared AI services. The tutorial consists of two modules: the first module discusses a case study of AI as a service and challenges of traditional machine learning, and how deep learning can help address these challenges. The second module introduces Keras and options for running Keras on Spark, including a use case, code lab, and prerequisites for running the code lab in Docker containers.
This presentation covers an overview of Analytics and Machine learning. It also covers the Microsoft's contribution in Machine learning space. Azure ML Studio, a SaaS based portal to create, experiment and share Machine Learning Solutions to the external world.
Introducción al Machine Learning AutomáticoSri Ambati
¿Cómo puede llevar el aprendizaje automático a las masas? Los proyectos de Machine Learning con la búsqueda de talento, el tiempo para construir e implementar modelos y confiar en los modelos que se construyen.
¿Cómo puede tener varios equipos en su organización para crear modelos de ML precisos sin ser expertos en ciencia de datos o aprendizaje automático?
¿Se pregunta sobre los diferentes sabores de AutoML?
H2O Driverless AI emplea las técnicas de científicos expertos en datos en una aplicación fácil de usar que ayuda a escalar sus esfuerzos de ciencia de datos. La inteligencia artificial Driverless permite a los científicos de datos trabajar en proyectos más rápido utilizando la automatización y la potencia de computación de vanguardia de las GPU para realizar tareas en minutos que solían tomar meses.
Con H2O Driverless AI, todos, incluyendo expertos y científicos de datos junior, científicos de dominio e ingenieros de datos pueden desarrollar modelos confiables de aprendizaje automático. Esta plataforma de aprendizaje automático de última generación ofrece una funcionalidad única y avanzada para la visualización de datos, la ingeniería de características, la interpretabilidad del modelo y la implementación de baja latencia.
H2O Driverless AI hace:
* Visualización automática de datos
* Ingeniería automática de funciones a nivel de Grandmaster
* Selección automática del modelo
* Ajuste y capacitación automáticos del modelo
* Paralelización automática utilizando múltiples CPU o GPU
* Ensamblaje automático del modelo
*automática del Interpretaciónaprendizaje automático (MLI)
* Generación automática de código de puntuación
¿Quieres probarlo tú mismo? Puede obtener una prueba gratuita aquí: H2O Driverless AI trial.
Venga a esta sesión y descubra cómo comenzar con el Aprendizaje automático automático con AI sin conductor H2O, y cree modelos potentes con solo unos pocos clics.
¡Te veo pronto!
Acerca de H2O.ai
H2O.ai es una empresa visionaria de software de código abierto de Silicon Valley que creó y reimaginó lo que es posible. Somos una empresa de fabricantes que trajeron al mercado nuevas plataformas y tecnologías para impulsar el movimiento de inteligencia artificial. Somos los creadores de, H2O, la principal plataforma de aprendizaje de ciencia de datos de fuente abierta y de aprendizaje automático utilizada por casi la mitad de Fortune 500 y en la que confían más de 14,000 organizaciones y cientos de miles de científicos de datos de todo el mundo.
Microsoft DevOps for AI with GoDataDrivenGoDataDriven
Artificial Intelligence (AI) and machine learning (ML) technologies extend the capabilities of software applications that are now found throughout our daily life: digital assistants, facial recognition, photo captioning, banking services, and product recommendations. The difficult part about integrating AI or ML into an application is not the technology, or the math, or the science or the algorithms. The challenge is getting the model deployed into a production environment and keeping it operational and supportable. Software development teams know how to deliver business applications and cloud services. AI/ML teams know how to develop models that can transform a business. But when it comes to putting the two together to implement an application pipeline specific to AI/ML — to automate it and wrap it around good deployment practices — the process needs some effort to be successful.
One of the most popular buzz words nowadays in the technology world is “Machine Learning (ML).” Most economists and business experts foresee Machine Learning changing every aspect of our lives in the next 10 years through automating and optimizing processes. This is leading many organizations to seek experts who can implement Machine Learning into their businesses.
The paper will be written for statistical programmers who want to explore Machine Learning career, add Machine Learning skills to their experiences or enter a Machine Learning fields. The paper will discuss about personal journey to become to a Machine Learning Engineer from a statistical programmer. The paper will share my personal experience on what motivated me to start Machine Learning career, how I started it, and what I have learned and done to be a Machine Learning Engineer. In addition, the paper will also discuss the future of Machine Learning in Pharmaceutical Industry, especially in Biometric department.
Keynote presentation from ECBS conference. The talk is about how to use machine learning and AI in improving software engineering. Experiences from our project in Software Center (www.software-center.se).
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTrivadis
This document provides an overview of artificial intelligence trends and applications in development and operations. It discusses how AI is being used for rapid prototyping, intelligent programming assistants, automatic error handling and code refactoring, and strategic decision making. Examples are given of AI tools from Microsoft, Facebook, and Codota. The document also discusses challenges like interpretability of neural networks and outlines a vision of "Software 2.0" where programs are generated automatically to satisfy goals. It emphasizes that AI will transform software development over the next 10 years.
201906 02 Introduction to AutoML with ML.NET 1.0Mark Tabladillo
ML.NET 1.0 release is the first major milestone of a great journey that started in May 2018 when we released ML.NET 0.1 as open source. ML.NET is an open-source and cross-platform machine learning framework for .NET developers. Using ML.NET, developers can leverage their existing tools and skillsets to develop and infuse custom AI into their applications by creating custom machine learning models for common scenarios like Sentiment Analysis, Recommendation, Image Classification and more.
“Automated ML” is a collection of new technologies from Microsoft to enhance the data science development process. Still in preview, Auto ML for ML.NET 1.0 will be demonstrated in a Deep Learning Virtual Machine running Windows Server 2016. Code examples are in C# and run in Visual Studio Community 2019.
This presentation is the second of four related to ML.NET and Automated ML. The presentation will be recorded with video posted to this YouTube Channel: http://bit.ly/2ZybKwI
This document provides an introduction to deep learning with Microsoft's Cognitive Toolkit (CNTK). It discusses key deep learning concepts and how they are implemented in CNTK, including neural networks, backpropagation, loss functions, and common network architectures like convolutional neural networks. It also outlines several of Microsoft's products that use deep learning like Cortana, Bing, and Skype Translator. Examples of training deep learning models with CNTK on datasets like MNIST using logistic regression, multi-layer perceptrons, and CNNs are also presented.
Machine Learning is often discussed in the context of data science, but little attention is given to the complexities of engineering production ready ML systems. This talk will explore some of the important challenges and provide advice on solutions to these problems.
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Sri Ambati
Presented at #H2OWorld 2017 in Mountain View, CA.
Enjoy the video: https://youtu.be/-rGRHrED94Y.
Learn more about H2O.ai: https://www.h2o.ai/.
Follow @h2oai: https://twitter.com/h2oai.
- - -
Abstract:
Most machine learning systems enable two essential processes: creating a model and applying the model in a repeatable and controlled fashion. These two processes are interrelated and pose technological and organizational challenges as they evolve from research to prototype to production. This presentation outlines common design patterns for tackling such challenges while implementing machine learning in a production environment.
Sergei's Bio:
Dr. Sergei Izrailev is Chief Data Scientist at BeeswaxIO, where he is responsible for data strategy and building AI applications powering the next generation of real-time bidding technology. Before Beeswax, Sergei led data science teams at Integral Ad Science and Collective, where he focused on architecture, development and scaling of data science based advertising technology products. Prior to advertising, Sergei was a quant/trader and developed trading strategies and portfolio optimization methodologies. Previously, he worked as a senior scientist at Johnson & Johnson, where he developed intelligent tools for structure-based drug discovery. Sergei holds a Ph.D. in Physics and Master of Computer Science degrees from the University of Illinois at Urbana-Champaign.
AI algorithms offer great promise in criminal justice, credit scoring, hiring and other domains. However, algorithmic fairness is a legitimate concern. Possible bias and adversarial contamination can come from training data, inappropriate data handling/model selection or incorrect algorithm design. This talk discusses how to build an open, transparent, secure and fair pipeline that fully integrates into the AI lifecycle — leveraging open-source projects such as AI Fairness 360 (AIF360), Adversarial Robustness Toolbox (ART), the Fabric for Deep Learning (FfDL) and the Model Asset eXchange (MAX).
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsAnyscale
Apache Spark has rapidly become a key tool for data scientists to explore, understand and transform massive datasets and to build and train advanced machine learning models. The question then becomes, how do I deploy these model to a production environment? How do I embed what I have learned into customer facing data applications?
In this webinar, we will discuss best practices from Databricks on
how our customers productionize machine learning models
do a deep dive with actual customer case studies,
show live tutorials of a few example architectures and code in Python, Scala, Java and SQL.
Bring Your Own Recipes Hands-On Session Sri Ambati
1. Driverless AI can be used across many industries like banking, healthcare, telecom, and marketing to save time and money through tasks like fraud detection, customer churn prediction, and personalized recommendations.
2. The document highlights new features in Driverless AI 1.7.1 including improved time series recipes, natural language processing features, automatic visualization, and machine learning interpretability tools.
3. Driverless AI provides fully automated machine learning through techniques such as automatic feature engineering, model tuning, standalone scoring pipelines, and massively parallel processing to find optimal solutions.
Monitoring AI applications with AI
The best performing offline algorithm can lose in production. The most accurate model does not always improve business metrics. Environment misconfiguration or upstream data pipeline inconsistency can silently kill the model performance. Neither prodops, data science or engineering teams are skilled to detect, monitor and debug such types of incidents.
Was it possible for Microsoft to test Tay chatbot in advance and then monitor and adjust it continuously in production to prevent its unexpected behaviour? Real mission critical AI systems require advanced monitoring and testing ecosystem which enables continuous and reliable delivery of machine learning models and data pipelines into production. Common production incidents include:
Data drifts, new data, wrong features
Vulnerability issues, malicious users
Concept drifts
Model Degradation
Biased Training set / training issue
Performance issue
In this demo based talk we discuss a solution, tooling and architecture that allows machine learning engineer to be involved in delivery phase and take ownership over deployment and monitoring of machine learning pipelines.
It allows data scientists to safely deploy early results as end-to-end AI applications in a self serve mode without assistance from engineering and operations teams. It shifts experimentation and even training phases from offline datasets to live production and closes a feedback loop between research and production.
Technical part of the talk will cover the following topics:
Automatic Data Profiling
Anomaly Detection
Clustering of inputs and outputs of the model
A/B Testing
Service Mesh, Envoy Proxy, trafic shadowing
Stateless and stateful models
Monitoring of regression, classification and prediction models
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Provectus
In this demo based talk we discuss a solution, tooling and architecture that allows machine learning engineer to be involved in delivery phase and take ownership over deployment and monitoring of machine learning pipelines. It allows data scientists to safely deploy early results as end-to-end AI applications in a self serve mode without assistance from engineering and operations teams. It shifts experimentation and even training phases from offline datasets to live production and closes a feedback loop between research and production.
1) The document summarizes a research update presentation on software engineering and artificial intelligence given by Assistant Professor Nacha Chondamrongkul.
2) It discusses how software engineering research tackles different stages of software production to minimize costs, efforts, and failures. It also examines how AI can be applied to enhance software engineering processes and how software engineering principles are needed to develop AI systems.
3) Key challenges discussed include how to specify requirements for intelligent systems, test AI systems given their unpredictability, and address issues around reliability, fairness, and deployment when integrating machine learning models into complex software.
Machine learning drove massive growth at consumer internet companies over the last decade, and this was enabled by open software, datasets, and AI research. For many problems, machine learning will produce better, faster, and more repeatable decisions at scale. Unfortunately, building and maintaining these systems is still extremely difficult and expensive. As more machine learning software moves to production, many of our traditional tools and best practices in software development will change.
Pete Skomoroch walks you through what you need to know as we shift from a world of deterministic programs to systems that give unpredictable results on ever-changing training data. To navigate this world powered by nondeterministic data-dependent programs, we’ll also need a new development stack to help us write, test, deploy, and monitor machine learning software.
Presented at OSCON Portland July 18, 2019
Webinar: Machine Learning para MicrocontroladoresEmbarcados
Neste webinar, serão apresentados conceitos sobre inteligência artificial, assim como ferramentas disponíveis para o desenvolvimento integradas ao MPLAB X e ao Harmony 3 e demonstração de um sistema de detecção de anomalia utilizando um microcontrolador da família ATSAMD21 (ARM Cortex M0+).
Bridging the Gap: from Data Science to ProductionFlorian Wilhelm
A recent but quite common observation in industry is that although there is an overall high adoption of data science, many companies struggle to get it into production. Huge teams of well-payed data scientists often present one fancy model after the other to their managers but their proof of concepts never manifest into something business relevant. The frustration grows on both sides, managers and data scientists.
In my talk I elaborate on the many reasons why data science to production is such a hard nut to crack. I start with a taxonomy of data use cases in order to easier assess technical requirements. Based thereon, my focus lies on overcoming the two-language-problem which is Python/R loved by data scientists vs. the enterprise-established Java/Scala. From my project experiences I present three different solutions, namely 1) migrating to a single language, 2) reimplementation and 3) usage of a framework. The advantages and disadvantages of each approach is presented and general advices based on the introduced taxonomy is given.
Additionally, my talk also addresses organisational as well as problems in quality assurance and deployment. Best practices and further references are presented on a high-level in order to cover all facets of data science to production.
With my talk I hope to convey the message that breakdowns on the road from data science to production are rather the rule than the exception, so you are not alone. At the end of my talk, you will have a better understanding of why your team and you are struggling and what to do about it.
Since its beginning, the Performance Advisory Council aims to promote engagement between various experts from around the world, to create relevant, value-added content sharing between members. For Neotys, to strengthen our position as a thought leader in load & performance testing. During this event, 12 participants convened in Chamonix (France) exploring several topics on the minds of today’s performance tester such as DevOps, Shift Left/Right, Test Automation, Blockchain and Artificial Intelligence.
1) The document discusses how systems engineering methods can be integrated with the AI/ML lifecycle to engineer intelligent systems. It identifies 10 major challenges for this integration, including describing AI/ML model needs and capabilities, integrating AI/ML into specification, verification, and other systems engineering processes.
2) The document proposes concepts for tackling each challenge, such as using standards to describe AI/ML model lifecycles and digital twin environments for verification. It also discusses opportunities like reusing existing AI/ML models and the need to educate new professionals.
3) Key points are that research is active in integrating systems engineering and AI/ML to build safer, more cost-effective cyber-physical systems, and
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
Looking to build a robust machine learning infrastructure to streamline MLOps? Learn from Provectus experts how to ensure the success of your MLOps initiative by implementing Data QA components in your ML infrastructure.
For most organizations, the development of multiple machine learning models, their deployment and maintenance in production are relatively new tasks. Join Provectus as we explain how to build an end-to-end infrastructure for machine learning, with a focus on data quality and metadata management, to standardize and streamline machine learning life cycle management (MLOps).
Agenda
- Data Quality and why it matters
- Challenges and solutions of Data Testing
- Challenges and solutions of Model Testing
- MLOps pipelines and why they matter
- How to expand validation pipelines for Data Quality
Automated machine learning (automated ML) automates feature engineering, algorithm and hyperparameter selection to find the best model for your data. The mission: Enable automated building of machine learning with the goal of accelerating, democratizing and scaling AI. This presentation covers some recent announcements of technologies related to Automated ML, and especially for Azure. The demonstrations focus on Python with Azure ML Service and Azure Databricks.
There are so many external API(OpenAI, Bard,...) and open source models (LLAMA, Mistral, ..) building a user facing application must be easy! What could go wrong? What do we have to think about before creating experiences?
Here is a short glimpse of some of things you need to think of for building your own application
Finetuning or using pre-trained models
Token optimizations: every word costs time and money
Building small ML models vs using prompts for all tasks
Prompt Engineering
Prompt versioning
Building an evaluation framework
Engineering challenges for streaming data
Moderation & safety of LLMs
.... and the list goes on.
In this webinar, data science expert and CEO of cnvrg.io Yochay Ettun discusses continual learning in production. This webinar examines continual learning, and will help you apply continual learning into your production models using tools like Tensorflow, Kubernetes, and cnvrg.io. This webinar for professional data scientists will go over how to monitor models when in production, and how to set up automatically adaptive machine learning.
Key webinar takeaways:
Understanding of continual learning
Optimizing your models for accuracy with continual learning
How to use TensorFlow, Kubernetes and cnvrg.io to apply CL to your models
How you can build automatically adaptive machine learning
Adapting to shifting data distributions
Coping with outliers
Retraining in production
Adapting to new tasks
A/B test your models
Deploying your machine learning pipeline to production
Watch all our webinars at https://cnvrg.io/webinars-and-workshops/
Pranav Prakash is a VP of engineering who has worked on projects involving machine learning, computer vision, and recommendations. The document discusses fundamental concepts in artificial intelligence including intelligent search algorithms. It covers categories of machine learning such as supervised, unsupervised, and reinforcement learning. Popular machine learning techniques like classification, clustering, and regression are described. Real-life applications of machine learning like recommender systems, sentiment analysis, and object recognition are also mentioned.
This short document does not contain enough information to summarize in 3 sentences or less. It only contains the word "test" and does not provide any meaningful context or details that could be extracted to create an informative summary.
201906 02 Introduction to AutoML with ML.NET 1.0Mark Tabladillo
ML.NET 1.0 release is the first major milestone of a great journey that started in May 2018 when we released ML.NET 0.1 as open source. ML.NET is an open-source and cross-platform machine learning framework for .NET developers. Using ML.NET, developers can leverage their existing tools and skillsets to develop and infuse custom AI into their applications by creating custom machine learning models for common scenarios like Sentiment Analysis, Recommendation, Image Classification and more.
“Automated ML” is a collection of new technologies from Microsoft to enhance the data science development process. Still in preview, Auto ML for ML.NET 1.0 will be demonstrated in a Deep Learning Virtual Machine running Windows Server 2016. Code examples are in C# and run in Visual Studio Community 2019.
This presentation is the second of four related to ML.NET and Automated ML. The presentation will be recorded with video posted to this YouTube Channel: http://bit.ly/2ZybKwI
This document provides an introduction to deep learning with Microsoft's Cognitive Toolkit (CNTK). It discusses key deep learning concepts and how they are implemented in CNTK, including neural networks, backpropagation, loss functions, and common network architectures like convolutional neural networks. It also outlines several of Microsoft's products that use deep learning like Cortana, Bing, and Skype Translator. Examples of training deep learning models with CNTK on datasets like MNIST using logistic regression, multi-layer perceptrons, and CNNs are also presented.
Machine Learning is often discussed in the context of data science, but little attention is given to the complexities of engineering production ready ML systems. This talk will explore some of the important challenges and provide advice on solutions to these problems.
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Sri Ambati
Presented at #H2OWorld 2017 in Mountain View, CA.
Enjoy the video: https://youtu.be/-rGRHrED94Y.
Learn more about H2O.ai: https://www.h2o.ai/.
Follow @h2oai: https://twitter.com/h2oai.
- - -
Abstract:
Most machine learning systems enable two essential processes: creating a model and applying the model in a repeatable and controlled fashion. These two processes are interrelated and pose technological and organizational challenges as they evolve from research to prototype to production. This presentation outlines common design patterns for tackling such challenges while implementing machine learning in a production environment.
Sergei's Bio:
Dr. Sergei Izrailev is Chief Data Scientist at BeeswaxIO, where he is responsible for data strategy and building AI applications powering the next generation of real-time bidding technology. Before Beeswax, Sergei led data science teams at Integral Ad Science and Collective, where he focused on architecture, development and scaling of data science based advertising technology products. Prior to advertising, Sergei was a quant/trader and developed trading strategies and portfolio optimization methodologies. Previously, he worked as a senior scientist at Johnson & Johnson, where he developed intelligent tools for structure-based drug discovery. Sergei holds a Ph.D. in Physics and Master of Computer Science degrees from the University of Illinois at Urbana-Champaign.
AI algorithms offer great promise in criminal justice, credit scoring, hiring and other domains. However, algorithmic fairness is a legitimate concern. Possible bias and adversarial contamination can come from training data, inappropriate data handling/model selection or incorrect algorithm design. This talk discusses how to build an open, transparent, secure and fair pipeline that fully integrates into the AI lifecycle — leveraging open-source projects such as AI Fairness 360 (AIF360), Adversarial Robustness Toolbox (ART), the Fabric for Deep Learning (FfDL) and the Model Asset eXchange (MAX).
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsAnyscale
Apache Spark has rapidly become a key tool for data scientists to explore, understand and transform massive datasets and to build and train advanced machine learning models. The question then becomes, how do I deploy these model to a production environment? How do I embed what I have learned into customer facing data applications?
In this webinar, we will discuss best practices from Databricks on
how our customers productionize machine learning models
do a deep dive with actual customer case studies,
show live tutorials of a few example architectures and code in Python, Scala, Java and SQL.
Bring Your Own Recipes Hands-On Session Sri Ambati
1. Driverless AI can be used across many industries like banking, healthcare, telecom, and marketing to save time and money through tasks like fraud detection, customer churn prediction, and personalized recommendations.
2. The document highlights new features in Driverless AI 1.7.1 including improved time series recipes, natural language processing features, automatic visualization, and machine learning interpretability tools.
3. Driverless AI provides fully automated machine learning through techniques such as automatic feature engineering, model tuning, standalone scoring pipelines, and massively parallel processing to find optimal solutions.
Monitoring AI applications with AI
The best performing offline algorithm can lose in production. The most accurate model does not always improve business metrics. Environment misconfiguration or upstream data pipeline inconsistency can silently kill the model performance. Neither prodops, data science or engineering teams are skilled to detect, monitor and debug such types of incidents.
Was it possible for Microsoft to test Tay chatbot in advance and then monitor and adjust it continuously in production to prevent its unexpected behaviour? Real mission critical AI systems require advanced monitoring and testing ecosystem which enables continuous and reliable delivery of machine learning models and data pipelines into production. Common production incidents include:
Data drifts, new data, wrong features
Vulnerability issues, malicious users
Concept drifts
Model Degradation
Biased Training set / training issue
Performance issue
In this demo based talk we discuss a solution, tooling and architecture that allows machine learning engineer to be involved in delivery phase and take ownership over deployment and monitoring of machine learning pipelines.
It allows data scientists to safely deploy early results as end-to-end AI applications in a self serve mode without assistance from engineering and operations teams. It shifts experimentation and even training phases from offline datasets to live production and closes a feedback loop between research and production.
Technical part of the talk will cover the following topics:
Automatic Data Profiling
Anomaly Detection
Clustering of inputs and outputs of the model
A/B Testing
Service Mesh, Envoy Proxy, trafic shadowing
Stateless and stateful models
Monitoring of regression, classification and prediction models
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Provectus
In this demo based talk we discuss a solution, tooling and architecture that allows machine learning engineer to be involved in delivery phase and take ownership over deployment and monitoring of machine learning pipelines. It allows data scientists to safely deploy early results as end-to-end AI applications in a self serve mode without assistance from engineering and operations teams. It shifts experimentation and even training phases from offline datasets to live production and closes a feedback loop between research and production.
1) The document summarizes a research update presentation on software engineering and artificial intelligence given by Assistant Professor Nacha Chondamrongkul.
2) It discusses how software engineering research tackles different stages of software production to minimize costs, efforts, and failures. It also examines how AI can be applied to enhance software engineering processes and how software engineering principles are needed to develop AI systems.
3) Key challenges discussed include how to specify requirements for intelligent systems, test AI systems given their unpredictability, and address issues around reliability, fairness, and deployment when integrating machine learning models into complex software.
Machine learning drove massive growth at consumer internet companies over the last decade, and this was enabled by open software, datasets, and AI research. For many problems, machine learning will produce better, faster, and more repeatable decisions at scale. Unfortunately, building and maintaining these systems is still extremely difficult and expensive. As more machine learning software moves to production, many of our traditional tools and best practices in software development will change.
Pete Skomoroch walks you through what you need to know as we shift from a world of deterministic programs to systems that give unpredictable results on ever-changing training data. To navigate this world powered by nondeterministic data-dependent programs, we’ll also need a new development stack to help us write, test, deploy, and monitor machine learning software.
Presented at OSCON Portland July 18, 2019
Webinar: Machine Learning para MicrocontroladoresEmbarcados
Neste webinar, serão apresentados conceitos sobre inteligência artificial, assim como ferramentas disponíveis para o desenvolvimento integradas ao MPLAB X e ao Harmony 3 e demonstração de um sistema de detecção de anomalia utilizando um microcontrolador da família ATSAMD21 (ARM Cortex M0+).
Bridging the Gap: from Data Science to ProductionFlorian Wilhelm
A recent but quite common observation in industry is that although there is an overall high adoption of data science, many companies struggle to get it into production. Huge teams of well-payed data scientists often present one fancy model after the other to their managers but their proof of concepts never manifest into something business relevant. The frustration grows on both sides, managers and data scientists.
In my talk I elaborate on the many reasons why data science to production is such a hard nut to crack. I start with a taxonomy of data use cases in order to easier assess technical requirements. Based thereon, my focus lies on overcoming the two-language-problem which is Python/R loved by data scientists vs. the enterprise-established Java/Scala. From my project experiences I present three different solutions, namely 1) migrating to a single language, 2) reimplementation and 3) usage of a framework. The advantages and disadvantages of each approach is presented and general advices based on the introduced taxonomy is given.
Additionally, my talk also addresses organisational as well as problems in quality assurance and deployment. Best practices and further references are presented on a high-level in order to cover all facets of data science to production.
With my talk I hope to convey the message that breakdowns on the road from data science to production are rather the rule than the exception, so you are not alone. At the end of my talk, you will have a better understanding of why your team and you are struggling and what to do about it.
Since its beginning, the Performance Advisory Council aims to promote engagement between various experts from around the world, to create relevant, value-added content sharing between members. For Neotys, to strengthen our position as a thought leader in load & performance testing. During this event, 12 participants convened in Chamonix (France) exploring several topics on the minds of today’s performance tester such as DevOps, Shift Left/Right, Test Automation, Blockchain and Artificial Intelligence.
1) The document discusses how systems engineering methods can be integrated with the AI/ML lifecycle to engineer intelligent systems. It identifies 10 major challenges for this integration, including describing AI/ML model needs and capabilities, integrating AI/ML into specification, verification, and other systems engineering processes.
2) The document proposes concepts for tackling each challenge, such as using standards to describe AI/ML model lifecycles and digital twin environments for verification. It also discusses opportunities like reusing existing AI/ML models and the need to educate new professionals.
3) Key points are that research is active in integrating systems engineering and AI/ML to build safer, more cost-effective cyber-physical systems, and
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
Looking to build a robust machine learning infrastructure to streamline MLOps? Learn from Provectus experts how to ensure the success of your MLOps initiative by implementing Data QA components in your ML infrastructure.
For most organizations, the development of multiple machine learning models, their deployment and maintenance in production are relatively new tasks. Join Provectus as we explain how to build an end-to-end infrastructure for machine learning, with a focus on data quality and metadata management, to standardize and streamline machine learning life cycle management (MLOps).
Agenda
- Data Quality and why it matters
- Challenges and solutions of Data Testing
- Challenges and solutions of Model Testing
- MLOps pipelines and why they matter
- How to expand validation pipelines for Data Quality
Automated machine learning (automated ML) automates feature engineering, algorithm and hyperparameter selection to find the best model for your data. The mission: Enable automated building of machine learning with the goal of accelerating, democratizing and scaling AI. This presentation covers some recent announcements of technologies related to Automated ML, and especially for Azure. The demonstrations focus on Python with Azure ML Service and Azure Databricks.
There are so many external API(OpenAI, Bard,...) and open source models (LLAMA, Mistral, ..) building a user facing application must be easy! What could go wrong? What do we have to think about before creating experiences?
Here is a short glimpse of some of things you need to think of for building your own application
Finetuning or using pre-trained models
Token optimizations: every word costs time and money
Building small ML models vs using prompts for all tasks
Prompt Engineering
Prompt versioning
Building an evaluation framework
Engineering challenges for streaming data
Moderation & safety of LLMs
.... and the list goes on.
In this webinar, data science expert and CEO of cnvrg.io Yochay Ettun discusses continual learning in production. This webinar examines continual learning, and will help you apply continual learning into your production models using tools like Tensorflow, Kubernetes, and cnvrg.io. This webinar for professional data scientists will go over how to monitor models when in production, and how to set up automatically adaptive machine learning.
Key webinar takeaways:
Understanding of continual learning
Optimizing your models for accuracy with continual learning
How to use TensorFlow, Kubernetes and cnvrg.io to apply CL to your models
How you can build automatically adaptive machine learning
Adapting to shifting data distributions
Coping with outliers
Retraining in production
Adapting to new tasks
A/B test your models
Deploying your machine learning pipeline to production
Watch all our webinars at https://cnvrg.io/webinars-and-workshops/
Pranav Prakash is a VP of engineering who has worked on projects involving machine learning, computer vision, and recommendations. The document discusses fundamental concepts in artificial intelligence including intelligent search algorithms. It covers categories of machine learning such as supervised, unsupervised, and reinforcement learning. Popular machine learning techniques like classification, clustering, and regression are described. Real-life applications of machine learning like recommender systems, sentiment analysis, and object recognition are also mentioned.
This short document does not contain enough information to summarize in 3 sentences or less. It only contains the word "test" and does not provide any meaningful context or details that could be extracted to create an informative summary.
Uniform Resource Locators (URLs) are standardized addresses used to locate resources on the Internet. A URL contains the protocol or scheme being used (such as http or ftp), the domain name or IP address of the server, and the path to the specific file or resource. Well-formed URLs follow a general syntax of <scheme>://<domain>/<path>. They allow both humans and software programs to directly access electronic resources anywhere on the Internet or on private networks.
The story of how solving one problem the OpenSource way
opened doors to so much more. Talk presented by Pranav Prakash and Hari Prasanna at OSDConf 2014, New Delhi.
The document discusses using Twitter during a live presentation to engage the audience and get feedback. It notes that everyone has a mobile device and loves to tweet, but there is a problem. It then demonstrates how tweeting can provide relevancy for the audience, allow authors to get feedback and track engagement, and help the audience feel connected through a single platform by leaving comments. The document encourages asking any questions using the hashtag #XCRT.
This document summarizes experiments comparing the open source search engine Lucene to a custom search engine called Juru on TREC data. The authors investigated differences in search quality between the two engines. They found that Lucene's default scoring was inferior to Juru's. They modified Lucene's scoring function by changing the document length normalization and term frequency normalization. Evaluations showed the modified Lucene performed comparably to Juru and other top systems in the TREC 1-Million Queries track, demonstrating the robustness of the modifications and the new evaluation measures.
This very short document contains two fruit names but provides no other context or information. It mentions both "banana" and "oranges" but does not explain their relationship or relevance to each other. The intended meaning or purpose is unclear from the limited content provided.
This very short document lists three types of fruit: banana, oranges, and peaches. It does not provide any other details about the fruits or context around them. The document simply names three different fruits in a list.
This very short document lists three types of fruit: apples, oranges, and peaches. It does not provide any other details about the fruits or context around them. The document simply names three common fruits in a list.
This very short document contains a list of three fruits: apple, banana, and peaches. It does not provide any additional context or details about the fruits.
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...PECB
Denis is a dynamic and results-driven Chief Information Officer (CIO) with a distinguished career spanning information systems analysis and technical project management. With a proven track record of spearheading the design and delivery of cutting-edge Information Management solutions, he has consistently elevated business operations, streamlined reporting functions, and maximized process efficiency.
Certified as an ISO/IEC 27001: Information Security Management Systems (ISMS) Lead Implementer, Data Protection Officer, and Cyber Risks Analyst, Denis brings a heightened focus on data security, privacy, and cyber resilience to every endeavor.
His expertise extends across a diverse spectrum of reporting, database, and web development applications, underpinned by an exceptional grasp of data storage and virtualization technologies. His proficiency in application testing, database administration, and data cleansing ensures seamless execution of complex projects.
What sets Denis apart is his comprehensive understanding of Business and Systems Analysis technologies, honed through involvement in all phases of the Software Development Lifecycle (SDLC). From meticulous requirements gathering to precise analysis, innovative design, rigorous development, thorough testing, and successful implementation, he has consistently delivered exceptional results.
Throughout his career, he has taken on multifaceted roles, from leading technical project management teams to owning solutions that drive operational excellence. His conscientious and proactive approach is unwavering, whether he is working independently or collaboratively within a team. His ability to connect with colleagues on a personal level underscores his commitment to fostering a harmonious and productive workplace environment.
Date: May 29, 2024
Tags: Information Security, ISO/IEC 27001, ISO/IEC 42001, Artificial Intelligence, GDPR
-------------------------------------------------------------------------------
Find out more about ISO training and certification services
Training: ISO/IEC 27001 Information Security Management System - EN | PECB
ISO/IEC 42001 Artificial Intelligence Management System - EN | PECB
General Data Protection Regulation (GDPR) - Training Courses - EN | PECB
Webinars: https://pecb.com/webinars
Article: https://pecb.com/article
-------------------------------------------------------------------------------
For more information about PECB:
Website: https://pecb.com/
LinkedIn: https://www.linkedin.com/company/pecb/
Facebook: https://www.facebook.com/PECBInternational/
Slideshare: http://www.slideshare.net/PECBCERTIFICATION
Chapter wise All Notes of First year Basic Civil Engineering.pptxDenish Jangid
Chapter wise All Notes of First year Basic Civil Engineering
Syllabus
Chapter-1
Introduction to objective, scope and outcome the subject
Chapter 2
Introduction: Scope and Specialization of Civil Engineering, Role of civil Engineer in Society, Impact of infrastructural development on economy of country.
Chapter 3
Surveying: Object Principles & Types of Surveying; Site Plans, Plans & Maps; Scales & Unit of different Measurements.
Linear Measurements: Instruments used. Linear Measurement by Tape, Ranging out Survey Lines and overcoming Obstructions; Measurements on sloping ground; Tape corrections, conventional symbols. Angular Measurements: Instruments used; Introduction to Compass Surveying, Bearings and Longitude & Latitude of a Line, Introduction to total station.
Levelling: Instrument used Object of levelling, Methods of levelling in brief, and Contour maps.
Chapter 4
Buildings: Selection of site for Buildings, Layout of Building Plan, Types of buildings, Plinth area, carpet area, floor space index, Introduction to building byelaws, concept of sun light & ventilation. Components of Buildings & their functions, Basic concept of R.C.C., Introduction to types of foundation
Chapter 5
Transportation: Introduction to Transportation Engineering; Traffic and Road Safety: Types and Characteristics of Various Modes of Transportation; Various Road Traffic Signs, Causes of Accidents and Road Safety Measures.
Chapter 6
Environmental Engineering: Environmental Pollution, Environmental Acts and Regulations, Functional Concepts of Ecology, Basics of Species, Biodiversity, Ecosystem, Hydrological Cycle; Chemical Cycles: Carbon, Nitrogen & Phosphorus; Energy Flow in Ecosystems.
Water Pollution: Water Quality standards, Introduction to Treatment & Disposal of Waste Water. Reuse and Saving of Water, Rain Water Harvesting. Solid Waste Management: Classification of Solid Waste, Collection, Transportation and Disposal of Solid. Recycling of Solid Waste: Energy Recovery, Sanitary Landfill, On-Site Sanitation. Air & Noise Pollution: Primary and Secondary air pollutants, Harmful effects of Air Pollution, Control of Air Pollution. . Noise Pollution Harmful Effects of noise pollution, control of noise pollution, Global warming & Climate Change, Ozone depletion, Greenhouse effect
Text Books:
1. Palancharmy, Basic Civil Engineering, McGraw Hill publishers.
2. Satheesh Gopi, Basic Civil Engineering, Pearson Publishers.
3. Ketki Rangwala Dalal, Essentials of Civil Engineering, Charotar Publishing House.
4. BCP, Surveying volume 1
How to Setup Warehouse & Location in Odoo 17 InventoryCeline George
In this slide, we'll explore how to set up warehouses and locations in Odoo 17 Inventory. This will help us manage our stock effectively, track inventory levels, and streamline warehouse operations.
This presentation includes basic of PCOS their pathology and treatment and also Ayurveda correlation of PCOS and Ayurvedic line of treatment mentioned in classics.
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
This slide is special for master students (MIBS & MIFB) in UUM. Also useful for readers who are interested in the topic of contemporary Islamic banking.
7. Mobile Devices
• iOS - CoreML, coremltools python package
• Android - TF Lite, PyTorch
• ML Kit - On Device ML from Google
• Top Considerations
• Energy
• Resources
• Real Time
• Internet/Connectivity
9. AB Testing
• Controlled randomised experiment to establish causality
• How does a model contribute to “business objective”?
• De
f
ine “Overall Evaluation Criterion”
10. AB Testing
Architecture
• De
f
ine experiment parameters (sample size,
duration)
• power (prob of false negative), level (prob of
false positive)
• Search Session = <User, Query, (Time Window)>
• Associate UUID with each session
• Split sessions between baseline (85-95%) and
experimental models.
• Capture feedback
11.
12. Lessons learned
• More data or better data
• Simple models are better than complex, but complex models are
sometimes needed
• Biases in data
• Evaluation approach
• ML Engineering & Data Science