If you have any device or source that generates values over time (also a log from a service), you want to determine if in a time frame, the time serie is correct or you can detect some anomalies. What can you do as a developer (not a Data Scientist) with .NET o Azure? Let's see how in this session.
How can you handle defects? If you are in a factory, production can produce objects with defects. Or values from sensors can tell you over time that some values are not "normal". What can you do as a developer (not a Data Scientist) with .NET o Azure to detect these anomalies? Let's see how in this session.
Time Series Anomaly Detection with .net and AzureMarco Parenzan
If you have any device or source that generates values over time (also a log from a service), you want to determine if in a time frame, the time serie is correct or you can detect some anomalies. What can you do as a developer (not a Data Scientist) with .NET o Azure? Let's see how in this session.
Get ready to lock and load through this quick overview of some of the newest most innovative, tools around. Source: http://www.takipiblog.com/7-new-tools-java-developers-should-know/
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017StampedeCon
This technical session provides a hands-on introduction to TensorFlow using Keras in the Python programming language. TensorFlow is Google’s scalable, distributed, GPU-powered compute graph engine that machine learning practitioners used for deep learning. Keras provides a Python-based API that makes it easy to create well-known types of neural networks in TensorFlow. Deep learning is a group of exciting new technologies for neural networks. Through a combination of advanced training techniques and neural network architectural components, it is now possible to train neural networks of much greater complexity. Deep learning allows a model to learn hierarchies of information in a way that is similar to the function of the human brain.
How can you handle defects? If you are in a factory, production can produce objects with defects. Or values from sensors can tell you over time that some values are not "normal". What can you do as a developer (not a Data Scientist) with .NET o Azure to detect these anomalies? Let's see how in this session.
Time Series Anomaly Detection with .net and AzureMarco Parenzan
If you have any device or source that generates values over time (also a log from a service), you want to determine if in a time frame, the time serie is correct or you can detect some anomalies. What can you do as a developer (not a Data Scientist) with .NET o Azure? Let's see how in this session.
Get ready to lock and load through this quick overview of some of the newest most innovative, tools around. Source: http://www.takipiblog.com/7-new-tools-java-developers-should-know/
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017StampedeCon
This technical session provides a hands-on introduction to TensorFlow using Keras in the Python programming language. TensorFlow is Google’s scalable, distributed, GPU-powered compute graph engine that machine learning practitioners used for deep learning. Keras provides a Python-based API that makes it easy to create well-known types of neural networks in TensorFlow. Deep learning is a group of exciting new technologies for neural networks. Through a combination of advanced training techniques and neural network architectural components, it is now possible to train neural networks of much greater complexity. Deep learning allows a model to learn hierarchies of information in a way that is similar to the function of the human brain.
Information security is a big problem today. With more attacks happening all the time, and increasingly sophisticated attacks beyond the script-kiddies of yesterday, patrolling the borders of our networks, and controlling threats both from outside and within is becoming harder. We cannot rely on endpoint protection for a few thousand PCs and servers anymore, but as connected cars, internet of things, and mobile devices become more common, so the attack surface broadens. To face these problems, we need technologies that go beyond the traditional SEIM, which human operators writing rules. We need to use the power of the Hadoop ecosystem to find new patterns, machine learning to uncover subtle signals and big data tools to help humans analysts work better and faster to meet these new threats. Apache Metron is a platform on top of Hadoop that meets these needs. Here we will look at the platform in action, and how to use it to trace a real world complex threat, and how it compares to traditional approaches. Come and see how to make your SOC more effective with automated evidence gathering, Hadoop-powered integration, and real-time detection.
Creating an end-to-end Recommender System with Apache Spark and Elasticsearch...sparktc
At the sold-out Spark & Machine Learning Meetup in Brussels on October 27, 2016, Nick Pentreath of the Spark Technology Center teamed up with Jean-François Puget of IBM Analytics to deliver a talk called Creating an end-to-endRecommender System with Apache Spark and Elasticsearch.
Jean-François and Nick started with a look at the workflow for recommender systems and machine learning, then moved on to data modeling and using Spark ML for collaborative filtering. They closed with a discussion of deploying and scoring the recommender models, including a demo.
Flink Forward SF 2017: Trevor Grant - Introduction to Online Machine Learning...Flink Forward
Online algorithms are an increasingly popular yet often misunderstood branch of machine learning, where model parameter estimates are updated for each new piece of information received. While mini-batch methods have often been mislabeled as 'streaming-machine learning', true online methods have different implementations and goals. This talk will explain key differences between online and offline machine learning, an introduction to many common online algorithms, and how online algorithms can be analyzed. An example using Apache Flink to detect trends on Twitter will be presented. Attendees will come away from this talk with a better understanding of the challenges and opportunities from working with online algorithms and how they can begin implementing their own algorithms in Apache Flink.
Embracing a Taxonomy of Types to Simplify Machine Learning with Leah McGuireDatabricks
Salesforce has created a machine learning framework on top of Spark ML that builds personalized models for businesses across a range of applications. Hear how expanding type information about features has allowed them to deal with custom datasets with good results.
By building a platform that automatically does feature engineering on rich types (e.g. Currency and Percentages rather than Doubles; Phone Numbers and Email Addresses rather than Strings), they have automated much of the work that consumes most data scientists’ time. Learn how you can do the same by building a single model outline based on the application, and then having the framework customize it for each customer.
Urs Köster and Yinyin Liu present at ODSC West. Deep learning has had a major impact in the last three years. Imperfect interactions with machines, such as speech, natural language, or image processing have been made robust by deep learning and deep learning holds promise in finding usable structure in large datasets. The training process is lengthy and has proven to be difficult to scale due to constraints of existing compute architectures and there is a need of standardized tools for building and scaling deep learning solutions. Urs will outline some of these challenges and how fundamental changes to the organization of computation and communication can lead to large advances in capabilities. Urs will dive deep into the field of Deep Learning and focus on Convolutional and Recurrent Neural Networks. The talk will be followed by a workshop highlighting neon™, an open source python based deep learning framework that has been built from the ground up for speed and ease of use. This session is targeted at data scientists and researchers interested in taking deep learning to the next level of speed and scalability. The tutorial covers how to use neon™ to build and train Recurrent Neural Networks to generate text, and Convolutional Networks to perform image classification.
A Fast Decision Rule Engine for Anomaly DetectionDatabricks
Description: We present a supervised anomaly detection approach that is scalable and interpretable. It works with tabular data and searches over all decision rules for the anomaly class involving one or two features. It creates a classifier out of all rules meeting user-specified precision and recall constraints, classifying a test example as an anomaly if any of the rules fire. Overlapping decision rules can be pruned to reduce model complexity, leaving a small number of simple rules that a user can easily understand. Our system operates on Pandas DataFrames and has a high-performance C++ backend with experimental GPU and FPGA acceleration available. It is available open-source at https://github.com/jjthomas/rule_engine
The term "machine learning" is increasingly bandied about in corporate settings and cocktail parties, but what is it, really? In this session we'll answer that question, providing an approachable overview of machine learning concepts, technologies, and use cases. We'll then take a deeper dive into machine learning topics such as supervised learning, unsupervised learning, and deep learning. We'll also survey various machine learning APIs and platforms. Technologies including Spring and Cloud Foundry will be leveraged in the demos. You'll be the hit of your next party when you're able to express the near-magical inner-workings of artificial neural networks!
Georgia Tech cse6242 - Intro to Deep Learning and DL4JJosh Patterson
Introduction to deep learning and DL4J - http://deeplearning4j.org/ - a guest lecture by Josh Patterson at Georgia Tech for the cse6242 graduate class.
At Databricks, we manage Spark clusters for customers to run various production workloads. In this talk, we share our experiences in building a real-time monitoring system for thousands of Spark nodes, including the lessons we learned and the value we’ve seen from our efforts so far.
The was part of the talk presented at #monitorSF Meetup held at Databricks HQ in SF.
CyberMLToolkit: Anomaly Detection as a Scalable Generic Service Over Apache S...Databricks
Cybercrime is one the greatest threats to every company in the world today and a major problem for mankind in general. The damage due to Cybercrime is estimated to be around $6 Trillion By 2021. Security professionals are struggling to cope with the threat. As a result, powerful and easy to use tools are necessary to aid in this battle. For this purpose we created an anomaly detection framework focused on security which can identify anomalous access patterns. It is built on top of Apache Spark and can be applied in parallel over multiple tenants. This allows the model to be trained over the data of thousands of customers over a Databricks cluster within less than an hour. The model leverages proven technologies from Recommendation Engines to produce high quality anomalies. We thoroughly evaluated the model’s ability to identify actual anomalies by using synthetically generated data and also by creating an actual attack and showing that the model clearly identifies the attack as anomalous behavior. We plan to open source this library as part of a cyber-ML toolkit we will be offering.
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...Jose Quesada (hiring)
The machine learning libraries in Apache Spark are an impressive piece of software engineering, and are maturing rapidly. What advantages does Spark.ml offer over scikit-learn? At Data Science Retreat we've taken a real-world dataset and worked through the stages of building a predictive model -- exploration, data cleaning, feature engineering, and model fitting; which would you use in production?
The machine learning libraries in Apache Spark are an impressive piece of software engineering, and are maturing rapidly. What advantages does Spark.ml offer over scikit-learn?
At Data Science Retreat we've taken a real-world dataset and worked through the stages of building a predictive model -- exploration, data cleaning, feature engineering, and model fitting -- in several different frameworks. We'll show what it's like to work with native Spark.ml, and compare it to scikit-learn along several dimensions: ease of use, productivity, feature set, and performance.
In some ways Spark.ml is still rather immature, but it also conveys new superpowers to those who know how to use it.
ROCm and Distributed Deep Learning on Spark and TensorFlowDatabricks
ROCm, the Radeon Open Ecosystem, is an open-source software foundation for GPU computing on Linux. ROCm supports TensorFlow and PyTorch using MIOpen, a library of highly optimized GPU routines for deep learning. In this talk, we describe how Apache Spark is a key enabling platform for distributed deep learning on ROCm, as it enables different deep learning frameworks to be embedded in Spark workflows in a secure end-to-end machine learning pipeline. We will analyse the different frameworks for integrating Spark with Tensorflow on ROCm, from Horovod to HopsML to Databrick's Project Hydrogen. We will also examine the surprising places where bottlenecks can surface when training models (everything from object stores to the Data Scientists themselves), and we will investigate ways to get around these bottlenecks. The talk will include a live demonstration of training and inference for a Tensorflow application embedded in a Spark pipeline written in a Jupyter notebook on Hopsworks with ROCm.
Anomaly Detection and Automatic Labeling with Deep LearningAdam Gibson
Adam Gibson demonstrates how to use variational autoencoders to automatically label time series location data. You'll explore the challenge of imbalanced classes and anomaly detection, learn how to leverage deep learning for automatically labeling (and the pitfalls of this), and discover how you can deploy these techniques in your organization.
Startup.Ml: Using neon for NLP and Localization Applications Intel Nervana
Speaker: Arjun Bansal, co-founder of Nervana Systems
Arjun Bansal’s workshop focused on neon, an open-source python based deep learning framework that has been build from the ground up for speed and ease of use. The workshop highlights how to use neon, build Recurrent Recurrent Neural Networks to generate and analyze text, and build Convolutional Autoencoders to generate images and to localize objects. Arjun also demoed the integration of neon with the Nervana cloud (in private beta) for multi-GPU training of deep networks.
Time Series Anomaly Detection for .net and AzureMarco Parenzan
If you have any device or source that generates values over time (also a log from a service), you want to determine if in a time frame, the time serie is correct or you can detect some anomalies. What can you do as a developer (not a Data Scientist) with .NET and Azure?
Information security is a big problem today. With more attacks happening all the time, and increasingly sophisticated attacks beyond the script-kiddies of yesterday, patrolling the borders of our networks, and controlling threats both from outside and within is becoming harder. We cannot rely on endpoint protection for a few thousand PCs and servers anymore, but as connected cars, internet of things, and mobile devices become more common, so the attack surface broadens. To face these problems, we need technologies that go beyond the traditional SEIM, which human operators writing rules. We need to use the power of the Hadoop ecosystem to find new patterns, machine learning to uncover subtle signals and big data tools to help humans analysts work better and faster to meet these new threats. Apache Metron is a platform on top of Hadoop that meets these needs. Here we will look at the platform in action, and how to use it to trace a real world complex threat, and how it compares to traditional approaches. Come and see how to make your SOC more effective with automated evidence gathering, Hadoop-powered integration, and real-time detection.
Creating an end-to-end Recommender System with Apache Spark and Elasticsearch...sparktc
At the sold-out Spark & Machine Learning Meetup in Brussels on October 27, 2016, Nick Pentreath of the Spark Technology Center teamed up with Jean-François Puget of IBM Analytics to deliver a talk called Creating an end-to-endRecommender System with Apache Spark and Elasticsearch.
Jean-François and Nick started with a look at the workflow for recommender systems and machine learning, then moved on to data modeling and using Spark ML for collaborative filtering. They closed with a discussion of deploying and scoring the recommender models, including a demo.
Flink Forward SF 2017: Trevor Grant - Introduction to Online Machine Learning...Flink Forward
Online algorithms are an increasingly popular yet often misunderstood branch of machine learning, where model parameter estimates are updated for each new piece of information received. While mini-batch methods have often been mislabeled as 'streaming-machine learning', true online methods have different implementations and goals. This talk will explain key differences between online and offline machine learning, an introduction to many common online algorithms, and how online algorithms can be analyzed. An example using Apache Flink to detect trends on Twitter will be presented. Attendees will come away from this talk with a better understanding of the challenges and opportunities from working with online algorithms and how they can begin implementing their own algorithms in Apache Flink.
Embracing a Taxonomy of Types to Simplify Machine Learning with Leah McGuireDatabricks
Salesforce has created a machine learning framework on top of Spark ML that builds personalized models for businesses across a range of applications. Hear how expanding type information about features has allowed them to deal with custom datasets with good results.
By building a platform that automatically does feature engineering on rich types (e.g. Currency and Percentages rather than Doubles; Phone Numbers and Email Addresses rather than Strings), they have automated much of the work that consumes most data scientists’ time. Learn how you can do the same by building a single model outline based on the application, and then having the framework customize it for each customer.
Urs Köster and Yinyin Liu present at ODSC West. Deep learning has had a major impact in the last three years. Imperfect interactions with machines, such as speech, natural language, or image processing have been made robust by deep learning and deep learning holds promise in finding usable structure in large datasets. The training process is lengthy and has proven to be difficult to scale due to constraints of existing compute architectures and there is a need of standardized tools for building and scaling deep learning solutions. Urs will outline some of these challenges and how fundamental changes to the organization of computation and communication can lead to large advances in capabilities. Urs will dive deep into the field of Deep Learning and focus on Convolutional and Recurrent Neural Networks. The talk will be followed by a workshop highlighting neon™, an open source python based deep learning framework that has been built from the ground up for speed and ease of use. This session is targeted at data scientists and researchers interested in taking deep learning to the next level of speed and scalability. The tutorial covers how to use neon™ to build and train Recurrent Neural Networks to generate text, and Convolutional Networks to perform image classification.
A Fast Decision Rule Engine for Anomaly DetectionDatabricks
Description: We present a supervised anomaly detection approach that is scalable and interpretable. It works with tabular data and searches over all decision rules for the anomaly class involving one or two features. It creates a classifier out of all rules meeting user-specified precision and recall constraints, classifying a test example as an anomaly if any of the rules fire. Overlapping decision rules can be pruned to reduce model complexity, leaving a small number of simple rules that a user can easily understand. Our system operates on Pandas DataFrames and has a high-performance C++ backend with experimental GPU and FPGA acceleration available. It is available open-source at https://github.com/jjthomas/rule_engine
The term "machine learning" is increasingly bandied about in corporate settings and cocktail parties, but what is it, really? In this session we'll answer that question, providing an approachable overview of machine learning concepts, technologies, and use cases. We'll then take a deeper dive into machine learning topics such as supervised learning, unsupervised learning, and deep learning. We'll also survey various machine learning APIs and platforms. Technologies including Spring and Cloud Foundry will be leveraged in the demos. You'll be the hit of your next party when you're able to express the near-magical inner-workings of artificial neural networks!
Georgia Tech cse6242 - Intro to Deep Learning and DL4JJosh Patterson
Introduction to deep learning and DL4J - http://deeplearning4j.org/ - a guest lecture by Josh Patterson at Georgia Tech for the cse6242 graduate class.
At Databricks, we manage Spark clusters for customers to run various production workloads. In this talk, we share our experiences in building a real-time monitoring system for thousands of Spark nodes, including the lessons we learned and the value we’ve seen from our efforts so far.
The was part of the talk presented at #monitorSF Meetup held at Databricks HQ in SF.
CyberMLToolkit: Anomaly Detection as a Scalable Generic Service Over Apache S...Databricks
Cybercrime is one the greatest threats to every company in the world today and a major problem for mankind in general. The damage due to Cybercrime is estimated to be around $6 Trillion By 2021. Security professionals are struggling to cope with the threat. As a result, powerful and easy to use tools are necessary to aid in this battle. For this purpose we created an anomaly detection framework focused on security which can identify anomalous access patterns. It is built on top of Apache Spark and can be applied in parallel over multiple tenants. This allows the model to be trained over the data of thousands of customers over a Databricks cluster within less than an hour. The model leverages proven technologies from Recommendation Engines to produce high quality anomalies. We thoroughly evaluated the model’s ability to identify actual anomalies by using synthetically generated data and also by creating an actual attack and showing that the model clearly identifies the attack as anomalous behavior. We plan to open source this library as part of a cyber-ML toolkit we will be offering.
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...Jose Quesada (hiring)
The machine learning libraries in Apache Spark are an impressive piece of software engineering, and are maturing rapidly. What advantages does Spark.ml offer over scikit-learn? At Data Science Retreat we've taken a real-world dataset and worked through the stages of building a predictive model -- exploration, data cleaning, feature engineering, and model fitting; which would you use in production?
The machine learning libraries in Apache Spark are an impressive piece of software engineering, and are maturing rapidly. What advantages does Spark.ml offer over scikit-learn?
At Data Science Retreat we've taken a real-world dataset and worked through the stages of building a predictive model -- exploration, data cleaning, feature engineering, and model fitting -- in several different frameworks. We'll show what it's like to work with native Spark.ml, and compare it to scikit-learn along several dimensions: ease of use, productivity, feature set, and performance.
In some ways Spark.ml is still rather immature, but it also conveys new superpowers to those who know how to use it.
ROCm and Distributed Deep Learning on Spark and TensorFlowDatabricks
ROCm, the Radeon Open Ecosystem, is an open-source software foundation for GPU computing on Linux. ROCm supports TensorFlow and PyTorch using MIOpen, a library of highly optimized GPU routines for deep learning. In this talk, we describe how Apache Spark is a key enabling platform for distributed deep learning on ROCm, as it enables different deep learning frameworks to be embedded in Spark workflows in a secure end-to-end machine learning pipeline. We will analyse the different frameworks for integrating Spark with Tensorflow on ROCm, from Horovod to HopsML to Databrick's Project Hydrogen. We will also examine the surprising places where bottlenecks can surface when training models (everything from object stores to the Data Scientists themselves), and we will investigate ways to get around these bottlenecks. The talk will include a live demonstration of training and inference for a Tensorflow application embedded in a Spark pipeline written in a Jupyter notebook on Hopsworks with ROCm.
Anomaly Detection and Automatic Labeling with Deep LearningAdam Gibson
Adam Gibson demonstrates how to use variational autoencoders to automatically label time series location data. You'll explore the challenge of imbalanced classes and anomaly detection, learn how to leverage deep learning for automatically labeling (and the pitfalls of this), and discover how you can deploy these techniques in your organization.
Startup.Ml: Using neon for NLP and Localization Applications Intel Nervana
Speaker: Arjun Bansal, co-founder of Nervana Systems
Arjun Bansal’s workshop focused on neon, an open-source python based deep learning framework that has been build from the ground up for speed and ease of use. The workshop highlights how to use neon, build Recurrent Recurrent Neural Networks to generate and analyze text, and build Convolutional Autoencoders to generate images and to localize objects. Arjun also demoed the integration of neon with the Nervana cloud (in private beta) for multi-GPU training of deep networks.
Time Series Anomaly Detection for .net and AzureMarco Parenzan
If you have any device or source that generates values over time (also a log from a service), you want to determine if in a time frame, the time serie is correct or you can detect some anomalies. What can you do as a developer (not a Data Scientist) with .NET and Azure?
Slildes from the Webinar "Five Ways to Leverage AI and Tableau". Full webinar recording: https://starschema.com/kb/five-ways-to-leverage-ai-and-tableau
Sources & Workbooks: https://github.com/starschema/tableau-ai-use-cases
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...HostedbyConfluent
"Regular performance testing is one of the pillars of Kafka Streams’ reliability and efficiency. Beyond ensuring dependable releases, regular performance testing supports engineers in new feature development with the ability to easily test the performance impact of their features, compare different approaches, etc.
In this session, Alex and John share their experience from developing, using, and maintaining a performance testing framework for Kafka Streams that has prevented multiple performance regressions over the last 5 years. They cover guiding principles and architecture, how to ensure statistical significance and stability of results, and how to automate regression detection for actionable notifications.
This talk sheds light on how Apache Kafka is able to foster a vibrant open-source community while maintaining a high performance bar across many years and releases. It also empowers performance-minded engineers to avoid common pitfalls and bring high-quality performance testing to their own systems."
Discovering signal in financial time series- where and how to startNicholasSherman11
In Part 1 of our webinar series on discovering signal in financial time-series data, Stanley walked you through constructing a state-of-the-art AI-driven trading strategy, by forecasting commodity time-series data, and constructing a three-asset portfolio adjusted on predictive risk.
Now, in Part 2, Stanley Speel describes how and why generating meaningful signals and relationships in time-series data is often the key to building accurate forecasting models.
Watch Stanley walk you through:
Preparing time-series historical data for a regression model
Methods for isolating and selecting relevant and independent signals
Network-based approaches for identifying and modelling relationships between multiple time-varying signals
Performance doesn’t have the same definition between system administrators, developpers and business teams. What is Performance ? High CPU usage, not scalable web site, low business transaction rate per sec, slow response time, … This presentation is about maths, code performance, load testing, web performance, best practices, … Working on performance optimizaton is a very broad topic. It’s important to really understand main concepts and to have a clean and strong methodology because it could be a very time consumming activity. Happy reading !
This presentation inludes step-by step tutorial by including the screen recordings to learn Rapid Miner.It also includes the step-step-step procedure to use the most interesting features -Turbo Prep and Auto Model.
Grails has great performance characteristics but as with all full stack frameworks, attention must be paid to optimize performance. In this talk Lari will discuss common missteps that can easily be avoided and share tips and tricks which help profile and tune Grails applications.
Auto-Train a Time-Series Forecast Model With AML + ADBDatabricks
Supply Chain, Healthcare, Insurance, and Finance often require highly accurate forecasting models in an enterprise large-scale fashion. With Azure Machine Learning on Azure Databricks, the scale and speed to large-scale many-models can be achieved and time-to-product decreases drastically. The better-together story poses an enterprise approach to AI/ML.
Azure AutoML offers an elegant solution efficiently to build forecasting models on Azure Databricks compute solving sophisticated business problems. The presentation covers the Azure Machine Learning + Azure Databricks approach (see slides attached) while the demo covers a hands-on business problem building a forecasting model in Azure Databricks using Azure Machine Learning. The AI/ML better-together story is elevated as MLFlow for Data Science Lifecycle Management and Hyperopt for distributed model execution completes AI/ML enterprise readiness for industry problems.
Time series analysis allows Data Scientists to recognize trends, seasonality, and correlations within past data related to an organization to make predictions on which business decisions are based.
Let’s take a look at how various industries use Time series analysis to make crucial decisions.
~ The airline industry can optimize travel routes by predicting future weather patterns, seasonal demands, or unexpected events.
~ Stockbrokers use it to predict correlations within stocks & market conditions to decide where to invest.
~ Supply Chain companies can predict weather conditions, traffic patterns, expected delivery times to optimize routes.
But did you know that you can build Time Series models with minimum knowledge of coding? The KNIME Analytics Platform can make this happen. It uses a Graphical User Interface to allow Data Scientists who are just starting out and do not have extensive experience in coding.
In this webinar, our Machine Learning expert will help you build time series models using the KNIME Analytics Platform. Business leaders and Data scientists must not miss this opportunity to arrive at smart, data-driven business decisions with the help of this platform.
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationImpetus Technologies
Detecting anomalous patterns in data can lead to significant actionable insights in a wide variety of application domains, such as fraud detection, network traffic management, predictive healthcare, energy monitoring and many more.
However, detecting anomalies accurately can be difficult. What qualifies as an anomaly is continuously changing and anomalous patterns are unexpected. An effective anomaly detection system needs to continuously self-learn without relying on pre-programmed thresholds.
Join our speakers Ravishankar Rao Vallabhajosyula, Senior Data Scientist, Impetus Technologies and Saurabh Dutta, Technical Product Manager - StreamAnalytix, in a discussion on:
Importance of anomaly detection in enterprise data, types of anomalies, and challenges
Prominent real-time application areas
Approaches, techniques and algorithms for anomaly detection
Sample use-case implementation on the StreamAnalytix platform
Mining System Logs to Learn Error Predictors, Universität Stuttgart, Stuttgar...Barbara Russo
Predicting system failures can be of great benefit to managers that get a better command over system performance.
Data that systems generate in the form of logs is a valuable source of information to predict system reliability. As such, there is an increasing demand of
tools to mine logs and provide accurate predictions. However, interpreting information in logs poses some challenges. This talk
presents how to effectively mining sequences of logs and provide correct predictions.
The approach integrates different machine learning techniques to control for data brittleness, provide accuracy of model selection and validation,
and increase robustness of classification results. We apply the proposed approach to log sequences of 25 different applications of a software system for
telemetry of cars
Normalmente parliamo e presentiamo Azure IoT (Central) con un taglio un po' da "maker". In questa sessione, invece, vediamo di parlare allo SCADA engineer. Come si configura Azure IoT Central per il mondo industriale? Dov'è OPC/UA? Cosa c'entra IoT Plug & Play in tutto questo? E Azure IoT Central...quali vantaggi ci da? Cerchiamo di rispondere a queste e ad altre domande in questa sessione...
Allo sviluppatore Azure piacciono i servizi PaaS perchè sono "pronti all'uso". Ma quando proponiamo le nostre soluzioni alle aziende, ci scontriamo con l'IT che apprezza gli elementi infrastrutturali, IaaS. Perchè non (ri)scoprirli aggiungendo anche un pizzico di Hybrid che con il recente Azure Kubernetes Services Edge Essentials si può anche usare in un hardware che si può tenere anche in casa? Quindi scopriremo in questa sessione, tra gli altri, le VNET, le VPN S2S, Azure Arc, i Private Endpoints, e AKS EE.
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptxMarco Parenzan
Did interfaces in C# need evolution? Maybe yes. Are they violating some fundamental principles? We see. Are we asking for some hoops? Let's see all this by telling a story (of code, of course)
Azure Synapse Analytics for your IoT SolutionsMarco Parenzan
Let's find out in this session how Azure Synapse Analytics, with its SQL Serverless Pool, ADX, Data Factory, Notebooks, Spark can be useful for managing data analysis in an IoT solution.
Power BI Streaming Data Flow e Azure IoT Central Marco Parenzan
Dal 2015 gli utilizzatori di Power BI hanno potuto analizzare dati in real-time grazie all'integrazione con altri prodotti e servizi Microsoft. Con streaming dataflow, si porterà l'analisi in tempo reale completamente all'interno di Power BI, rimuovendo la maggior parte delle restrizioni che avevamo, integrando al contempo funzionalità di analisi chiave come la preparazione dei dati in streaming e nessuna creazione di codice. Per vederlo in funzione, studieremo un caso specifico di streaming come l'IoT con Azure IoT Central.
Power BI Streaming Data Flow e Azure IoT CentralMarco Parenzan
Dal 2015 gli utilizzatori di Power BI hanno potuto analizzare dati in real-time grazie all'integrazione con altri prodotti e servizi Microsoft. Con streaming dataflow, si porterà l'analisi in tempo reale completamente all'interno di Power BI, rimuovendo la maggior parte delle restrizioni che avevamo, integrando al contempo funzionalità di analisi chiave come la preparazione dei dati in streaming e nessuna creazione di codice. Per vederlo in funzione, studieremo un caso specifico di streaming come l'IoT con Azure IoT Central.
Power BI Streaming Data Flow e Azure IoT CentralMarco Parenzan
Since 2015, Power BI users have been able to analyze data in real-time thanks to the integration with other Microsoft products and services. With streaming dataflow, you'll bring real-time analytics completely within Power BI, removing most of the restrictions we had, while integrating key analytics features like streaming data preparation and no coding. To see it in action, we will study a specific case of streaming such as IoT with Azure IoT Central.
What are the actors? What are they used for? And how can we develop them? And how are they published and used on Azure? Let's see how it's done in this session
Generic Math, funzionalità ora schedulata per .NET 7, e Azure IoT PnP mi hanno risvegliato un argomento che nel mio passato mi hanno portato a fare due/tre viaggi, grazie all'Università di Trieste, a Cambridge (2006/2007 circa) e a Seattle (2010, quando ho parlato pubblicamente per la prima volta di Azure :) e che mi ha fatto conoscere il mito Don Box!), a parlare di codice in .NET che aveva a che fare con la matematica e con la fisica: le unità di misura e le matrici. L'avvento dei Notebook nel mondo .NET e un vecchio sogno legato alla libreria ANTLR (e tutti i miei esercizi di Code Generation) mi portano a mettere in ordine 'sto minestrone di idee...o almeno ci provo (non so se sta tutto in piedi).
322 / 5,000
Translation results
.NET is better every year for a developer who still dreams of developing a video game. Without pretensions and without talking about Unity or any other framework, just "barebones" .NET code, we will try to write a game (or parts of it) in the 80's style (because I was a kid in those years). In Christmas style.
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...Marco Parenzan
IoT scenarios necessarily pass through the Edge component and the Raspberry PI is a great way to explore this world. If we need to receive IoT events from sensors, how do I implement an MQTT endpoint? Kafka is a clever way to do this. And how do I process the data? Kafka? Spark? Rabbit ?. How do we write custom code for these environments? .NET, now in version 6 is another clever way to do it! And maybe, we can also communicate with Azure. We'll see in this session if we can make it all work!
Quali vantaggi ci da Azure? Dal punto di vista dello sviluppo software, uno di questi è certamente la varietà dei servizi di gestione dei dati. Questo ci permette di cominciare a non essere SQL centrici ma utilizzare il servizio giusto per il problema giusto fino ad applicare una strategia di Polyglot Persistence (e vedremo cosa significa) nel rispetto di una corretta gestione delle risorse IT e delle pratiche di DevOps.
C'è ancora diffidenza nei confronti dell'Internet of Things e il costo delle soluzioni custom non aiuta. Azure IoT Central è un servizio SaaS personalizzabile che rende accessibile a costi sostenibili. Vediamo quali sonole peculiarità di questo servizio.
Come puoi gestire i difetti? Se sei in una fabbrica, la produzione può produrre oggetti con difetti. Oppure i valori dei sensori possono dirti nel tempo che alcuni valori non sono "normali". Cosa puoi fare come sviluppatore (non come Data Scientist) con .NET o Azure per rilevare queste anomalie? Vediamo come in questa sessione.
It happens that we have to develop several services and deploy them in Azure. They are small, repetitive but different, often not very different. Why not use code generation techniques to simplify the development and implementation of these services? Let's see with .NET comes to meet us and helps us to deploy in Azure.
Running Kafka and Spark on Raspberry PI with Azure and some .net magicMarco Parenzan
IoT scenarios necessarily pass through the Edge component and the Raspberry PI is a great way to explore this world. If we need to receive IoT events from sensors, how do I implement an MQTT endpoint? Kafka is a clever way to do this. And how do I process the data in Kafka? Spark is another clever way of doing this. How do we write custom code for these environments? .NET, now in version 6 is another clever way to do it! And maybe, we also communicate with Azure. We'll see in this session if we can make it all work!
Time Series Anomaly Detection with Azure and .NETTMarco Parenzan
f you have any device or source that generates values over time (also a log from a service), you want to determine if in a time frame, the time serie is correct or you can detect some anomalies. What can you do as a developer (not a Data Scientist) with .NET o Azure? Let's see how in this session.
It happens that we have to develop several services and deploy them in Azure. They are small, repetitive but different, often not very different. Why not use code generation techniques to simplify the development and implementation of these services? Let's see with .NET comes to meet us and helps us to deploy in Azure.
Che cosa è .NET interactive? Cosa ha a che fare con .NET? e a cosa ti serve? E se usi Azure, in cosa ti può servire? Vediamo di fare chiarezza in questa sessione.
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ?
Venez le découvrir lors de cette session ignite
Why React Native as a Strategic Advantage for Startup Innovation.pdfayushiqss
Do you know that React Native is being increasingly adopted by startups as well as big companies in the mobile app development industry? Big names like Facebook, Instagram, and Pinterest have already integrated this robust open-source framework.
In fact, according to a report by Statista, the number of React Native developers has been steadily increasing over the years, reaching an estimated 1.9 million by the end of 2024. This means that the demand for this framework in the job market has been growing making it a valuable skill.
But what makes React Native so popular for mobile application development? It offers excellent cross-platform capabilities among other benefits. This way, with React Native, developers can write code once and run it on both iOS and Android devices thus saving time and resources leading to shorter development cycles hence faster time-to-market for your app.
Let’s take the example of a startup, which wanted to release their app on both iOS and Android at once. Through the use of React Native they managed to create an app and bring it into the market within a very short period. This helped them gain an advantage over their competitors because they had access to a large user base who were able to generate revenue quickly for them.
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
Designing for Privacy in Amazon Web ServicesKrzysztofKkol1
Data privacy is one of the most critical issues that businesses face. This presentation shares insights on the principles and best practices for ensuring the resilience and security of your workload.
Drawing on a real-life project from the HR industry, the various challenges will be demonstrated: data protection, self-healing, business continuity, security, and transparency of data processing. This systematized approach allowed to create a secure AWS cloud infrastructure that not only met strict compliance rules but also exceeded the client's expectations.
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?XfilesPro
Worried about document security while sharing them in Salesforce? Fret no more! Here are the top-notch security standards XfilesPro upholds to ensure strong security for your Salesforce documents while sharing with internal or external people.
To learn more, read the blog: https://www.xfilespro.com/how-does-xfilespro-make-document-sharing-secure-and-seamless-in-salesforce/
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
Advanced Flow Concepts Every Developer Should KnowPeter Caitens
Tim Combridge from Sensible Giraffe and Salesforce Ben presents some important tips that all developers should know when dealing with Flows in Salesforce.
Advanced Flow Concepts Every Developer Should Know
Time Series Anomaly Detection with .net and Azure
1. Supported by the Russian MVP Сommunity
Time Series Anomaly Detection
with .net and Azure
Marco Parenzan
Solution Sales Specialist @ Insight
Microsoft Azure MVP
1nn0va Community Lead
2. Supported by the Russian MVP Сommunity
Marco Parenzan
• Solution Sales Specialist @ Insight
• 1nn0va Community Lead (Pordenone)
• Microsoft Azure MVP
• Profiles
• Linkedin: https://www.linkedin.com/in/marcoparenzan/
• Slideshare: https://www.slideshare.net/marco.parenzan
• GitHub: https://github.com/marcoparenzan
3. Supported by the Russian MVP Сommunity
Agenda
• Scenario
• Anomaly Detection in Time Series
• Data Science for the .NET developer
• How Data Scientists work
• Bring ML.NET to Azure
• Anomaly Detection As A Service in Azure
• Conclusions
5. Supported by the Russian MVP Сommunity
• In an industrial fridge, you
monitor temperatures to check
not the temperature «per se», but
to check the healthy of the plant
Scenario
From real industrial fridges
6. Supported by the Russian MVP Сommunity
Storage
Account
IoT Hub
Devices
Events
Ingest
The starting point...
7. Supported by the Russian MVP Сommunity
Storage
Account
Azure
IoT Central
Devices
Events
Ingest
The starting point...better...
8. Supported by the Russian MVP Сommunity
Current data path
Data collecting
Data Analysis
Data Report...?
Threshold alarms
9. Supported by the Russian MVP Сommunity
• Threshold Anomalies for a time window
• Slow changing damages
• Fridge is no more efficient
• Threshold alarms are not enough
• Anomalies cannot be just «over a threshold
for some time»...
• Condenser or Evaporator with difficulties
starting
• Distinguish from Opening a door (that is also
an anomaly)
• Or also counting the number of times that
there are peaks (too many times)
• You can considering each of these events as
anomalies that alter the temperature you
measure in different part of the fridge
Threshold anomalies?
10. Supported by the Russian MVP Сommunity
With no any specific request...what happens in
production?
Efficiency Anomalies
Batch Streaming
11. Supported by the Russian MVP Сommunity
How we can evolve...
Function App
Notification
Logic App
Ingest Process
Storage
Account
Azure
IoT Central
Devices
Events
12. Supported by the Russian MVP Сommunity
A bit of theory for Anomaly Detection
in Time Series
13. Supported by the Russian MVP Сommunity
Anomaly Detection
• Anomaly detection is the process of identifying unexpected items or
events in data sets, which differ from the norm.
• And anomaly detection is often applied on unlabeled data which is
known as unsupervised anomaly detection.
14. Supported by the Russian MVP Сommunity
Time Series
• Definition
• Time series is a sequence of data points recorded in time order, often taken at successive
equally paced points in time.
• Examples
• Stock prices, Sales demand, website traffic, daily temperatures, quarterly sales
• Time series is different from regression analysis because of its time-dependent
nature.
• Auto-correlation: Regression analysis requires that there is little or no autocorrelation in the
data. It occurs when the observations are not independent of each other. For example, in stock
prices, the current price is not independent of the previous price. [The observations have to be
dependent on time]
• Seasonality, a characteristic which we will discuss below.
15. Supported by the Russian MVP Сommunity
Components of a Time Series
• Trend
• is a general direction in which something is developing or changing. A trend can be
upward(uptrend) or downward(downtrend). It is not always necessary that the increase or
decrease is consistently in the same direction in a given period.
• Seasonality
• Predictable pattern that recurs or repeats over regular intervals. Seasonality is often observed
within a year or less.
• Irregular fluctuation
• These are variations that occur due to sudden causes and are unpredictable. For example the
rise in prices of food due to war, flood, earthquakes, farmers striking etc.
16. Supported by the Russian MVP Сommunity
Anomaly Detection in Time Series
• In time series data, an anomaly or outlier can be termed as a data point
which is not following the common collective trend or seasonal or cyclic
pattern of the entire data and is significantly distinct from rest of the
data. By significant, most data scientists mean statistical significance,
which in order words, signify that the statistical properties of the data
point is not in alignment with the rest of the series.
• Anomaly detection has two basic assumptions:
• Anomalies only occur very rarely in the data.
• Their features differ from the normal instances significantly.
17. Supported by the Russian MVP Сommunity
How to do Time Series Anomaly Detection?
• Statistical Profiling Approach
• This can be done by calculating statistical values like mean or median moving
average of the historical data and using a standard deviation to come up with a
band of statistical values which can define the uppermost bound and the lower
most bound and anything falling beyond these ranges can be an anomaly.
• By Predictive Confidence Level Approach
• One way of doing anomaly detection with time series data is by building a
predictive model using the historical data to estimate and get a sense of the
overall common trend, seasonal or cyclic pattern of the time series data.
• Clustering Based Unsupervised Approach
• Unsupervised approaches are extremely useful for anomaly detection as it does
not require any labelled data, mentioning that a particular data point is an
anomaly.
18. Supported by the Russian MVP Сommunity
• All described is “univariate” anomaly
detection, on a single time serie
• The multivariate anomaly detection
allows detecting anomalies from
groups of metrics
• Dependencies and inter-correlations
between different signals
• News are already announced in this
area, else not yet available
Multivariate anomaly detection
18
#GLOBALAZURE2021
19. Supported by the Russian MVP Сommunity
Data Science for the .NET developer
20. Supported by the Russian MVP Сommunity
• ML.NET is first and foremost a framework that you can use to
create your own custom ML models. This custom approach
contrasts with “pre-built AI,” where you use pre-designed
general AI services from the cloud (like many of the offerings
from Azure Cognitive Services). This can work great for many
scenarios, but it might not always fit your specific business needs
due to the nature of the machine learning problem or to the
deployment context (cloud vs. on-premises).
• ML.NET enables developers to use their existing .NET skills to
easily integrate machine learning into almost any .NET
application. This means that if C# (or F# or VB) is your
programming language of choice, you no longer have to learn a
new programming language, like Python or R, in order to
develop your own ML models and infuse custom machine
learning into your .NET apps.
Data Science and AI for the .NET developer
21. Supported by the Russian MVP Сommunity
ML.NET Components
Anomaly Detection
23. Supported by the Russian MVP Сommunity
Independent Identically Distributed (iid)
• Data points collected in the time series are independently sampled from
the same distribution (independent identically distributed). Thus, the
value at the current timestamp can be viewed as the value at the next
timestamp in expectation.
24. Supported by the Russian MVP Сommunity
Singular Spectrum Analysis (SSA)
• This class implements the general anomaly detection transform based on
Singular Spectrum Analysis (SSA). SSA is a powerful framework for
decomposing the time-series into trend, seasonality and noise
components as well as forecasting the future values of the time-series.
• In principle, SSA performs spectral analysis on the input time-series
where each component in the spectrum corresponds to a trend, seasonal
or noise component in the time-series
25. Supported by the Russian MVP Сommunity
Spectrum Residual Cnn (SrCnn)
• To monitor the time-series continuously and alert for potential incidents on time
• The algorithm first computes the Fourier Transform of the original data. Then it computes the spectral
residual of the log amplitude of the transformed signal before applying the Inverse Fourier Transform
to map the sequence back from the frequency to the time domain. This sequence is called the saliency
map. The anomaly score is then computed as the relative difference between the saliency map values
and their moving averages. If the score is above a threshold, the value at a specific timestep is flagged
as an outlier.
• There are several parameters for SR algorithm. To obtain a model with good performance, we suggest
to tune windowSize and threshold at first, these are the most important parameters to SR. Then you
could search for an appropriate judgementWindowSize which is no larger than windowSize. And for
the remaining parameters, you could use the default value directly.
• Time-Series Anomaly Detection Service at Microsoft [https://arxiv.org/pdf/1906.03821.pdf]
26. Supported by the Russian MVP Сommunity
• Unsupervised Machine
LearningNo labelling
• Automated Training Set for
Anomaly Detection Algorithms
• the algorithms automatically
generates a simulated training set
based non your input data
• Auto(mated) MLfind the best
tuning for you with parameters
and algorithms
Helping no-data scientits developers (all! )
https://docs.microsoft.com/en-us/azure/machine-learning/algorithm-cheat-sheet
27. Supported by the Russian MVP Сommunity
Some tools required
• .NET 5 + WPF + ML.NET
• Mandatory , the platform where we try to make experiments
• Xplot.Ploty (soon you will understand I use this) https://fslab.org/XPlot/
• XPlot is a cross-platform data visualization package for the F# programming language
powered by popular JavaScript charting libraries Plotly and Google Charts. The library
provides a complete mapping for the configuration options of the underlying libraries and so
you get a nice F# interface that gives you access to the full power of Plotly and Google
Charts. The XPlot library can be used interactively from F# Interactive, but charts can equally
easy be embedded in F# applications and in HTML reports.
• WebView2 https://docs.microsoft.com/en-us/microsoft-edge/webview2/gettingstarted/wpf
• The Microsoft Edge WebView2 control enables you to embed web technologies (HTML, CSS,
and JavaScript) in your native apps. The WebView2 control uses Microsoft Edge (Chromium)
as the rendering engine to display the web content in native apps. With WebView2, you may
embed web code in different parts of your native app. Build all of the native app within a
single WebView instance.
28. Supported by the Russian MVP Сommunity
Experimenting with .NET and WPF
30. Supported by the Russian MVP Сommunity
Jupyter
• Evolution and generalization of the seminal role of Mathematica
• In web standards way
• Web (HTTP+Markdown)
• Python adoption (ipynb)
• Written in Java
• Python has an interop bridge...not native (if ever important)Python is
a kernel for Jupyter
31. Supported by the Russian MVP Сommunity
.NET Interactive and Jupyter
and Visual Studio Code
• .NET Interactive gives C# and F# kernels to Jupyter
• .NET Interactive gives all tools to create your hosting application independently
from Jupyter
• In Visual Studio Code, you have two different notebooks (looking similar but
developed in parallel by different teams)
• .NET Interactive Notebook (by the .NET Interactive Team) that can run also Python
• Jupyter Notebook (by the Azure Data Studio Team – probably) that can run also C#
and F#
• There is a little confusion on that
• .NET Interactive has a strong C#/F# Kernel...
• ...a less mature infrastructure (compared to Jupiter)
32. Supported by the Russian MVP Сommunity
Experimenting ML.NET
with .NET Interactive
34. Supported by the Russian MVP Сommunity
.NET (5) hosting in Azure
Existing apps
.NET web apps (on-premises)
Cloud-Optimized
PaaS
Cloud-Native
PaaS for microservices and serverless
Monolithic / N-Tier
architectures
Monolithic / N-Tier
architectures
Microservices and serverless architectures
Cloud
Infrastructure-Ready
Monolithic / N-Tier
architectures
Relational
Database
VMs
Managed services
On-premises Azure
PaaS for containerized microservices
+ Serverless computing
+ Managed services
And Windows Containers
IaaS
(Infrastructure as a Service)
Azure Azure
35. Supported by the Russian MVP Сommunity
Functions everywhere
Platform
App delivery
OS
On-premises
Code
App Service on Azure Stack
Windows
●●●
Non-Azure hosts
●●●
●●●
+
Azure Functions
host runtime
Azure Functions
Core Tools
Azure Functions
base Docker image
Azure Functions
.NET Docker image
Azure Functions
Node Docker image
●●●
36. Supported by the Russian MVP Сommunity
Logic Apps
• Visually design workflows in the cloud
• Express logic through powerful control flow
• Connect disparate functions and APIs
• Utilize declarative definition to work with CI/CD
38. Supported by the Russian MVP Сommunity
Anomaly Detection As A Service in Azure
39. Supported by the Russian MVP Сommunity
Azure Cognitive Services
• Cognitive Services brings AI within reach of every developer—without requiring
machine-learning expertise. All it takes is an API call to embed the ability to see, hear,
speak, search, understand, and accelerate decision-making into your apps. Enable
developers of all skill levels to easily add AI capabilities to their apps.
• Five areas:
• Decision
• Language
• Speech
• Vision
• Web search
Anomaly Detector
Identify potential problems early on.
Content Moderator
Detect potentially offensive or unwanted
content.
Metrics Advisor PREVIEW
Monitor metrics and diagnose issues.
Personalizer
Create rich, personalized experiences for every
user.
40. Supported by the Russian MVP Сommunity
Anomaly Detector
• Through an API, Anomaly Detector ingests time-series data of all types
and selects the best-fitting detection model for your data to ensure high
accuracy. Customize the service to detect any level of anomaly and
deploy it where you need it most -- from the cloud to the intelligent
edge with containers. Azure is the only major cloud provider that offers
anomaly detection as an AI service.
41. Supported by the Russian MVP Сommunity
Anomaly Detection As A Service
42. Supported by the Russian MVP Сommunity
Anomaly Detector
• Through an API, Anomaly Detector ingests time-series data of all types
and selects the best-fitting detection model for your data to ensure high
accuracy. Customize the service to detect any level of anomaly and
deploy it where you need it most -- from the cloud to the intelligent
edge with containers. Azure is the only major cloud provider that offers
anomaly detection as an AI service.
It seems too much simple…
45. Supported by the Russian MVP Сommunity
Conclusions
• Start simple and bulk: you already have data
• If you have daily data, you need to aggregate (a month?) to have training
• take time for a correct Data Lake strategy
• there is time for realtime
• The right algorithm is the one that gives you what you want to see
• Also professionals make the same (besides REAL data scientists)
• But if you know statistics, if better for you
• Azure Cognitive Services will become more important
• New Metrics Advisor Service!
46. Supported by the Russian MVP Сommunity
Thank you!
Marco Parenzan
Solution Sales Specialist @ Insight
Microsoft Azure MVP
1nn0va Community Lead
• https://docs.microsoft.com/en-us/azure/cognitive-services/anomaly-detector/
• https://docs.microsoft.com/en-us/dotnet/machine-learning/tutorials/sales-
anomaly-detection
• https://github.com/dotnet/interactive
• https://docs.microsoft.com/en-us/dotnet/machine-learning/how-to-
guides/serve-model-serverless-azure-functions-ml-net
• https://azure.microsoft.com/en-us/services/cognitive-services/metrics-advisor/
https://azure.microsoft.com/en-us/services/cognitive-services/metrics-advisor/
Anomaly detection is the process of identifying unexpected items or events in data sets, which differ from the norm. And anomaly detection is often applied on unlabeled data which is known as unsupervised anomaly detection.
https://towardsdatascience.com/effective-approaches-for-time-series-anomaly-detection-9485b40077f1
Effective Approaches for Time Series Anomaly Detection | by Aditya Bhattacharya | Towards Data Science
SSA works by decomposing a time-series into a set of principal components. These components can be interpreted as the parts of a signal that correspond to trends, noise, seasonality, and many other factors. Then, these components are reconstructed and used to forecast values some time in the future.
The Spectral Residual outlier detector is based on the paper Time-Series Anomaly Detection Service at Microsoft and is suitable for unsupervised online anomaly detection in univariate time series data. The algorithm first computes the Fourier Transform of the original data. Then it computes the spectral residual of the log amplitude of the transformed signal before applying the Inverse Fourier Transform to map the sequence back from the frequency to the time domain. This sequence is called the saliency map. The anomaly score is then computed as the relative difference between the saliency map values and their moving averages. If the score is above a threshold, the value at a specific timestep is flagged as an outlier. For more details, please check out the paper.
What’s next?
Modernize applications with .NET Core
Today we focused on Cloud-optimized .NET Framework apps. However, many applications will benefit from modern architecture built on .NET Core – a much faster, modular, cross-platform, open source .NET. Websites can be modernized with ASP.NET Core to bring in better security, compliance, and much better performance than ASP.NET on .NET Framework. .NET Core also provides code patterns for building resilient, high-performance microservices on Linux and Windows.
Build 2015
Metrics Advisor, a new platform-as-a-service, provides you an out-of-the-box intelligent metrics monitoring platform.
It simplifies the monitoring lifecycle with a built-in web-based workspace where you can setup time-series monitoring, alerting and diagnostics with a simple user interface.
A rich set of REST APIs and SDK libraries support developers to build your custom solutions easily. Because Metrics Advisor has built an end-to-end monitoring pipeline, time to value is accelerated.