Subutai Ahmad, VP Research presenting NAB and discussing the need for evaluating real-time anomaly detection algorithms. This presentation was delivered at MLConf (Machine Learning Conference) in San Francisco 2015.
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...Neo4j
by Ruben Menke, Lead Data Scientist at Banking Circle
In this talk, Banking Circle will show how a modern computational method is essential in the fight against money laundering.
Fraud Analytics with Machine Learning and Big Data Engineering for TelecomSudarson Roy Pratihar
Presentation of a successful project executed on telecom fraud analytics @ 3rd International conference for businees analytics and intelligence, Indian Institute of Management Bangalore
Anomaly Detection and Spark Implementation - Meetup Presentation.pptxImpetus Technologies
StreamAnalytix sponsored a meetup on “Anomaly Detection Techniques and Implementation using Apache Spark” which took place on Tuesday December 5, 2017 at Larkspur Landing Milpitas Hotel, Milpitas, CA. The meetup was led by Maxim Shkarayev, Lead Data Scientist, Impetus Technologies along with Punit Shah, Solution Architect, StreamAnalytix and Anand Venugopal, Product Head & AVP, StreamAnalytix, who introduced and summarized the vast field of Anomaly Detection and its applications in various industry problems. The speakers at the event also offered a structured approach to choose the right anomaly detection techniques based on specific use-cases and data characteristics which was followed by a demonstration of some real-world anomaly detection use-cases on Apache Spark based analytics platform.
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...Neo4j
by Ruben Menke, Lead Data Scientist at Banking Circle
In this talk, Banking Circle will show how a modern computational method is essential in the fight against money laundering.
Fraud Analytics with Machine Learning and Big Data Engineering for TelecomSudarson Roy Pratihar
Presentation of a successful project executed on telecom fraud analytics @ 3rd International conference for businees analytics and intelligence, Indian Institute of Management Bangalore
Anomaly Detection and Spark Implementation - Meetup Presentation.pptxImpetus Technologies
StreamAnalytix sponsored a meetup on “Anomaly Detection Techniques and Implementation using Apache Spark” which took place on Tuesday December 5, 2017 at Larkspur Landing Milpitas Hotel, Milpitas, CA. The meetup was led by Maxim Shkarayev, Lead Data Scientist, Impetus Technologies along with Punit Shah, Solution Architect, StreamAnalytix and Anand Venugopal, Product Head & AVP, StreamAnalytix, who introduced and summarized the vast field of Anomaly Detection and its applications in various industry problems. The speakers at the event also offered a structured approach to choose the right anomaly detection techniques based on specific use-cases and data characteristics which was followed by a demonstration of some real-world anomaly detection use-cases on Apache Spark based analytics platform.
Build an Ensemble classifier that can detect credit card fraudulent
transactions.Implemented a classifier by use of machine learning algorithms, such as
Decision Trees, Logistic Regression, Artificial Neural Networks and Gradient Boosting
Classifier.
Anomaly detection is a topic with many different applications. From social media tracking, to cybersecurity, anomaly detection (or outlier detection) algorithms can have a huge impact in your organisation.
For the video please visit: https://www.youtube.com/watch?v=XEM2bYYxkTU
This slideshare has been produced by the Tesseract Academy (http://tesseract.academy), a company that educates decision makers in deep technical topics such as data science, analytics, machine learning and blockchain.
If you are interested in data science and related topics, make sure to also visit The Data Scientist: http://thedatascientist.com.
Building the Artificially Intelligent EnterpriseDatabricks
This session looks at where we are today with data and analytics and what is needed to transition to the Artificially Intelligent Enterprise.
How do you mobilise developers to exploit what data scientists and business analysts have built? How do you align it all with business strategy to maximise business outcomes? How do you combine BI, predictive and prescriptive analytics, automation and reinforcement learning to get maximum value across the enterprise? What is the blueprint for building the artificially intelligent enterprise?
•Data and analytics – Where are we?
•Why is the journey only half-way done?
•2021 and beyond – The new era of AI usage and not just build
•The requirement – event-driven, on-demand and automated analytics
•Operationalising what you build – DataOps, MLOps and RPA
•Mobilising the masses to integrate AI into processes – what needs to be done?
•Business strategy alignment – the guiding light to AI utilisation for high reward
•Agility step change – the shift to no-code integration of AI by citizen developers
•Recording decisions, and analysing business impact
•Reinforcement-learning – transitioning to continuous reward
Anomaly detection (or Outlier analysis) is the identification of items, events or observations which do not conform to an expected pattern or other items in a dataset. It is used is applications such as intrusion detection, fraud detection, fault detection and monitoring processes in various domains including energy, healthcare and finance. In this talk, we will introduce anomaly detection and discuss the various analytical and machine learning techniques used in in this field. Through a case study, we will discuss how anomaly detection techniques could be applied to energy data sets. We will also demonstrate, using R and Apache Spark, an application to help reinforce concepts in anomaly detection and best practices in analyzing and reviewing results.
This session will go into best practices and detail on how to architect a near real-time application on Hadoop using an end-to-end fraud detection case study as an example. It will discuss various options available for ingest, schema design, processing frameworks, storage handlers and others, available for architecting this fraud detection application and walk through each of the architectural decisions among those choices.
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaPyData
PyData London 2018
This talk will focus on the importance of correctly defining an anomaly when conducting anomaly detection using unsupervised machine learning. It will include a review of Isolation Forest algorithm (Liu et al. 2008), and a demonstration of how this algorithm can be applied to transaction monitoring, specifically to detect money laundering.
---
www.pydata.org
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
Wealth management is facing significant disruption on two fronts – customer experience and digital transformation. To effectively succeed within these turbulent times, understanding client demographics and expectations is essential. Firms can leverage deep customer insights to grasp their clients’ changing ethos and develop solutions accordingly. Improved customer satisfaction often drives competitive advantage. As firms prioritize superior customer experience, they are adopting intelligent solutions such as analysis of consumer sentiments to deliver hyper-personalized services. Firms are also leveraging artificial intelligence (AI) and machine learning (ML) techniques to improve client-advisor relationships. To innovate, especially within legacy infrastructures, organizations must embrace open APIs to scale technology capability with support from WealthTech newcomers and third-party vendors that offer generic and customizable API-based platforms. Regulations such as the EU’s General Data Protection Regulation (GDPR) and know your customer (KYC) mandates are pushing firms to ramp up cybersecurity and automate cumbersome client onboarding processes, in a data-driven compliance scenario.
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Dataconomy Media
"Machine learning algorithms require significant amounts of training data which has been centralized on one machine or in a datacenter so far. For numerous applications, such need of collecting data can be extremely privacy-invasive. Recent advancements in AI research approach this issue by a new paradigm of training AI models, i.e., Federated Learning.
In federated learning, edge devices (phones, computers, cars etc.) collaboratively learn a shared AI model while keeping all the training data on device, decoupling the ability to do machine learning from the need to store the data in the cloud. From personal data perspective, this paradigm enables a way of training a model on the device without directly inspecting users’ data on a server. This talk will pinpoint several examples of AI applications benefiting from federated learning and the likely future of privacy-aware systems."
MLOps - Getting Machine Learning Into ProductionMichael Pearce
Creating autonomy and self-sufficiency by giving people what they need in order to do the things they need to do! What gets in the way, and how can we overcome those barriers? How do we get started quickly, effectively and safely? We'll come together to look at what MLOps entails, some of the tools available and what common MLOps pipelines look like.
Part 1
- Introduction
- Application for Anomaly Detection
- AIOps
- GraphDB
Part 2
- Type Of Anomaly Detection
- How to Identify Outliers in your Data
Part 3
- Anomaly Detection for Timeseries Technique
Data deduplication, or entity resolution, is a common problem for anyone working with data, especially public data sets. Many real world datasets do not contain unique IDs, instead, we often use a combination of fields to identify unique entities across records by linking and grouping. This talk will show how we can use active learning techniques to train learnable similarity functions that outperform standard similarity metrics (such as edit or cosine distance) for deduplicating data in a graph database. Further, we show how these techniques can be enhanced by inspecting the structure of the graph to inform the linking and grouping processes. We will demonstrate how to use open source tools to perform entity resolution on a dataset of campaign finance contributions loaded into the Neo4j graph database.
Un patrón recurrente que vemos en plataformas de datos centralizadas es básicamente extraer los datos de varios sistemas operacionales para después limpiar o procesar la data y al final desifrar cómo obtener valor de los datos. El problema es que los datos son ubicuos y cambian constantemente en el tiempo y este tipo de arquitectura centralizada esta descompuesta en capacidad técnicas y simplemente no escala. En esta charla se explicará la teoría y las pruebas que se han ejecutado en ThoughtWorks, sobre Data Mesh, un paradigma que se basa en la arquitectura distribuida moderna donde se considera la división en dominios, el pensamiento de la plataforma para crear una infraestructura de datos de autoservicio y el tratamiento de los datos como un producto.
Build an Ensemble classifier that can detect credit card fraudulent
transactions.Implemented a classifier by use of machine learning algorithms, such as
Decision Trees, Logistic Regression, Artificial Neural Networks and Gradient Boosting
Classifier.
Anomaly detection is a topic with many different applications. From social media tracking, to cybersecurity, anomaly detection (or outlier detection) algorithms can have a huge impact in your organisation.
For the video please visit: https://www.youtube.com/watch?v=XEM2bYYxkTU
This slideshare has been produced by the Tesseract Academy (http://tesseract.academy), a company that educates decision makers in deep technical topics such as data science, analytics, machine learning and blockchain.
If you are interested in data science and related topics, make sure to also visit The Data Scientist: http://thedatascientist.com.
Building the Artificially Intelligent EnterpriseDatabricks
This session looks at where we are today with data and analytics and what is needed to transition to the Artificially Intelligent Enterprise.
How do you mobilise developers to exploit what data scientists and business analysts have built? How do you align it all with business strategy to maximise business outcomes? How do you combine BI, predictive and prescriptive analytics, automation and reinforcement learning to get maximum value across the enterprise? What is the blueprint for building the artificially intelligent enterprise?
•Data and analytics – Where are we?
•Why is the journey only half-way done?
•2021 and beyond – The new era of AI usage and not just build
•The requirement – event-driven, on-demand and automated analytics
•Operationalising what you build – DataOps, MLOps and RPA
•Mobilising the masses to integrate AI into processes – what needs to be done?
•Business strategy alignment – the guiding light to AI utilisation for high reward
•Agility step change – the shift to no-code integration of AI by citizen developers
•Recording decisions, and analysing business impact
•Reinforcement-learning – transitioning to continuous reward
Anomaly detection (or Outlier analysis) is the identification of items, events or observations which do not conform to an expected pattern or other items in a dataset. It is used is applications such as intrusion detection, fraud detection, fault detection and monitoring processes in various domains including energy, healthcare and finance. In this talk, we will introduce anomaly detection and discuss the various analytical and machine learning techniques used in in this field. Through a case study, we will discuss how anomaly detection techniques could be applied to energy data sets. We will also demonstrate, using R and Apache Spark, an application to help reinforce concepts in anomaly detection and best practices in analyzing and reviewing results.
This session will go into best practices and detail on how to architect a near real-time application on Hadoop using an end-to-end fraud detection case study as an example. It will discuss various options available for ingest, schema design, processing frameworks, storage handlers and others, available for architecting this fraud detection application and walk through each of the architectural decisions among those choices.
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaPyData
PyData London 2018
This talk will focus on the importance of correctly defining an anomaly when conducting anomaly detection using unsupervised machine learning. It will include a review of Isolation Forest algorithm (Liu et al. 2008), and a demonstration of how this algorithm can be applied to transaction monitoring, specifically to detect money laundering.
---
www.pydata.org
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
Wealth management is facing significant disruption on two fronts – customer experience and digital transformation. To effectively succeed within these turbulent times, understanding client demographics and expectations is essential. Firms can leverage deep customer insights to grasp their clients’ changing ethos and develop solutions accordingly. Improved customer satisfaction often drives competitive advantage. As firms prioritize superior customer experience, they are adopting intelligent solutions such as analysis of consumer sentiments to deliver hyper-personalized services. Firms are also leveraging artificial intelligence (AI) and machine learning (ML) techniques to improve client-advisor relationships. To innovate, especially within legacy infrastructures, organizations must embrace open APIs to scale technology capability with support from WealthTech newcomers and third-party vendors that offer generic and customizable API-based platforms. Regulations such as the EU’s General Data Protection Regulation (GDPR) and know your customer (KYC) mandates are pushing firms to ramp up cybersecurity and automate cumbersome client onboarding processes, in a data-driven compliance scenario.
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Dataconomy Media
"Machine learning algorithms require significant amounts of training data which has been centralized on one machine or in a datacenter so far. For numerous applications, such need of collecting data can be extremely privacy-invasive. Recent advancements in AI research approach this issue by a new paradigm of training AI models, i.e., Federated Learning.
In federated learning, edge devices (phones, computers, cars etc.) collaboratively learn a shared AI model while keeping all the training data on device, decoupling the ability to do machine learning from the need to store the data in the cloud. From personal data perspective, this paradigm enables a way of training a model on the device without directly inspecting users’ data on a server. This talk will pinpoint several examples of AI applications benefiting from federated learning and the likely future of privacy-aware systems."
MLOps - Getting Machine Learning Into ProductionMichael Pearce
Creating autonomy and self-sufficiency by giving people what they need in order to do the things they need to do! What gets in the way, and how can we overcome those barriers? How do we get started quickly, effectively and safely? We'll come together to look at what MLOps entails, some of the tools available and what common MLOps pipelines look like.
Part 1
- Introduction
- Application for Anomaly Detection
- AIOps
- GraphDB
Part 2
- Type Of Anomaly Detection
- How to Identify Outliers in your Data
Part 3
- Anomaly Detection for Timeseries Technique
Data deduplication, or entity resolution, is a common problem for anyone working with data, especially public data sets. Many real world datasets do not contain unique IDs, instead, we often use a combination of fields to identify unique entities across records by linking and grouping. This talk will show how we can use active learning techniques to train learnable similarity functions that outperform standard similarity metrics (such as edit or cosine distance) for deduplicating data in a graph database. Further, we show how these techniques can be enhanced by inspecting the structure of the graph to inform the linking and grouping processes. We will demonstrate how to use open source tools to perform entity resolution on a dataset of campaign finance contributions loaded into the Neo4j graph database.
Un patrón recurrente que vemos en plataformas de datos centralizadas es básicamente extraer los datos de varios sistemas operacionales para después limpiar o procesar la data y al final desifrar cómo obtener valor de los datos. El problema es que los datos son ubicuos y cambian constantemente en el tiempo y este tipo de arquitectura centralizada esta descompuesta en capacidad técnicas y simplemente no escala. En esta charla se explicará la teoría y las pruebas que se han ejecutado en ThoughtWorks, sobre Data Mesh, un paradigma que se basa en la arquitectura distribuida moderna donde se considera la división en dominios, el pensamiento de la plataforma para crear una infraestructura de datos de autoservicio y el tratamiento de los datos como un producto.
Brains, Data, and Machine Intelligence (2014 04 14 London Meetup)Numenta
Jeff will discuss the Brains, Data, Machine Intelligence, Cortical Learning Algorithm he developed and the Numenta Platform for Intelligent Computing (NuPIC).
Why Neurons have thousands of synapses? A model of sequence memory in the brainNumenta
Presentation given by Yuwei Cui, Numenta Research Engineer at Beijing Normal University. December 2015.
Collaborators: Jeff Hawkins, Subutai Ahmad, Chetan Surpur
Abstract:
There’s no question that we are seeing an increase in the availability of streaming, time-series data. Largely driven by the rise of the Internet of Things (IoT) and connected real-time data sources, we now have an enormous number of applications with sensors that produce important data that changes over time. This data presents a challenge and opportunity for businesses across every industry. How do they handle the onslaught of streaming data? How can they exploit it to make decisions in real-time? One way is to detect, in real time, when something unusual occurs. Early anomaly detection in streaming data has significant implications, yet can be very difficult to execute. It requires detectors to process data in real-time, not batches, and learn while simultaneously making predictions. In this talk, we’ll look at algorithms designed for such data and analyze the components that lead to optimal performance. We’ll also discuss a new benchmark with a labeled, real-world data set, designed to provide a controlled and repeatable environment of open-source tools to test and measure anomaly detection algorithms on streaming data. How do we score in a way that rewards algorithms that detect all anomalies as soon as possible, triggers no false alarms, works with real-world time-series data across a variety of domains, and automatically adapts to changing statistics?
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationImpetus Technologies
Detecting anomalous patterns in data can lead to significant actionable insights in a wide variety of application domains, such as fraud detection, network traffic management, predictive healthcare, energy monitoring and many more.
However, detecting anomalies accurately can be difficult. What qualifies as an anomaly is continuously changing and anomalous patterns are unexpected. An effective anomaly detection system needs to continuously self-learn without relying on pre-programmed thresholds.
Join our speakers Ravishankar Rao Vallabhajosyula, Senior Data Scientist, Impetus Technologies and Saurabh Dutta, Technical Product Manager - StreamAnalytix, in a discussion on:
Importance of anomaly detection in enterprise data, types of anomalies, and challenges
Prominent real-time application areas
Approaches, techniques and algorithms for anomaly detection
Sample use-case implementation on the StreamAnalytix platform
How the Big Data of APM can Supercharge DevOpsCA Technologies
In the age where applications reign supreme, your organizations must be agile in application performance management and app development in order to meet the market demands and stay competitive. Even with mature APM solutions, developer, test and operations teams are strained by operational complexity, accelerated release schedules, and big data challenges to quickly find the root cause of issues affecting end user experience.
The power of advanced analytics and data science can help us make the most of the vast cache of APM data we collect and help our DevOps teams supercharge user experience. It’s time to take some of the load off of our humans and let technology make it easier to focus on meaningful changes in user, application and system behavior. Analytics are becoming a valuable component of APM solutions to redefine triage, improve application quality, and delight the end-user.
In a webcast on August 7th, 2014, Ken Godskind, Chief blogger and Analyst, APMExaminer.com shared how the big data of APM can supercharge your DevOps transformation. Chris Kline, Senior Director, CA Technologies followed Ken and discussed how the Advanced Behavior Analytics capability of CA APM can assist in this journey.
Ken and Chris used this slide set during the webcast which can be viewed at http://goo.gl/TZYEuq
Grails has great performance characteristics but as with all full stack frameworks, attention must be paid to optimize performance. In this talk Lari will discuss common missteps that can easily be avoided and share tips and tricks which help profile and tune Grails applications.
Grails has great performance characteristics but as with all full stack frameworks, attention must be paid to optimize performance. In this talk Lari will discuss common missteps that can easily be avoided and share tips and tricks which help profile and tune Grails applications.
In this talk I will review several real-world applications and tools developed at the University of Waikato over the past 15 years. The early applications focused on agricultural problems such as cow culling, venison bruising and grass grubs. Following this we looked at the use of near infrared spectroscopy coupled with data mining as an alternate laboratory technique for predicting compound concentrations in soil and plant samples. Our latest application is in the area of gas chromatography mass spectrometry (GCMS), a technique used to determine in environmental applications, for example, the petroleum content in soil and water samples.
(Mike Graham + Dan Carroll, Comcast) Kafka Summit SF 2018
Comcast manages over 2 million miles of fiber and coax, and over 40 million in home devices. This “outside plant” is subject to adverse conditions from severe weather to power grid outages to construction-related disruptions. Maintaining the health of this large and important infrastructure requires a distributed, scalable, reliable and fast information system capable of real-time processing and rapid analysis and response. Using Apache Kafka and the Kafka Streams Processor API, Comcast built an innovative new system for monitoring, problem analysis, metrics reporting and action response for the outside plant.
In this talk, you’ll learn how topic partitions, state stores, key mapping, source and sink topics and processors from the Kafka Streams Processor API work together to build a powerful dynamic system. We will dive into the details about the inner workings of the state store—how it is backed by a Kafka “changelog” topic, how it is scaled horizontally by partition and how the instances are rebuilt on startup or on processor failure. We will discuss how these state stores essentially become like materialized views in a SQL database but are updated incrementally as data flows through the system, and how this allows the developers to maintain the data in the optimal structures for performing the processing. The best part is that the data is readily available when needed by the processors. You will see how a REST API using Kafka Streams “interactive queries” can be used to retrieve the data in the state stores. We will explore the deployment and monitoring mechanisms used to deliver this system as a set of independently deployed components.
Customers always have the right to request a meter test.
Some utilities and some jurisdictions allow for testing at the customer site, others require a test in a laboratory environment.
Some allow the customer to witness the test and others require the utility commission to witness the test.
Utilities must show that the meter tests well and must demonstrate that they have a test program in place to ensure the meters in service are performing well.
This presentation will demonstrate:
Why do we test?
How do we test?
What types of meter tests are there?
How do utility tests differ from customer request tests?
What is In-Service Testing?
How do we know meter tests are good?
What do we do with the test data?
Grails has great performance characteristics but as with all full stack frameworks, attention must be paid to optimize performance. In this talk Lari will discuss common missteps that can easily be avoided and share tips and tricks which help profile and tune Grails applications.
HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...HBaseCon
n this session, you will learn about a solution developed in partnership between Intel and the Michael J. Fox foundation to enable breakthroughs in Parkinson's disease (PD) research, by leveraging wearable sensors and smartphone to monitor PD patient's motor movements 24/7. We'll elaborate on how we're using HBase for time-series data storage and integrating it with various stream, batch, and interactive technologies. We'll also review our efforts to create an interactive querying solution over HBase.
SmartData Webinar: Applying Neocortical Research to Streaming AnalyticsDATAVERSITY
We are witnessing an explosion of sensors and machine generated data. Every server, every building, and every device generates a continuous stream of information that is ever changing and potentially valuable. The existing big data paradigm requires storing data for batch analysis, and extensive modeling by a human expert, prior to deployment. This is incredibly inefficient and cannot scale.
In this webinar, Ahmad will describe a new paradigm for streaming data algorithms, based on recent neuroscience findings and on the computational properties of the neocortex. These systems are highly automated, adapt to changing statistics, and naturally deal with temporal data streams. Many of the core ideas have been implemented in the open source project NuPIC, and validated in commercial anomaly detection and predictive maintenance applications. Given the massive increase in the number of data sources, a general-purpose automated approach is the only scalable way to effectively analyze and act on continuously streaming information.
Use of Machine Intelligence in manufacturing industry poses a special challenge due to a wide range of use cases, inherent complexity in data collection, availability of information and disconnect between information islands in different manufacturing steps.
Within our talk we present several machine intelligence projects we did in the manufacturing industry, which helped our customers in product quality improvement, reduction of cost and better asset management. We will talk about the used methodologies, the results achieved and the lessons learned from these projects. We will specifically focus on the importance of process and business knowledge for successful implementation of any industrial project.
Similar to Evaluating Real-Time Anomaly Detection: The Numenta Anomaly Benchmark (20)
Brains@Bay Meetup: A Primer on Neuromodulatory Systems - Srikanth RamaswamyNumenta
Meetup page: https://www.meetup.com/Brains-Bay/events/284481247/
Neuromodulators are signalling chemicals in the brain, which control the emergence of adaptive learning and behaviour. Neuromodulators including dopamine, acetylcholine, serotonin and noradrenaline operate on a spectrum of spatio-temporal scales in tandem and opposition to reconfigure functions of biological neural networks and to regulate global cognition and state transition. Although neuromodulators are important in shaping cognition, their phenomenology is yet to be fully realized in deep neural networks (DNNs). In this talk, we will give an overview of the biological organizing principles of neuromodulators in adaptive cognition and highlight the competition and cooperation across neuromodulators.
Brains@Bay Meetup: How to Evolve Your Own Lab Rat - Thomas MiconiNumenta
Meetup page: https://www.meetup.com/Brains-Bay/events/284481247/
A hallmark of intelligence is the ability to learn new flexible, cognitive behaviors - that is, behaviors that require discovering, storing and exploiting novel information for each new instance of the task. In meta-learning, agents are trained with external algorithms to learn one specific cognitive task. However, animals are able to pick up such cognitive tasks automatically, as a result of their evolved neural architecture and synaptic plasticity mechanisms, including neuromodulation. Here we evolve neural networks, endowed with plastic connections and reward-based neuromodulation, over a sizable set of simple meta-learning tasks based on a framework from computational neuroscience. The resulting evolved networks can automatically acquire a novel simple cognitive task, never seen during evolution, through the spontaneous operation of their evolved neural organization and plasticity system. We suggest that attending to the multiplicity of loops involved in natural learning may provide useful insight into the emergence of intelligent behavior.
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...Numenta
We receive information about the world through our sensors and influence the world through our effectors. Such low-level data has gradually come to play a greater role in AI during its 70-year history. I see this as occurring in four steps, two of which are mostly past and two of which are in progress or yet to come. The first step was to view AI as the design of agents which interact with the world and thereby have sensorimotor experience; this viewpoint became prominent in the 1980s and 1990s. The second step was to view the goal of intelligence in terms of experience, as in the reward signal of optimal control and reinforcement learning. The reward formulation of goals is now widely used but rarely loved. Many would prefer to express goals in non-experiential terms, such as reaching a destination or benefiting humanity, but settle for reward because, as an experiential signal, reward is directly available to the agent without human assistance or interpretation. This is the pattern that we see in all four steps. Initially a non-experiential approach seems more intuitive, is preferred and tried, but ultimately proves a limitation on scaling; the experiential approach is more suited to learning and scaling with computational resources. The third step in the increasing role of experience in AI concerns the agent’s representation of the world’s state. Classically, the state of the world is represented in objective terms external to the agent, such as “the grass is wet” and “the car is ten meters in front of me”, or with probability distributions over world states such as in POMDPs and other Bayesian approaches. Alternatively, the state of the world can be represented experientially in terms of summaries of past experience (e.g., the last four Atari video frames input to DQN) or predictions of future experience (e.g., successor representations). The fourth step is potentially the biggest: world knowledge. Classically, world knowledge has always been expressed in terms far from experience, and this has limited its ability to be learned and maintained. Today we are seeing more calls for knowledge to be predictive and grounded in experience. After reviewing the history and prospects of the four steps, I propose a minimal architecture for an intelligent agent that is entirely grounded in experience.
Brains@Bay Meetup: Open-ended Skill Acquisition in Humans and Machines: An Ev...Numenta
In this talk, I will propose a conceptual framework sketching a path toward open-ended skill acquisition through the coupling of environmental, morphological, sensorimotor, cognitive, developmental, social, cultural and evolutionary mechanisms. I will illustrate parts of this framework through computational experiments highlighting the key role of intrinsically motivated exploration in the generation of behavioral regularity and diversity. Firstly, I will show how some forms of language can self-organize out of generic exploration mechanisms without any functional pressure to communicate. Secondly, we will see how language — once invented — can be recruited as a cognitive tool that enables compositional imagination and bootstraps open-ended cultural innovation.
For more:
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...Numenta
Most current deep neural networks learn from a static data set without active interaction with the world. We take a look at how learning through a closed loop between action and perception affects the representations learned in a DNN. We demonstrate how these representations are significantly different from DNNs that learn supervised or unsupervised from a static dataset without interaction. These representations are much sparser and encode meaningful content in an efficient way. Even an agent who learned without any external supervision, purely through curious interaction with the world, acquires encodings of the high dimensional visual input that enable the agent to recognize objects using only a handful of labeled examples. Our results highlight the capabilities that emerge from letting DNNs learn more similar to biological brains, though sensorimotor interaction with the world.
For more:
SBMT 2021: Can Neuroscience Insights Transform AI? - Lawrence SpracklenNumenta
Numenta's Director of ML Architecture Lawrence Spracklen presented a talk at the SBMT Annual Congress on July 10th, 2021. He talked about how neuroscience principles can inspire better machine learning algorithms.
FPGA Conference 2021: Breaking the TOPS ceiling with sparse neural networks -...Numenta
Nick Ni (Xilinx) and Lawrence Spracklen (Numenta) presented a talk at the FGPA Conference Europe on July 8th, 2021. In this talk, they presented a neuroscience approach to optimize state-of-the-art deep learning networks into sparse topology and how it can unlock significant performance gains on FPGAs without major loss of accuracy. They then walked through the FPGA implementation where they exploited the advantage of sparse networks with a unique Domain Specific Architecture (DSA).
BAAI Conference 2021: The Thousand Brains Theory - A Roadmap for Creating Mac...Numenta
Jeff Hawkins presented a talk on "The Thousand Brains Theory: A Roadmap to Machine Intelligence" at the Beijing Academy of Artificial Intelligence Conference on 1st June 2021. In this talk, he discussed the key components of The Thousand Brains Theory and Numenta's recent work.
Jeff Hawkins NAISys 2020: How the Brain Uses Reference Frames, Why AI Needs t...Numenta
Jeff Hawkins presents a talk on "How the Brain Uses Reference Frames to Model the World, Why AI Needs to do the Same." In this talk, he gives an overview of The Thousand Brains Theory and discusses how machine intelligence can benefit from working on the same principles as the neocortex.
This talk was first presented at the NAISys conference on November 10, 2020. You can find a re-recording of the talk here: https://youtu.be/mGSG7I9VKDU
OpenAI’s GPT 3 Language Model - guest Steve OmohundroNumenta
In this research meeting, guest Stephen Omohundro gave a fascinating talk on GPT-3, the new massive OpenAI Natural Language Processing model. He reviewed the network architecture, training process, and results in the context of past work. There was extensive discussion on the implications for NLP and for Machine Intelligence / AGI.
Link to GPT-3 paper: https://arxiv.org/abs/2005.14165
Link to YouTube recording of Steve's talk: https://youtu.be/0ZVOmBp29E0
CVPR 2020 Workshop: Sparsity in the neocortex, and its implications for conti...Numenta
Numenta VP Research Subutai Ahmad presents a talk on "Sparsity in the Neocortex and its Implications for Continual Learning" at the virtual CVPR 2020 workshop. In this talk, he discusses how continuous learning systems can benefit from sparsity, active dendrites and other neocortical mechanisms.
The Thousand Brains Theory: A Framework for Understanding the Neocortex and B...Numenta
Recent advances in reverse engineering the neocortex reveal that it is a highly-distributed sensory-motor modeling system. Each cortical column learns complete models of observed objects through movement and sensation. The columns use long-range connections to vote on what objects are currently being observed. In this talk, we introduce the key elements of this theory and describe how these elements can be introduced into current machine learning techniques to improve their capabilities, robustness, and power requirements.
Jeff Hawkins Human Brain Project Summit Keynote: "Location, Location, Locatio...Numenta
Jeff Hawkins delivered this keynote presentation at the 2018 Human Brain Project Summit Open Day in Maastricht, the Netherlands on October 15, 2018. A screencast recording of the slides is also available at: https://numenta.com/resources/videos/jeff-hawkins-human-brain-project-screencast/
Location, Location, Location - A Framework for Intelligence and Cortical Comp...Numenta
Jeff Hawkins gave this presentation as part of the Johns Hopkins APL Colloquium Series on Septemer 21, 2018.
View the video of the talk here: https://numenta.com/resources/videos/jeff-hawkins-johns-hopkins-apl-talk/
Have We Missed Half of What the Neocortex Does? A New Predictive Framework ...Numenta
Numenta VP of Research Subutai Ahmad delivered this presentation at the Centre for Theoretical Neuroscience, University of Waterloo on October 2, 2018.
The Biological Path Toward Strong AI by Matt Taylor (05/17/18)Numenta
These are Matt Taylor's slides from the AI Singapore Meetup on May 17, 2018.
Abstract:
Today’s wave of AI technology is still being driven by the ANN neuron pioneered decades ago. Hierarchical Temporal Memory (HTM) is a realistic biologically-constrained model of the pyramidal neuron reflecting today’s most recent neocortical research. This talk will describe and visualize core HTM concepts like sparse distributed representations, spatial pooling and temporal memory. Strong AI is a common goal of many computer scientists. So far, machine learning techniques have created amazing results in narrow fields, but haven’t produced something we could all call “intelligent”. Given recent advances in neuroscience research, we know a lot more about how neurons work together now than we did when ANNs were created. We believe systems with a more realistic neuronal model will be more likely to produce Strong AI. Hierarchical Temporal Memory is a theory of intelligence based upon neuroscience research. The neocortex is the seat of intelligence in the brain, and it is structurally homogeneous throughout. This means a common algorithm is processing all your sensory input, no matter which sense. We believe we have discovered some of the foundational algorithms of the neocortex, and we’ve implemented them in software. I’ll show you how they work with detailed dynamic visualizations of Sparse Distributed Representations, Spatial Pooling, and Temporal Memory.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
5. 5
YET ANOTHER BENCHMARK?
• A benchmark consists of:
• Labeled data sets
• Scoring mechanism
• Versioning system
• Most existing benchmarks are designed for batch data, not
streaming data
• Hard to find benchmarks containing real world data labeled with
anomalies
• We saw a need for a benchmark that is designed to test anomaly
detection algorithms on real-time, streaming data
• A standard community benchmark could spur innovation in real-
time anomaly detection algorithms
6. 6
NUMENTA ANOMALY BENCHMARK (NAB)
• NAB: a rigorous benchmark for anomaly
detection in streaming applications
7. 7
NUMENTA ANOMALY BENCHMARK (NAB)
• NAB: a rigorous benchmark for anomaly
detection in streaming applications
• Real-world benchmark data set
• 58 labeled data streams
(47 real-world, 11 artificial streams)
• Total of 365,551 data points
8. 8
NUMENTA ANOMALY BENCHMARK (NAB)
• NAB: a rigorous benchmark for anomaly
detection in streaming applications
• Real-world benchmark data set
• 58 labeled data streams
(47 real-world, 11 artificial streams)
• Total of 365,551 data points
• Scoring mechanism
• Reward early detection
• Anomaly windows
• Scoring function
• Different “application profiles”
9. 9
NUMENTA ANOMALY BENCHMARK (NAB)
• NAB: a rigorous benchmark for anomaly
detection in streaming applications
• Real-world benchmark data set
• 58 labeled data streams
(47 real-world, 11 artificial streams)
• Total of 365,551 data points
• Scoring mechanism
• Reward early detection
• Anomaly windows
• Scoring function
• Different “application profiles”
• Open resource
• AGPL repository contains data, source code,
and documentation
• github.com/numenta/NAB
13. 13
HOW SHOULD WE SCORE ANOMALIES?
• The perfect detector
• Detects every anomaly
• Detects anomalies as soon as possible
• Provides detections in real time
• Triggers no false alarms
• Requires no parameter tuning
• Automatically adapts to changing statistics
14. 14
HOW SHOULD WE SCORE ANOMALIES?
• The perfect detector
• Detects every anomaly
• Detects anomalies as soon as possible
• Provides detections in real time
• Triggers no false alarms
• Requires no parameter tuning
• Automatically adapts to changing statistics
• Scoring methods in traditional benchmarks are insufficient
• Precision/recall does not incorporate importance of early detection
• Artificial separation into training and test sets does not handle continuous learning
• Batch data files allow look ahead and multiple passes through the data
17. 17
• Effect of each detection is scaled
relative to position within window:
• Detections outside window are false
positives (scored low)
• Multiple detections within window are
ignored (use earliest one)
SCORING FUNCTION
18. 18
• Effect of each detection is scaled
relative to position within window:
• Detections outside window are false
positives (scored low)
• Multiple detections within window are
ignored (use earliest one)
• Total score is sum of scaled detections
+ weighted sum of missed detections:
SCORING FUNCTION
19. 19
OTHER DETAILS
• Application profiles
• Three application profiles assign different weightings based on the tradeoff between
false positives and false negatives.
• EKG data on a cardiac patient favors False Positives.
• IT / DevOps professionals hate False Positives.
• Three application profiles: standard, favor low false positives, favor low false negatives.
20. 20
OTHER DETAILS
• Application profiles
• Three application profiles assign different weightings based on the tradeoff between
false positives and false negatives.
• EKG data on a cardiac patient favors False Positives.
• IT / DevOps professionals hate False Positives.
• Three application profiles: standard, favor low false positives, favor low false negatives.
• NAB emulates practical real-time scenarios
• Look ahead not allowed for algorithms. Detections must be made on the fly.
• No separation between training and test files. Invoke model, start streaming, and go.
• No batch, per dataset, parameter tuning. Must be fully automated with single set of
parameters across datasets. Any further parameter tuning must be done on the fly.
21. 21
TESTING ALGORITHMS WITH NAB
• NAB is a community effort
• The goal is to have researchers independently evaluate a large number of algorithms
• Very easy to plug in and test new algorithms
22. 22
TESTING ALGORITHMS WITH NAB
• NAB is a community effort
• The goal is to have researchers independently evaluate a large number of algorithms
• Very easy to plug in and test new algorithms
• Seed results with three algorithms:
• Hierarchical Temporal Memory
• Numenta’s open source streaming anomaly detection algorithm
• Models temporal sequences in data, continuously learning
• Etsy Skyline
• Popular open source anomaly detection technique
• Mixture of statistical experts, continuously learning
• Twitter ADVec
• Open source anomaly detection released earlier this year
• Robust outlier statistics + piecewise approximation
24. 24
DETECTION RESULTS: CPU USAGE ON
PRODUCTION SERVER
Simple spike, all 3
algorithms detect
Shift in usage
Etsy
Skyline
Numenta
HTM
Twitter
ADVec
Red denotes
False Positive
Key
25. 25
DETECTION RESULTS: MACHINE
TEMPERATURE READINGS
HTM detects purely
temporal anomaly
Etsy
Skyline
Numenta
HTM
Twitter
ADVec
Red denotes
False Positive
Key
All 3 detect
catastrophic failure
26. 26
DETECTION RESULTS: TEMPORAL CHANGES IN
BEHAVIOR OFTEN PRECEDE A LARGER SHIFT
HTM detects anomaly 3
hours earlier
Etsy
Skyline
Numenta
HTM
Twitter
ADVec
Red denotes
False Positive
Key
27. 27
SUMMARY
• Anomaly detection is most common application for streaming analytics
• NAB is a community benchmark for streaming anomaly detection
• Includes a labeled dataset with real data
• Scoring methodology designed for practical real-time applications
• Fully open source codebase
28. 28
SUMMARY
• Anomaly detection is most common application for streaming analytics
• NAB is a community benchmark for streaming anomaly detection
• Includes a labeled dataset with real data
• Scoring methodology designed for practical real-time applications
• Fully open source codebase
• What’s next for NAB?
• We hope to see researchers test additional algorithms
• We hope to spark improved algorithms for streaming
• More data sets!
• Could incorporate UC Irvine dataset, Yahoo labs dataset (not open source)
• Would love to get more labeled streaming datasets from you
• Add support for multivariate anomaly detection
29. 29
NAB RESOURCES
Table 12 at MLConf
Repository: https://github.com/numenta/NAB
Paper:
A. Lavin and S. Ahmad, “Evaluating Real-time Anomaly Detection Algorithms –
the Numenta Anomaly Benchmark,” to appear in 14th International Conference
on Machine Learning and Applications (IEEE ICMLA’15), 2015.
Preprint available: http://arxiv.org/abs/1510.03336
Contact info:
sahmad@numenta.com , alavin@numenta.com