Anomaly Detection At The Edge

Statistical Learning Principal at Machine Zone, Inc.
Jun. 25, 2020

More Related Content

Slideshows for you(20)


Anomaly Detection At The Edge

  1. Arun Kejariwal Ira Cohen @arun_kejariwal @irairacohen ANOMAly detection AT the edge
  2. Computing closer to the data source, reduce network traffic SUPPLEMENT CLOUD COMPUTING 5G - 1.1B connections by 2023, 8.9% of all mobile device connections*^ NEW TECHNOLOGIES Sensors, actuators - distributed computing topology Internet of Things Network outages, machine downtime, and weather change Improve Accuracy of Events EDGE COMPUTING An Overview: What, Drivers, Example use cases * (Dec 2019) ^ (Feb 2020)
  3. BUSINESS OPPORTUNITY CAGR OF 26.5% USD 9B by 2024 * Low-latency processing Real-TIme Automated Decision making Low half-life or low value INCREASING DATA VOLUME Reduction in network traffic and compute needed EFFiciency * (Aug 2019)
  4. EDGE COMPUTING Bridging the Processing ⟷ Data Gap * Image borrowed from May 2019) *
  5. DECENTRALIZATION Federated learning COMPUTE High density, ultra-low read write latency STORAGE
  6. Ensuring continuous operation even in the wake of a network outage DEPENDABILITY Limited capability of authentication and encryption DATA SECURITY DECENTRALIZATION
  7. Connected Cars - increase situational awareness Autonomous Vehicles Quality-of-Service (QoS) TELECOMMUNICATIONS Personalization - monitoring HEALTHCARE and LIFE SCIENCES Predictive Maintenance MANUFACTURING Hypertargeting RETAIL AND CONSUMER GOODS INDUSTRIES J
  8. Remote health sensing to identify clusters of suspected illnesses Monitoring of early onset of the disease in mild patients (to quickly identify if hospitalization is needed) #COVID-19 @ @ @ Monitoring of lab testing devices and results (to check for false positives and false negatives)
  9. Improve Efficiency Energy and Utilities Precision Farming to Improve Yield Agriculture PUE, Real-time monitoring for work-site safety conditions datacenters Detecting Fraud FINANCE INDUSTRIES
  10. * Image borrowed from May 2019) *USE CASES
  11. Real-time Traffic Monitoring Video Analytics Real-time Surveillance Security Voice-based Digital Assistants Productivity Multiplayer gaming virtual reality Shopping Augmented realityA A A A A USE CASES
  12. Energy Efficiency, Smart Meters SMART HOMES Optimize route planning Stores and Restaurants Nearby SMART TRANSPORTATION Preprocessing - improve latency, reduce bandwidth requirement DATA REPORTING Air quality, water quality in lakes, rivers Environmental MonitorinG USE CASES
  13. Voice control Conversational interfaces Robots, Drones AUTOMATION Real-time Monitoring HEALTH & SAFETY Data Filtering PRIVACY USE CASES
  14. Last mile tracking LOGISTICS Smart parking (dynamic pricing) Structural monitoring (streetlights and bridges) SMART CITIES Condition-based maintenance (trains, tracks , navigation systems) RAILWAYS Reduction of collision and theft INSURANCE USE CASES
  15. * Image borrowed from IMPLICATIONS ON HARDWARE *
  16. Personal Assistants AUDITORYImages, Video and Live Video VISUAL Safety Tactile AI AT THE EDGE
  17. AI AT THE EDGE On Device Productivity LANGUAGE TRANSLATION Authentication Facial Recognition No checkout stores Vision Security Cameras Motion Detection Metaverses for the home, workplace, amusement park, school and social settings AR/VR Personalization Recommendations Enabling disconnected, local interactions
  18. Pooling, Chunk-wise attention SpeeD-Accuracy trade-off Lightweight vocoders: SqueezeWave* Text→Acoustic features →Waveforms Tens of MB Small Model Size News (#COVID-19), Driving directions Text-to-Speech Hyper low-rank approximation^ Mixed Low precision Quantization# Optimizations Leverage hardware accelerators, e.g. DSPs Parallel algorithms On-DEVICE SPEECH RECOGNITION/Synthesis * “SqueezeWave: Extremely Lightweight Vocoders for On-device Speech Synthesis”, Zhai et al. 2020. ^ “Attention based on-device streaming speech recognition with large speech corpus”, Kim et al. 2020. # "Memory-Driven Mixed Low Precision Quantization For Enabling Deep Network Inference On Microcontrollers”, Rusci et al. 2019. Language recognition Speech reconstruction (Lip Reading) Visual Speech Recognition
  19. COMPUTE CHALLENGES SPACE BATTERY LIFE intermittent POWER Small form factor Withstand rugged environments (weather, vibration and connectivity) Sensor poor and noise Low power
  20. Concept drift, Non-stationarity EnviRonmental changes MobileNets^, EdgeCNN# , ApproxNet, IONet, Fire SSD Inertial* - Accelerometer, Gyroscope, Magnetometer Temperature MODELS, FEATURES Fully Decentralized / Peer-to-Peer Distributed Learning Multi-agent optimization Federated learning INFERENCE, TRAINING * Deep Learning based Pedestrian Inertial Navigation: Methods, Dataset and On-Device Inference, Chen et al. 2020 ^ # EdgeCNN: Convolutional Neural Network Classification Model with small inputs for Edge Computing, Yang et al. 2019 Compute and memory constrained Network unreliability, Low power Convergence rate Online (Re-) Training INFERENCE Velocity, Orientation, Trajectory, Activity Example: Pedestrian navigation
  21. FEDERATED learning*^ Mobile keyboard, vocal classifiers, next word prediction Predicting future hospitalization, patient similarity learning Applications Share the weights, not the raw data Single hidden-layer FF networks, Autoencoders, Federated Momentum MODELS Handling unbalanced and non-IID data, Neural architecture search Multi-task learning, Domain adaption, Meta learning Improving Efficiency and EffectiveNess# Convergence time, Communication between devices Bias in Training Data, Compliance (HIPAA, FERPA) Challenges * “Towards Federated Learning at Scale: System Design”, Bonawitz et al. 2019. ^ “A Survey on Federated Learning Systems: Vision, Hype and Reality for Data Privacy and Protection”, Li et al. 2019. # “Advances and Open Problems in Federated Learning“, Kairouz et al. 2019.
  22. Emerging application requires <10 ms end-to-end latency, guarantee freshness of insights Low latency Increasingly support extraction of insights on high velocity data streams Data Velocity Moments of univariate/multivariate data, susceptibility to anomalies Descriptive Statistics * Serverless architectures DISTRIBUTED ONE pASS ALGORITHMS * Numerically stable, scalable formulas for parallel and online computation of higher-order multivariate central moments with arbitrary weights, Pébay et al. 2016.
  23. Incremental Numerical Stability “Textbook algorithm” can result in -ve variance even with small data sets Online Algorithms Communication Costs High Dimensions Tensor manipulation Spatially localized models Dynamic Graphs Continuously evolving Unbounded Data Streams
  26. LONG HISTory - Studied for over 125 yrs
  27. Variational AE, Adversarial AE autoencodeR ARIMA and variants Kernel PCA, Robust PCA, Sparse PCA Time SERIES ANALYSIS, Pca Transformer-XL, Reformer Transformer BRNN, Stochastic RNN, Convolutional LSTM LSTM, GRU, GLU, GHN BiGAN, GANomaly GANs Sparse Attentive Backpropagation Attention STATISTICAl, DL/RL Techniques
  28. Wide spectrum of edge devices DATA VERACITY Unsuitable for low latency ITERATIVE Dynamic environmental conditions, behavioral changes CONCEPT DRIFT DL/RL models trained on the cloud Communication bottleneck LIMITATIONS
  29. Bloom filter [Bloom 1970] and variants (Neural Bloom Filter) Count-Min [Cormode and Muthukrishnan 2005] and variants Filter & COUNT Sketches Dolha [Zhang et al. 2019], Spotlight [Eswaran et al. 2018] TCM [Tang et al. 2016], gMatrix [Khan and Aggarwal 2016] Graph Sketching Random Sampling/Projections [Mahoney 2011] Frequent Directions [Ghashami et al. 2016]Matrix Sketching SKETCHING
  30. Cost vs. security (DDoS attacks, cryptocurrency mining), real-time requirement vs. accuracy Trade-offs VPNFilter, IoTReaper IoT Malware Loadable kernel module - tamper-proof resistant against an attacker (with superuser privileges) Monitor process spawning - system call interception Only programs that are known to run on an “uninfected” off the shelf device are allowed to run Whitelisting approach TAMPER-PROOF RESISTANCE * Image borrowed from “HADES-IoT: A Practical Host-Based Anomaly Detection System for IoT Devices”, Breitenbacher et al. 2019. Solution must not be dependent on a manufacturer, should not require recompilation of the kernel of the IoT device’s OS Deployment *
  31. [Kim et al. 2017] CNN-Variational Autoencoder [Kim et al. 2017] Squeezed Convolutional Variational Autoencoder [Lu and Lysecky, 2019] One Class SVM [Lin et al. 2019] Edge-Based RNN [Nguyen et al. 2019] GRU [Yang et al. 2019] Federated XGBoost MODELS for Anomaly detection at the Edge Subcomponent timing FEATURES
  32. The makeup of an anomaly detection product and edge use cases 2
  33. Anodot’s Anomaly Detection Steps
  34. Analyze
  35. 5 PATENT US10061632B2 PATENT US10061677B2 PATENT US20160210556A1 PATENT PENDING ANOMALY SCORE SEASONALITY LEADING DIMENSIONS HD BASELINE AT SCALE API requests for service 123 Drop in Play for app123 , source-promo Traffic for partnerAccount, partnerName Login Errors
  36. Correlate What Drop in payments success rate Where Card: Visa Type: Online Why + Spike in API errors + New version release Payment success rate Payment API Errors
  37. False Positive Reduction Mechanisms Problem: How do I reduce the alerts I receive without decreasing the quality of detection? Alert Simulation Influencing Factors Context and Correlation (Patented) Spot on alerts Anomaly Attributes Duration | Delta | Score (Patented) Alert Feedback Loop
  38. Who is using it? Enterprise Telecom Gaming Internet Fintech eCommerce Adtech
  39. Why are they using it? Partner Monitoring Revenue and Cost Monitoring Customer Experience Monitoring DAU MAU Retention Usage Flows Funnels APIs Partners Affiliates Customers (b2b) 3rd party services Purchase/sales funnels Price & promo glitches Payment gateways Cloud costs Ad costs
  40. But what about the edge? COVID-19 is pushing this along...
  41. ● Monitoring confirmed cases continuously even if they are at home. ● Monitoring hospitalized and ventilated patients at scale ● Alerts using static thresholds on health indicators (e.g., SPO2 < 90%) is too noisy and creates alert fatigue, and often late. ● Requirements: ○ Remote monitoring ○ Early warning score to identify deteriorating conditions without need for physical check The Problem - Volume of Monitored Patients
  42. Real life examples: Detecting respiratory degradation from health watches Respiratory Rate for patient xxxx
  43. Real life examples: ICU patient monitoring
  44. Early Warning Score
  45. Benefits Scale Reduce load on medical staff. Using an autonomous monitoring approach. Early Detection Improved outcomes to patients: the system is constantly monitoring them and alerts early on deterioration of condition Reduce risk of exposure Staff protection
  46. Difference Engine No. 2 1847-49 Charles Babbage *Image borrowed from * We have come A LONG WAY! EDGE COMPUTING
  48. QUESTIONs?
  50. [Mattia et al. 2019] A Survey on GANs for Anomaly Detection [Fadhel and Nyarko, 2019] GAN Augmented Text Anomaly Detection with Sequences of Deep Statistics [Wen and Keyes 2019] Time Series Anomaly Detection Using Convolutional Neural Networks and Transfer Learning [Pol et al. 2019] Anomaly Detection with Conditional Variational Autoencoders [Wang et al. 2019] adVAE: A self-adversarial variational autoencoder with Gaussian anomaly prior knowledge for anomaly detection [Akçay et al. 2018] GANomaly: Semi-Supervised Anomaly Detection via Adversarial Training [Tsukada et al. 2019] A Neural Network-Based On-device Learning Anomaly Detector for Edge Devices [Bhatia et al. 2019] MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams READINGS [Choudhary et al. 2017] On the Runtime-Efficacy Trade-off of Anomaly Detection Techniques for Real-Time Streaming Data
  51. [Huan et al. 2019] [Puzanov and Cohen 2019] [Hannon et al. 2019] [Maciąg et al. 2019] [Calikus et al. 2019] [Zhong et al. 2019] Active anomaly detection in heterogeneous processes Deep reinforcement one-shot learning for change point detection Real-time Anomaly Detection and Classification in Streaming PMU Data Unsupervised Anomaly Detection in Stream Data with Online Evolving Spiking Neural Networks No Free Lunch But A Cheaper Supper: A General Framework for Streaming Anomaly Detection Deep Actor-Critic Reinforcement Learning for Anomaly Detection READINGS
  52. READINGS [Nolle et al. 2020[ DeepAlign: Alignment-based Process Anomaly Correction Using Recurrent Neural Networks [Li et al. 2020] RCC-Dual-GAN: An Efficient Approach for Outlier Detection with Few Identified Anomalies [Ngo et al. 2019] Fence GAN: Towards Better Anomaly Detection [Ngo et al. 2020] Adaptive Anomaly Detection for IoT Data in Hierarchical Edge Computing [Gao et al. 2020] RobustTAD: Robust Time Series Anomaly Detection via Decomposition and Convolutional Neural Networks
  53. Understanding Anomaly Detection: An Exploration of Anomaly Detection's History, Applications, and State-of-the-Art Techniques Edge Computing Market by by Component , Application, Organization Size worth-28-07-billion-by-2027-Exclusive-Report-by-Meticulous-Research.html Edge Computing Market worth $28.07 billion by 2027 new-markets-what-edge-computing-means-for-hardware-companies New demand, new markets: What edge computing means for hardware companies Edge Computing Market Size, Share & Trends Analysis Report Exploring the Edge: 12 Frontiers of Edge Computing RESOURCES
  54. RESOURCES Technical capabilities of an edge computing solution Why edge computing for IoT? The 5G era: New horizons for advanced-electronics and industrial companies IDC's Worldwide Core and Edge Computing Platforms Taxonomy, 2020 the-5g-revolution Connected world: An evolution in connectivity beyond the 5G revolution