Arun	Kejariwal													Ira	Cohen	
@arun_kejariwal									@irairacohen
ANOMAly detection AT the edge
Computing	closer	to	the	data	source,	reduce	network	traffic		
SUPPLEMENT CLOUD COMPUTING
5G	-	1.1B	connections	by	2023,	8.9%	of	all	mobile	device	connections*^
NEW TECHNOLOGIES
Sensors,	actuators	-	distributed	computing	topology	
Internet of Things
Network	outages,	machine	downtime,	and	weather	change
Improve Accuracy of Events
EDGE COMPUTING
An	Overview:	What,	Drivers,	Example	use	cases
* https://www.idc.com/getdoc.jsp?containerId=prUS45740019		(Dec	2019)		
^	https://www.mckinsey.com/industries/advanced-electronics/our-insights/the-5g-era-new-horizons-for-advanced-electronics-and-industrial-companies		(Feb	2020)
BUSINESS OPPORTUNITY
CAGR	OF	26.5%
USD 9B by 2024 *
Low-latency	processing
Real-TIme Automated Decision making
Low	half-life	or	low	value	
INCREASING DATA VOLUME
Reduction	in	network	traffic	and	compute	needed
EFFiciency
*	https://www.marketsandmarkets.com/Market-Reports/edge-computing-market-133384090.html		(Aug	2019)
EDGE COMPUTING
Bridging	the	Processing	⟷	Data	Gap
*	Image	borrowed	from	https://www.gartner.com/doc/reprints?id=1-1OIB65YZ&ct=190917&st=sb		May	2019)	
*
DECENTRALIZATION
Federated	learning
COMPUTE
High	density,	ultra-low	read	write	latency	
STORAGE
Ensuring	continuous	operation	even	in	the	
wake	of	a	network	outage	
DEPENDABILITY
Limited	capability	of	authentication	and	
encryption	
DATA SECURITY
DECENTRALIZATION
Connected	Cars	-	increase	situational	awareness	
Autonomous Vehicles
Quality-of-Service	(QoS)	
TELECOMMUNICATIONS
Personalization	-	monitoring	
HEALTHCARE and LIFE SCIENCES
Predictive	Maintenance
MANUFACTURING
Hypertargeting
RETAIL AND CONSUMER GOODS
INDUSTRIES
J
Remote	 health	 sensing	 to	 identify	 clusters	 of	 suspected	
illnesses
Monitoring	of	early	onset	of	the	disease	in	mild	patients	
(to	quickly	identify	if	hospitalization	is	needed)
#COVID-19
@
@
@ Monitoring	of	lab	testing	devices	and	results	(to	check	for	
false	positives	and	false	negatives)
Improve	Efficiency	
Energy and Utilities
Precision	Farming	to	Improve	
Yield
Agriculture
PUE,	Real-time	monitoring	for	work-site	
safety	conditions
datacenters
Detecting	Fraud
FINANCE
INDUSTRIES
*	Image	borrowed	from	https://www.gartner.com/doc/reprints?id=1-1OIB65YZ&ct=190917&st=sb		May	2019)	
*USE CASES
Real-time	Traffic	Monitoring
Video Analytics
Real-time	Surveillance
Security
Voice-based	Digital	Assistants	
Productivity
Multiplayer	gaming
virtual reality
Shopping
Augmented realityA
A
A
A
A
USE CASES
Energy	Efficiency,	Smart	Meters
SMART HOMES
Optimize	route	planning	
Stores	and	Restaurants	Nearby
SMART TRANSPORTATION
Preprocessing	-	improve	latency,	reduce	bandwidth	requirement	
DATA REPORTING
Air	quality,	water	quality	in	
lakes,	rivers
Environmental MonitorinG
USE CASES
Voice	control
Conversational interfaces
Robots,	Drones
AUTOMATION
Real-time	Monitoring	
HEALTH & SAFETY
Data	Filtering
PRIVACY
USE CASES
Last	mile	tracking	
LOGISTICS
Smart	parking	(dynamic	pricing)	
Structural	monitoring	(streetlights	and	
bridges)
SMART CITIES
Condition-based	maintenance	
(trains,	tracks	,	navigation	systems)
RAILWAYS
Reduction	of	collision	and	theft
INSURANCE
USE CASES
*	Image	borrowed	from	https://www.mckinsey.com/industries/technology-media-and-telecommunications/our-insights/new-demand-new-markets-what-edge-computing-means-for-hardware-companies	
IMPLICATIONS ON HARDWARE
*
Personal	Assistants
AUDITORYImages,	Video	and	Live	Video	
VISUAL
Safety
Tactile
AI AT THE EDGE
AI AT THE EDGE
On	Device
Productivity
LANGUAGE TRANSLATION
Authentication
Facial Recognition
No	checkout	stores
Vision
Security	Cameras
Motion Detection
Metaverses	for	the	home,	workplace,	amusement	
park,	school	and	social	settings
AR/VR
Personalization	
Recommendations
Enabling	disconnected,	local	interactions
Pooling,	Chunk-wise	attention	
SpeeD-Accuracy trade-off
Lightweight	vocoders:	SqueezeWave*	
Text→Acoustic features →Waveforms
Tens	of	MB
Small Model Size
News	(#COVID-19),	Driving	directions
Text-to-Speech
Hyper	low-rank	approximation^		
Mixed	Low	precision	Quantization#
	
Optimizations
Leverage	hardware	accelerators,	e.g.	DSPs
Parallel algorithms
On-DEVICE SPEECH RECOGNITION/Synthesis
*	“SqueezeWave:	Extremely	Lightweight	Vocoders	for	On-device	Speech	Synthesis”,	Zhai	et	al.	2020.	
^	“Attention	based	on-device	streaming	speech	recognition	with	large	speech	corpus”,	Kim	et	al.	2020.	
#	"Memory-Driven	Mixed	Low	Precision	Quantization	For	Enabling	Deep	Network	Inference	On	Microcontrollers”,	Rusci	et	al.	2019.
Language	recognition	
Speech	reconstruction	(Lip	Reading)
Visual Speech Recognition
COMPUTE
CHALLENGES
SPACE BATTERY LIFE
intermittent POWER
Small	form	factor	
Withstand	rugged	environments	
(weather,	vibration	and	connectivity)	
Sensor	poor	and	noise	
Low	power
Concept	drift,	Non-stationarity	
EnviRonmental changes
MobileNets^,	EdgeCNN#
,	ApproxNet,	IONet,	Fire	SSD			
Inertial*	-	Accelerometer,	Gyroscope,	Magnetometer	
																		Temperature
MODELS, FEATURES
Fully	Decentralized	/	Peer-to-Peer	Distributed	Learning	
Multi-agent	optimization	
Federated learning
INFERENCE, TRAINING
*	https://www.bosch-sensortec.com/products/motion-sensors/imus/	
			Deep	Learning	based	Pedestrian	Inertial	Navigation:	Methods,	Dataset	and	On-Device	Inference,	Chen	et	al.	2020		
^	https://arxiv.org/abs/1905.02244	
#	EdgeCNN:	Convolutional	Neural	Network	Classification	Model	with	small	inputs	for	Edge	Computing,	Yang	et	al.	2019
Compute	and	memory	constrained	
Network	unreliability,	Low	power		
Convergence	rate		
Online (Re-) Training
INFERENCE
Velocity,	Orientation,	Trajectory,	Activity	
Example:	Pedestrian	navigation
FEDERATED learning*^
Mobile	keyboard,	vocal	classifiers,	next	word	prediction	
Predicting	future	hospitalization,	patient	similarity	learning			
Applications
Share	the	weights,	not	the	raw	data	
Single	hidden-layer	FF	networks,	Autoencoders,	Federated	Momentum
MODELS
Handling	unbalanced	and	non-IID	data,	Neural	architecture	search	
Multi-task	learning,	Domain	adaption,	Meta	learning
Improving Efficiency and EffectiveNess#
Convergence	time,	Communication	between	devices		
Bias	in	Training	Data,	Compliance	(HIPAA,	FERPA)	
Challenges
*	“Towards	Federated	Learning	at	Scale:	System	Design”,	Bonawitz	et	al.	2019.	
^	“A	Survey	on	Federated	Learning	Systems:	Vision,	Hype	and	Reality	for	Data	Privacy	and	Protection”,	Li	et	al.	2019.		
#	“Advances	and	Open	Problems	in	Federated	Learning“,	Kairouz	et	al.	2019.
Emerging	application	requires	<10	ms	end-to-end	latency,	guarantee	freshness	of	insights	
Low latency
Increasingly	support	extraction	of	insights	on	high	velocity	data	streams
Data Velocity
Moments	of	univariate/multivariate	data,	susceptibility	to	anomalies		
Descriptive Statistics *
Serverless	architectures		
DISTRIBUTED
ONE pASS ALGORITHMS
*	Numerically	stable,	scalable	formulas	for	parallel	and	online	computation	of	higher-order	multivariate	central	moments	with	arbitrary	weights,	Pébay	et	al.	2016.
Incremental
Numerical	Stability	
“Textbook	algorithm”	
can	result	in	-ve	variance	
even	with	small	data	sets
Online	Algorithms	
Communication	Costs
High	Dimensions	
Tensor	manipulation	
Spatially	localized	
models	
Dynamic	Graphs	
Continuously	evolving
Unbounded	
Data	Streams
DATA FIDELITY
ANOMALY
DETECTION
LONG HISTory - Studied for over 125 yrs
Variational	AE,	Adversarial	AE
autoencodeR
ARIMA	and		variants	
Kernel	PCA,	Robust	PCA,	Sparse	PCA
Time SERIES ANALYSIS, Pca
Transformer-XL,	Reformer
Transformer
BRNN,	Stochastic	RNN,	Convolutional	LSTM	
LSTM, GRU, GLU, GHN
BiGAN,	GANomaly
GANs
Sparse	Attentive	Backpropagation	
Attention
STATISTICAl, DL/RL Techniques
Wide	spectrum	of	edge	devices
DATA VERACITY
Unsuitable	for	low	latency	
ITERATIVE
Dynamic	environmental	conditions,	behavioral	changes
CONCEPT DRIFT
DL/RL	models	trained	on	the	cloud
Communication bottleneck
LIMITATIONS
Bloom filter [Bloom 1970] and variants (Neural Bloom Filter)
Count-Min [Cormode and Muthukrishnan 2005] and variants
Filter & COUNT Sketches
Dolha [Zhang et al. 2019], Spotlight [Eswaran et al. 2018]
TCM [Tang et al. 2016], gMatrix [Khan and Aggarwal 2016]
Graph Sketching
Random Sampling/Projections [Mahoney 2011]
Frequent Directions [Ghashami et al. 2016]Matrix Sketching
SKETCHING
Cost	vs.	security	(DDoS	attacks,	cryptocurrency	mining),	real-time	requirement	vs.	accuracy	
Trade-offs
VPNFilter,	IoTReaper	
IoT	Malware
Loadable	kernel	module	-	tamper-proof	resistant	against	an	attacker	(with	superuser	
privileges)
Monitor	process	spawning	-	system	call	interception	
Only	programs	that	are	known	to	run	on	an	“uninfected”	off	the	shelf	device	are	
allowed	to	run
Whitelisting	approach	
TAMPER-PROOF RESISTANCE
*	Image	borrowed	from	“HADES-IoT:	A	Practical	Host-Based	Anomaly	Detection	System	for	IoT	Devices”,	Breitenbacher	et	al.	2019.	
Solution	must	not	be	dependent	on	a	manufacturer,	should	not	require	
recompilation	of	the	kernel	of	the	IoT	device’s	OS
Deployment	
*
[Kim	et	al.	2017]
CNN-Variational Autoencoder
[Kim	et	al.	2017]
Squeezed Convolutional Variational
Autoencoder
[Lu	and	Lysecky,	2019]
One Class SVM
[Lin	et	al.	2019]
Edge-Based RNN
[Nguyen	et	al.	2019]
GRU
[Yang	et	al.	2019]
Federated XGBoost
MODELS for Anomaly detection at the Edge
Subcomponent	timing
FEATURES
The makeup of an anomaly detection product
and edge use cases
2
Anodot’s Anomaly Detection Steps
Analyze
5
PATENT US10061632B2 PATENT US10061677B2 PATENT US20160210556A1 PATENT PENDING
ANOMALY SCORE SEASONALITY LEADING DIMENSIONS HD BASELINE AT SCALE
API requests for service 123
Drop in Play for app123 , source-promo
Traffic for partnerAccount, partnerName
Login Errors
Correlate
What
Drop in payments
success rate
Where
Card: Visa
Type: Online
Why
+ Spike in API errors
+ New version release
Payment	success	rate
Payment	API	Errors
False Positive
Reduction
Mechanisms
Problem:
How do I reduce the
alerts I receive without
decreasing the quality of
detection?
Alert Simulation
Influencing Factors
Context and Correlation (Patented)
Spot on alerts
Anomaly Attributes
Duration | Delta | Score (Patented)
Alert Feedback Loop
Who is using it?
Enterprise Telecom
Gaming
Internet
Fintech eCommerce Adtech
Why are they using it?
Partner
Monitoring
Revenue and Cost
Monitoring
Customer Experience
Monitoring
DAU
MAU
Retention
Usage Flows
Funnels
APIs
Partners
Affiliates
Customers (b2b)
3rd party services
Purchase/sales funnels
Price & promo glitches
Payment gateways
Cloud costs
Ad costs
But what about the edge?
COVID-19 is pushing this along...
● Monitoring	 confirmed	cases	continuously	 even	if	they	are	at	home.
● Monitoring	 hospitalized	and	ventilated	patients	at	scale
● Alerts	using	static	thresholds	 on	health	indicators	(e.g.,	SPO2	<	90%)	is	too	noisy	
and	creates	alert	fatigue,	and	often	late.
● Requirements:
○ Remote	monitoring
○ Early	warning	score	to	identify	deteriorating	conditions	 without	need	
for	physical	check
The Problem - Volume of Monitored Patients
Real life examples: Detecting respiratory degradation
from health watches
Respiratory Rate for patient xxxx
Real life examples: ICU patient monitoring
Early Warning Score
Benefits
Scale
Reduce load on medical staff.
Using an autonomous
monitoring approach.
Early Detection
Improved outcomes to
patients: the system is
constantly monitoring them
and alerts early on
deterioration of condition
Reduce risk of
exposure
Staff protection
Difference	Engine	No.	2	1847-49	
Charles	Babbage
*Image	borrowed	from		https://spectrum.ieee.org/tech-talk/tech-history/dawn-of-electronics/untold-history-of-ai-charles-babbage-and-the-turk
*
We have come A LONG WAY!
EDGE COMPUTING
A LONG ROAD AHEAD!
QUESTIONs?
THANK YOU
[Mattia	et	al.	2019]
A	Survey	on	GANs	for	Anomaly	Detection
[Fadhel	and	Nyarko,	2019]
GAN	Augmented	Text	Anomaly	Detection	with	Sequences	
of	Deep	Statistics
[Wen	and	Keyes	2019]
Time	Series	Anomaly	Detection	Using	Convolutional	
Neural	Networks	and	Transfer	Learning
[Pol	et	al.	2019]
Anomaly	Detection	with	Conditional	
Variational	Autoencoders
[Wang	et	al.	2019]
adVAE:	A	self-adversarial	variational	autoencoder	
with	Gaussian	anomaly	prior	knowledge	for	
anomaly	detection
[Akçay	et	al.	2018]
GANomaly:	Semi-Supervised	Anomaly	
Detection	via	Adversarial	Training
[Tsukada	et	al.	2019]
A	Neural	Network-Based	On-device	Learning	
Anomaly	Detector	for	Edge	Devices
[Bhatia	et	al.	2019]
MIDAS:	Microcluster-Based	Detector	of	
Anomalies	in	Edge	Streams
READINGS
[Choudhary	et	al.	2017]
On	the	Runtime-Efficacy	Trade-off	of	Anomaly	Detection	Techniques	for	Real-Time	Streaming	Data
[Huan	et	al.	2019]
[Puzanov	and	Cohen	2019]	
[Hannon	et	al.	2019]
[Maciąg	et	al.	2019]
[Calikus	et	al.	2019]
[Zhong	et	al.	2019]
Active anomaly detection in heterogeneous processes
Deep reinforcement one-shot learning for change point detection
Real-time Anomaly Detection and Classification in Streaming PMU Data
Unsupervised Anomaly Detection in Stream Data with Online Evolving Spiking Neural Networks
No Free Lunch But A Cheaper Supper: A General Framework for Streaming Anomaly Detection
Deep Actor-Critic Reinforcement Learning for Anomaly Detection
READINGS
READINGS
[Nolle	et	al.	2020[
DeepAlign: Alignment-based Process
Anomaly Correction Using Recurrent
Neural Networks
[Li	et	al.	2020]
RCC-Dual-GAN: An Efficient Approach for
Outlier Detection with Few Identified
Anomalies
[Ngo	et	al.	2019]
Fence GAN: Towards Better
Anomaly Detection
[Ngo	et	al.	2020]
Adaptive Anomaly Detection for IoT Data
in Hierarchical Edge Computing
[Gao	et	al.	2020]
RobustTAD: Robust Time Series Anomaly Detection via Decomposition and Convolutional Neural Networks
Understanding Anomaly Detection: An Exploration of Anomaly Detection's
History, Applications, and State-of-the-Art Techniques
https://www.marketsandmarkets.com/Market-Reports/edge-computing-market-133384090.html
Edge Computing Market by by Component , Application, Organization Size
https://www.globenewswire.com/news-release/2019/12/10/1958380/0/en/Edge-Computing-Market-
worth-28-07-billion-by-2027-Exclusive-Report-by-Meticulous-Research.html
Edge Computing Market worth $28.07 billion by 2027
https://www.mckinsey.com/industries/technology-media-and-telecommunications/our-insights/new-demand-
new-markets-what-edge-computing-means-for-hardware-companies
New demand, new markets: What edge computing means for hardware companies
https://www.grandviewresearch.com/industry-analysis/edge-computing-market
Edge Computing Market Size, Share & Trends Analysis Report
https://www.gartner.com/doc/reprints?id=1-1OIB65YZ&ct=190917&st=sb
Exploring the Edge: 12 Frontiers of Edge Computing
http://shop.oreilly.com/product/0636920072904.do
RESOURCES
RESOURCES
https://blog.bosch-si.com/bosch-iot-suite/technical-capabilities-of-edge-computing-solution/
Technical capabilities of an edge computing solution
https://blog.bosch-si.com/internetofthings/why-edge-computing-for-iot/	
Why edge computing for IoT?
https://www.mckinsey.com/industries/advanced-electronics/our-insights/the-5g-era-new-horizons-for-advanced-electronics-and-industrial-companies	
The 5G era: New horizons for advanced-electronics and industrial companies
https://www.idc.com/research/viewtoc.jsp?containerId=US46054020
IDC's Worldwide Core and Edge Computing Platforms Taxonomy, 2020
https://www.mckinsey.com/industries/technology-media-and-telecommunications/our-insights/connected-world-an-evolution-in-connectivity-beyond-
the-5g-revolution	
Connected world: An evolution in connectivity beyond the 5G revolution

Anomaly Detection At The Edge