-- Real Time Anomaly Detection
-- IoT Device Intelligence
-- Uni Variate and Multi Variate Anomaly Detection
-- Unsupervised Learning Classification from Anomaly Detection
IoT Device Intelligence & Real Time Anomaly Detection
1. IoT Device Intelligence and Real Time Anomaly Detection 1
IoT Device Intelligence
Real Time Anomaly Detection
Braja Das
bkd_108@yahoo.com
04/01/2018
2. IoT Device Intelligence and Real Time Anomaly Detection 2
Abstract
IoT device sends real time telemetry data in event streams. Event streams steps in
and gets processed for further analytics pipeline. Real time data helps in real time
analytics, performance monitoring in IoT device, production process and control.
Defects from measurements can also be quantified.
Anomaly detection is used during troubleshooting sensors measurement deviations.
If sensors read incorrect data or there are system health issues in any device or its
parts, anomalous measures can be observed. If this comes in consistent basis an
action can be taken and preventive maintenance can be performed in device.
Machine Intelligence can give insight of remote monitoring with different KPI and can
help preventive and planned maintenance. If remote agent works as first level
support with machine intelligence KPI and data across fleet, cost can be optimized in
touch labor.
Extension of anomaly detection can also help for predictive maintenance and defects
can be quantified in advance and corrective action can be taken.
Closed loop feedback to controllers from anomaly detection can help adjust different
controls of IoT devices and help system perform best.
3. IoT Device Intelligence and Real Time Anomaly Detection 3
Machine Intelligence & Real Time Anomaly Detection
Quality control is an important aspect of product manufacturing. While brand cares
precision and quality, it is inherent of customer satisfaction.
Statistical process control technique is being used since Industrial Age as process
variation technique. Limitation of its usage are observed in control logic
implementation in most of embedded systems.
In today’s’ distributed cloud and edge computing it brings easy for customers.
AI, Machine learning has given a direction in further extending this technique not
only focus on quantifying defects on single measurement but also allows multi variate
defect analysis and identifying root cause of the problems.
In further extension it allows company to think event triggered preventive
maintenance as compared to earlier time triggered maintenance. It also gives
flexibility of thinking predictive maintenance for the company.
Key Quality Management Principles
1. Core of this total quality management principles is process approach.
2. Improvement of production processes is key. Cycles of continuous
improvement in production process ensures top quality of product.
3. To get customer feedback and engage in all phases of product development
life cycles helps maximum customer satisfaction.
4. Evidence based decision making is important for quality improvement in
product development.
Fig: Key Quality Management Principles
4. IoT Device Intelligence and Real Time Anomaly Detection 4
Data Exploration: Machine Intelligence and Anomaly Detection
IoT device sends time series real time telemetry. It is important to capture process
variations from its measurements. As subject matter expert has pure understanding
of process variation limit called specification limit, it is also important to control
process variation from real time telemetry.
Univariate analysis can be used in identifying outliers from process variations. This
outlier detection from univariate analysis can be applied in every telemetry. Missing
value identification is also important at this stage.
Multivariate analysis is important for complex systems when process variable or
measurement is dependent on each other. Linear regression and correlation
technique is widely used as part of multi variate analysis.
The result of multi variate analysis can create new variables and new time series
behavior can be found as part of this process. Once new variable is identified same
cycle can be continued for anomaly detection or outlier detection.
5. IoT Device Intelligence and Real Time Anomaly Detection 5
HVAC Air Handling Unit
There are three different sensors send below telemetry from above HVAC air handling
unit.
1. Outside air or Return air temperature
2. Supply air temperature
3. Relative humidity.
It is expected to have return air temperature within 20 deg C to 25 deg C. This can
be considered as specification limit for outside air temperature. As time series data
comes in, outliers can be detected based on this limits. This anomaly detection is not
based on evidence based decision making but on expert judgements.
Univariate Statistical Process Control (SPC) analysis be performed for evidence based
decision making. The same technique can be used in other measurements such as
supply air temp or relation humidity as well.
Return air temp is dependent on room conditions, air filtration inside rooms, supply
air temp. Multi variate analysis can be applied in return air temp with other measures
as well. This helps in identifying relations of return air temperature telemetry outliers
in context to other variables.
6. IoT Device Intelligence and Real Time Anomaly Detection 6
Statistical Process Control Chart: Return Air Temperature
Statistical process control (SPC) is technique can be used when telemetry sample
follows normal distribution. Subgroup selection in SPC is very important. It is
important to consider frequency distributions before applying SPC rules. This is an
important pattern recognition technique in anomaly detection.
Subgroup
Subgroup are data samples that are homogenous and can follow normal distribution.
Store traffic in a day is important for subgroup homogenous collection of data.
Subgroups can be selected across multiple stores or fleet as well specially if assumed
to have normally distributed data.
Common Causes of Variation
Store traffic can be different at different times in a day and it is common and easy to
identify these variations.
Special Causes of Variation
In case air handling unit is restarted or sudden power outage can cause system
backing up from generator and start supplying air, can be considered as special
causes of variation. These variations can be treated separately and would not be
considered as part of data samples.
Variable Control Chart
Control chart captures process variations from time series data.
Control Limit: Upper Control Limit (UCL), Lower Control Limit (LCL) are two limits
where data samples can belong.
In Range Control Chart, ranges are defined as Zones. There are 3 Zones named as
Zone A: 3 Sigma Zone (99.7)
Zone B: 2 Sigma Zone (95.5%)
Zone C: 1 Sigma Zone. (68.3%)
Control Line: Temporal or Spatio-Temporal Average.
7. IoT Device Intelligence and Real Time Anomaly Detection 7
Telemetry time series data are plotted in these range.
Zone Test Rules
Rule Rule Name Pattern
1
Beyond
Limits One or more points beyond the control limits (UCL, LCL)
2 Zone A
2 out of 3 consecutive points in Zone A or beyond (
probability of 66.67% in 3 consecutive events)
3 Zone B
4 out of 5 consecutive points in Zone B or beyond
(probability of 80% in 5 consecutive events)
4 Zone C
7 or more consecutive points on one side of the average (in
Zone C )
5 Trend 7 consecutive points trending up or trending down
6 Mixture 8 consecutive points with no points in Zone C
7 Stratification 15 consecutive points in Zone C
8 Over-control 14 consecutive points alternating up and down
Core algorithm of anomaly or defect detection is in zone test rules. There are eight
above rules can be applied based on range chart results and can quantify defects that
violates each of the rules.
Data Exploration in IoT Device Intelligence
3/28/2018 IoT Device Intelligence and Real Time Anomaly Detection 7
8. IoT Device Intelligence and Real Time Anomaly Detection 8
X-bar Control Chart:
X-bar control chart measures process mean in different subgroups. Mean process
variations can be observed at different temporal or spatio-temporal intervals.
S-bar Control Chart
S-bar control chart measures standard deviation and variation in lower subgroup
compared to its higher sub group.
Attribute Control Chart
Attribute control chart plot quality characteristics that are not numerical in nature
especially when defects quantification is necessary from measurements.
Zone test summary helps quantifying anomalies in time series data. These zone test
rules can be applied in all the measurements for a given sample and can help detect
anomalies separately.
9. IoT Device Intelligence and Real Time Anomaly Detection 9
Multi Variate analysis – Correlation and Linear Regression
Time series pattern recognition can be applied among multiple variables. Correlation
coefficient can be calculated between two variables.
In below diagram Pearson Correlation Coefficient (CC) is calculated between supply
air temp and return air temp and plotted for every telemetry events. This gives
relation in these two variables.
Return air temp can be dependent on many variables including supply air temp.
Variation in Correlation Coefficient (CC) indicates new variables can be taken into
consideration.
Anomaly detection technique using control chart can be applied in new variable
(CC).
10. IoT Device Intelligence and Real Time Anomaly Detection 10
Unsupervised Learning Classification
The result of attribute control chart on base variables and derived variables give us
quantitative measures and help in binary classification (Good, Bad) for every events.
Association rule be applied and decision tree can be created based on these
classifications.
In above example air quality can be classified as good air or bad air. One of the
objective in BMS (Building Management Systems) is to help ensure comfort and
climate control. Quantifying bad air quality at different times are very important as
part of this process for preventive maintenance of different parts of this Air handling
unit.
Further classification can be made on bad air quality due to air infiltration or excessive
room temperature or due to inaccurate functions in chiller or heater.
Air infiltration or excessive room temperature can be caused in an event when supply
air temperature is good based on anomaly detection but return air temperature is
bad.
If supply air temperature and return air temperature both answers bad quality in
events, the result could indicate issues in heating coil or cooling coil or heater or
chiller.
11. IoT Device Intelligence and Real Time Anomaly Detection 11
Monitoring KPI, Dashboard, Alerts
Measurement KPI can be monitored and alerted based on defect counts. Shifts
(moving average, moving standard deviation or variation) can also be measured and
monitored in given Spatio Temporal data range.
Preventive Maintenance
Event triggered preventive maintenance can be scheduled based on result of this
classification in different parts of the system. Remote Alert monitoring and
notification is also another application based on this classification.