The document discusses data-driven production optimization using machine learning techniques. It describes how machine learning can be applied at different stages - on-site and on-line for real-time scoring of production data, or off-line for model training. Machine learning algorithms like time series analysis, clustering, and anomaly detection can be used to provide insights into production processes and identify opportunities for improvement in key metrics like overall equipment efficiency. The techniques can be implemented using an OPC UA interface to access industrial equipment data and provide analytics.
Back again, guess what: it's cluster friday again, so let's cluster around a bit :slightly_smiling_face:
11:25
I was not so happy with using DBSCAN for anomalies, so I put it aside for now and looked into LOF instead. The LOF (local outlier Factor) is an algorithm, that measures the outlier-ness of a point by looking at the local densitiy of points from the learning phase:
11:26
To measure a outlierness of a Point P, we take 1) the k-Nearest Neghbors of that Point P and measure the distance to those points
11:26
2) the k-Nearest Neighbors of the Neighbors and measure those distances also
11:27
3) we compare the two distance distributions
11:27
This will give us a measure that compares the local surroundings of a Point compared to the local surroundings of the Neighbors of that point
11:29
Long story short: if a point is far away of a dense cloud of points, it will have large distances to it's nearest neighbors (which are e.g. outer members of a dense cloud); but those neighbors themselves have a lot nearby neighbors in their cloud
11:30
so, LOF is used not for clustering but only for density-based outlier detection, which is ok for my use-case
11:30
let's take a first look at Prinovis data and see a first behaviour of the algorithm:
I scored (scores are the colored lines with high anomaly scores giving negative numbers) with LOF on a paper roll change and here I compare two learning phases: one with 3000 points and one with 10.000 points. the learning phases also include only normal operation and paper roll changes, so basically the same events I am scoring on. We would expect a "low" outlier value, as we have seen the according events during the learning phase (edited)
11:33
For the LOF, we need to select the k-NN, and I put k as 50,100,...1000 as you see in the plots.
11:35
Let's look at the green curve (approx @5300): it's very differnt between left and right, why? The algorithm is looking for 200 neighbors of a point to measure outlierness. In the left case with less learning, we simply don't have so many ocurrences of paper roll changes, and that's why the algorithms gives it an higher outlier score, where on the right we don't get an outlier score, because in the reference (learning phase), we have see so many events of this kind, that it is not annomal to the LOF anymore (edited)
11:36
=> so we have to be careful in selecting the number of neighbors for the density measurement compared to the size of our learning-set
datapreparation was PCA => 4 dimensions, but raw data works just the same, only slower (
on this picture above you can clearly see the capability of learning a behviour, I scored a certain area, but used different learning areas: on the left learning with paper rips and on the right without.
2:13
The scores (for k-NN, k=16,32) are very different during the paperrip (~@1500). The left model has learned that area and will not score so heavy anomalies, whereas on the right, basically the whole process of paperrip until the system is back up running is an anomaly
al
3:27 PM
now, I did a trial with several contamination levels during learning on the same data: Settting the contamination to 1% is not a good idea (orange curve), as many of the good areas are also markes as anomalies. Basically 1% of the area :slightly_smiling_face:
3:30
So going with 1ppm as contamination gives me again a good result: the whole learned area is considered as "good" and only the new stuff (the paper rip) is considered anormal. (Backgroundcolor blue)
al
4:01 PM
last trial for today: setting the contamination level during learning very high, but reduced the learning area more, we still see that we get spurious anomaly-alarms. The threshold on a contamination parameter of 1ppm will basically take the highest seen anomaly during learning as the level, but a similar area with a little bit more noise will then trigger an alarm. We probably have to have a savety margin
In the upper picture we see on the right side a zoom in to an area where we get spurious anomalies. I'll have to make a few more experiments to make a robust automatic setting of the algorithm.... Have a nice weekend!"