The evolution of machine learning and IoT have made it possible for manufacturers to build more effective applications for predictive maintenance than ever before. Despite the huge potential that machine learning offers for predictive maintenance, it's challenging to build solutions that can handle the speed of IoT data streams and the massively large datasets required to train models that can forecast rare events like mechanical failures. Solving these challenges requires knowledge about state-of-the-art dataware, such as MapR, and cluster computing frameworks, such as Spark, which give developers foundational APIs for consuming and transforming data into feature tables useful for machine learning.
Other titles:
Ways and Means of Predictive Maintenance with Machine Learning
Demystifying Predictive Maintenance with Hands-On Machine Learning
https://www.meetup.com/Portland-Machine-Learning-Meetup/
And predictive maintenance.
This, from a seminal study by the DoE.
Note the date. This was way back in 2010!
The opportunities for IoT are even bigger now.
What makes ML difficult?
There’s a lot of specialized software needed to put ML into prod.
ML has a very different lifecycle too.
I’m excited to talk about this stuff because I think I have pretty good ideas about how to do this stuff, and how to generate data and play with Keras.
I work for MapR, but that’s not a big part of the story here.
My intent is to draw on personal experiences building PdM to help you learn what’s involved in buiding PdM regardless of your tool choice.
Advanced PdM not only involves time-series IIoT data, but also historical maintenance records, error logs, machine and operator features
No matter what you plan to do with the data, it must persist somewhere.
No matter what you plan to do with the data, it must persist somewhere.
This is the point at which I need to mention MapR. MapR is dataware.
MapR facilitates data science, model dev, and ML in prod.
Scale storage, scale analytics, doing ML, putting analytical products into prod.
If you struggle with data storage, and iterative data analysis, using datasets for production apps, check out MapR.
Lots of tools to clean and analyze data.
But data cleansing and feature engineering require lots of trial and error.
So, any friction (e.g. data movement, schema discovery, proprietary query languages, etc) in data access is bad.
MapR reduces the barriers to saving data, analyzing it, augmenting it, and operationalizing it.
Now let’s talk about data flows? What do the processes that pull from MQTT or REST look like? Where do they run?
You can write them custom, or use a data pipeline tool, like StreamSets
“In the face of drift, in the face of change, in the face of unexpected data, changing business needs and logic, changing infrastructure, you're able to minimize the amount of downtime of the system and kind of keep it always on”
Demo script:
Now we go from talking about data collection to data transformation.
I’ll describe PdM feature engineering concepts and show how to implement them.
There may be some properties which correlate to failures.
Those properties may be calculated on-the-fly or derived by joining other datasets.
This requires data to be stored with flexible schemas.
Derived features can make analysis much easier.
Grouping sensors by system can also make analysis much easier.
What if the data is sampled frequently?
What if failures are rare?
There may be some properties which correlate to failures.
Those properties may be calculated on-the-fly or derived by joining other datasets.
This requires data to be stored with flexibile schemas.
There may be some properties which correlate to failures.
Those properties may be calculated on-the-fly or derived by joining other datasets.
This requires data to be stored with flexibile schemas.
There may be some properties which correlate to failures.
Those properties may be calculated on-the-fly or derived by joining other datasets.
This requires data to be stored with flexibile schemas.
There may be some properties which correlate to failures.
Those properties may be calculated on-the-fly or derived by joining other datasets.
This requires data to be stored with flexibile schemas.
Here’s an example of MapR-DB being used from Spark to update lagging features.
Unusual vibrations give you the first clue that a machine is nearing the end of its useful life,
so it's very important to detect those anomalies.
Vibration sensors measure the displacement or velocity of motion thousands of times per second.
Acoustic sensors work the same way
Two things could go wrong:
Could have too much data. E.g. too many motors / data sources
Spark SQL filtering and FFT computation could be too slow.
Now we go from talking about industry trends to more of a how-to guide.
There is no rule of thumb for the amount of hidden nodes you should use.
It is something you have to figure out through trial and error.
Dropout forces better generalization
Dropout forces better generalization
We must specify a loss function and an optimizer function when compiling the model.
The loss function is a way of penalizing the model for low accuracy scores. We use binary cross entropy because we have just two classes (1 and 0).
The optimizer defines how to adjust neuron weights in response to inaccuracate predictions. The Adam optimizer make sense, because I’ve read that Adam learns fast, is stable over a wide range of learning rates, and has comparatively low memory requirements. Keras uses a default learning rate of 0.001.
I like to think of LSTM as doing a exponential rolling average
Training:
I fit the network with 5 epochs and batch size of 10
An epoch is when you go over the complete training data once.
A batch size of 10 means we expose the network to 10 input sequences before updating the weights. Batches also ensure we don’t try to load the entire training data into memory at once.
The fit function returns a history object that provides a summary of model accuracy recorded at each epoch.