SlideShare a Scribd company logo
1 of 45
A Machine Learning Practitioner’s Guide to
Predictive Maintenance
IAN DOWNARD
idownard@mapr.com
© 2018 MapR Technologies 2
About Me
Sr. Solution Architect at MapR
Email: idownard@mapr.com
Blog: bigendiandata.com
© 2018 MapR Technologies 3
Predictive maintenance can
reduce…
• equipment failures by 75%
• downtime by 45%
• maintenance costs by 30%
US Department of Energy, August 2010
https://www.energy.gov/eere/femp/downloads/operations-and-maintenance-best-practices-guide
© 2018 MapR Technologies 4
“Manufacturer’s adoption of
ML/AI will increase 38% in the
next five years.”
Digital Factories 2020: Shaping the future of manufacturing
PricewaterhouseCoopers, 2017
If ML is so effective, why
aren’t more people using it?
© 2018 MapR Technologies 5
https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
Hidden Technical Dept in Machine Learning Systems
Google, 2015
“Only a small fraction of real-world ML systems is composed of the ML code…
The required surrounding infrastructure is vast and complex.”
© 2018 MapR Technologies 6
Data Collection
Feature
Extraction
This talk is about Data Collection, Feature
Engineering, and time-series forecasting.
ML
Code
Data Collection
© 2018 MapR Technologies 8
1. Collect data wherever possible
(sensors, cameras, operator logs,
weather, etc.)
2. Store that data for a long time
3. Look for patterns using tools that can
spot subtle trends.
4. Deploy monitoring agents to watch
for anomalies or known failure modes.
Predictive Maintenance
Ein Besuch bei Ford in Köln (source: Gilly on Flickr)
© 2018 MapR Technologies 9
Data Collection
© 2018 MapR Technologies 10
Data Pipelines
Ingest Persist Analyze / Operationalize
IDEs,
notebooks,
and AI
platforms
?
Data Flow
Data Platform(s)
© 2018 MapR Technologies 11
Data Pipelines
Ingest Persist Analyze / Operationalize
Data Flow
IDEs,
notebooks,
and AI
platforms
© 2018 MapR Technologies 12
Industry’s leading data storage platform for Big Data and Machine Learning
Data Storage
© 2018 MapR Technologies 13
Data Exploration, Feature Engineering, and AI
IDEs, notebooks
platforms
Programming
Libraries
Dataware Files, Tables, StreamsData Flow
© 2018 MapR Technologies 14
Dataflow Management
• IDE with integrated monitoring and
debugging tools
• CD/CI capabilities built-in
• Containerized approach that scales
elastically, e.g. in Kubernetes.
• Data validation on-the-fly
• Dataflows that can span edge/on-
prem/cloud infrastructure with
guaranteed privacy and delivery.
© 2018 MapR Technologies 15
StreamSets Demo
Demo Steps:
https://github.com/mapr-demos/predictive-maintenance#streamsets-demonstration
Feature Engineering
© 2018 MapR Technologies 17
What does Factory IoT data look like?
Sensor Data
Device ID time x y z
1 8:00:00 .431 .123 .145
1 8:00:01 .735 .112 .672
1 8:00:02 .932 .141 .431
1 8:00:03 .988 .241 .625
Very important
machinery
© 2018 MapR Technologies 18
Device ID time x y z _operator _weather
1 11:59:58 .431 .123 .145 Joe sunny
1 11:59:59 .735 .112 .672 Joe rain
1 12:00:00 .932 .141 .431 Moe sunny
1 12:00:01 .988 .241 .625 Moe sunny
Feature engineering helps AI find correlations.
Feature tables need to have flexible schemas.
© 2018 MapR Technologies 19
Device ID time x y z _subsystem _weekend
1 11:59:58 .431 .123 .145 Boiler False
2 11:59:59 .735 .112 .672 Chiller False
1 12:00:00 .932 .141 .431 Boiler True
3 12:00:01 .988 .241 .625 Fuel Supply True
Derived features also make analysis easier.
How would you write SQL logic for _weekend?
© 2018 MapR Technologies 20
Device ID time x y z _subsystem _weekend
1 11:59:58 .431 .123 .145 Boiler False
2 11:59:59 .735 .112 .672 Chiller False
1 12:00:00 .932 .141 .431 Boiler True
3 12:00:01 .988 .241 .625 Fuel Supply True
Derived features also make analysis easier.
How would you filter on _subsystem without a full table scan?
© 2018 MapR Technologies 21
Device
ID
time x y z Remaining
Life
30s to
failure
1 8:00:00 .431 .123 .145
1 8:00:01 .735 .112 .672
1 8:00:02 .932 .141 .431
1 8:00:03 .988 .241 .625
Some features may be “lagging”
Values can only be calculated once a future event occurs.
© 2018 MapR Technologies 22
Device
ID
time x y z Remaining
Life
30s to
failure
1 8:00:00 .431 .123 .145 3 true
1 8:00:01 .735 .112 .672 2 true
1 8:00:02 .932 .141 .431 1 true
1 8:00:03 --- --- --- 0 true
Lagging features work like this:
When a failure happens…
…then lagging features get labeled.
© 2018 MapR Technologies 23
Spark database connectors
Key Features:
• Operate on data in Spark
without data movement.
• DB pushdown  fast
filtering and sorting.
DB connectors enable you to use feature tables
without massive ETL into Spark executors.
r
© 2018 MapR Technologies 24
Ingest
Stream
Feature
Engineering
in Spark
Feature
storage in
NoSQL DB
SQL analytics in Drill
ODBC connect to
data science tools.
Kafka API CRUD API
Failure events
Sensor data
Feature Engineering Demo:
github.com/mapr-demos/predictive-maintenance
© 2018 MapR Technologies 25
Feature engineering with Spark
Define case class for
incoming metrics.
Subscribe to stream
Read from stream
© 2018 MapR Technologies 26
Feature engineering with Spark
Create lagging variables
Derive _weekend feature
Save feature table to DB
© 2018 MapR Technologies 27
Labeling lagging features with Spark
Subscribe to stream containing
failure notifications
When there’s a failure,
open the feature table
Calculate the timestamps for when
we consider failure “immanent”
© 2018 MapR Technologies 28
Labeling lagging features with Spark
Label “AboutToFail”
Label “RUL”
Combine the lagging
feature updates into one df
Save to DB
© 2018 MapR Technologies 29
Feature table size
Number of lagging variable
records to update.
Alert sent to Grafana
Listening to stream
for failure events.
© 2018 MapR Technologies 30
• Continuous time signals require high speed sampling.
• Full resolution is required.
– Aggregation hides important things.
– High fidelity makes AI more effective.
• Challenges:
– Stream throughput congestion?
– Stream transformation backlog?
Architecting for Fast Data
© 2018 MapR Technologies 31
• Vibrations give the first clue that a machine is failing
• Vibration sensors measure physical displacement
• Capturing a 10kHz vibration requires > 20k samples / second
Detecting Vibrations with FFTs
Detecting vibration anomalies requires continuously
processing high speed data streams.
© 2018 MapR Technologies 32
Can Spark distill fast data streams?
(1 record / sec)
>20k samples/sec
Anomaly
notifications
Feature
Store
As long as Spark can process signals fast enough, this
will work. What two things could go wrong?
© 2018 MapR Technologies 33
Spark scales via parallel compute.
Anomaly
notifications
Feature
Store
If Spark computes FFTs too slow, then just run more Spark jobs.
What if there’s too much data for one stream?
© 2018 MapR Technologies 34
Stream bottlenecks can be avoided by distributing
data across topics and/or partitions.
Anomaly
notifications
Feature
Store
Tip: make sure each
producer has its own topic.
This can significantly improve
throughput (msgs/sec).
Tip: Spark consumers
can subscribe to multiple
topics, so subscribe to all.
Machine Learning
© 2018 MapR Technologies 36
1. Regression: Predict the Remaining Useful Life (RUL)
2. Binary classification: Predict if an asset will fail within
certain time frame (e.g. 50 days).
3. Multi-class classification: Predict if an asset will fail in
different time windows (e.g. tomorrow, next week…)
Three Types of Models for PdM
© 2018 MapR Technologies 37
Terminology: “units” vs “cell”
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
One LSTM cell
3 LSTM cells
3 memory cells
3 hidden nodes
3 hidden layers
3 neurons
3 units
1 visible layer
Determining units is a process of trial and error.
© 2018 MapR Technologies 38
• Input shape:
[samples, time steps, and features]:
• Samples: these are all the rows in our
time-series training data
• Time Steps: This is the look back
window, or sequence length.
– LSTMs work better on windows with fewer
than a couple hundred time steps.
• Features columns - these are the
signals which we want to generalize
Model Input
LSTM 1
LSTM 2
100 Units
0.2 Dropout
50 Units
0.3 Dropout
Dense
Sigmoid
0 or 1 result
Input
© 2018 MapR Technologies 39
Sequential: linear stack of layers
Units: number of neurons
Dropout: reduces overfitting by randomly dropping neurons.
Sigmoid: makes the output either 0 or 1.
Dense: applies sigmoid to every neuron
Binary Cross Entropy: loss function for when you have just two
classes (1 and 0).
Adam Optimizer: learns fast, low memory usage, and is stable
over a wide range of learning rates
Model Structure
LSTM 1
LSTM 2
100 Units
0.2 Dropout
50 Units
0.3 Dropout
Dense
Sigmoid
0 or 1 result
Input
© 2018 MapR Technologies 40
Training
• An epoch is when you go over the complete training data once.
• A batch size of 10 means we expose the network to 10 input sequences before
updating the weights.
• Batches also ensure we don’t try to load the entire training data into memory at
once.
© 2018 MapR Technologies 41
Generating data with logsynth
https://github.com/tdunning/log-synth
© 2018 MapR Technologies 42
References https://github.com/mapr-demos/predictive-maintenance
© 2018 MapR Technologies 43
References
• Awesome LSTM implementation for predictive maintenance on aircraft
engines, by Fidan Boylu Uz, PhD:
https://github.com/Azure/lstms_for_predictive_maintenance/blob/master/Deep%20Learning%20Basics%20for%2
0Predictive%20Maintenance.ipynb
• Good LSTM overview, by Jason Brownlee, PhD:
https://machinelearningmastery.com/5-step-life-cycle-long-short-term-memory-models-keras/
© 2018 MapR Technologies 44
Check out my webinars:
bit.ly/iot_webinar_part1 bit.ly/iot_webinar_part2
© 2018 MapR Technologies 45
Questions?
https://mapr.com/ebooks
IAN DOWNARD
idownard@mapr.com

More Related Content

What's hot

What is Predictive Maintenance? Learn Its Benefits & Role of Industrial IoT
What is Predictive Maintenance? Learn Its Benefits & Role of Industrial IoTWhat is Predictive Maintenance? Learn Its Benefits & Role of Industrial IoT
What is Predictive Maintenance? Learn Its Benefits & Role of Industrial IoTEmbitel Technologies (I) PVT LTD
 
Preparing for the Future: How Asset Management Will Evolve in the Age of Smar...
Preparing for the Future: How Asset Management Will Evolve in the Age of Smar...Preparing for the Future: How Asset Management Will Evolve in the Age of Smar...
Preparing for the Future: How Asset Management Will Evolve in the Age of Smar...Schneider Electric
 
Emergence of ITOA: An Evolution in IT Monitoring and Management
Emergence of ITOA: An Evolution in IT Monitoring and ManagementEmergence of ITOA: An Evolution in IT Monitoring and Management
Emergence of ITOA: An Evolution in IT Monitoring and ManagementHCL Technologies
 
Getting Started with Advanced Network Operations
Getting Started with Advanced Network OperationsGetting Started with Advanced Network Operations
Getting Started with Advanced Network OperationsSchneider Electric
 
Predictive maintenance Solutions-Faststream Technologies
Predictive maintenance Solutions-Faststream TechnologiesPredictive maintenance Solutions-Faststream Technologies
Predictive maintenance Solutions-Faststream TechnologiesSudipta Maity
 
Field Data Gathering Services — A Cloud-Based Approach
Field Data Gathering Services — A Cloud-Based ApproachField Data Gathering Services — A Cloud-Based Approach
Field Data Gathering Services — A Cloud-Based ApproachSchneider Electric
 
Trellis DCIM Platform
Trellis DCIM PlatformTrellis DCIM Platform
Trellis DCIM PlatformGreg Stover
 
Data Center Infrastructure Management(DCIM)
Data Center Infrastructure Management(DCIM)Data Center Infrastructure Management(DCIM)
Data Center Infrastructure Management(DCIM)MD. IFTEKARUL ALAM
 
Netcool OMNIbus Customer Case
Netcool OMNIbus Customer CaseNetcool OMNIbus Customer Case
Netcool OMNIbus Customer CaseIBM Danmark
 
IBM Netcool Operations Insight
IBM Netcool Operations InsightIBM Netcool Operations Insight
IBM Netcool Operations InsightTulsie Narine
 
IBM Netcool: Smarter Energy and Utilities 130910
IBM Netcool: Smarter Energy and Utilities 130910IBM Netcool: Smarter Energy and Utilities 130910
IBM Netcool: Smarter Energy and Utilities 130910Mark Anderson
 
“Lights Out”Configuration using Tivoli Netcool AutoDiscovery Tools
“Lights Out”Configuration using Tivoli Netcool AutoDiscovery Tools“Lights Out”Configuration using Tivoli Netcool AutoDiscovery Tools
“Lights Out”Configuration using Tivoli Netcool AutoDiscovery ToolsAntonio Rolle
 
IoT-Enabled Predictive Maintenance
IoT-Enabled Predictive MaintenanceIoT-Enabled Predictive Maintenance
IoT-Enabled Predictive MaintenanceCloudera, Inc.
 
How Test Labs Reduce Cyber Security Threats to Industrial Control Systemse cy...
How Test Labs Reduce Cyber Security Threats to Industrial Control Systemse cy...How Test Labs Reduce Cyber Security Threats to Industrial Control Systemse cy...
How Test Labs Reduce Cyber Security Threats to Industrial Control Systemse cy...Schneider Electric
 
Understanding Open Protocols in Building Automation
Understanding Open Protocols in Building AutomationUnderstanding Open Protocols in Building Automation
Understanding Open Protocols in Building AutomationSchneider Electric
 

What's hot (20)

What is Predictive Maintenance? Learn Its Benefits & Role of Industrial IoT
What is Predictive Maintenance? Learn Its Benefits & Role of Industrial IoTWhat is Predictive Maintenance? Learn Its Benefits & Role of Industrial IoT
What is Predictive Maintenance? Learn Its Benefits & Role of Industrial IoT
 
Preparing for the Future: How Asset Management Will Evolve in the Age of Smar...
Preparing for the Future: How Asset Management Will Evolve in the Age of Smar...Preparing for the Future: How Asset Management Will Evolve in the Age of Smar...
Preparing for the Future: How Asset Management Will Evolve in the Age of Smar...
 
Cloud computing
Cloud computing Cloud computing
Cloud computing
 
Emergence of ITOA: An Evolution in IT Monitoring and Management
Emergence of ITOA: An Evolution in IT Monitoring and ManagementEmergence of ITOA: An Evolution in IT Monitoring and Management
Emergence of ITOA: An Evolution in IT Monitoring and Management
 
Oil & Gas Fields Get Smart
Oil & Gas Fields Get SmartOil & Gas Fields Get Smart
Oil & Gas Fields Get Smart
 
Getting Started with Advanced Network Operations
Getting Started with Advanced Network OperationsGetting Started with Advanced Network Operations
Getting Started with Advanced Network Operations
 
Predictive maintenance Solutions-Faststream Technologies
Predictive maintenance Solutions-Faststream TechnologiesPredictive maintenance Solutions-Faststream Technologies
Predictive maintenance Solutions-Faststream Technologies
 
SCADA of the Future
SCADA of the FutureSCADA of the Future
SCADA of the Future
 
Field Data Gathering Services — A Cloud-Based Approach
Field Data Gathering Services — A Cloud-Based ApproachField Data Gathering Services — A Cloud-Based Approach
Field Data Gathering Services — A Cloud-Based Approach
 
Trellis DCIM Platform
Trellis DCIM PlatformTrellis DCIM Platform
Trellis DCIM Platform
 
Smart DataCenters
Smart DataCentersSmart DataCenters
Smart DataCenters
 
Data Center Infrastructure Management(DCIM)
Data Center Infrastructure Management(DCIM)Data Center Infrastructure Management(DCIM)
Data Center Infrastructure Management(DCIM)
 
BDPA Cincinnati: 'Big Data - Friend or Foe?'
BDPA Cincinnati: 'Big Data - Friend or Foe?' BDPA Cincinnati: 'Big Data - Friend or Foe?'
BDPA Cincinnati: 'Big Data - Friend or Foe?'
 
Netcool OMNIbus Customer Case
Netcool OMNIbus Customer CaseNetcool OMNIbus Customer Case
Netcool OMNIbus Customer Case
 
IBM Netcool Operations Insight
IBM Netcool Operations InsightIBM Netcool Operations Insight
IBM Netcool Operations Insight
 
IBM Netcool: Smarter Energy and Utilities 130910
IBM Netcool: Smarter Energy and Utilities 130910IBM Netcool: Smarter Energy and Utilities 130910
IBM Netcool: Smarter Energy and Utilities 130910
 
“Lights Out”Configuration using Tivoli Netcool AutoDiscovery Tools
“Lights Out”Configuration using Tivoli Netcool AutoDiscovery Tools“Lights Out”Configuration using Tivoli Netcool AutoDiscovery Tools
“Lights Out”Configuration using Tivoli Netcool AutoDiscovery Tools
 
IoT-Enabled Predictive Maintenance
IoT-Enabled Predictive MaintenanceIoT-Enabled Predictive Maintenance
IoT-Enabled Predictive Maintenance
 
How Test Labs Reduce Cyber Security Threats to Industrial Control Systemse cy...
How Test Labs Reduce Cyber Security Threats to Industrial Control Systemse cy...How Test Labs Reduce Cyber Security Threats to Industrial Control Systemse cy...
How Test Labs Reduce Cyber Security Threats to Industrial Control Systemse cy...
 
Understanding Open Protocols in Building Automation
Understanding Open Protocols in Building AutomationUnderstanding Open Protocols in Building Automation
Understanding Open Protocols in Building Automation
 

Similar to Predictive Maintenance - Portland Machine Learning Meetup

Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...DataWorks Summit
 
Design & Implementation Of Fault Identification In Underground Cables Using IOT
Design & Implementation Of Fault Identification In Underground Cables Using IOTDesign & Implementation Of Fault Identification In Underground Cables Using IOT
Design & Implementation Of Fault Identification In Underground Cables Using IOTIRJET Journal
 
Cheryl Wiebe - Advanced Analytics in the Industrial World
Cheryl Wiebe - Advanced Analytics in the Industrial WorldCheryl Wiebe - Advanced Analytics in the Industrial World
Cheryl Wiebe - Advanced Analytics in the Industrial WorldRehgan Avon
 
Tiarrah Computing: The Next Generation of Computing
Tiarrah Computing: The Next Generation of ComputingTiarrah Computing: The Next Generation of Computing
Tiarrah Computing: The Next Generation of ComputingIJECEIAES
 
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...Edge AI and Vision Alliance
 
Advanced Open IoT Platform for Prevention and Early Detection of Forest Fires
Advanced Open IoT Platform for Prevention and Early Detection of Forest FiresAdvanced Open IoT Platform for Prevention and Early Detection of Forest Fires
Advanced Open IoT Platform for Prevention and Early Detection of Forest FiresIvo Andreev
 
Horizontal Scaling for Millions of Customers!
Horizontal Scaling for Millions of Customers! Horizontal Scaling for Millions of Customers!
Horizontal Scaling for Millions of Customers! elangovans
 
Hybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGsHybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGsAli Hodroj
 
“Autonomous Driving AI Workloads: Technology Trends and Optimization Strategi...
“Autonomous Driving AI Workloads: Technology Trends and Optimization Strategi...“Autonomous Driving AI Workloads: Technology Trends and Optimization Strategi...
“Autonomous Driving AI Workloads: Technology Trends and Optimization Strategi...Edge AI and Vision Alliance
 
LIDAR Magizine 2015: The Birth of 3D Mapping Artificial Intelligence
LIDAR Magizine 2015: The Birth of 3D Mapping Artificial IntelligenceLIDAR Magizine 2015: The Birth of 3D Mapping Artificial Intelligence
LIDAR Magizine 2015: The Birth of 3D Mapping Artificial IntelligenceJason Creadore 🌐
 
Accelerating Cyber Threat Detection With GPU
Accelerating Cyber Threat Detection With GPUAccelerating Cyber Threat Detection With GPU
Accelerating Cyber Threat Detection With GPUJoshua Patterson
 
Development of Software for Estimation of Structural Dynamic Characteristics ...
Development of Software for Estimation of Structural Dynamic Characteristics ...Development of Software for Estimation of Structural Dynamic Characteristics ...
Development of Software for Estimation of Structural Dynamic Characteristics ...IRJET Journal
 
From sensor data processing to proactive alerting and ai software ag - misja ...
From sensor data processing to proactive alerting and ai software ag - misja ...From sensor data processing to proactive alerting and ai software ag - misja ...
From sensor data processing to proactive alerting and ai software ag - misja ...Capgemini
 
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...LEGATO project
 
PIMRC-2012, Sydney, Australia, 28 July, 2012
PIMRC-2012, Sydney, Australia, 28 July, 2012PIMRC-2012, Sydney, Australia, 28 July, 2012
PIMRC-2012, Sydney, Australia, 28 July, 2012Charith Perera
 
IRJET - Importance of Edge Computing and Cloud Computing in IoT Technolog...
IRJET -  	  Importance of Edge Computing and Cloud Computing in IoT Technolog...IRJET -  	  Importance of Edge Computing and Cloud Computing in IoT Technolog...
IRJET - Importance of Edge Computing and Cloud Computing in IoT Technolog...IRJET Journal
 
“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...
“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...
“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...Edge AI and Vision Alliance
 

Similar to Predictive Maintenance - Portland Machine Learning Meetup (20)

Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...
 
Design & Implementation Of Fault Identification In Underground Cables Using IOT
Design & Implementation Of Fault Identification In Underground Cables Using IOTDesign & Implementation Of Fault Identification In Underground Cables Using IOT
Design & Implementation Of Fault Identification In Underground Cables Using IOT
 
Cheryl Wiebe - Advanced Analytics in the Industrial World
Cheryl Wiebe - Advanced Analytics in the Industrial WorldCheryl Wiebe - Advanced Analytics in the Industrial World
Cheryl Wiebe - Advanced Analytics in the Industrial World
 
Tiarrah Computing: The Next Generation of Computing
Tiarrah Computing: The Next Generation of ComputingTiarrah Computing: The Next Generation of Computing
Tiarrah Computing: The Next Generation of Computing
 
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
 
Santhosh Resume
Santhosh ResumeSanthosh Resume
Santhosh Resume
 
8. 9590 1-pb
8. 9590 1-pb8. 9590 1-pb
8. 9590 1-pb
 
Advanced Open IoT Platform for Prevention and Early Detection of Forest Fires
Advanced Open IoT Platform for Prevention and Early Detection of Forest FiresAdvanced Open IoT Platform for Prevention and Early Detection of Forest Fires
Advanced Open IoT Platform for Prevention and Early Detection of Forest Fires
 
Horizontal Scaling for Millions of Customers!
Horizontal Scaling for Millions of Customers! Horizontal Scaling for Millions of Customers!
Horizontal Scaling for Millions of Customers!
 
Hybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGsHybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGs
 
“Autonomous Driving AI Workloads: Technology Trends and Optimization Strategi...
“Autonomous Driving AI Workloads: Technology Trends and Optimization Strategi...“Autonomous Driving AI Workloads: Technology Trends and Optimization Strategi...
“Autonomous Driving AI Workloads: Technology Trends and Optimization Strategi...
 
LIDAR Magizine 2015: The Birth of 3D Mapping Artificial Intelligence
LIDAR Magizine 2015: The Birth of 3D Mapping Artificial IntelligenceLIDAR Magizine 2015: The Birth of 3D Mapping Artificial Intelligence
LIDAR Magizine 2015: The Birth of 3D Mapping Artificial Intelligence
 
Accelerating Cyber Threat Detection With GPU
Accelerating Cyber Threat Detection With GPUAccelerating Cyber Threat Detection With GPU
Accelerating Cyber Threat Detection With GPU
 
Development of Software for Estimation of Structural Dynamic Characteristics ...
Development of Software for Estimation of Structural Dynamic Characteristics ...Development of Software for Estimation of Structural Dynamic Characteristics ...
Development of Software for Estimation of Structural Dynamic Characteristics ...
 
From sensor data processing to proactive alerting and ai software ag - misja ...
From sensor data processing to proactive alerting and ai software ag - misja ...From sensor data processing to proactive alerting and ai software ag - misja ...
From sensor data processing to proactive alerting and ai software ag - misja ...
 
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...
 
PIMRC-2012, Sydney, Australia, 28 July, 2012
PIMRC-2012, Sydney, Australia, 28 July, 2012PIMRC-2012, Sydney, Australia, 28 July, 2012
PIMRC-2012, Sydney, Australia, 28 July, 2012
 
IRJET - Importance of Edge Computing and Cloud Computing in IoT Technolog...
IRJET -  	  Importance of Edge Computing and Cloud Computing in IoT Technolog...IRJET -  	  Importance of Edge Computing and Cloud Computing in IoT Technolog...
IRJET - Importance of Edge Computing and Cloud Computing in IoT Technolog...
 
“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...
“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...
“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...
 
Priorities Shift In IC Design
Priorities Shift In IC DesignPriorities Shift In IC Design
Priorities Shift In IC Design
 

Recently uploaded

VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad EscortsCall girls in Ahmedabad High profile
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 

Recently uploaded (20)

VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 

Predictive Maintenance - Portland Machine Learning Meetup

  • 1. A Machine Learning Practitioner’s Guide to Predictive Maintenance IAN DOWNARD idownard@mapr.com
  • 2. © 2018 MapR Technologies 2 About Me Sr. Solution Architect at MapR Email: idownard@mapr.com Blog: bigendiandata.com
  • 3. © 2018 MapR Technologies 3 Predictive maintenance can reduce… • equipment failures by 75% • downtime by 45% • maintenance costs by 30% US Department of Energy, August 2010 https://www.energy.gov/eere/femp/downloads/operations-and-maintenance-best-practices-guide
  • 4. © 2018 MapR Technologies 4 “Manufacturer’s adoption of ML/AI will increase 38% in the next five years.” Digital Factories 2020: Shaping the future of manufacturing PricewaterhouseCoopers, 2017 If ML is so effective, why aren’t more people using it?
  • 5. © 2018 MapR Technologies 5 https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf Hidden Technical Dept in Machine Learning Systems Google, 2015 “Only a small fraction of real-world ML systems is composed of the ML code… The required surrounding infrastructure is vast and complex.”
  • 6. © 2018 MapR Technologies 6 Data Collection Feature Extraction This talk is about Data Collection, Feature Engineering, and time-series forecasting. ML Code
  • 8. © 2018 MapR Technologies 8 1. Collect data wherever possible (sensors, cameras, operator logs, weather, etc.) 2. Store that data for a long time 3. Look for patterns using tools that can spot subtle trends. 4. Deploy monitoring agents to watch for anomalies or known failure modes. Predictive Maintenance Ein Besuch bei Ford in Köln (source: Gilly on Flickr)
  • 9. © 2018 MapR Technologies 9 Data Collection
  • 10. © 2018 MapR Technologies 10 Data Pipelines Ingest Persist Analyze / Operationalize IDEs, notebooks, and AI platforms ? Data Flow Data Platform(s)
  • 11. © 2018 MapR Technologies 11 Data Pipelines Ingest Persist Analyze / Operationalize Data Flow IDEs, notebooks, and AI platforms
  • 12. © 2018 MapR Technologies 12 Industry’s leading data storage platform for Big Data and Machine Learning Data Storage
  • 13. © 2018 MapR Technologies 13 Data Exploration, Feature Engineering, and AI IDEs, notebooks platforms Programming Libraries Dataware Files, Tables, StreamsData Flow
  • 14. © 2018 MapR Technologies 14 Dataflow Management • IDE with integrated monitoring and debugging tools • CD/CI capabilities built-in • Containerized approach that scales elastically, e.g. in Kubernetes. • Data validation on-the-fly • Dataflows that can span edge/on- prem/cloud infrastructure with guaranteed privacy and delivery.
  • 15. © 2018 MapR Technologies 15 StreamSets Demo Demo Steps: https://github.com/mapr-demos/predictive-maintenance#streamsets-demonstration
  • 17. © 2018 MapR Technologies 17 What does Factory IoT data look like? Sensor Data Device ID time x y z 1 8:00:00 .431 .123 .145 1 8:00:01 .735 .112 .672 1 8:00:02 .932 .141 .431 1 8:00:03 .988 .241 .625 Very important machinery
  • 18. © 2018 MapR Technologies 18 Device ID time x y z _operator _weather 1 11:59:58 .431 .123 .145 Joe sunny 1 11:59:59 .735 .112 .672 Joe rain 1 12:00:00 .932 .141 .431 Moe sunny 1 12:00:01 .988 .241 .625 Moe sunny Feature engineering helps AI find correlations. Feature tables need to have flexible schemas.
  • 19. © 2018 MapR Technologies 19 Device ID time x y z _subsystem _weekend 1 11:59:58 .431 .123 .145 Boiler False 2 11:59:59 .735 .112 .672 Chiller False 1 12:00:00 .932 .141 .431 Boiler True 3 12:00:01 .988 .241 .625 Fuel Supply True Derived features also make analysis easier. How would you write SQL logic for _weekend?
  • 20. © 2018 MapR Technologies 20 Device ID time x y z _subsystem _weekend 1 11:59:58 .431 .123 .145 Boiler False 2 11:59:59 .735 .112 .672 Chiller False 1 12:00:00 .932 .141 .431 Boiler True 3 12:00:01 .988 .241 .625 Fuel Supply True Derived features also make analysis easier. How would you filter on _subsystem without a full table scan?
  • 21. © 2018 MapR Technologies 21 Device ID time x y z Remaining Life 30s to failure 1 8:00:00 .431 .123 .145 1 8:00:01 .735 .112 .672 1 8:00:02 .932 .141 .431 1 8:00:03 .988 .241 .625 Some features may be “lagging” Values can only be calculated once a future event occurs.
  • 22. © 2018 MapR Technologies 22 Device ID time x y z Remaining Life 30s to failure 1 8:00:00 .431 .123 .145 3 true 1 8:00:01 .735 .112 .672 2 true 1 8:00:02 .932 .141 .431 1 true 1 8:00:03 --- --- --- 0 true Lagging features work like this: When a failure happens… …then lagging features get labeled.
  • 23. © 2018 MapR Technologies 23 Spark database connectors Key Features: • Operate on data in Spark without data movement. • DB pushdown  fast filtering and sorting. DB connectors enable you to use feature tables without massive ETL into Spark executors. r
  • 24. © 2018 MapR Technologies 24 Ingest Stream Feature Engineering in Spark Feature storage in NoSQL DB SQL analytics in Drill ODBC connect to data science tools. Kafka API CRUD API Failure events Sensor data Feature Engineering Demo: github.com/mapr-demos/predictive-maintenance
  • 25. © 2018 MapR Technologies 25 Feature engineering with Spark Define case class for incoming metrics. Subscribe to stream Read from stream
  • 26. © 2018 MapR Technologies 26 Feature engineering with Spark Create lagging variables Derive _weekend feature Save feature table to DB
  • 27. © 2018 MapR Technologies 27 Labeling lagging features with Spark Subscribe to stream containing failure notifications When there’s a failure, open the feature table Calculate the timestamps for when we consider failure “immanent”
  • 28. © 2018 MapR Technologies 28 Labeling lagging features with Spark Label “AboutToFail” Label “RUL” Combine the lagging feature updates into one df Save to DB
  • 29. © 2018 MapR Technologies 29 Feature table size Number of lagging variable records to update. Alert sent to Grafana Listening to stream for failure events.
  • 30. © 2018 MapR Technologies 30 • Continuous time signals require high speed sampling. • Full resolution is required. – Aggregation hides important things. – High fidelity makes AI more effective. • Challenges: – Stream throughput congestion? – Stream transformation backlog? Architecting for Fast Data
  • 31. © 2018 MapR Technologies 31 • Vibrations give the first clue that a machine is failing • Vibration sensors measure physical displacement • Capturing a 10kHz vibration requires > 20k samples / second Detecting Vibrations with FFTs Detecting vibration anomalies requires continuously processing high speed data streams.
  • 32. © 2018 MapR Technologies 32 Can Spark distill fast data streams? (1 record / sec) >20k samples/sec Anomaly notifications Feature Store As long as Spark can process signals fast enough, this will work. What two things could go wrong?
  • 33. © 2018 MapR Technologies 33 Spark scales via parallel compute. Anomaly notifications Feature Store If Spark computes FFTs too slow, then just run more Spark jobs. What if there’s too much data for one stream?
  • 34. © 2018 MapR Technologies 34 Stream bottlenecks can be avoided by distributing data across topics and/or partitions. Anomaly notifications Feature Store Tip: make sure each producer has its own topic. This can significantly improve throughput (msgs/sec). Tip: Spark consumers can subscribe to multiple topics, so subscribe to all.
  • 36. © 2018 MapR Technologies 36 1. Regression: Predict the Remaining Useful Life (RUL) 2. Binary classification: Predict if an asset will fail within certain time frame (e.g. 50 days). 3. Multi-class classification: Predict if an asset will fail in different time windows (e.g. tomorrow, next week…) Three Types of Models for PdM
  • 37. © 2018 MapR Technologies 37 Terminology: “units” vs “cell” http://colah.github.io/posts/2015-08-Understanding-LSTMs/ One LSTM cell 3 LSTM cells 3 memory cells 3 hidden nodes 3 hidden layers 3 neurons 3 units 1 visible layer Determining units is a process of trial and error.
  • 38. © 2018 MapR Technologies 38 • Input shape: [samples, time steps, and features]: • Samples: these are all the rows in our time-series training data • Time Steps: This is the look back window, or sequence length. – LSTMs work better on windows with fewer than a couple hundred time steps. • Features columns - these are the signals which we want to generalize Model Input LSTM 1 LSTM 2 100 Units 0.2 Dropout 50 Units 0.3 Dropout Dense Sigmoid 0 or 1 result Input
  • 39. © 2018 MapR Technologies 39 Sequential: linear stack of layers Units: number of neurons Dropout: reduces overfitting by randomly dropping neurons. Sigmoid: makes the output either 0 or 1. Dense: applies sigmoid to every neuron Binary Cross Entropy: loss function for when you have just two classes (1 and 0). Adam Optimizer: learns fast, low memory usage, and is stable over a wide range of learning rates Model Structure LSTM 1 LSTM 2 100 Units 0.2 Dropout 50 Units 0.3 Dropout Dense Sigmoid 0 or 1 result Input
  • 40. © 2018 MapR Technologies 40 Training • An epoch is when you go over the complete training data once. • A batch size of 10 means we expose the network to 10 input sequences before updating the weights. • Batches also ensure we don’t try to load the entire training data into memory at once.
  • 41. © 2018 MapR Technologies 41 Generating data with logsynth https://github.com/tdunning/log-synth
  • 42. © 2018 MapR Technologies 42 References https://github.com/mapr-demos/predictive-maintenance
  • 43. © 2018 MapR Technologies 43 References • Awesome LSTM implementation for predictive maintenance on aircraft engines, by Fidan Boylu Uz, PhD: https://github.com/Azure/lstms_for_predictive_maintenance/blob/master/Deep%20Learning%20Basics%20for%2 0Predictive%20Maintenance.ipynb • Good LSTM overview, by Jason Brownlee, PhD: https://machinelearningmastery.com/5-step-life-cycle-long-short-term-memory-models-keras/
  • 44. © 2018 MapR Technologies 44 Check out my webinars: bit.ly/iot_webinar_part1 bit.ly/iot_webinar_part2
  • 45. © 2018 MapR Technologies 45 Questions? https://mapr.com/ebooks IAN DOWNARD idownard@mapr.com

Editor's Notes

  1. Other titles: Ways and Means of Predictive Maintenance with Machine Learning Demystifying Predictive Maintenance with Hands-On Machine Learning https://www.meetup.com/Portland-Machine-Learning-Meetup/
  2. And predictive maintenance. This, from a seminal study by the DoE. Note the date. This was way back in 2010! The opportunities for IoT are even bigger now.
  3. What makes ML difficult? There’s a lot of specialized software needed to put ML into prod. ML has a very different lifecycle too.
  4. I’m excited to talk about this stuff because I think I have pretty good ideas about how to do this stuff, and how to generate data and play with Keras. I work for MapR, but that’s not a big part of the story here. My intent is to draw on personal experiences building PdM to help you learn what’s involved in buiding PdM regardless of your tool choice.
  5. Advanced PdM not only involves time-series IIoT data, but also historical maintenance records, error logs, machine and operator features No matter what you plan to do with the data, it must persist somewhere.
  6. No matter what you plan to do with the data, it must persist somewhere. This is the point at which I need to mention MapR. MapR is dataware.
  7. MapR facilitates data science, model dev, and ML in prod.
  8. Scale storage, scale analytics, doing ML, putting analytical products into prod. If you struggle with data storage, and iterative data analysis, using datasets for production apps, check out MapR.
  9. Lots of tools to clean and analyze data. But data cleansing and feature engineering require lots of trial and error. So, any friction (e.g. data movement, schema discovery, proprietary query languages, etc) in data access is bad. MapR reduces the barriers to saving data, analyzing it, augmenting it, and operationalizing it.  Now let’s talk about data flows? What do the processes that pull from MQTT or REST look like? Where do they run?  You can write them custom, or use a data pipeline tool, like StreamSets
  10. “In the face of drift, in the face of change, in the face of unexpected data, changing business needs and logic, changing infrastructure, you're able to minimize the amount of downtime of the system and kind of keep it always on”
  11. Demo script:
  12. Now we go from talking about data collection to data transformation. I’ll describe PdM feature engineering concepts and show how to implement them.
  13. There may be some properties which correlate to failures. Those properties may be calculated on-the-fly or derived by joining other datasets. This requires data to be stored with flexible schemas.
  14. Derived features can make analysis much easier.
  15. Grouping sensors by system can also make analysis much easier.
  16. What if the data is sampled frequently? What if failures are rare?
  17. There may be some properties which correlate to failures. Those properties may be calculated on-the-fly or derived by joining other datasets. This requires data to be stored with flexibile schemas.
  18. There may be some properties which correlate to failures. Those properties may be calculated on-the-fly or derived by joining other datasets. This requires data to be stored with flexibile schemas.
  19. There may be some properties which correlate to failures. Those properties may be calculated on-the-fly or derived by joining other datasets. This requires data to be stored with flexibile schemas.
  20. There may be some properties which correlate to failures. Those properties may be calculated on-the-fly or derived by joining other datasets. This requires data to be stored with flexibile schemas.
  21. Here’s an example of MapR-DB being used from Spark to update lagging features.
  22. Unusual vibrations give you the first clue that a machine is nearing the end of its useful life, so it's very important to detect those anomalies. Vibration sensors measure the displacement or velocity of motion thousands of times per second.  Acoustic sensors work the same way
  23. Two things could go wrong: Could have too much data. E.g. too many motors / data sources Spark SQL filtering and FFT computation could be too slow.
  24. Now we go from talking about industry trends to more of a how-to guide.
  25. There is no rule of thumb for the amount of hidden nodes you should use. It is something you have to figure out through trial and error.
  26. Dropout forces better generalization
  27. Dropout forces better generalization We must specify a loss function and an optimizer function when compiling the model. The loss function is a way of penalizing the model for low accuracy scores. We use binary cross entropy because we have just two classes (1 and 0). The optimizer defines how to adjust neuron weights in response to inaccuracate predictions. The Adam optimizer make sense, because I’ve read that Adam learns fast, is stable over a wide range of learning rates, and has comparatively low memory requirements. Keras uses a default learning rate of 0.001.
  28. I like to think of LSTM as doing a exponential rolling average Training: I fit the network with 5 epochs and batch size of 10 An epoch is when you go over the complete training data once. A batch size of 10 means we expose the network to 10 input sequences before updating the weights. Batches also ensure we don’t try to load the entire training data into memory at once. The fit function returns a history object that provides a summary of model accuracy recorded at each epoch.
  29. http://bit.ly/iot_webinar_part2