Advertisement

Machine learning for predictive maintenance external

Machine Learning Practice Lead, Google . Authored AI chapter in NOAA book, Research Paper & SQL Server Book. at Google
Dec. 4, 2018
Advertisement

More Related Content

Similar to Machine learning for predictive maintenance external(20)

Advertisement

Machine learning for predictive maintenance external

  1. © 2018 Google LLC. All rights reserved. Prashant Dhingra: pkdhingra@google.com ML for Predictive Maintenance
  2. © 2018 Google LLC. All rights reserved. ● Predictive maintenance using machine learning - Common use cases and hypotheses - Demo and notebook - Schema and dataset - Techniques for creating labels - Techniques for creating features - Selecting an algorithm - Defining metrics for predictive maintenance ● Summary Agenda
  3. Why Predictive Maintenance Save Money Unforeseen downtime in production can cost millions of dollars of losses annually per region, as well as environmental and life hazards. Increase Productivity You can make your operations and monitoring teams more productive with predictive maintenance.
  4. © 2017 Google Inc. All rights reserved.© 2018 Google LLC. All rights reserved. Common use cases and hypotheses
  5. © 2018 Google LLC. All rights reserved. Imagine if you could… [ predict which vehicle will fail ]
  6. © 2018 Google LLC. All rights reserved. [ predict the remaining life of devices ] Imagine if you could…
  7. © 2018 Google LLC. All rights reserved. [ optimize energy consumption ] Imagine if you could …
  8. © 2018 Google LLC. All rights reserved. [ Identify patterns and anomaly ] Imagine if you could … Create benchmark and identify exception…
  9. © 2018 Google LLC. All rights reserved. [ predict the value of waterflow even if sensor is broken ] Imagine if you could…
  10. © 2018 Google LLC. All rights reserved. Define Hypothesis What is breakdown? What signals have patterns of degrade or breakdown? How often are signals collected? How much normal and failure data do you have? How will you use the output of predictive maintenance?
  11. DEMO
  12. Three key challenges Collect data from equipment Create features Build model and get actionable insights
  13. Define Hypothesis What is breakdown? What signals have patterns of degrade or breakdown? How often are signals collected? How much normal and failure data do you have? How will you use the output of predictive maintenance?
  14. ● Predict if equipment will fail in next ‘N’ period ● Predict if equipment will fail in next ‘N1’, ‘N2’.. period ● Predict if equipment will fail in next ‘N’ period due to fault in part ‘X’ ● Predict time to failure or remaining life of equipment ● Find anomalies Forming the hypothesis
  15. © 2017 Google Inc. All rights reserved.© 2018 Google LLC. All rights reserved. Techniques for creating labels
  16. Heat, noise, vibrationSpeed, efficiency, pressure, load Find degrade patterns in data Age Age
  17. Example – sensor data
  18. © 2018 Google LLC. All rights reserved. Equipment details Usage history Maintenance details Iot signal Make Model Configuration Age in days/hours # of start Dwell time Part replaced Date of replacement Service history Heat Noise Vibration Voltage Images # of people in building, vehicle load, CPU, memory Weather, load on equipment Manual Business rules Capacity Past breakdowns Static data Time series data Frequently updated data Occasionally updated data Example – dataset for predictive maintenance
  19. Proprietary + Confidential Techniques for creating labels - Convert failure signals into a label - Convert degrade signals into a label
  20. © 2018 Google LLC. All rights reserved. Signal value on Day 1 Normal Create negative label Signal value on Day 2 Normal Signal value on Day 3 Normal Signal value on Day 4 Normal Signal value on Day 5 Normal Signal value on Day 6 About to Fail Create positive label Signal value on Day 7 About to Fail Signal value on Day 8 About to Fail Signal value on Day 9 About to Fail Signal value on Day 10 Fail Choose label for classification
  21. © 2018 Google LLC. All rights reserved. FailLife left at period ‘N-1’ Life left at period ‘N-7’ From failure data, calculate how much was left at 1 day, 2 days….7 days before failure. Life left at period ‘N5’ Create label for regression
  22. © 2018 Google LLC. All rights reserved. ● Count: ● Sum: ● Minimum: ● Maximum: ● Average: ● Median: ● Standard deviation: ● Derivative: ● 2nd derivative: ● Rolling average: ● Tumbling average: Features Craft new features
  23. © 2018 Google LLC. All rights reserved. ● Tumbling aggregates: Select a rolling aggregate period ‘P’ of size ‘N’ time units. For each record, compute the tumbling aggregate for each period. ● Rolling aggregates: Select a rolling aggregate period ‘P’ of size ‘N’ time units. For each record, compute the rolling aggregate for period ‘P’ before that time window. Capture trend changes
  24. © 2017 Google Inc. All rights reserved.© 2018 Google LLC. All rights reserved. Algorithms
  25. Classification Multiclass classification Regression Anomaly detection Recurrent Neural Network (RNN), Long Short Term Memory (LSTM) DNN regression Conditional AutoEncoder Traditional ML e.g. Random Forest SVM, Decision Trees Random Forest, Decision Trees Hidden Markov chain Will it fail? Is the behavior anomalous? Will it fail for reason X? After how long will it fail? Anomaly MASF RF Regression Algorithms for predictive maintenance Deep Neural Network (DNN) Classification DNN Classification RNN, LSTM RNN, LSTM AutoEncoder
  26. ● Anomalies are a pattern of data that does not conform to normal data ● An anomaly detection system estimates what is normal behavior and what is abnormal behavior Techniques When to use Point anomalies Individual data points are out of range Contextual anomalies Individual instances are out of range in a specific context Collective/sequence anomalies ● Sequence is not usual ● Combination is not usual Challenges: Abnormal behavior is skewed and evolving. Anomaly detection
  27. © 2018 Google LLC. All rights reserved. TensorFlow helps you train models at scale Input Feature Predicted Value Model True Value Update model based on cost Cost
  28. © 2017 Google Inc. All rights reserved.© 2018 Google LLC. All rights reserved. Metrics
  29. © 2018 Google LLC. All rights reserved. ● Precision answers the question – Out of the equipment classified “will fail,” what fraction were correct? ● Recall answers the question – Out of equipment that actually failed, what fraction did the classifier pick up? ● F1 score is the harmonic mean of precision and recall F1 score – Will equipment fail?
  30. © 2018 Google LLC. All rights reserved. S.N. fail not fail Classified “will fail” tp fp Classified “will not fail” fn tn Precision = tp/(tp+fp) = 5/(5+2) Recall = tp/(tp+fn) = 5/(5+1). Also called sensitivity and true positive rate Accuracy = (tp + tn)/(tp+tn+fp+fn) F1 = 2 *(precision*recall) /(precision + recall) F1 = 2*(5/7*5/6)/(5/7+5/6) *tp = 5 tn = 5 fp = 2 fn = 1 * Reduce this (FP) to avoid maintenance cost. Improve precision * Reduce this (FN) to avoid failure cost * F1 score – precision and recall
  31. © 2018 Google LLC. All rights reserved. FP Rate TPRate 0 1 An ROC curve TP vs. FP rate at one decision threshold TP vs. FP rate at another decision threshold 1
  32. © 2018 Google LLC. All rights reserved. ● AUC = area under the ROC curve Evaluation metrics: AUC ● Intuition: Gives an aggregate measure of performance aggregated across all possible classification thresholds ● Interpretation: If we pick a random positive and a random negative, what’s the probability my model scores them in the correct relative order?
  33. © 2018 Google LLC. All rights reserved. Metrics for imbalanced datasets Accuracy is not a good measure. Consider these: ● TPrate = tp/(tp+fn) gives percentage of correctly classified failure instance ● TNrate = tn/(tn+fp) gives percentage of correctly classified normal instance ● FPrate = fp/(fp+tn) gives percentage of misclassified failure instance ● FNrate = fn/(fn+tp) gives percentage of misclassified normal instance ● ROC curve If .5% of examples are failures ● 99.5% accuracy by classifying everything good
  34. © 2018 Google LLC. All rights reserved. ● Micro average method ● Sum up individual tp, fp. ● Micro precision = tp1 +tp2 +..tpn /(tp1 +tp2 +..tpn +fp1 +fp2 +..fpn ) ● Micro recall = = tp1 +tp2 +..tpn /(tp1 +tp2 +..tpn +fn1 +fn2 +..fnn ) ● Macro average method ● Compute metric independently for each class and then take average Metrics for multiclass classification
  35. © 2018 Google LLC. All rights reserved. RMSE metric – What is remaining life?
  36. © 2017 Google Inc. All rights reserved.© 2018 Google LLC. All rights reserved. Predictive Maintenance on Edge
  37. © 2018 Google LLC. All rights reserved. Three key challenges in deploying on Cloud LatencyNetwork Resource constraint Power, Computation, MemoryOffline, Low bandwidth High performance
  38. © 2018 Google LLC. All rights reserved. A lightweight machine learning library for mobile and embedded devices. TensorFlow Lite Easier Optimized set of core kernals Faster Low overhead, Static Execution Smaller Binaries Quantized kernal
  39. ML lifecycle with Cloud IoT Edge Device data Aggregated data Build or bring your own model Cloud IoT Edge Real-time analytics & ML Edge TPU Operational data Trained ML model Edge Device Edge ML Edge IoT Core Data Deploy Trained ML Model Build and Train ML model in the Cloud Predict & act at the Edge
  40. Google Cloud IoT platform Accelerate business agility and decision making with IoT data Training Data Update Config & Deploy ML model Data Control Data Analytics & ML Update device config Update device config Serving Cloud Pub/Sub Cloud Functions Cloud Bigtable Cloud Machine Learning Cloud IoT Core Insights Cloud Dataflow or Linux OS Edge Device CPU GPU Edge TPU Cloud Datalab Cloud IoT Edge Real-time analytics & ML Edge ML Edge IoT Core BigQuery Devices partners Application partners Service partners Refer to the individual sections as indicated in color-coded numbering
  41. © 2018 Google LLC. All rights reserved. Thank you
  42. Start of equipment life End of equipment 1 life End of equipment 2 life End of equipment 3 life End of equipment 4 life
  43. Start of equipment life End of equipment 1 life End of equipment 2 life End of equipment 3 life End of equipment 4 life
Advertisement