SlideShare a Scribd company logo
Failure prediction for
APU's on a Metro
System
Aaryadev Ghosalkar
Agenda
• Introduction
• Problem Statement
• Literature Review
• Existing Systems
• Modelling
• Results
• Conclusion
• Future Scope
Failure prediction for APU's on a Metro System 2
Introduction
Air production unit (APU) is component on most modern metro systems
which circulates compressed air through the metro, our research presents a
comprehensive comparison of PdM models on APU failure detection. Our
goal is to detect failures at least 2 hours in advance, the challenge with APU's
in particular is that APU failures are rare events
In the dataset there are 3 failure within 6 months of operation
Failure prediction for APU's on a Metro System 3
Problem Statement
• Detect failures at least 2 hours in advance
• Create a system that is customizable can expand to more than 2
hours if required (also keeping in mind the side effects that may
arise)
• Provide a way to deploy the model for real time inference
Failure prediction for APU's on a Metro System
4
Literature
Review
Taking a look at what other
researchers have done
Literature Review
• Veloso et al performed the initial data collection, which
involved converting data from the metro company
database into a CSV file
• We used some of the ideas for data preprocessing
presented by the AzureML team at microsoft
• A study on Davari et al motivated many of the algorithms
that we used in this study, there work has been a great
resource in our study
Failure prediction for APU's on a Metro System 6
Literature Review continued
• Chaudhuri et al used SVM to classify vehicles into 3
distinct risk levels, focusing on model interpretability,
model interpretability refers to how easy it is to
understand the choices made by the model and results of
the model
• Various other researchers have used CNN with GAF to
convert timeseries data into images and used CNNs for
classification most notable was done by Silva which
reported an accuracy of 93% in their study
Failure prediction for APU's on a Metro System 7
Literature Review continued
• Nguyen and Medjaher used LSTM to predicted the
probability of failure in a given time window, we drew a lot
of inspiration from their work in terms of balancing the
dataset and the model used
• In terms of early research into RUL estimation a majority
of the work revolves around using statistical models or
models which assume linear degradation pattern, it was
only after 2016 Deep learning models were used in this
field
Failure prediction for APU's on a Metro System 8
Existing
Systems
Predictive maintenance has been
used in a lot of domains such as
elevators and Jet engines.
The Infrastructure abroad
• Uses NLP on text fields where engineers
describe the problem and how the
problem was solved.
• Also predict if a failure is about to happen
and which component will fail
• TFL expects to save £3 million a year
Failure prediction for APU's on a Metro System
10
• TrainDNA collects and analyzes real-
time data from more than 200 trains
across Australia using IBM Maximo
• 51% Increase in realiablity after
introducing the TrainDNA system
• Considers multiple failures
• Large scale system able to process 30
Million message every hour
London Australia and NZL
Modelling
Taking a look at how the data
was processed and what ML
models were tested
Failure prediction for APU's on a Metro System 11
Data Preprocessing
Most of the data collection
and cleaning work on this
data set was done by
researchers that created
the data set
Thus this data set did not
have any null values
Failure prediction for APU's on a Metro System
• Linear Model are a bad
idea
• There is no obvious
pattern which we can
see to distinguish each
class, which further
motivated use of deep
learning
• Balance is key!
Collection EDA
• Discretizing the data to
make this more
suitable for
classification
• Generalizing the model
using a parameter to
control number of
hours before warning.
Preparation
12
Challenges in data processing
• The sheer volume of data presented a significant
challenge, there were more than 10 000 000 (1Cr) rows in
the dataset to solve this we performed carefully changed
the data types of the features to minimize the loss of
information this allowed us to significantly reduce size of
the data with a 91% decrease in storage space and 77%
decrease in RAM usage
Failure prediction for APU's on a Metro System 13
Challenges in Data processing
continued
• The highly imbalanced nature of the data also presented
a huge challenge when training the models, since APU
failures are a rare event and only 3% of the data
constitutes of APU failure a model that predicts all data
points as normal would mathematically have an accuracy
of 97%, to combat this we used Near miss under
sampling and a convenience sampling like strategy for
the LSTM, inspired by Chen et al
Failure prediction for APU's on a Metro System 14
Machine Learning Models
• SVMs take very long to
train
• Results are not
satisfactory and most
other models perform
better
Failure prediction for APU's on a Metro System
• Very quick training time
• The hyper parameters
such as the criterion do
not make any
difference to the
accuracy
SVM Decision Trees
• Provide greater
accuracy than decision
trees due to this being
an ensemble model
• Training time is similar
to decision trees
Random Forests
15
Deep Learning Models
• Used a small neural network of 25k
parameters with AdamW optimizer
• Best results from all tested model
• Does not take into account the time
series nature of the data
• Requires CUDA for efficient
deployment.
Failure prediction for APU's on a Metro System 16
• Can incorporate temporal patterns
• Large model almost 184K
parameters which makes training
difficult.
• Requires 3rd order tensors as input
for training thus balancing data is
hard
• Also requires CUDA for efficient
training and deployment
Neural Network LSTM Network
Results
Accuracy Precision (Class
1)
Recall (Class 1)
SVM 0.57 0.60 0.62
Decision Tree 0.66 0.69 0.70
Random Forest 0.70 0.69 0.78
Neural Network (Adam) 0.72 0.67 0.91
Neural Network (AdamW) 0.76 0.75 0.87
LSTM Network 0.76 0.64 0.85
Failure prediction for APU's on a Metro System 17
Conclusion
• Neural networks produce satisfactory results however the
Random forest model can be used when deploying on
low end hardware or considering an edge solution.
LSTMs perform nearly identical to neural networks with
less data
• Microservices can be considered when deploying as this
will allow different parts of the model to be deployed
separately
• Multi collinearity is present in the data which can makes
it hard to fit any kind of linear model
Failure prediction for APU's on a Metro System 18
Future Scope
• Transformer based architecture can be considered as
they have shown promising results with sequential data in
the case of NLP applications
• Perhaps due to the imbalanced nature and the rarity of
APU failures on a real metro systems an anomaly
detection approach would be better as this would not
need the data to be balanced and a lot more of the
existing data can be used
Failure prediction for APU's on a Metro System 19
Question and
Answer
Aaryadev Ghosalkar
aaryadevg@gmail.com
https://github.com/aaryadevg

More Related Content

Similar to Failure Prediction for APU on a Metro System

An Approach to Overcome Modeling Inaccuracies for Performance Simulation Sig...
An Approach to Overcome Modeling  Inaccuracies for Performance Simulation Sig...An Approach to Overcome Modeling  Inaccuracies for Performance Simulation Sig...
An Approach to Overcome Modeling Inaccuracies for Performance Simulation Sig...
Pankaj Singh
 
Improving Resource Utilization in Cloud using Application Placement Heuristics
Improving Resource Utilization in Cloud using Application Placement HeuristicsImproving Resource Utilization in Cloud using Application Placement Heuristics
Improving Resource Utilization in Cloud using Application Placement Heuristics
AtakanAral
 
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
Mohsen Sadok
 
Chap2 slides
Chap2 slidesChap2 slides
Chap2 slides
ashishmulchandani
 
Recent and Planned Improvements to the System Advisor Model
Recent and Planned Improvements to the System Advisor ModelRecent and Planned Improvements to the System Advisor Model
Recent and Planned Improvements to the System Advisor Model
Sandia National Laboratories: Energy & Climate: Renewables
 
Ajila (1)
Ajila (1)Ajila (1)
Ajila (1)
akanksha kunwar
 
Approximation techniques used for general purpose algorithms
Approximation techniques used for general purpose algorithmsApproximation techniques used for general purpose algorithms
Approximation techniques used for general purpose algorithms
Sabidur Rahman
 
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHMJOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
mailjkb
 
Modeling Cardiac Pacemakers With Timed Coloured Petri Nets And Related Tools
Modeling Cardiac Pacemakers With Timed Coloured Petri Nets And Related ToolsModeling Cardiac Pacemakers With Timed Coloured Petri Nets And Related Tools
Modeling Cardiac Pacemakers With Timed Coloured Petri Nets And Related Tools
Mohammed Assiri
 
Lifetime-Aware Scheduling and Power Control for Cellular-based M2M Communicat...
Lifetime-Aware Scheduling and Power Control for Cellular-based M2M Communicat...Lifetime-Aware Scheduling and Power Control for Cellular-based M2M Communicat...
Lifetime-Aware Scheduling and Power Control for Cellular-based M2M Communicat...
amin azari
 
Biomedical Signal and Image Analytics using MATLAB
Biomedical Signal and Image Analytics using MATLABBiomedical Signal and Image Analytics using MATLAB
Biomedical Signal and Image Analytics using MATLAB
CodeOps Technologies LLP
 
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Peter Tröger
 
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...
Databricks
 
Real time operating systems
Real time operating systemsReal time operating systems
Real time operating systems
Sri Manakula Vinayagar Engineering College
 
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
Power System Operation
 
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
Power System Operation
 
Wide Area Monitoring, Protection and Control (WAMPAC) Application in Transmis...
Wide Area Monitoring, Protection and Control (WAMPAC) Application in Transmis...Wide Area Monitoring, Protection and Control (WAMPAC) Application in Transmis...
Wide Area Monitoring, Protection and Control (WAMPAC) Application in Transmis...
IRJET Journal
 
MCS2SIM - Method Allowing Application of PSA Results in Simulators
MCS2SIM - Method Allowing Application of PSA Results in SimulatorsMCS2SIM - Method Allowing Application of PSA Results in Simulators
MCS2SIM - Method Allowing Application of PSA Results in Simulators
GSE Systems, Inc.
 
KCC2017 28APR2017
KCC2017 28APR2017KCC2017 28APR2017
KCC2017 28APR2017
JEE HYUN PARK
 
Performance of a speculative transmission scheme for scheduling latency reduc...
Performance of a speculative transmission scheme for scheduling latency reduc...Performance of a speculative transmission scheme for scheduling latency reduc...
Performance of a speculative transmission scheme for scheduling latency reduc...
Mumbai Academisc
 

Similar to Failure Prediction for APU on a Metro System (20)

An Approach to Overcome Modeling Inaccuracies for Performance Simulation Sig...
An Approach to Overcome Modeling  Inaccuracies for Performance Simulation Sig...An Approach to Overcome Modeling  Inaccuracies for Performance Simulation Sig...
An Approach to Overcome Modeling Inaccuracies for Performance Simulation Sig...
 
Improving Resource Utilization in Cloud using Application Placement Heuristics
Improving Resource Utilization in Cloud using Application Placement HeuristicsImproving Resource Utilization in Cloud using Application Placement Heuristics
Improving Resource Utilization in Cloud using Application Placement Heuristics
 
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
 
Chap2 slides
Chap2 slidesChap2 slides
Chap2 slides
 
Recent and Planned Improvements to the System Advisor Model
Recent and Planned Improvements to the System Advisor ModelRecent and Planned Improvements to the System Advisor Model
Recent and Planned Improvements to the System Advisor Model
 
Ajila (1)
Ajila (1)Ajila (1)
Ajila (1)
 
Approximation techniques used for general purpose algorithms
Approximation techniques used for general purpose algorithmsApproximation techniques used for general purpose algorithms
Approximation techniques used for general purpose algorithms
 
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHMJOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
 
Modeling Cardiac Pacemakers With Timed Coloured Petri Nets And Related Tools
Modeling Cardiac Pacemakers With Timed Coloured Petri Nets And Related ToolsModeling Cardiac Pacemakers With Timed Coloured Petri Nets And Related Tools
Modeling Cardiac Pacemakers With Timed Coloured Petri Nets And Related Tools
 
Lifetime-Aware Scheduling and Power Control for Cellular-based M2M Communicat...
Lifetime-Aware Scheduling and Power Control for Cellular-based M2M Communicat...Lifetime-Aware Scheduling and Power Control for Cellular-based M2M Communicat...
Lifetime-Aware Scheduling and Power Control for Cellular-based M2M Communicat...
 
Biomedical Signal and Image Analytics using MATLAB
Biomedical Signal and Image Analytics using MATLABBiomedical Signal and Image Analytics using MATLAB
Biomedical Signal and Image Analytics using MATLAB
 
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
 
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...
 
Real time operating systems
Real time operating systemsReal time operating systems
Real time operating systems
 
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
 
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
Data-Driven Security Assessment of Power Grids Based on Machine Learning Appr...
 
Wide Area Monitoring, Protection and Control (WAMPAC) Application in Transmis...
Wide Area Monitoring, Protection and Control (WAMPAC) Application in Transmis...Wide Area Monitoring, Protection and Control (WAMPAC) Application in Transmis...
Wide Area Monitoring, Protection and Control (WAMPAC) Application in Transmis...
 
MCS2SIM - Method Allowing Application of PSA Results in Simulators
MCS2SIM - Method Allowing Application of PSA Results in SimulatorsMCS2SIM - Method Allowing Application of PSA Results in Simulators
MCS2SIM - Method Allowing Application of PSA Results in Simulators
 
KCC2017 28APR2017
KCC2017 28APR2017KCC2017 28APR2017
KCC2017 28APR2017
 
Performance of a speculative transmission scheme for scheduling latency reduc...
Performance of a speculative transmission scheme for scheduling latency reduc...Performance of a speculative transmission scheme for scheduling latency reduc...
Performance of a speculative transmission scheme for scheduling latency reduc...
 

Recently uploaded

GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 

Recently uploaded (20)

GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 

Failure Prediction for APU on a Metro System

  • 1. Failure prediction for APU's on a Metro System Aaryadev Ghosalkar
  • 2. Agenda • Introduction • Problem Statement • Literature Review • Existing Systems • Modelling • Results • Conclusion • Future Scope Failure prediction for APU's on a Metro System 2
  • 3. Introduction Air production unit (APU) is component on most modern metro systems which circulates compressed air through the metro, our research presents a comprehensive comparison of PdM models on APU failure detection. Our goal is to detect failures at least 2 hours in advance, the challenge with APU's in particular is that APU failures are rare events In the dataset there are 3 failure within 6 months of operation Failure prediction for APU's on a Metro System 3
  • 4. Problem Statement • Detect failures at least 2 hours in advance • Create a system that is customizable can expand to more than 2 hours if required (also keeping in mind the side effects that may arise) • Provide a way to deploy the model for real time inference Failure prediction for APU's on a Metro System 4
  • 5. Literature Review Taking a look at what other researchers have done
  • 6. Literature Review • Veloso et al performed the initial data collection, which involved converting data from the metro company database into a CSV file • We used some of the ideas for data preprocessing presented by the AzureML team at microsoft • A study on Davari et al motivated many of the algorithms that we used in this study, there work has been a great resource in our study Failure prediction for APU's on a Metro System 6
  • 7. Literature Review continued • Chaudhuri et al used SVM to classify vehicles into 3 distinct risk levels, focusing on model interpretability, model interpretability refers to how easy it is to understand the choices made by the model and results of the model • Various other researchers have used CNN with GAF to convert timeseries data into images and used CNNs for classification most notable was done by Silva which reported an accuracy of 93% in their study Failure prediction for APU's on a Metro System 7
  • 8. Literature Review continued • Nguyen and Medjaher used LSTM to predicted the probability of failure in a given time window, we drew a lot of inspiration from their work in terms of balancing the dataset and the model used • In terms of early research into RUL estimation a majority of the work revolves around using statistical models or models which assume linear degradation pattern, it was only after 2016 Deep learning models were used in this field Failure prediction for APU's on a Metro System 8
  • 9. Existing Systems Predictive maintenance has been used in a lot of domains such as elevators and Jet engines.
  • 10. The Infrastructure abroad • Uses NLP on text fields where engineers describe the problem and how the problem was solved. • Also predict if a failure is about to happen and which component will fail • TFL expects to save £3 million a year Failure prediction for APU's on a Metro System 10 • TrainDNA collects and analyzes real- time data from more than 200 trains across Australia using IBM Maximo • 51% Increase in realiablity after introducing the TrainDNA system • Considers multiple failures • Large scale system able to process 30 Million message every hour London Australia and NZL
  • 11. Modelling Taking a look at how the data was processed and what ML models were tested Failure prediction for APU's on a Metro System 11
  • 12. Data Preprocessing Most of the data collection and cleaning work on this data set was done by researchers that created the data set Thus this data set did not have any null values Failure prediction for APU's on a Metro System • Linear Model are a bad idea • There is no obvious pattern which we can see to distinguish each class, which further motivated use of deep learning • Balance is key! Collection EDA • Discretizing the data to make this more suitable for classification • Generalizing the model using a parameter to control number of hours before warning. Preparation 12
  • 13. Challenges in data processing • The sheer volume of data presented a significant challenge, there were more than 10 000 000 (1Cr) rows in the dataset to solve this we performed carefully changed the data types of the features to minimize the loss of information this allowed us to significantly reduce size of the data with a 91% decrease in storage space and 77% decrease in RAM usage Failure prediction for APU's on a Metro System 13
  • 14. Challenges in Data processing continued • The highly imbalanced nature of the data also presented a huge challenge when training the models, since APU failures are a rare event and only 3% of the data constitutes of APU failure a model that predicts all data points as normal would mathematically have an accuracy of 97%, to combat this we used Near miss under sampling and a convenience sampling like strategy for the LSTM, inspired by Chen et al Failure prediction for APU's on a Metro System 14
  • 15. Machine Learning Models • SVMs take very long to train • Results are not satisfactory and most other models perform better Failure prediction for APU's on a Metro System • Very quick training time • The hyper parameters such as the criterion do not make any difference to the accuracy SVM Decision Trees • Provide greater accuracy than decision trees due to this being an ensemble model • Training time is similar to decision trees Random Forests 15
  • 16. Deep Learning Models • Used a small neural network of 25k parameters with AdamW optimizer • Best results from all tested model • Does not take into account the time series nature of the data • Requires CUDA for efficient deployment. Failure prediction for APU's on a Metro System 16 • Can incorporate temporal patterns • Large model almost 184K parameters which makes training difficult. • Requires 3rd order tensors as input for training thus balancing data is hard • Also requires CUDA for efficient training and deployment Neural Network LSTM Network
  • 17. Results Accuracy Precision (Class 1) Recall (Class 1) SVM 0.57 0.60 0.62 Decision Tree 0.66 0.69 0.70 Random Forest 0.70 0.69 0.78 Neural Network (Adam) 0.72 0.67 0.91 Neural Network (AdamW) 0.76 0.75 0.87 LSTM Network 0.76 0.64 0.85 Failure prediction for APU's on a Metro System 17
  • 18. Conclusion • Neural networks produce satisfactory results however the Random forest model can be used when deploying on low end hardware or considering an edge solution. LSTMs perform nearly identical to neural networks with less data • Microservices can be considered when deploying as this will allow different parts of the model to be deployed separately • Multi collinearity is present in the data which can makes it hard to fit any kind of linear model Failure prediction for APU's on a Metro System 18
  • 19. Future Scope • Transformer based architecture can be considered as they have shown promising results with sequential data in the case of NLP applications • Perhaps due to the imbalanced nature and the rarity of APU failures on a real metro systems an anomaly detection approach would be better as this would not need the data to be balanced and a lot more of the existing data can be used Failure prediction for APU's on a Metro System 19