SlideShare a Scribd company logo
1 of 22
A Machine Learning Project by
Pranov Mishra
Preventive Maintenance
Recommendations
Executive Summary
 A thorough analysis was done to identify if there are ways of knowing which machines have
higher probabilities of breaking down. The ultimate goal of the management is to improve
the productivity of the company by ensuring minimum or no stoppage of work at any point
of time.
 The idea of reviewing the data is to come up with a implementable framework and establish
protocols which will enable visibility of machine health status and proactively take remedial
steps before an actual breakdown. Post analysis the summary and recommendations are
given below:
I. Machines delivered by Provider3 breakdown much earlier, as early as at 60 months.
Management needs to have discussions around, if they should continue with Provider3
and/or initiate discussions with them to get them to improve their quality of delivered
products.
II. In the interim, mandate monthly review of all Provider 3 machines aged more than 60
months.
III. Mandate monthly review of all machines older than 72.5 months that are provided by
providers 1,2 and 4.
IV. Essentially all machines older than 72.5 months will need monthly preventative
Data set Summary
 The data-set has 90,000 observations.
 The data-set constitutes historical information of whether a machine has broken
down or not and the various predictor variables which supposedly play a role in
deciding overall health and longevity of the machines in use.
 There are 7 variables with the variable, “broken” indicating whether the machine
had broken down or not.
 The variables are the key initial insights summarizing them are given below
Variable Name Data Type Max Min Levels
lifetime Numeric 93 1 NA
broken Numeric* 1 0 Needs to be converted to factor
variable with 2 levels – 0 & 1
pressureInd_1 Numeric 173.28 33.48 NA
pressureInd_2 Numeric 128.60 58.55 NA
pressureInd_3 Numeric 172.54 42.28 NA
team Categorical NA NA 3 – TeamA,B and C
provider Categorical NA NA 4 – Provider1,2,3 and 4
Approach to Solution
 All machines, over their life-time undergo wear and tear and require constant monitoring to
ensure that their thresholds to break down are not breached thereby extending the longevity.
 The goal here is to analyze the data to identify the variables that indeed contribute to wear and
tear of machines thereby affecting (negatively) the lifetime of a machine.
 The next goal is to assess and calculate the thresholds which will work as early warning
indicators, thereby triggering timely repair, ensuring a prevention of an early break down.
 The approach would involve doing thorough exploratory analysis and building a predictive model
to call out early warning indicators
I. Identify if there are distinct patterns that point to what specifically contributes to a
break down.
II. Check if the distribution of # of machines broken down or otherwise across all levels
of teams and providers are same or different.
III. Check the lifetime of machines, both broken and otherwise, across all combinations
of teams and providers.
IV. Partition the data to Training and Testing dataset to build the build on the former and
test it on the latter.
V. Build a model to identify which factors are statistically significant in terms of
contributing to the machine breakdown.
VI. Identify the thresholds of the combination of and/or individual factors that will trigger
inspection and appropriate work prior to a breakdown.
Approach to Solution – Initial Insights
The initial data exploration suggests that no machine with a lifetime less than 60 months has broken down.
See below. Hence one of
the approach to be taken would be to select all observations with lifetime greater than 60 and explore further
to identify any
significant factor contributing towards machine break down.
Variable Profiling – Continuous Variables
 Upon binning the lifetime it was found that highest percent of machine breakdowns happen
in the “lifetime” range of 88-93. However it is also seen that the minimum age at which
machine breaks down is 60, as was seen in the previous slide. Breakdown is more than
50% in every grouping after machine has crossed 60 years.
 Upon completing a similar analysis on pressure indicators, no pattern was observed. As the
average pressure increases across the groups, the break percentage is not exhibiting any
distinct pattern.
Variable Profiling – Categorical Variables
Analysis of the categorical variables individually with the target variable is shown below.
Providers 1 and 3 seem to have higher contribution towards a machine breakdown.
Machines used by team B seems to be experiencing much higher break downs than the
machines used by other teams.
Data Analysis - - Exploration & Visualization
Lifetime Comparison of Machines
The average life of machines that are broken is seen to be almost double of that of
machines that are not broken. This is a good thing and expected since the machines that
are broken have served the company for a long time before breaking down and the newer
machines would be expected to serve for close to 78 months on average before breaking
down.
Data Analysis - Exploration & Visualization
Comparisons -Pressure Versus Machine Health Status
There does not seem to be any significant difference in the average pressures at
any of the pressure indicator points for machines that have broken down versus
machines that have not broken down. There needs to be further multivariate
analysis to understand if interaction of the pressure with other variables plays a
role or not.
Data Analysis - Exploration & Visualization
Defect Proportion comparisons by absolute Numbers
There does not seem to be any significant difference in the average pressures at
any of the pressure indicator points for machines that have broken down versus
machines that have not broken down. There needs to be further multivariate
analysis to understand if interaction of the pressure with other variables plays a
role or not.
Data Analysis - Exploration & Visualization
Pressure Indicator1 V Lifetime
The pressure Indicator1 does not give any major insight as pressure values are consistent
across all combination of
providers and teams. Similar pattern is seen for both broken and non-broken machines.
However what we can infer
is that for all machines with Team C, there is a tendency to break down earlier than machines
with Team A and B.
Data Analysis - Exploration & Visualization
Pressure Indicator1 V Lifetime – Filtered by Machines = Broken
Further analysis by sub setting the data to be consisting only of machines that have broken
down, we see that pressure indicator is consistent all across but TeamC machines break down
much earlier and the lifetime values are different across the providers. Lifetime values are least
for Provider3, followed by Provider1 and Provider4.
Data Analysis - Exploration & Visualization
Pressure Indicators(2 & 3) V Lifetime – Filtered by Machines = Broken
Exactly same observation was made for pressures at indicator point 2 and 3. The graphs are
below to
demonstrate the same.
Data Analysis - Exploration & Visualization
Data Split for further analysis
 For further analysis the data is subset by filtering out all machines with age less than 60
since all machines less than 60 months of lifetime are found to be in good health across all
variables. After sub-setting data the attempt would be to identify the significant factors
contributing to a machine breakdown and work towards developing a strategy to use this
information to improve the longevity of the machines.
 With the split we have 47550 observations with the same 7 variables. It has 35,522
machines that have broken down and 12028 machines that are in good health, about 25%
of the total in good health. The new distribution is shown below. New insights would suggest
that, Provider4 seems to be providing best machines and TeamA seems to be handling the
machines best.
Data Analysis - Exploration & Visualization
Outlier Analysis of numeric variables
The box plots of all the numeric variables are shown below. It is seen that pressures at
indicator points at 1,2 and 3 have outliers. An analysis was done on these outliers and it
was found that 2/3rd of the outliers are aligning with machines that are broken but 1/3
(=32.85%) of them align with machines which are not broken. It is noticed that total no. of
outlier observations is equal to 1385 which is less than 3% of total observations. The
client needs to be asked if the outlier values are probable values or we need to impute
with cut offs. In the current scenario, assuming outlier values are possible (since outliers
do not represent specific pattern for broken status) we will build the model without further
treatment of them.
Model Building
As mentioned earlier, for model building effort the observations involving lifetime of
machines less than 60 months is eliminated. The dimensions of the new data is 47550
obs. of 7 variables.
The data is further split into training and testing datasets in the ratio of 75:25. The code is
below for the split.
count.rows=nrow(MydataNew)
train.end.row=round(count.rows*0.75)
test.start.row=train.end.row+1
set.seed(1234)
Mydata_Random=MydataNew[order(runif(count.rows)),]
Train=Mydata_Random[1:train.end.row,]
Test=Mydata_Random[test.start.row:count.rows,]
Model Building
A Model is built using CART technique to predict the likelihood of machine breaking down.
The model is built on the training dataset. The 1st model is built with a reasonable
restriction of, 3000 observations required to be existing in any node to qualify for splitting.
The tree is allowed to grow fully with complexity parameter set at zero. The tree looks as
below
Model Building
The model was then pruned by increasing the complexity parameter to 0.04000147. The
optimal cp is arrived at
by looking at the cptable from the final model. The cptable shows that the optimal # of
terminal nodes should be
5, which is reinforced in the screeplot below. The optimal size of the tree should be 5. The cp
value which
corresponds to 5 terminal nodes = 0.04000147. See below
CP nsplit rel error xerror xstd
1 0.25237359 0 1.0000000 1.0000000 0.009075164
2 0.09295650 2 0.4952528 0.4952528 0.006913607
3 0.04000147 4 0.3093398 0.3093398 0.005609610
4 0.00000000 7 0.1893354 0.3093398 0.005609610
Final Model
The final model looks like below. There are 5 terminal nodes but the nodes of great interest
are the 3 which are the
nodes that are predicting which machines will breakdown. Before extracting the rules, lets
assess the model
performance (next slide).
Model Performance Assessment
The final model is tested by applying the model on an unseen data to predict the accuracy. The
accuracy of the model built is found to be 97.60. Code below:
Pred=predict(ModelFinal, newdata = Test, type = "class")
CT_Cart=table(Actual=Test$broken,Predicted=Pred)
Acc_Cart=sum(diag(CT_Cart))/sum(CT_Cart)
The ROCR for the model built is shown below which has an AUC of 98.45, which is amazing result.
Rules from the Model to Identify the Machines that may require Preventive care
The rules that are extracted from the model for the terminal nodes is given below. The focus
is majorly on the first
3 nodes below as they have a very high proportion of machines that breakdown. The first 3
nodes below constitute
82% of the total observations and all the machines that have a high probability of breaking
down. Hence this is a
very good split as by identifying the 17.542% of the total population which constitutes the last
2 nodes, 100% of
the machines which have a high probability of breaking down can be identified. They can be
looked into for
providing preventive maintenance to prevent a breakdown.
Recommendations
 Create a framework which will mandate monthly preventative maintenance review for all
machines that reach 78.5 months of age(lifetime). The average age of machines that have
crossed 78.5 months and have broken down is 85.3 months and there are instances of
machine breaking down at 79,80, 81 months. Hence preventive maintenance can either
increase longevity or prompt the management to replace the machine in case it is likely to
breakdown. Either way its a proactive measure to prevent sudden stoppage of work due to
lack of knowledge of when a machine will breakdown.
 The same framework needs to be applied on all machines that are provided by Provider3
and are older than 60 months ( & less than 78.5).
 For all machines that are in the range of 72.5 months and 78.5 months and the provider is
not Provider3, consistent monitoring as mentioned above is required. This is essentially
combining Rule 3 and Rule 4 from previous slide. Though rule 4 states that there is no need
to monitor machines aged between 75 and 78.5 if they are not from provider3, it is sensible
to monitor them as machines with lesser lifetime than 75 have broken down as seen
through rule 2.

More Related Content

What's hot

Process Capability: Steps 1 to 3
Process Capability: Steps 1 to 3Process Capability: Steps 1 to 3
Process Capability: Steps 1 to 3Matt Hansen
 
Rational Sub-Grouping
Rational Sub-GroupingRational Sub-Grouping
Rational Sub-GroupingMatt Hansen
 
Process Capability: Step 5 (Non-Normal Distributions)
Process Capability: Step 5 (Non-Normal Distributions)Process Capability: Step 5 (Non-Normal Distributions)
Process Capability: Step 5 (Non-Normal Distributions)Matt Hansen
 
What Is a Model, Anyhow?
What Is a Model, Anyhow?What Is a Model, Anyhow?
What Is a Model, Anyhow?Bill Cassill
 
Variation Over Time (Short/Long Term Data)
Variation Over Time (Short/Long Term Data)Variation Over Time (Short/Long Term Data)
Variation Over Time (Short/Long Term Data)Matt Hansen
 
MSA – Improving the Measurement System
MSA – Improving the Measurement SystemMSA – Improving the Measurement System
MSA – Improving the Measurement SystemMatt Hansen
 
Control Charts: U Chart
Control Charts: U ChartControl Charts: U Chart
Control Charts: U ChartMatt Hansen
 
Building a Scorecard
Building a ScorecardBuilding a Scorecard
Building a ScorecardMatt Hansen
 
Process capability
Process capabilityProcess capability
Process capabilitypadam nagar
 
Statistical Process Control Part 1
Statistical Process Control Part 1Statistical Process Control Part 1
Statistical Process Control Part 1Malay Pandya
 
Final Case Study Churn (Autosaved)
Final Case Study Churn (Autosaved)Final Case Study Churn (Autosaved)
Final Case Study Churn (Autosaved)Marreddy P
 
Tools Of Quality Control PowerPoint Presentation Slides
Tools Of Quality Control PowerPoint Presentation Slides Tools Of Quality Control PowerPoint Presentation Slides
Tools Of Quality Control PowerPoint Presentation Slides SlideTeam
 
Statistical Process Control Part 2
Statistical Process Control Part 2Statistical Process Control Part 2
Statistical Process Control Part 2Malay Pandya
 
The graphical analysis for maintenace management method
The graphical analysis for maintenace management methodThe graphical analysis for maintenace management method
The graphical analysis for maintenace management methodPeterpanPan3
 
Control Charts28 Modified
Control Charts28 ModifiedControl Charts28 Modified
Control Charts28 Modifiedvaliamoley
 

What's hot (20)

Process Capability: Steps 1 to 3
Process Capability: Steps 1 to 3Process Capability: Steps 1 to 3
Process Capability: Steps 1 to 3
 
Rational Sub-Grouping
Rational Sub-GroupingRational Sub-Grouping
Rational Sub-Grouping
 
Process Capability: Step 5 (Non-Normal Distributions)
Process Capability: Step 5 (Non-Normal Distributions)Process Capability: Step 5 (Non-Normal Distributions)
Process Capability: Step 5 (Non-Normal Distributions)
 
C O N T R O L L P R E S E N T A T I O N
C O N T R O L L  P R E S E N T A T I O NC O N T R O L L  P R E S E N T A T I O N
C O N T R O L L P R E S E N T A T I O N
 
What Is a Model, Anyhow?
What Is a Model, Anyhow?What Is a Model, Anyhow?
What Is a Model, Anyhow?
 
Quality tools
Quality toolsQuality tools
Quality tools
 
Variation Over Time (Short/Long Term Data)
Variation Over Time (Short/Long Term Data)Variation Over Time (Short/Long Term Data)
Variation Over Time (Short/Long Term Data)
 
Spc
SpcSpc
Spc
 
MSA – Improving the Measurement System
MSA – Improving the Measurement SystemMSA – Improving the Measurement System
MSA – Improving the Measurement System
 
SPC,SQC & QC TOOLS
SPC,SQC & QC TOOLSSPC,SQC & QC TOOLS
SPC,SQC & QC TOOLS
 
Control Charts: U Chart
Control Charts: U ChartControl Charts: U Chart
Control Charts: U Chart
 
Building a Scorecard
Building a ScorecardBuilding a Scorecard
Building a Scorecard
 
Process capability
Process capabilityProcess capability
Process capability
 
Statistical Process Control Part 1
Statistical Process Control Part 1Statistical Process Control Part 1
Statistical Process Control Part 1
 
Final Case Study Churn (Autosaved)
Final Case Study Churn (Autosaved)Final Case Study Churn (Autosaved)
Final Case Study Churn (Autosaved)
 
Tools Of Quality Control PowerPoint Presentation Slides
Tools Of Quality Control PowerPoint Presentation Slides Tools Of Quality Control PowerPoint Presentation Slides
Tools Of Quality Control PowerPoint Presentation Slides
 
Statistical Process Control Part 2
Statistical Process Control Part 2Statistical Process Control Part 2
Statistical Process Control Part 2
 
The graphical analysis for maintenace management method
The graphical analysis for maintenace management methodThe graphical analysis for maintenace management method
The graphical analysis for maintenace management method
 
Control charts
Control charts Control charts
Control charts
 
Control Charts28 Modified
Control Charts28 ModifiedControl Charts28 Modified
Control Charts28 Modified
 

Similar to Recommendations for Preventive Maintenance - A Machine Learning Project

Jigsaw Corporate Contest: Pexitics Preventive Maintenance Case Study
Jigsaw Corporate Contest: Pexitics Preventive Maintenance Case StudyJigsaw Corporate Contest: Pexitics Preventive Maintenance Case Study
Jigsaw Corporate Contest: Pexitics Preventive Maintenance Case StudyAnupama Rathore
 
Jigsaw Academy Pexitics Student Projects
Jigsaw Academy Pexitics Student ProjectsJigsaw Academy Pexitics Student Projects
Jigsaw Academy Pexitics Student ProjectsJigsaw Academy
 
Guidelines to Understanding to estimate MTBF
Guidelines to Understanding to estimate MTBFGuidelines to Understanding to estimate MTBF
Guidelines to Understanding to estimate MTBFijsrd.com
 
Quality control and inspection
Quality control and inspectionQuality control and inspection
Quality control and inspectionSujal Topno
 
Validation of Maintenance Policy of Steel Plant Machine Shop By Analytic Hier...
Validation of Maintenance Policy of Steel Plant Machine Shop By Analytic Hier...Validation of Maintenance Policy of Steel Plant Machine Shop By Analytic Hier...
Validation of Maintenance Policy of Steel Plant Machine Shop By Analytic Hier...IJMERJOURNAL
 
IMPLEMENTATION OF STATISTICAL PROCESS CONTROL TOOL IN AN AUTOMOBILE MANUFACTU...
IMPLEMENTATION OF STATISTICAL PROCESS CONTROL TOOL IN AN AUTOMOBILE MANUFACTU...IMPLEMENTATION OF STATISTICAL PROCESS CONTROL TOOL IN AN AUTOMOBILE MANUFACTU...
IMPLEMENTATION OF STATISTICAL PROCESS CONTROL TOOL IN AN AUTOMOBILE MANUFACTU...Angela Williams
 
Five costly mistakes applying spc [whitepaper]
Five costly mistakes applying spc [whitepaper]Five costly mistakes applying spc [whitepaper]
Five costly mistakes applying spc [whitepaper]Blackberry&Cross
 
predictive maintenance
predictive maintenancepredictive maintenance
predictive maintenanceAmey Kulkarni
 
Lecture5 Applied Econometrics and Economic Modeling
Lecture5 Applied Econometrics and Economic ModelingLecture5 Applied Econometrics and Economic Modeling
Lecture5 Applied Econometrics and Economic Modelingstone55
 
How to Improve Quality and Efficiency Using Test Data Analytics
How to Improve Quality and Efficiency Using Test Data AnalyticsHow to Improve Quality and Efficiency Using Test Data Analytics
How to Improve Quality and Efficiency Using Test Data AnalyticsTequra Analytics
 
IRJET- Overview of Forecasting Techniques
IRJET- Overview of Forecasting TechniquesIRJET- Overview of Forecasting Techniques
IRJET- Overview of Forecasting TechniquesIRJET Journal
 
Measurement system analysis Presentation.ppt
Measurement system analysis Presentation.pptMeasurement system analysis Presentation.ppt
Measurement system analysis Presentation.pptjawadullah25
 
Statistical Process Control,Control Chart and Process Capability
Statistical Process Control,Control Chart and Process CapabilityStatistical Process Control,Control Chart and Process Capability
Statistical Process Control,Control Chart and Process Capabilityvaidehishah25
 
Statistical quality control .pdf
Statistical quality control .pdfStatistical quality control .pdf
Statistical quality control .pdfUVAS
 
GP_Training_Introduction-to-MSA__RevAF.pptx
GP_Training_Introduction-to-MSA__RevAF.pptxGP_Training_Introduction-to-MSA__RevAF.pptx
GP_Training_Introduction-to-MSA__RevAF.pptxssuserbcf0cd
 
STATISTICAL PROCESS CONTROL(PPT).pptx
STATISTICAL PROCESS CONTROL(PPT).pptxSTATISTICAL PROCESS CONTROL(PPT).pptx
STATISTICAL PROCESS CONTROL(PPT).pptxmayankdubey99
 
Gage Repeatability and Reproducibility in Semiconductor Manufacturing.pptx
Gage Repeatability and Reproducibility in Semiconductor Manufacturing.pptxGage Repeatability and Reproducibility in Semiconductor Manufacturing.pptx
Gage Repeatability and Reproducibility in Semiconductor Manufacturing.pptxyieldWerx Semiconductor
 

Similar to Recommendations for Preventive Maintenance - A Machine Learning Project (20)

Jigsaw Corporate Contest: Pexitics Preventive Maintenance Case Study
Jigsaw Corporate Contest: Pexitics Preventive Maintenance Case StudyJigsaw Corporate Contest: Pexitics Preventive Maintenance Case Study
Jigsaw Corporate Contest: Pexitics Preventive Maintenance Case Study
 
Jigsaw Academy Pexitics Student Projects
Jigsaw Academy Pexitics Student ProjectsJigsaw Academy Pexitics Student Projects
Jigsaw Academy Pexitics Student Projects
 
Guidelines to Understanding to estimate MTBF
Guidelines to Understanding to estimate MTBFGuidelines to Understanding to estimate MTBF
Guidelines to Understanding to estimate MTBF
 
Quality control and inspection
Quality control and inspectionQuality control and inspection
Quality control and inspection
 
Validation of Maintenance Policy of Steel Plant Machine Shop By Analytic Hier...
Validation of Maintenance Policy of Steel Plant Machine Shop By Analytic Hier...Validation of Maintenance Policy of Steel Plant Machine Shop By Analytic Hier...
Validation of Maintenance Policy of Steel Plant Machine Shop By Analytic Hier...
 
IMPLEMENTATION OF STATISTICAL PROCESS CONTROL TOOL IN AN AUTOMOBILE MANUFACTU...
IMPLEMENTATION OF STATISTICAL PROCESS CONTROL TOOL IN AN AUTOMOBILE MANUFACTU...IMPLEMENTATION OF STATISTICAL PROCESS CONTROL TOOL IN AN AUTOMOBILE MANUFACTU...
IMPLEMENTATION OF STATISTICAL PROCESS CONTROL TOOL IN AN AUTOMOBILE MANUFACTU...
 
Five costly mistakes applying spc [whitepaper]
Five costly mistakes applying spc [whitepaper]Five costly mistakes applying spc [whitepaper]
Five costly mistakes applying spc [whitepaper]
 
predictive maintenance
predictive maintenancepredictive maintenance
predictive maintenance
 
Lecture5 Applied Econometrics and Economic Modeling
Lecture5 Applied Econometrics and Economic ModelingLecture5 Applied Econometrics and Economic Modeling
Lecture5 Applied Econometrics and Economic Modeling
 
How to Improve Quality and Efficiency Using Test Data Analytics
How to Improve Quality and Efficiency Using Test Data AnalyticsHow to Improve Quality and Efficiency Using Test Data Analytics
How to Improve Quality and Efficiency Using Test Data Analytics
 
Review on Quality Management using 7 QC Tools
Review on Quality Management using 7 QC ToolsReview on Quality Management using 7 QC Tools
Review on Quality Management using 7 QC Tools
 
IRJET- Overview of Forecasting Techniques
IRJET- Overview of Forecasting TechniquesIRJET- Overview of Forecasting Techniques
IRJET- Overview of Forecasting Techniques
 
Em33832837
Em33832837Em33832837
Em33832837
 
Measurement system analysis Presentation.ppt
Measurement system analysis Presentation.pptMeasurement system analysis Presentation.ppt
Measurement system analysis Presentation.ppt
 
Statistical Process Control,Control Chart and Process Capability
Statistical Process Control,Control Chart and Process CapabilityStatistical Process Control,Control Chart and Process Capability
Statistical Process Control,Control Chart and Process Capability
 
Statistical quality control .pdf
Statistical quality control .pdfStatistical quality control .pdf
Statistical quality control .pdf
 
GP_Training_Introduction-to-MSA__RevAF.pptx
GP_Training_Introduction-to-MSA__RevAF.pptxGP_Training_Introduction-to-MSA__RevAF.pptx
GP_Training_Introduction-to-MSA__RevAF.pptx
 
STATISTICAL PROCESS CONTROL(PPT).pptx
STATISTICAL PROCESS CONTROL(PPT).pptxSTATISTICAL PROCESS CONTROL(PPT).pptx
STATISTICAL PROCESS CONTROL(PPT).pptx
 
Gage Repeatability and Reproducibility in Semiconductor Manufacturing.pptx
Gage Repeatability and Reproducibility in Semiconductor Manufacturing.pptxGage Repeatability and Reproducibility in Semiconductor Manufacturing.pptx
Gage Repeatability and Reproducibility in Semiconductor Manufacturing.pptx
 
Accelerated Stress Testing
Accelerated Stress TestingAccelerated Stress Testing
Accelerated Stress Testing
 

More from Pranov Mishra

Automation of IT Ticket Automation using NLP and Deep Learning
Automation of IT Ticket Automation using NLP and Deep LearningAutomation of IT Ticket Automation using NLP and Deep Learning
Automation of IT Ticket Automation using NLP and Deep LearningPranov Mishra
 
Sales Performance Deep Dive and Forecast: A ML Driven Analytics Solution
Sales Performance Deep Dive and Forecast: A ML Driven Analytics SolutionSales Performance Deep Dive and Forecast: A ML Driven Analytics Solution
Sales Performance Deep Dive and Forecast: A ML Driven Analytics SolutionPranov Mishra
 
Prediction of potential customers for term deposit
Prediction of potential customers for term depositPrediction of potential customers for term deposit
Prediction of potential customers for term depositPranov Mishra
 
Reduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage IndustryReduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage IndustryPranov Mishra
 
Prediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPrediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPranov Mishra
 
Impact of Macro-Economic Factors on Customer Behaviour in the US Insurance In...
Impact of Macro-Economic Factors on Customer Behaviour in the US Insurance In...Impact of Macro-Economic Factors on Customer Behaviour in the US Insurance In...
Impact of Macro-Economic Factors on Customer Behaviour in the US Insurance In...Pranov Mishra
 

More from Pranov Mishra (6)

Automation of IT Ticket Automation using NLP and Deep Learning
Automation of IT Ticket Automation using NLP and Deep LearningAutomation of IT Ticket Automation using NLP and Deep Learning
Automation of IT Ticket Automation using NLP and Deep Learning
 
Sales Performance Deep Dive and Forecast: A ML Driven Analytics Solution
Sales Performance Deep Dive and Forecast: A ML Driven Analytics SolutionSales Performance Deep Dive and Forecast: A ML Driven Analytics Solution
Sales Performance Deep Dive and Forecast: A ML Driven Analytics Solution
 
Prediction of potential customers for term deposit
Prediction of potential customers for term depositPrediction of potential customers for term deposit
Prediction of potential customers for term deposit
 
Reduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage IndustryReduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage Industry
 
Prediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPrediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom Industry
 
Impact of Macro-Economic Factors on Customer Behaviour in the US Insurance In...
Impact of Macro-Economic Factors on Customer Behaviour in the US Insurance In...Impact of Macro-Economic Factors on Customer Behaviour in the US Insurance In...
Impact of Macro-Economic Factors on Customer Behaviour in the US Insurance In...
 

Recently uploaded

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 

Recently uploaded (20)

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 

Recommendations for Preventive Maintenance - A Machine Learning Project

  • 1. A Machine Learning Project by Pranov Mishra Preventive Maintenance Recommendations
  • 2. Executive Summary  A thorough analysis was done to identify if there are ways of knowing which machines have higher probabilities of breaking down. The ultimate goal of the management is to improve the productivity of the company by ensuring minimum or no stoppage of work at any point of time.  The idea of reviewing the data is to come up with a implementable framework and establish protocols which will enable visibility of machine health status and proactively take remedial steps before an actual breakdown. Post analysis the summary and recommendations are given below: I. Machines delivered by Provider3 breakdown much earlier, as early as at 60 months. Management needs to have discussions around, if they should continue with Provider3 and/or initiate discussions with them to get them to improve their quality of delivered products. II. In the interim, mandate monthly review of all Provider 3 machines aged more than 60 months. III. Mandate monthly review of all machines older than 72.5 months that are provided by providers 1,2 and 4. IV. Essentially all machines older than 72.5 months will need monthly preventative
  • 3. Data set Summary  The data-set has 90,000 observations.  The data-set constitutes historical information of whether a machine has broken down or not and the various predictor variables which supposedly play a role in deciding overall health and longevity of the machines in use.  There are 7 variables with the variable, “broken” indicating whether the machine had broken down or not.  The variables are the key initial insights summarizing them are given below Variable Name Data Type Max Min Levels lifetime Numeric 93 1 NA broken Numeric* 1 0 Needs to be converted to factor variable with 2 levels – 0 & 1 pressureInd_1 Numeric 173.28 33.48 NA pressureInd_2 Numeric 128.60 58.55 NA pressureInd_3 Numeric 172.54 42.28 NA team Categorical NA NA 3 – TeamA,B and C provider Categorical NA NA 4 – Provider1,2,3 and 4
  • 4. Approach to Solution  All machines, over their life-time undergo wear and tear and require constant monitoring to ensure that their thresholds to break down are not breached thereby extending the longevity.  The goal here is to analyze the data to identify the variables that indeed contribute to wear and tear of machines thereby affecting (negatively) the lifetime of a machine.  The next goal is to assess and calculate the thresholds which will work as early warning indicators, thereby triggering timely repair, ensuring a prevention of an early break down.  The approach would involve doing thorough exploratory analysis and building a predictive model to call out early warning indicators I. Identify if there are distinct patterns that point to what specifically contributes to a break down. II. Check if the distribution of # of machines broken down or otherwise across all levels of teams and providers are same or different. III. Check the lifetime of machines, both broken and otherwise, across all combinations of teams and providers. IV. Partition the data to Training and Testing dataset to build the build on the former and test it on the latter. V. Build a model to identify which factors are statistically significant in terms of contributing to the machine breakdown. VI. Identify the thresholds of the combination of and/or individual factors that will trigger inspection and appropriate work prior to a breakdown.
  • 5. Approach to Solution – Initial Insights The initial data exploration suggests that no machine with a lifetime less than 60 months has broken down. See below. Hence one of the approach to be taken would be to select all observations with lifetime greater than 60 and explore further to identify any significant factor contributing towards machine break down.
  • 6. Variable Profiling – Continuous Variables  Upon binning the lifetime it was found that highest percent of machine breakdowns happen in the “lifetime” range of 88-93. However it is also seen that the minimum age at which machine breaks down is 60, as was seen in the previous slide. Breakdown is more than 50% in every grouping after machine has crossed 60 years.  Upon completing a similar analysis on pressure indicators, no pattern was observed. As the average pressure increases across the groups, the break percentage is not exhibiting any distinct pattern.
  • 7. Variable Profiling – Categorical Variables Analysis of the categorical variables individually with the target variable is shown below. Providers 1 and 3 seem to have higher contribution towards a machine breakdown. Machines used by team B seems to be experiencing much higher break downs than the machines used by other teams.
  • 8. Data Analysis - - Exploration & Visualization Lifetime Comparison of Machines The average life of machines that are broken is seen to be almost double of that of machines that are not broken. This is a good thing and expected since the machines that are broken have served the company for a long time before breaking down and the newer machines would be expected to serve for close to 78 months on average before breaking down.
  • 9. Data Analysis - Exploration & Visualization Comparisons -Pressure Versus Machine Health Status There does not seem to be any significant difference in the average pressures at any of the pressure indicator points for machines that have broken down versus machines that have not broken down. There needs to be further multivariate analysis to understand if interaction of the pressure with other variables plays a role or not.
  • 10. Data Analysis - Exploration & Visualization Defect Proportion comparisons by absolute Numbers There does not seem to be any significant difference in the average pressures at any of the pressure indicator points for machines that have broken down versus machines that have not broken down. There needs to be further multivariate analysis to understand if interaction of the pressure with other variables plays a role or not.
  • 11. Data Analysis - Exploration & Visualization Pressure Indicator1 V Lifetime The pressure Indicator1 does not give any major insight as pressure values are consistent across all combination of providers and teams. Similar pattern is seen for both broken and non-broken machines. However what we can infer is that for all machines with Team C, there is a tendency to break down earlier than machines with Team A and B.
  • 12. Data Analysis - Exploration & Visualization Pressure Indicator1 V Lifetime – Filtered by Machines = Broken Further analysis by sub setting the data to be consisting only of machines that have broken down, we see that pressure indicator is consistent all across but TeamC machines break down much earlier and the lifetime values are different across the providers. Lifetime values are least for Provider3, followed by Provider1 and Provider4.
  • 13. Data Analysis - Exploration & Visualization Pressure Indicators(2 & 3) V Lifetime – Filtered by Machines = Broken Exactly same observation was made for pressures at indicator point 2 and 3. The graphs are below to demonstrate the same.
  • 14. Data Analysis - Exploration & Visualization Data Split for further analysis  For further analysis the data is subset by filtering out all machines with age less than 60 since all machines less than 60 months of lifetime are found to be in good health across all variables. After sub-setting data the attempt would be to identify the significant factors contributing to a machine breakdown and work towards developing a strategy to use this information to improve the longevity of the machines.  With the split we have 47550 observations with the same 7 variables. It has 35,522 machines that have broken down and 12028 machines that are in good health, about 25% of the total in good health. The new distribution is shown below. New insights would suggest that, Provider4 seems to be providing best machines and TeamA seems to be handling the machines best.
  • 15. Data Analysis - Exploration & Visualization Outlier Analysis of numeric variables The box plots of all the numeric variables are shown below. It is seen that pressures at indicator points at 1,2 and 3 have outliers. An analysis was done on these outliers and it was found that 2/3rd of the outliers are aligning with machines that are broken but 1/3 (=32.85%) of them align with machines which are not broken. It is noticed that total no. of outlier observations is equal to 1385 which is less than 3% of total observations. The client needs to be asked if the outlier values are probable values or we need to impute with cut offs. In the current scenario, assuming outlier values are possible (since outliers do not represent specific pattern for broken status) we will build the model without further treatment of them.
  • 16. Model Building As mentioned earlier, for model building effort the observations involving lifetime of machines less than 60 months is eliminated. The dimensions of the new data is 47550 obs. of 7 variables. The data is further split into training and testing datasets in the ratio of 75:25. The code is below for the split. count.rows=nrow(MydataNew) train.end.row=round(count.rows*0.75) test.start.row=train.end.row+1 set.seed(1234) Mydata_Random=MydataNew[order(runif(count.rows)),] Train=Mydata_Random[1:train.end.row,] Test=Mydata_Random[test.start.row:count.rows,]
  • 17. Model Building A Model is built using CART technique to predict the likelihood of machine breaking down. The model is built on the training dataset. The 1st model is built with a reasonable restriction of, 3000 observations required to be existing in any node to qualify for splitting. The tree is allowed to grow fully with complexity parameter set at zero. The tree looks as below
  • 18. Model Building The model was then pruned by increasing the complexity parameter to 0.04000147. The optimal cp is arrived at by looking at the cptable from the final model. The cptable shows that the optimal # of terminal nodes should be 5, which is reinforced in the screeplot below. The optimal size of the tree should be 5. The cp value which corresponds to 5 terminal nodes = 0.04000147. See below CP nsplit rel error xerror xstd 1 0.25237359 0 1.0000000 1.0000000 0.009075164 2 0.09295650 2 0.4952528 0.4952528 0.006913607 3 0.04000147 4 0.3093398 0.3093398 0.005609610 4 0.00000000 7 0.1893354 0.3093398 0.005609610
  • 19. Final Model The final model looks like below. There are 5 terminal nodes but the nodes of great interest are the 3 which are the nodes that are predicting which machines will breakdown. Before extracting the rules, lets assess the model performance (next slide).
  • 20. Model Performance Assessment The final model is tested by applying the model on an unseen data to predict the accuracy. The accuracy of the model built is found to be 97.60. Code below: Pred=predict(ModelFinal, newdata = Test, type = "class") CT_Cart=table(Actual=Test$broken,Predicted=Pred) Acc_Cart=sum(diag(CT_Cart))/sum(CT_Cart) The ROCR for the model built is shown below which has an AUC of 98.45, which is amazing result.
  • 21. Rules from the Model to Identify the Machines that may require Preventive care The rules that are extracted from the model for the terminal nodes is given below. The focus is majorly on the first 3 nodes below as they have a very high proportion of machines that breakdown. The first 3 nodes below constitute 82% of the total observations and all the machines that have a high probability of breaking down. Hence this is a very good split as by identifying the 17.542% of the total population which constitutes the last 2 nodes, 100% of the machines which have a high probability of breaking down can be identified. They can be looked into for providing preventive maintenance to prevent a breakdown.
  • 22. Recommendations  Create a framework which will mandate monthly preventative maintenance review for all machines that reach 78.5 months of age(lifetime). The average age of machines that have crossed 78.5 months and have broken down is 85.3 months and there are instances of machine breaking down at 79,80, 81 months. Hence preventive maintenance can either increase longevity or prompt the management to replace the machine in case it is likely to breakdown. Either way its a proactive measure to prevent sudden stoppage of work due to lack of knowledge of when a machine will breakdown.  The same framework needs to be applied on all machines that are provided by Provider3 and are older than 60 months ( & less than 78.5).  For all machines that are in the range of 72.5 months and 78.5 months and the provider is not Provider3, consistent monitoring as mentioned above is required. This is essentially combining Rule 3 and Rule 4 from previous slide. Though rule 4 states that there is no need to monitor machines aged between 75 and 78.5 if they are not from provider3, it is sensible to monitor them as machines with lesser lifetime than 75 have broken down as seen through rule 2.