SlideShare a Scribd company logo
1 of 14
Download to read offline
Fault Prediction using
Logistic Regression
Using Python 3.5
17-Mar-17 binayakdutta@gmail.com 1
Preface
 This deck illustrates the considerations and method for use of
Logistic Regression and analytics in general
 For the illustration, a hypotheticalWind turbine based electricity
generation system is considered along with its associated IT
systems
 The LearningAndTesting Data is mocked up. Range bound
random weights are added to mimic natural randomness in data
 Detailing the Statistics behind the models or the overall
data/analytics architecture is not the focus and hence saved for
later
 Python 3.5 along with its libraries (pandas, statsmodels and
matplotlib) is used for implementing the regression models
17-Mar-17 binayakdutta@gmail.com 2
Approach to
Modeling
Acquire, Prepare,
Analyze
Select & Develop
Model
Train,Test &
Evaluate Model
Implement
Model in Solution
Re-evaluate &
Re-train Model
17-Mar-17 binayakdutta@gmail.com 3
Problem
Statement &
Solution
Overview
WIND
TURBINES
SENSOR DATA
ASSET & WORK
MGMT. SYSTEM
WEATHER DATA
ASSET & INCIDENT
DATA
DATA PREP.
FOR
ANALYTICS
FAULT
PREDICTION
ANALYTICS
MIS DATA
PREVENTIVE
ALERTs
PREVENTIVE MAINTENANCE
INCIDENT & ACTION LOG
Problem : ForWind turbines, duration based maintenance policy does not
factor the dynamic operational stats. of assets and its ambient conditions.
Hence sometimes, the interventions are too early and too late
Solution : Preserve data for asset, its condition and fault history as well as
weather condition. Build analytic model to learn from past data and predict
failures
17-Mar-17 binayakdutta@gmail.com 4
Acquire,
Prepare,
Analyze
Data
In this context, lets assume that following information is available or can be
made available
 Equipment Identifier :Which will uniquely identify an instance of wind turbine
 EquipmentType : Classifying where wind turbine transmission is Geared or
Direct Drive
 Tower Height : Height of tower (in meters) of the wind turbines
 Historical Daily Status of Equipment (Operational /Failed). [To be predicted]
 Weather Data (Temperature andWind Speed) for each Equipment Location
can be made available from external sources
Further we will assume that
 All such data sets are available for historical dates and will be available for
future dates
For our exercise, we will take two sets of data
 January Data which will be used to train the model
(File name = Logit_Wind_Data_Train_File.csv )
 February Data which will be used to test the model
(File name = Logit_Wind_Data_Test_File.csv )
17-Mar-17 binayakdutta@gmail.com 5
Acquire,
Prepare,
Analyze
Data
Read and analyze available data points
17-Mar-17 binayakdutta@gmail.com 6
Acquire,
Prepare,
Analyze
Data
• OPS_STATUS, our dependent variable, can take only binary values. For most of
records, its UP ( 1 ) while for a small portion its DOWN ( 0 )
• TOWER HEIGHT is spread between 50, 60 and 70 meters
• BLADE RPM is spread between 0 to 30 revolutions per minute
• TEMPERATURE can fluctuate between 0-60 degrees Celsius
• WIND speed is varying between 0-30 meter per second for the given sample
17-Mar-17 binayakdutta@gmail.com 7
Prepare Data
& Develop
Model
Create dummy variable for categorical attribute
Setting up the predictor variables for prediction
Set up Logistic Regression Model
17-Mar-17 binayakdutta@gmail.com 8
Train,Test &
Evaluate
ModelWith
Training and
Test DataSets
• Lower z-stats and higher p-value for [TOWER-HT] indicates null-hypothesis
(suggesting that this predictor does not have significant influence on outcome)
• Higher Standard error for [INTERCEPT] indicates high variability in coefficient
(possibly suggesting other unidentified predictor variables)
• Standard error, z-stat and p-value of other predictors look healthy
17-Mar-17 binayakdutta@gmail.com 9
Train,Test &
Evaluate
ModelWith
Training and
Test DataSets
TYPE = GEARED
RPM of BLADE
TEMPERATURE
WIND SPEED INTERCEPT
OPS_STATUS close to 1predicting Up & Running
OPS_STATUS close to 0predicting potential failure
OPS_STATUS close to 0.5indicating indecisiveness
of prediction
As the evaluation suggested that [TOWER_HT] is inconsequential for
[OPS_STATUS], we will knock it off from “predictor” list and recalibrate the model
17-Mar-17 binayakdutta@gmail.com 10
AlternateTest
WithTest Data
 As it stands, our model is evaluated and tuned with help of the
training Data (Jan-17 data)
 The model seems to behave well to couple of sample test data
 As next step, lets see how our model (trained on Jan-17 data) predicts
faults for Feb-17
 Lets devise our own simple method to score the prediction. Since the
outcome is binary (Up/Down or 1/0), we will have below rule
 Predictions above 0.75 will be considered 1 (or Up/Running)
 Predictions below 0.25 will be considered 0 (or Down/Fault)
 Predictions between 0.25 and 0.75 will be considered indecisive
 If Prediction matchesActualOPS_STATUS, award one point (+1)
 If Prediction mismatchesActual, penalize one point (-1)
 If Prediction is indecisive, no points given or taken
 At last, sum the points and divide by sample size to obtain mean
 The worst case mean will be -1 (all wrong prediction) and best case
will be +1 (all correct prediction)
 Let us see the affinity of our model to best case (or worst case)
17-Mar-17 binayakdutta@gmail.com 11
AlternateTest
WithTest Data
Accuracy Score of 0.78 on our alternate test is not too bad ☺
17-Mar-17 binayakdutta@gmail.com 12
Implement &
Setup
Learning,
Tuning
and
Re-evaluation
Process
Analytics is about continuous learning and IS processes should
support this philosophy
 So while our model appears to be a good fit on train and test data,
throughout its lifecycle, it should be regularly fine tuned by
 Learning for bigger and newer data sets. Fit the model with Jan-17
plus Feb-17 data (as learning set) and test against Mar-17 data
 Look out for new predictor variables which may reduce Standard
Error of [INTERCEPT]
 Comparing prediction of other analytic models
 Accommodating changing business needs
17-Mar-17 binayakdutta@gmail.com 13
THANKYOU
As we close, below are some teasers to keep up the momentum
 Can Multiple Linear Regression be used instead ?
 What is difference between MLE (Maximum Likelihood
Estimation) and OLS (Ordinary Least Square) regression ?
 Instead of Logit, can Probit be used ?
17-Mar-17 binayakdutta@gmail.com 14

More Related Content

What's hot

Query evaluation and optimization
Query evaluation and optimizationQuery evaluation and optimization
Query evaluation and optimizationlavanya marichamy
 
How to understand and implement regression analysis
How to understand and implement regression analysisHow to understand and implement regression analysis
How to understand and implement regression analysisClaireWhittaker5
 
IRJET- Probability based Missing Value Imputation Method and its Analysis
IRJET- Probability based Missing Value Imputation Method and its AnalysisIRJET- Probability based Missing Value Imputation Method and its Analysis
IRJET- Probability based Missing Value Imputation Method and its AnalysisIRJET Journal
 
Overview of query evaluation
Overview of query evaluationOverview of query evaluation
Overview of query evaluationavniS
 
IRJET- Error Reduction in Data Prediction using Least Square Regression Method
IRJET- Error Reduction in Data Prediction using Least Square Regression MethodIRJET- Error Reduction in Data Prediction using Least Square Regression Method
IRJET- Error Reduction in Data Prediction using Least Square Regression MethodIRJET Journal
 
Basic concepts of data structures and algorithms
Basic concepts of data structures and algorithmsBasic concepts of data structures and algorithms
Basic concepts of data structures and algorithmsRosmina Joy Cabauatan
 
A report on designing a model for improving CPU Scheduling by using Machine L...
A report on designing a model for improving CPU Scheduling by using Machine L...A report on designing a model for improving CPU Scheduling by using Machine L...
A report on designing a model for improving CPU Scheduling by using Machine L...MuskanRath1
 
House Sale Price Prediction
House Sale Price PredictionHouse Sale Price Prediction
House Sale Price Predictionsriram30691
 
Rachit Mishra_stock prediction_report
Rachit Mishra_stock prediction_reportRachit Mishra_stock prediction_report
Rachit Mishra_stock prediction_reportRachit Mishra
 
Regularized Principal Component Analysis for Spatial Data
Regularized Principal Component Analysis for Spatial DataRegularized Principal Component Analysis for Spatial Data
Regularized Principal Component Analysis for Spatial DataWen-Ting Wang
 
Using an SQL coverage measurement for testing database application
Using an SQL coverage measurement for testing database applicationUsing an SQL coverage measurement for testing database application
Using an SQL coverage measurement for testing database applicationKhalidKhan412
 

What's hot (13)

Query evaluation and optimization
Query evaluation and optimizationQuery evaluation and optimization
Query evaluation and optimization
 
How to understand and implement regression analysis
How to understand and implement regression analysisHow to understand and implement regression analysis
How to understand and implement regression analysis
 
IRJET- Probability based Missing Value Imputation Method and its Analysis
IRJET- Probability based Missing Value Imputation Method and its AnalysisIRJET- Probability based Missing Value Imputation Method and its Analysis
IRJET- Probability based Missing Value Imputation Method and its Analysis
 
Overview of query evaluation
Overview of query evaluationOverview of query evaluation
Overview of query evaluation
 
IRJET- Error Reduction in Data Prediction using Least Square Regression Method
IRJET- Error Reduction in Data Prediction using Least Square Regression MethodIRJET- Error Reduction in Data Prediction using Least Square Regression Method
IRJET- Error Reduction in Data Prediction using Least Square Regression Method
 
Basic concepts of data structures and algorithms
Basic concepts of data structures and algorithmsBasic concepts of data structures and algorithms
Basic concepts of data structures and algorithms
 
Machine learning
Machine learningMachine learning
Machine learning
 
A report on designing a model for improving CPU Scheduling by using Machine L...
A report on designing a model for improving CPU Scheduling by using Machine L...A report on designing a model for improving CPU Scheduling by using Machine L...
A report on designing a model for improving CPU Scheduling by using Machine L...
 
House Sale Price Prediction
House Sale Price PredictionHouse Sale Price Prediction
House Sale Price Prediction
 
Test data generation
Test data generationTest data generation
Test data generation
 
Rachit Mishra_stock prediction_report
Rachit Mishra_stock prediction_reportRachit Mishra_stock prediction_report
Rachit Mishra_stock prediction_report
 
Regularized Principal Component Analysis for Spatial Data
Regularized Principal Component Analysis for Spatial DataRegularized Principal Component Analysis for Spatial Data
Regularized Principal Component Analysis for Spatial Data
 
Using an SQL coverage measurement for testing database application
Using an SQL coverage measurement for testing database applicationUsing an SQL coverage measurement for testing database application
Using an SQL coverage measurement for testing database application
 

Viewers also liked

World’s Best Airports According To Skytrax World Airport Awards
World’s Best Airports According To Skytrax World Airport AwardsWorld’s Best Airports According To Skytrax World Airport Awards
World’s Best Airports According To Skytrax World Airport AwardsDustin Hahn
 
2017 LDS Youth Theme Fireside Slides - "Don't Forget to A.S.K."
2017 LDS Youth Theme Fireside Slides - "Don't Forget to A.S.K."2017 LDS Youth Theme Fireside Slides - "Don't Forget to A.S.K."
2017 LDS Youth Theme Fireside Slides - "Don't Forget to A.S.K."Ben Bernards
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionDrZahid Khan
 
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics a...
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics a...Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics a...
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics a...Alexander Efremov
 
Building 5 star review apps with Xamarin Test Cloud
Building 5 star review apps with Xamarin Test CloudBuilding 5 star review apps with Xamarin Test Cloud
Building 5 star review apps with Xamarin Test CloudGerald Versluis
 
PTPP4 - Marcas de tenis (Cristian Delgado)
PTPP4 - Marcas de tenis (Cristian Delgado)PTPP4 - Marcas de tenis (Cristian Delgado)
PTPP4 - Marcas de tenis (Cristian Delgado)desarrollodenegocios1a
 
Blood transfusion in obstetrics: evidence based approach
Blood transfusion in obstetrics: evidence based approachBlood transfusion in obstetrics: evidence based approach
Blood transfusion in obstetrics: evidence based approachOsama Warda
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionsaba khan
 
Suntem egali slavand natura
Suntem egali slavand naturaSuntem egali slavand natura
Suntem egali slavand naturaMarta Bernas
 
Ne distrăm împreună
Ne distrăm împreunăNe distrăm împreună
Ne distrăm împreunăMarta Bernas
 
Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)MikeBlyth
 
Intro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVMIntro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVMNYC Predictive Analytics
 
Logistic regression with SPSS examples
Logistic regression with SPSS examplesLogistic regression with SPSS examples
Logistic regression with SPSS examplesGaurav Kamboj
 

Viewers also liked (20)

World’s Best Airports According To Skytrax World Airport Awards
World’s Best Airports According To Skytrax World Airport AwardsWorld’s Best Airports According To Skytrax World Airport Awards
World’s Best Airports According To Skytrax World Airport Awards
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
2017 LDS Youth Theme Fireside Slides - "Don't Forget to A.S.K."
2017 LDS Youth Theme Fireside Slides - "Don't Forget to A.S.K."2017 LDS Youth Theme Fireside Slides - "Don't Forget to A.S.K."
2017 LDS Youth Theme Fireside Slides - "Don't Forget to A.S.K."
 
Charla control parental e IoT v2. Etek.ppsx
Charla control parental e IoT v2. Etek.ppsxCharla control parental e IoT v2. Etek.ppsx
Charla control parental e IoT v2. Etek.ppsx
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics a...
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics a...Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics a...
Stepwise Logistic Regression - Lecture for Students /Faculty of Mathematics a...
 
Building 5 star review apps with Xamarin Test Cloud
Building 5 star review apps with Xamarin Test CloudBuilding 5 star review apps with Xamarin Test Cloud
Building 5 star review apps with Xamarin Test Cloud
 
PTPP4 - Marcas de tenis (Cristian Delgado)
PTPP4 - Marcas de tenis (Cristian Delgado)PTPP4 - Marcas de tenis (Cristian Delgado)
PTPP4 - Marcas de tenis (Cristian Delgado)
 
Blood transfusion in obstetrics: evidence based approach
Blood transfusion in obstetrics: evidence based approachBlood transfusion in obstetrics: evidence based approach
Blood transfusion in obstetrics: evidence based approach
 
OS 5 SENTIDOS.
OS 5 SENTIDOS.OS 5 SENTIDOS.
OS 5 SENTIDOS.
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
O que-é-trabalho-1
O que-é-trabalho-1O que-é-trabalho-1
O que-é-trabalho-1
 
Suntem egali slavand natura
Suntem egali slavand naturaSuntem egali slavand natura
Suntem egali slavand natura
 
Ne distrăm împreună
Ne distrăm împreunăNe distrăm împreună
Ne distrăm împreună
 
Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)
 
Nume feminine
Nume feminineNume feminine
Nume feminine
 
La ira
La iraLa ira
La ira
 
Clase 2 para continuar
Clase 2 para continuarClase 2 para continuar
Clase 2 para continuar
 
Intro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVMIntro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVM
 
Logistic regression with SPSS examples
Logistic regression with SPSS examplesLogistic regression with SPSS examples
Logistic regression with SPSS examples
 

Similar to Fault prediction using logistic regression (Python)

Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...PATHALAMRAJESH
 
Machine Learning Foundations Project Presentation
Machine Learning Foundations Project PresentationMachine Learning Foundations Project Presentation
Machine Learning Foundations Project PresentationAmit J Bhattacharyya
 
IRJET- Deep Learning Model to Predict Hardware Performance
IRJET- Deep Learning Model to Predict Hardware PerformanceIRJET- Deep Learning Model to Predict Hardware Performance
IRJET- Deep Learning Model to Predict Hardware PerformanceIRJET Journal
 
IRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
IRJET- Analysis of PV Fed Vector Controlled Induction Motor DriveIRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
IRJET- Analysis of PV Fed Vector Controlled Induction Motor DriveIRJET Journal
 
IRJET- Analysis of Crucial Oil Gas and Liquid Sensor Statistics and Productio...
IRJET- Analysis of Crucial Oil Gas and Liquid Sensor Statistics and Productio...IRJET- Analysis of Crucial Oil Gas and Liquid Sensor Statistics and Productio...
IRJET- Analysis of Crucial Oil Gas and Liquid Sensor Statistics and Productio...IRJET Journal
 
Chap_05_Data_Collection_and_Analysis.ppt
Chap_05_Data_Collection_and_Analysis.pptChap_05_Data_Collection_and_Analysis.ppt
Chap_05_Data_Collection_and_Analysis.pptRosaHildaFlix
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data AnalyticsOsman Ali
 
How to do accurate RE forecasting & scheduling
How to do accurate RE forecasting & scheduling How to do accurate RE forecasting & scheduling
How to do accurate RE forecasting & scheduling Das A. K.
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Edureka!
 
DA ST-1 SET-B-Solution.pdf we also provide the many type of solution
DA ST-1 SET-B-Solution.pdf we also provide the many type of solutionDA ST-1 SET-B-Solution.pdf we also provide the many type of solution
DA ST-1 SET-B-Solution.pdf we also provide the many type of solutiongitikasingh2004
 
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...IJCSES Journal
 
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...ijcseit
 
MSA – Gage R&R Test
MSA – Gage R&R TestMSA – Gage R&R Test
MSA – Gage R&R TestMatt Hansen
 
Nose Dive into Apache Spark ML
Nose Dive into Apache Spark MLNose Dive into Apache Spark ML
Nose Dive into Apache Spark MLAhmet Bulut
 
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET Journal
 
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET Journal
 
CASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGES
CASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGESCASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGES
CASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGESIRJET Journal
 

Similar to Fault prediction using logistic regression (Python) (20)

Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
 
Vldb14
Vldb14Vldb14
Vldb14
 
Machine Learning Foundations Project Presentation
Machine Learning Foundations Project PresentationMachine Learning Foundations Project Presentation
Machine Learning Foundations Project Presentation
 
IRJET- Deep Learning Model to Predict Hardware Performance
IRJET- Deep Learning Model to Predict Hardware PerformanceIRJET- Deep Learning Model to Predict Hardware Performance
IRJET- Deep Learning Model to Predict Hardware Performance
 
IRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
IRJET- Analysis of PV Fed Vector Controlled Induction Motor DriveIRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
IRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
 
IRJET- Analysis of Crucial Oil Gas and Liquid Sensor Statistics and Productio...
IRJET- Analysis of Crucial Oil Gas and Liquid Sensor Statistics and Productio...IRJET- Analysis of Crucial Oil Gas and Liquid Sensor Statistics and Productio...
IRJET- Analysis of Crucial Oil Gas and Liquid Sensor Statistics and Productio...
 
Chap_05_Data_Collection_and_Analysis.ppt
Chap_05_Data_Collection_and_Analysis.pptChap_05_Data_Collection_and_Analysis.ppt
Chap_05_Data_Collection_and_Analysis.ppt
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
How to do accurate RE forecasting & scheduling
How to do accurate RE forecasting & scheduling How to do accurate RE forecasting & scheduling
How to do accurate RE forecasting & scheduling
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
 
DA ST-1 SET-B-Solution.pdf we also provide the many type of solution
DA ST-1 SET-B-Solution.pdf we also provide the many type of solutionDA ST-1 SET-B-Solution.pdf we also provide the many type of solution
DA ST-1 SET-B-Solution.pdf we also provide the many type of solution
 
Network predictive analysis
Network predictive analysisNetwork predictive analysis
Network predictive analysis
 
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
 
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
 
DSE-complete.pptx
DSE-complete.pptxDSE-complete.pptx
DSE-complete.pptx
 
MSA – Gage R&R Test
MSA – Gage R&R TestMSA – Gage R&R Test
MSA – Gage R&R Test
 
Nose Dive into Apache Spark ML
Nose Dive into Apache Spark MLNose Dive into Apache Spark ML
Nose Dive into Apache Spark ML
 
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms Comparison
 
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms Comparison
 
CASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGES
CASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGESCASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGES
CASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGES
 

Recently uploaded

Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 

Recently uploaded (20)

Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 

Fault prediction using logistic regression (Python)

  • 1. Fault Prediction using Logistic Regression Using Python 3.5 17-Mar-17 binayakdutta@gmail.com 1
  • 2. Preface  This deck illustrates the considerations and method for use of Logistic Regression and analytics in general  For the illustration, a hypotheticalWind turbine based electricity generation system is considered along with its associated IT systems  The LearningAndTesting Data is mocked up. Range bound random weights are added to mimic natural randomness in data  Detailing the Statistics behind the models or the overall data/analytics architecture is not the focus and hence saved for later  Python 3.5 along with its libraries (pandas, statsmodels and matplotlib) is used for implementing the regression models 17-Mar-17 binayakdutta@gmail.com 2
  • 3. Approach to Modeling Acquire, Prepare, Analyze Select & Develop Model Train,Test & Evaluate Model Implement Model in Solution Re-evaluate & Re-train Model 17-Mar-17 binayakdutta@gmail.com 3
  • 4. Problem Statement & Solution Overview WIND TURBINES SENSOR DATA ASSET & WORK MGMT. SYSTEM WEATHER DATA ASSET & INCIDENT DATA DATA PREP. FOR ANALYTICS FAULT PREDICTION ANALYTICS MIS DATA PREVENTIVE ALERTs PREVENTIVE MAINTENANCE INCIDENT & ACTION LOG Problem : ForWind turbines, duration based maintenance policy does not factor the dynamic operational stats. of assets and its ambient conditions. Hence sometimes, the interventions are too early and too late Solution : Preserve data for asset, its condition and fault history as well as weather condition. Build analytic model to learn from past data and predict failures 17-Mar-17 binayakdutta@gmail.com 4
  • 5. Acquire, Prepare, Analyze Data In this context, lets assume that following information is available or can be made available  Equipment Identifier :Which will uniquely identify an instance of wind turbine  EquipmentType : Classifying where wind turbine transmission is Geared or Direct Drive  Tower Height : Height of tower (in meters) of the wind turbines  Historical Daily Status of Equipment (Operational /Failed). [To be predicted]  Weather Data (Temperature andWind Speed) for each Equipment Location can be made available from external sources Further we will assume that  All such data sets are available for historical dates and will be available for future dates For our exercise, we will take two sets of data  January Data which will be used to train the model (File name = Logit_Wind_Data_Train_File.csv )  February Data which will be used to test the model (File name = Logit_Wind_Data_Test_File.csv ) 17-Mar-17 binayakdutta@gmail.com 5
  • 6. Acquire, Prepare, Analyze Data Read and analyze available data points 17-Mar-17 binayakdutta@gmail.com 6
  • 7. Acquire, Prepare, Analyze Data • OPS_STATUS, our dependent variable, can take only binary values. For most of records, its UP ( 1 ) while for a small portion its DOWN ( 0 ) • TOWER HEIGHT is spread between 50, 60 and 70 meters • BLADE RPM is spread between 0 to 30 revolutions per minute • TEMPERATURE can fluctuate between 0-60 degrees Celsius • WIND speed is varying between 0-30 meter per second for the given sample 17-Mar-17 binayakdutta@gmail.com 7
  • 8. Prepare Data & Develop Model Create dummy variable for categorical attribute Setting up the predictor variables for prediction Set up Logistic Regression Model 17-Mar-17 binayakdutta@gmail.com 8
  • 9. Train,Test & Evaluate ModelWith Training and Test DataSets • Lower z-stats and higher p-value for [TOWER-HT] indicates null-hypothesis (suggesting that this predictor does not have significant influence on outcome) • Higher Standard error for [INTERCEPT] indicates high variability in coefficient (possibly suggesting other unidentified predictor variables) • Standard error, z-stat and p-value of other predictors look healthy 17-Mar-17 binayakdutta@gmail.com 9
  • 10. Train,Test & Evaluate ModelWith Training and Test DataSets TYPE = GEARED RPM of BLADE TEMPERATURE WIND SPEED INTERCEPT OPS_STATUS close to 1predicting Up & Running OPS_STATUS close to 0predicting potential failure OPS_STATUS close to 0.5indicating indecisiveness of prediction As the evaluation suggested that [TOWER_HT] is inconsequential for [OPS_STATUS], we will knock it off from “predictor” list and recalibrate the model 17-Mar-17 binayakdutta@gmail.com 10
  • 11. AlternateTest WithTest Data  As it stands, our model is evaluated and tuned with help of the training Data (Jan-17 data)  The model seems to behave well to couple of sample test data  As next step, lets see how our model (trained on Jan-17 data) predicts faults for Feb-17  Lets devise our own simple method to score the prediction. Since the outcome is binary (Up/Down or 1/0), we will have below rule  Predictions above 0.75 will be considered 1 (or Up/Running)  Predictions below 0.25 will be considered 0 (or Down/Fault)  Predictions between 0.25 and 0.75 will be considered indecisive  If Prediction matchesActualOPS_STATUS, award one point (+1)  If Prediction mismatchesActual, penalize one point (-1)  If Prediction is indecisive, no points given or taken  At last, sum the points and divide by sample size to obtain mean  The worst case mean will be -1 (all wrong prediction) and best case will be +1 (all correct prediction)  Let us see the affinity of our model to best case (or worst case) 17-Mar-17 binayakdutta@gmail.com 11
  • 12. AlternateTest WithTest Data Accuracy Score of 0.78 on our alternate test is not too bad ☺ 17-Mar-17 binayakdutta@gmail.com 12
  • 13. Implement & Setup Learning, Tuning and Re-evaluation Process Analytics is about continuous learning and IS processes should support this philosophy  So while our model appears to be a good fit on train and test data, throughout its lifecycle, it should be regularly fine tuned by  Learning for bigger and newer data sets. Fit the model with Jan-17 plus Feb-17 data (as learning set) and test against Mar-17 data  Look out for new predictor variables which may reduce Standard Error of [INTERCEPT]  Comparing prediction of other analytic models  Accommodating changing business needs 17-Mar-17 binayakdutta@gmail.com 13
  • 14. THANKYOU As we close, below are some teasers to keep up the momentum  Can Multiple Linear Regression be used instead ?  What is difference between MLE (Maximum Likelihood Estimation) and OLS (Ordinary Least Square) regression ?  Instead of Logit, can Probit be used ? 17-Mar-17 binayakdutta@gmail.com 14