Machine Learning, Linear and
Bayesian Models for Logistic
Regression in Failure Detection
Problems
B. Pavlyshenko (Ph.D.)
SoftServe, Inc.,
Ivan Franko National University of Lviv,
Lviv,Ukraine
MACHINE LEARNING MODEL
The most important features and their gain values:
Matthews correlation coefficient (MCC) :
MACHINE LEARNING MODEL
ROC curve for
classification results
AUC=0.753
Matthews correlation
coefficient for logistic regression
for different values of probability
threshold.
Matthews correlation coefficient for different samples sets
MACHINE LEARNING MODEL
ROC curve and Matthews correlation coefficient for different sets of features
MACHINE LEARNING MODEL
Features set 1:
AUC=0.75
Features set 2:
AUC=0.91
MULTILEVEL MODEL
GENERALIZED LINEAR MODEL
Dependence of total within-clusters sum of
squares from number of clusters.
Dependence of Lambda from AUC
value.
Coefficients of the generalized linear
model for logistic regression
(Lambda=0.03 )
GENERALIZED LINEAR MODEL
GENERALIZED LINEAR MODEL
Histograms, correlation coefficients, pairs scatterplots for features.
BAYESIAN MODEL
model{
for (i in 1:n) {
y[i] ~ dbern(p[i])
logit(p[i]) <- b0+inprod(b[ ],x[i,])
}
b0 ~ dnorm(0,0.0001)
for (j in 1:nfeat) {
b[j] ~ dnorm(0,0.0001)
}
}
Probabilistic model for logistic regression using BUGS syntax
BAYESIAN MODEL
Trace plot for Intercept parameter. Probability density function for
Intercept parameter.
BAYESIAN MODEL
Box plots for logistic regression coefficients.
Combining Machine Learning with
Linear and Bayesian Models
Combining Machine Learning with Linear Model
Parameters set 1:
max.depth = 15,
colsample_bytree = 0.7
Parameters set 2:
max.depth = 5,
colsample_bytree = 0.7
Parameters set 3:
max.depth = 15,
colsample_bytree = 0.3
Matthews correlation coefficient for different
XGBoost parameter sets (features set 2):
Matthews correlation coefficient for different
XGBoost parameter sets (features set 1):
Combining Machine Learning with Bayesian Model
Study of Reliability of Parts
Weibull distribution
Thank you for your attention !
Special thanks to Bosch company
for awarding me the travel grant for
attending the IEEE BigData 2016
conference !

PresentationMachine Learning, Linear and Bayesian Models for Logistic Regression in Failure Detection Problems

  • 1.
    Machine Learning, Linearand Bayesian Models for Logistic Regression in Failure Detection Problems B. Pavlyshenko (Ph.D.) SoftServe, Inc., Ivan Franko National University of Lviv, Lviv,Ukraine
  • 2.
    MACHINE LEARNING MODEL Themost important features and their gain values: Matthews correlation coefficient (MCC) :
  • 3.
    MACHINE LEARNING MODEL ROCcurve for classification results AUC=0.753 Matthews correlation coefficient for logistic regression for different values of probability threshold.
  • 4.
    Matthews correlation coefficientfor different samples sets MACHINE LEARNING MODEL
  • 5.
    ROC curve andMatthews correlation coefficient for different sets of features MACHINE LEARNING MODEL Features set 1: AUC=0.75 Features set 2: AUC=0.91
  • 6.
  • 7.
    GENERALIZED LINEAR MODEL Dependenceof total within-clusters sum of squares from number of clusters.
  • 8.
    Dependence of Lambdafrom AUC value. Coefficients of the generalized linear model for logistic regression (Lambda=0.03 ) GENERALIZED LINEAR MODEL
  • 9.
    GENERALIZED LINEAR MODEL Histograms,correlation coefficients, pairs scatterplots for features.
  • 10.
    BAYESIAN MODEL model{ for (iin 1:n) { y[i] ~ dbern(p[i]) logit(p[i]) <- b0+inprod(b[ ],x[i,]) } b0 ~ dnorm(0,0.0001) for (j in 1:nfeat) { b[j] ~ dnorm(0,0.0001) } } Probabilistic model for logistic regression using BUGS syntax
  • 11.
    BAYESIAN MODEL Trace plotfor Intercept parameter. Probability density function for Intercept parameter.
  • 12.
    BAYESIAN MODEL Box plotsfor logistic regression coefficients.
  • 13.
    Combining Machine Learningwith Linear and Bayesian Models
  • 14.
    Combining Machine Learningwith Linear Model Parameters set 1: max.depth = 15, colsample_bytree = 0.7 Parameters set 2: max.depth = 5, colsample_bytree = 0.7 Parameters set 3: max.depth = 15, colsample_bytree = 0.3 Matthews correlation coefficient for different XGBoost parameter sets (features set 2): Matthews correlation coefficient for different XGBoost parameter sets (features set 1):
  • 15.
    Combining Machine Learningwith Bayesian Model
  • 16.
    Study of Reliabilityof Parts Weibull distribution
  • 17.
    Thank you foryour attention ! Special thanks to Bosch company for awarding me the travel grant for attending the IEEE BigData 2016 conference !