SlideShare a Scribd company logo
1 of 21
Master the Art of Analytics
A Simplistic Explainer Series For Citizen Data Scientists
J o u r n e y To w a r d s A u g m e n t e d A n a l y t i c s
Binary Logistic Regression
Basic Terminologies
 Target variable usually denoted by Y , is the variable being predicted and
is also called dependent variable, output variable, response variable or
outcome variable (Ex : One highlighted in red box in table below)
 Predictor, sometimes called an independent variable, is a variable that is
being used to predict the target variable ( Ex : variables highlighted in
green box in table below ) Age Marital Status Loan Status Default
58 married no yes
44 single no no
33 married yes yes
47 married no yes
33 single no no
35 married no yes
28 single yes no
Introduction
• Objective :
• Logistic regression measures the relationship between the categorical target
variable and one or more independent variables
• It deals with situations in which the outcome for a target variable can have
only two possible types
• Thus , logistic regression makes use of one or more predictor variables that
may be either continuous or categorical to predict the target variable classes
• Benefit:
• Logistic regression model output helps identify important factors ( Xi )
impacting the target variable (Y) and also the nature of relationship between
each of these factors and dependent variable
Example : Binary Logistic Regression : Input
Let’s conduct the Binary Logistic Regression analysis on following variables :
Default Status Age Marital Status
Existing Loan
Status
Income
Defaulted 58 married no 46,399
Not Defaulted 44 single no 47,971
Defaulted 33 married yes 52,618
Defaulted 47 married no 28,717
Not Defaulted 33 single no 41,216
Defaulted 35 married no 34,372
Not Defaulted 28 single yes 64,811
Not Defaulted 42 divorced no 53,000
Defaulted 58 married no 41,375
Not Defaulted 43 single no 53,778
Not Defaulted 41 divorced no 44,440
Not Defaulted 29 single no 51,026
Independent variables (Xi)Target Variable (Y)
Example : Binary Logistic Regression : Output
Coefficients P value
(Intercept) -2.34 0.00
Age 0.01 0.07
Marital Status (Married) 0.5 0.04
Income 0.1 0.04
Existing loan (Yes) 0.3 0.03
COEFFICIENTS
• P value for marital status, income and existing loan is <0.05 ;
Hence these variables are important factors for predicting likely default/non
default class
• But p value for Age is >0.05 which means Age is not impacting the prediction
significantly
Example : Binary Logistic Regression : Output
CLASSIFICATION ACCURACY : (35+ 70) / (35+70+4+4) = 92%
• The prediction accuracy is useful criterion for assessing the model performance
• Model with prediction accuracy >= 70% is useful
CLASSIFICATION ERROR = 100- Accuracy = 8%
There is 8% chance of error in classification
Defaulted Not defaulted
Defaulted 35 4
Not defaulted 4 70
ACTUAL VERSUS PREDICTED
Predicted
Actual
Standard input parameters & Sample UI
SAMPLE OUTPUT 1 : MODEL SUMMARY
Coefficients P value
(Intercept) -2.34 0.00
Age 0.01 0.07
Marital Status (Married) 0.5 0.04
Income 0.1 0.04
Existing loan (Yes) 0.3 0.03
Defaulted Not defaulted
Defaulted 35 4
Not defaulted 4 70
ACTUAL VERSUS PREDICTED
Predicted
Actual
COEFFICIENT MATRIX :
Age
Marital
Status
Existing Loan
Status
Income Default Status Predicted class Probability
58 married no 46,399 Defaulted Defaulted 0.7
44 single no 47,971 Not Defaulted Not Defaulted 0.9
33 married yes 52,618 Defaulted Defaulted 0.8
47 married no 28,717 Defaulted Defaulted 0.7
33 single no 41,216 Not Defaulted Not Defaulted 0.6
35 married no 34,372 Defaulted Not Defaulted 0.5
28 single yes 64,811 Not Defaulted Defaulted 0.4
42 divorced no 53,000 Not Defaulted Not Defaulted 0.3
58 married no 41,375 Defaulted Defaulted 0.2
43 single no 53,778 Not Defaulted Defaulted 0.1
Thus, output will contain predicted class column, confusion matrix and classification plot
SAMPLE OUTPUT 2 : PREDICTED CLASS & PROBABILITY
SAMPLE OUTPUT 3 : CLASSIFICATION PLOT
• Lesser the overlap between two classes in the plot above , better the
classification done by model
INTERPRETATION OF IMPORTANT MODEL SUMMARY STATISTICS
Accuracy:
 If Accuracy >= 70% : Model is well fit on provided data and predicted classes
are reasonably accurate
 If Accuracy < 70% : Model is not well fit on provided data and predicted classes
are likely to contain high chances of error
Coefficients and p value :
 If value of coefficient is positive and p value <0.05 , variable is positively
correlated with target variable
 If value of coefficient is negative and p value <0.05 , variable is negatively
correlated with target variable
 If p value > 0.05, variable is unimportant in terms of predicting target variable
classes
Limitations
It is applicable only when target variable is categorical
Sample size must be at least 1000 in order to get reliable predictions
Binary logistic regression is not suitable when number of classes > 2
Level 1 of the target variable should represent the desired outcome.
i.e. if desired class is yes in response/non response target variable
then Yes has to be recoded into 1 and No into 0
General applications
Credit/loan
approval analysis
•Given a list of client’s
transactional
attributes, predict
whether a client will
default or not on a
bank loan
Medical Diagnosis
•Given a list of
symptoms, predict if a
patient has disease X
or not
Rain forecasting
•Based on
temperature,
humidity, pressure
etc. predict if it will be
raining or not
Treatment
effectiveness
analysis
•Based on patient’s
body attributes such
as blood pressure,
sugar, hemoglobin,
name of a drug taken,
type of a treatment
taken etc., check the
likelihood of a disease
being cured
Fraud analysis
•Based on various bills
submitted by an
employee for
reimbursement of
food , travel , medical
expense etc., predict
the likelihood of an
employee doing fraud
Use case 1
Business benefit:
•Once classes are assigned, bank will
have a loan applicants’ dataset with
each applicant labeled as
“likely/unlikely to default”.
•Based on this labels , bank can easily
make a decision on whether to give
loan to an applicant or not and if yes
then how much credit limit and
interest rate each applicant is eligible
for based on the amount of risk
involved.
Business problem :
•A bank loans officer wants to predict if
the loan applicant will be a bank
defaulter or non defaulter based on
attributes such as Loan amount ,
Monthly installment, Employment
tenure , Times delinquent, Annual
income, Debt to income ratio etc.
•Here the target variable would be ‘past
default status’ and predicted class
would be containing values ‘yes or no’
representing ‘likely to default/unlikely
to default’ class respectively.
Use case 1 : Input Dataset
Customer ID
Loan
amount
Monthly
installment
Annual
income
Debt to
income
ratio
Times
delinquent
Employment
tenure
Past default
status
1039153 21000 701.73 105000 9 5 4 No
1069697 15000 483.38 92000 11 5 2 No
1068120 25600 824.96 110000 10 9 2 No
563175 23000 534.94 80000 9 2 12 No
562842 19750 483.65 57228 11 3 21 Yes
562681 25000 571.78 113000 10 0 9 No
562404 21250 471.2 31008 12 1 12 Yes
700159 14400 448.99 82000 20 6 6 No
696484 10000 241.33 45000 18 8 2 Yes
702598 11700 381.61 45192 20 7 3 Yes
702470 10000 243.29 38000 17 9 7 Yes
702373 4800 144.77 54000 19 8 2 Yes
Use case 1 : Output : Predicted Class
Output : Each record will have the predicted class assigned as shown below (Column : Likelihood to default) :
Customer
ID
Loan
amount
Monthly
installment
Annual
income
Debt to
income
ratio
Times
delinquent
Employment
tenure
Past
default
status
Likelihood
to default
1039153 21000 701.73 105000 9 5 4 No No
1069697 15000 483.38 92000 11 5 2 No No
1068120 25600 824.96 110000 10 9 2 No No
563175 23000 534.94 80000 9 2 12 No No
562842 19750 483.65 57228 11 3 21 Yes No
562681 25000 571.78 113000 10 0 9 No No
562404 21250 471.2 31008 12 1 12 Yes Yes
700159 14400 448.99 82000 20 6 6 No No
696484 10000 241.33 45000 18 8 2 Yes Yes
702598 11700 381.61 45192 20 7 3 Yes Yes
702470 10000 243.29 38000 17 9 7 Yes Yes
702373 4800 144.77 54000 19 8 2 Yes No
Use case 1 : Output : Class profile
 As can be seen in the table above, there are distinctive characteristics of defaulters (Class : Yes ) and
non defaulters ( Class : No ).
 Defaulters have tendency to be delinquent, higher debt to income ratio and lower employment tenure
as compared to non defaulters
 Hence , delinquency , employment tenure and debt to income ratio are the determinant factors when
it comes to classifying loan applicants into likely defaulter/non defaulters
Class(Likely to
default)
Average
loan
amount
Average
monthly
installment
Average
annual
income
Average debt
to income
ratio
Average
times
delinquent
Average
employment
tenure
No 10447.30 304.87 66467.74 9.58 1.69 16.82
Yes 7521.32 227.43 60935.28 16.55 6.91 4.01
Use case 2
Business benefit:
•Given the body profile of a patient and
recent treatments and drugs taken by
him/her , probability of a cure can be
predicted and changes in treatment/drug
can be suggested if required.
Business problem :
•A doctor/ pharmacist wants to predict
the likelihood of a new patient’s disease
being cured/not cured based on various
attributes of a patient such as blood
pressure , hemoglobin level, sugar level ,
name of a drug given to patient, name of
a treatment given to patient etc.
•Here the target variable would be ‘past
cure status’ and predicted class would
contain values ‘yes or no’ meaning ‘prone
to cure/ not prone to cure’ respectively..
Use case 3
Business benefit:
•Such classification can prevent a
company from spending unreasonably
on any employee and can in turn save
the company budget by detecting such
fraud beforehand.
Business problem :
•An accountant/human resource
manager wants to predict the
likelihood of an employee doing fraud
to a company based on various bills
submitted by him/her so far such as
food bill , travel bill , medical bill.
•The target variable in this case would
be ‘past fraud status’ and predicted
class would contain values ‘yes or no’
representing likely fraud and no fraud
respectively.
Want to Learn
More?
Get in touch with us @
support@Smarten.com
And Do Checkout the Learning section
on
Smarten.com
June 2018

More Related Content

What's hot

Logistic regression
Logistic regressionLogistic regression
Logistic regressionsaba khan
 
Application of ordinal logistic regression in the study of students’ performance
Application of ordinal logistic regression in the study of students’ performanceApplication of ordinal logistic regression in the study of students’ performance
Application of ordinal logistic regression in the study of students’ performanceAlexander Decker
 
Inferential statistics
Inferential statisticsInferential statistics
Inferential statisticsAshok Kulkarni
 
Multinomial Logistic Regression Analysis
Multinomial Logistic Regression AnalysisMultinomial Logistic Regression Analysis
Multinomial Logistic Regression AnalysisHARISH Kumar H R
 
Logistic regression with SPSS examples
Logistic regression with SPSS examplesLogistic regression with SPSS examples
Logistic regression with SPSS examplesGaurav Kamboj
 
Generalized linear model
Generalized linear modelGeneralized linear model
Generalized linear modelRahul Rockers
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spssDr Nisha Arora
 
inferencial statistics
inferencial statisticsinferencial statistics
inferencial statisticsanjaemerry
 
Logistic Regression.ppt
Logistic Regression.pptLogistic Regression.ppt
Logistic Regression.ppthabtamu biazin
 
Introduction to Generalized Linear Models
Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
Introduction to Generalized Linear Modelsrichardchandler
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis TestingKen Plummer
 
Point and Interval Estimation
Point and Interval EstimationPoint and Interval Estimation
Point and Interval EstimationShubham Mehta
 
Inferential statistics.ppt
Inferential statistics.pptInferential statistics.ppt
Inferential statistics.pptNursing Path
 
Normal distribution
Normal distributionNormal distribution
Normal distributionGlobal Polis
 

What's hot (20)

Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Application of ordinal logistic regression in the study of students’ performance
Application of ordinal logistic regression in the study of students’ performanceApplication of ordinal logistic regression in the study of students’ performance
Application of ordinal logistic regression in the study of students’ performance
 
Binomial probability distributions
Binomial probability distributions  Binomial probability distributions
Binomial probability distributions
 
Inferential statistics
Inferential statisticsInferential statistics
Inferential statistics
 
Cox model
Cox modelCox model
Cox model
 
Multinomial Logistic Regression Analysis
Multinomial Logistic Regression AnalysisMultinomial Logistic Regression Analysis
Multinomial Logistic Regression Analysis
 
Logistic regression sage
Logistic regression sageLogistic regression sage
Logistic regression sage
 
Logistic regression with SPSS examples
Logistic regression with SPSS examplesLogistic regression with SPSS examples
Logistic regression with SPSS examples
 
Generalized linear model
Generalized linear modelGeneralized linear model
Generalized linear model
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spss
 
inferencial statistics
inferencial statisticsinferencial statistics
inferencial statistics
 
Logistic Regression.ppt
Logistic Regression.pptLogistic Regression.ppt
Logistic Regression.ppt
 
Introduction to Generalized Linear Models
Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
Introduction to Generalized Linear Models
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Point and Interval Estimation
Point and Interval EstimationPoint and Interval Estimation
Point and Interval Estimation
 
Inferential statistics.ppt
Inferential statistics.pptInferential statistics.ppt
Inferential statistics.ppt
 
Simple linear regression
Simple linear regressionSimple linear regression
Simple linear regression
 
Normal distribution
Normal distributionNormal distribution
Normal distribution
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Statistics
StatisticsStatistics
Statistics
 

Similar to What is Binary Logistic Regression Classification and How is it Used in Analysis?

What is SVM Classification Analysis and How Can It Benefit Business Analytics?
What is SVM Classification Analysis and How Can It Benefit Business Analytics?What is SVM Classification Analysis and How Can It Benefit Business Analytics?
What is SVM Classification Analysis and How Can It Benefit Business Analytics?Smarten Augmented Analytics
 
What is the Multinomial-Logistic Regression Classification Algorithm and How ...
What is the Multinomial-Logistic Regression Classification Algorithm and How ...What is the Multinomial-Logistic Regression Classification Algorithm and How ...
What is the Multinomial-Logistic Regression Classification Algorithm and How ...Smarten Augmented Analytics
 
What is KNN Classification and How Can This Analysis Help an Enterprise?
What is KNN Classification and How Can This Analysis Help an Enterprise?What is KNN Classification and How Can This Analysis Help an Enterprise?
What is KNN Classification and How Can This Analysis Help an Enterprise?Smarten Augmented Analytics
 
What is Naïve Bayes Classification and How is it Used for Enterprise Analysis?
What is Naïve Bayes Classification and How is it Used for Enterprise Analysis?What is Naïve Bayes Classification and How is it Used for Enterprise Analysis?
What is Naïve Bayes Classification and How is it Used for Enterprise Analysis?Smarten Augmented Analytics
 
CreditCardDefaultModel
CreditCardDefaultModelCreditCardDefaultModel
CreditCardDefaultModelAndrew Rogala
 
Exploratory Factor Analysis
Exploratory Factor AnalysisExploratory Factor Analysis
Exploratory Factor AnalysisShailendra Tomar
 
Creditscore
CreditscoreCreditscore
Creditscorekevinlan
 
Supuestos Actuariales en tasas contingentes- versión inglés (3).pdf
Supuestos Actuariales en tasas contingentes- versión inglés (3).pdfSupuestos Actuariales en tasas contingentes- versión inglés (3).pdf
Supuestos Actuariales en tasas contingentes- versión inglés (3).pdfEvaristoDiz1
 
Addiction severity index intro training jan 2015
Addiction severity index intro training jan 2015Addiction severity index intro training jan 2015
Addiction severity index intro training jan 2015Sunrays of Hope, Inc
 
What Is Generalized Linear Regression with Gaussian Distribution And How Can ...
What Is Generalized Linear Regression with Gaussian Distribution And How Can ...What Is Generalized Linear Regression with Gaussian Distribution And How Can ...
What Is Generalized Linear Regression with Gaussian Distribution And How Can ...Smarten Augmented Analytics
 
What is the Paired Sample T Test and How is it Beneficial to Business Analysis?
What is the Paired Sample T Test and How is it Beneficial to Business Analysis?What is the Paired Sample T Test and How is it Beneficial to Business Analysis?
What is the Paired Sample T Test and How is it Beneficial to Business Analysis?Smarten Augmented Analytics
 
Credit risk scoring model final
Credit risk scoring model finalCredit risk scoring model final
Credit risk scoring model finalRitu Sarkar
 
Biostatistics Workshop: Regression
Biostatistics Workshop: RegressionBiostatistics Workshop: Regression
Biostatistics Workshop: RegressionHopkinsCFAR
 
#newapproach Alternative to the WCA V2.0
#newapproach Alternative to the WCA V2.0#newapproach Alternative to the WCA V2.0
#newapproach Alternative to the WCA V2.0Rick Burgess
 
Math 104 Fall 14Lab Assignment #4Math 104 Fall 14Lab Assignmen.docx
Math 104 Fall 14Lab Assignment #4Math 104 Fall 14Lab Assignmen.docxMath 104 Fall 14Lab Assignment #4Math 104 Fall 14Lab Assignmen.docx
Math 104 Fall 14Lab Assignment #4Math 104 Fall 14Lab Assignmen.docxandreecapon
 
Download the presentation
Download the presentationDownload the presentation
Download the presentationbutest
 
Project Report for Mostan Superstore.pptx
Project Report for Mostan Superstore.pptxProject Report for Mostan Superstore.pptx
Project Report for Mostan Superstore.pptxChristianahEfunniyi
 
A Statistical/Mathematical Approach to Enhanced Loan Modification Targeting
A Statistical/Mathematical Approach to Enhanced Loan Modification TargetingA Statistical/Mathematical Approach to Enhanced Loan Modification Targeting
A Statistical/Mathematical Approach to Enhanced Loan Modification TargetingCognizant
 

Similar to What is Binary Logistic Regression Classification and How is it Used in Analysis? (20)

What is SVM Classification Analysis and How Can It Benefit Business Analytics?
What is SVM Classification Analysis and How Can It Benefit Business Analytics?What is SVM Classification Analysis and How Can It Benefit Business Analytics?
What is SVM Classification Analysis and How Can It Benefit Business Analytics?
 
What is the Multinomial-Logistic Regression Classification Algorithm and How ...
What is the Multinomial-Logistic Regression Classification Algorithm and How ...What is the Multinomial-Logistic Regression Classification Algorithm and How ...
What is the Multinomial-Logistic Regression Classification Algorithm and How ...
 
What is KNN Classification and How Can This Analysis Help an Enterprise?
What is KNN Classification and How Can This Analysis Help an Enterprise?What is KNN Classification and How Can This Analysis Help an Enterprise?
What is KNN Classification and How Can This Analysis Help an Enterprise?
 
What is Naïve Bayes Classification and How is it Used for Enterprise Analysis?
What is Naïve Bayes Classification and How is it Used for Enterprise Analysis?What is Naïve Bayes Classification and How is it Used for Enterprise Analysis?
What is Naïve Bayes Classification and How is it Used for Enterprise Analysis?
 
CreditCardDefaultModel
CreditCardDefaultModelCreditCardDefaultModel
CreditCardDefaultModel
 
Exploratory Factor Analysis
Exploratory Factor AnalysisExploratory Factor Analysis
Exploratory Factor Analysis
 
Final report mkt
Final report mktFinal report mkt
Final report mkt
 
Creditscore
CreditscoreCreditscore
Creditscore
 
Supuestos Actuariales en tasas contingentes- versión inglés (3).pdf
Supuestos Actuariales en tasas contingentes- versión inglés (3).pdfSupuestos Actuariales en tasas contingentes- versión inglés (3).pdf
Supuestos Actuariales en tasas contingentes- versión inglés (3).pdf
 
Addiction severity index intro training jan 2015
Addiction severity index intro training jan 2015Addiction severity index intro training jan 2015
Addiction severity index intro training jan 2015
 
What Is Generalized Linear Regression with Gaussian Distribution And How Can ...
What Is Generalized Linear Regression with Gaussian Distribution And How Can ...What Is Generalized Linear Regression with Gaussian Distribution And How Can ...
What Is Generalized Linear Regression with Gaussian Distribution And How Can ...
 
What is the Paired Sample T Test and How is it Beneficial to Business Analysis?
What is the Paired Sample T Test and How is it Beneficial to Business Analysis?What is the Paired Sample T Test and How is it Beneficial to Business Analysis?
What is the Paired Sample T Test and How is it Beneficial to Business Analysis?
 
Credit risk scoring model final
Credit risk scoring model finalCredit risk scoring model final
Credit risk scoring model final
 
Biostatistics Workshop: Regression
Biostatistics Workshop: RegressionBiostatistics Workshop: Regression
Biostatistics Workshop: Regression
 
#newapproach Alternative to the WCA V2.0
#newapproach Alternative to the WCA V2.0#newapproach Alternative to the WCA V2.0
#newapproach Alternative to the WCA V2.0
 
Présentation jonathan agnew
Présentation jonathan agnewPrésentation jonathan agnew
Présentation jonathan agnew
 
Math 104 Fall 14Lab Assignment #4Math 104 Fall 14Lab Assignmen.docx
Math 104 Fall 14Lab Assignment #4Math 104 Fall 14Lab Assignmen.docxMath 104 Fall 14Lab Assignment #4Math 104 Fall 14Lab Assignmen.docx
Math 104 Fall 14Lab Assignment #4Math 104 Fall 14Lab Assignmen.docx
 
Download the presentation
Download the presentationDownload the presentation
Download the presentation
 
Project Report for Mostan Superstore.pptx
Project Report for Mostan Superstore.pptxProject Report for Mostan Superstore.pptx
Project Report for Mostan Superstore.pptx
 
A Statistical/Mathematical Approach to Enhanced Loan Modification Targeting
A Statistical/Mathematical Approach to Enhanced Loan Modification TargetingA Statistical/Mathematical Approach to Enhanced Loan Modification Targeting
A Statistical/Mathematical Approach to Enhanced Loan Modification Targeting
 

More from Smarten Augmented Analytics

Crime Type Prediction - Augmented Analytics Use Case – Smarten
Crime Type Prediction - Augmented Analytics Use Case – SmartenCrime Type Prediction - Augmented Analytics Use Case – Smarten
Crime Type Prediction - Augmented Analytics Use Case – SmartenSmarten Augmented Analytics
 
What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...
What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...
What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...Smarten Augmented Analytics
 
What Is Random Forest Classification And How Can It Help Your Business?
What Is Random Forest Classification And How Can It Help Your Business?What Is Random Forest Classification And How Can It Help Your Business?
What Is Random Forest Classification And How Can It Help Your Business?Smarten Augmented Analytics
 
What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?
What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?
What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?Smarten Augmented Analytics
 
Students' Academic Performance Predictive Analytics Use Case – Smarten
Students' Academic Performance Predictive Analytics Use Case – SmartenStudents' Academic Performance Predictive Analytics Use Case – Smarten
Students' Academic Performance Predictive Analytics Use Case – SmartenSmarten Augmented Analytics
 
Random Forest Regression Analysis Reveals Impact of Variables on Target Values
Random Forest Regression Analysis Reveals Impact of Variables on Target Values  Random Forest Regression Analysis Reveals Impact of Variables on Target Values
Random Forest Regression Analysis Reveals Impact of Variables on Target Values Smarten Augmented Analytics
 
Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...
Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...
Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...Smarten Augmented Analytics
 
What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...Smarten Augmented Analytics
 
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...Smarten Augmented Analytics
 
Fraud Mitigation Predictive Analytics Use Case – Smarten
Fraud Mitigation Predictive Analytics Use Case – SmartenFraud Mitigation Predictive Analytics Use Case – Smarten
Fraud Mitigation Predictive Analytics Use Case – SmartenSmarten Augmented Analytics
 
Quality Control Predictive Analytics Use Case - Smarten
Quality Control Predictive Analytics Use Case - SmartenQuality Control Predictive Analytics Use Case - Smarten
Quality Control Predictive Analytics Use Case - SmartenSmarten Augmented Analytics
 
Machine Maintenance Management Predictive Analytics Use Case - Smarten
Machine Maintenance Management Predictive Analytics Use Case - SmartenMachine Maintenance Management Predictive Analytics Use Case - Smarten
Machine Maintenance Management Predictive Analytics Use Case - SmartenSmarten Augmented Analytics
 
Predictive Analytics Using External Data Augmented Analytics Use Case - Smarten
Predictive Analytics Using External Data Augmented Analytics Use Case - SmartenPredictive Analytics Using External Data Augmented Analytics Use Case - Smarten
Predictive Analytics Using External Data Augmented Analytics Use Case - SmartenSmarten Augmented Analytics
 
Marketing Optimization Augmented Analytics Use Cases - Smarten
Marketing Optimization Augmented Analytics Use Cases - SmartenMarketing Optimization Augmented Analytics Use Cases - Smarten
Marketing Optimization Augmented Analytics Use Cases - SmartenSmarten Augmented Analytics
 
Human Resource Attrition Augmented Analytics Use Case - Smarten
Human Resource Attrition Augmented Analytics Use Case - SmartenHuman Resource Attrition Augmented Analytics Use Case - Smarten
Human Resource Attrition Augmented Analytics Use Case - SmartenSmarten Augmented Analytics
 
Customer Targeting Augmented Analytics Use Case - Smarten
Customer Targeting Augmented Analytics Use Case - SmartenCustomer Targeting Augmented Analytics Use Case - Smarten
Customer Targeting Augmented Analytics Use Case - SmartenSmarten Augmented Analytics
 
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...Smarten Augmented Analytics
 
What is the Independent Samples T Test Method of Analysis and How Can it Bene...
What is the Independent Samples T Test Method of Analysis and How Can it Bene...What is the Independent Samples T Test Method of Analysis and How Can it Bene...
What is the Independent Samples T Test Method of Analysis and How Can it Bene...Smarten Augmented Analytics
 
What Are Simple Random Sampling and Stratified Random Sampling Analytical Tec...
What Are Simple Random Sampling and Stratified Random Sampling Analytical Tec...What Are Simple Random Sampling and Stratified Random Sampling Analytical Tec...
What Are Simple Random Sampling and Stratified Random Sampling Analytical Tec...Smarten Augmented Analytics
 
What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...Smarten Augmented Analytics
 

More from Smarten Augmented Analytics (20)

Crime Type Prediction - Augmented Analytics Use Case – Smarten
Crime Type Prediction - Augmented Analytics Use Case – SmartenCrime Type Prediction - Augmented Analytics Use Case – Smarten
Crime Type Prediction - Augmented Analytics Use Case – Smarten
 
What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...
What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...
What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...
 
What Is Random Forest Classification And How Can It Help Your Business?
What Is Random Forest Classification And How Can It Help Your Business?What Is Random Forest Classification And How Can It Help Your Business?
What Is Random Forest Classification And How Can It Help Your Business?
 
What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?
What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?
What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?
 
Students' Academic Performance Predictive Analytics Use Case – Smarten
Students' Academic Performance Predictive Analytics Use Case – SmartenStudents' Academic Performance Predictive Analytics Use Case – Smarten
Students' Academic Performance Predictive Analytics Use Case – Smarten
 
Random Forest Regression Analysis Reveals Impact of Variables on Target Values
Random Forest Regression Analysis Reveals Impact of Variables on Target Values  Random Forest Regression Analysis Reveals Impact of Variables on Target Values
Random Forest Regression Analysis Reveals Impact of Variables on Target Values
 
Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...
Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...
Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...
 
What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...
 
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
 
Fraud Mitigation Predictive Analytics Use Case – Smarten
Fraud Mitigation Predictive Analytics Use Case – SmartenFraud Mitigation Predictive Analytics Use Case – Smarten
Fraud Mitigation Predictive Analytics Use Case – Smarten
 
Quality Control Predictive Analytics Use Case - Smarten
Quality Control Predictive Analytics Use Case - SmartenQuality Control Predictive Analytics Use Case - Smarten
Quality Control Predictive Analytics Use Case - Smarten
 
Machine Maintenance Management Predictive Analytics Use Case - Smarten
Machine Maintenance Management Predictive Analytics Use Case - SmartenMachine Maintenance Management Predictive Analytics Use Case - Smarten
Machine Maintenance Management Predictive Analytics Use Case - Smarten
 
Predictive Analytics Using External Data Augmented Analytics Use Case - Smarten
Predictive Analytics Using External Data Augmented Analytics Use Case - SmartenPredictive Analytics Using External Data Augmented Analytics Use Case - Smarten
Predictive Analytics Using External Data Augmented Analytics Use Case - Smarten
 
Marketing Optimization Augmented Analytics Use Cases - Smarten
Marketing Optimization Augmented Analytics Use Cases - SmartenMarketing Optimization Augmented Analytics Use Cases - Smarten
Marketing Optimization Augmented Analytics Use Cases - Smarten
 
Human Resource Attrition Augmented Analytics Use Case - Smarten
Human Resource Attrition Augmented Analytics Use Case - SmartenHuman Resource Attrition Augmented Analytics Use Case - Smarten
Human Resource Attrition Augmented Analytics Use Case - Smarten
 
Customer Targeting Augmented Analytics Use Case - Smarten
Customer Targeting Augmented Analytics Use Case - SmartenCustomer Targeting Augmented Analytics Use Case - Smarten
Customer Targeting Augmented Analytics Use Case - Smarten
 
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
 
What is the Independent Samples T Test Method of Analysis and How Can it Bene...
What is the Independent Samples T Test Method of Analysis and How Can it Bene...What is the Independent Samples T Test Method of Analysis and How Can it Bene...
What is the Independent Samples T Test Method of Analysis and How Can it Bene...
 
What Are Simple Random Sampling and Stratified Random Sampling Analytical Tec...
What Are Simple Random Sampling and Stratified Random Sampling Analytical Tec...What Are Simple Random Sampling and Stratified Random Sampling Analytical Tec...
What Are Simple Random Sampling and Stratified Random Sampling Analytical Tec...
 
What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...
 

Recently uploaded

5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 

Recently uploaded (20)

5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 

What is Binary Logistic Regression Classification and How is it Used in Analysis?

  • 1. Master the Art of Analytics A Simplistic Explainer Series For Citizen Data Scientists J o u r n e y To w a r d s A u g m e n t e d A n a l y t i c s
  • 3. Basic Terminologies  Target variable usually denoted by Y , is the variable being predicted and is also called dependent variable, output variable, response variable or outcome variable (Ex : One highlighted in red box in table below)  Predictor, sometimes called an independent variable, is a variable that is being used to predict the target variable ( Ex : variables highlighted in green box in table below ) Age Marital Status Loan Status Default 58 married no yes 44 single no no 33 married yes yes 47 married no yes 33 single no no 35 married no yes 28 single yes no
  • 4. Introduction • Objective : • Logistic regression measures the relationship between the categorical target variable and one or more independent variables • It deals with situations in which the outcome for a target variable can have only two possible types • Thus , logistic regression makes use of one or more predictor variables that may be either continuous or categorical to predict the target variable classes • Benefit: • Logistic regression model output helps identify important factors ( Xi ) impacting the target variable (Y) and also the nature of relationship between each of these factors and dependent variable
  • 5. Example : Binary Logistic Regression : Input Let’s conduct the Binary Logistic Regression analysis on following variables : Default Status Age Marital Status Existing Loan Status Income Defaulted 58 married no 46,399 Not Defaulted 44 single no 47,971 Defaulted 33 married yes 52,618 Defaulted 47 married no 28,717 Not Defaulted 33 single no 41,216 Defaulted 35 married no 34,372 Not Defaulted 28 single yes 64,811 Not Defaulted 42 divorced no 53,000 Defaulted 58 married no 41,375 Not Defaulted 43 single no 53,778 Not Defaulted 41 divorced no 44,440 Not Defaulted 29 single no 51,026 Independent variables (Xi)Target Variable (Y)
  • 6. Example : Binary Logistic Regression : Output Coefficients P value (Intercept) -2.34 0.00 Age 0.01 0.07 Marital Status (Married) 0.5 0.04 Income 0.1 0.04 Existing loan (Yes) 0.3 0.03 COEFFICIENTS • P value for marital status, income and existing loan is <0.05 ; Hence these variables are important factors for predicting likely default/non default class • But p value for Age is >0.05 which means Age is not impacting the prediction significantly
  • 7. Example : Binary Logistic Regression : Output CLASSIFICATION ACCURACY : (35+ 70) / (35+70+4+4) = 92% • The prediction accuracy is useful criterion for assessing the model performance • Model with prediction accuracy >= 70% is useful CLASSIFICATION ERROR = 100- Accuracy = 8% There is 8% chance of error in classification Defaulted Not defaulted Defaulted 35 4 Not defaulted 4 70 ACTUAL VERSUS PREDICTED Predicted Actual
  • 9. SAMPLE OUTPUT 1 : MODEL SUMMARY Coefficients P value (Intercept) -2.34 0.00 Age 0.01 0.07 Marital Status (Married) 0.5 0.04 Income 0.1 0.04 Existing loan (Yes) 0.3 0.03 Defaulted Not defaulted Defaulted 35 4 Not defaulted 4 70 ACTUAL VERSUS PREDICTED Predicted Actual COEFFICIENT MATRIX :
  • 10. Age Marital Status Existing Loan Status Income Default Status Predicted class Probability 58 married no 46,399 Defaulted Defaulted 0.7 44 single no 47,971 Not Defaulted Not Defaulted 0.9 33 married yes 52,618 Defaulted Defaulted 0.8 47 married no 28,717 Defaulted Defaulted 0.7 33 single no 41,216 Not Defaulted Not Defaulted 0.6 35 married no 34,372 Defaulted Not Defaulted 0.5 28 single yes 64,811 Not Defaulted Defaulted 0.4 42 divorced no 53,000 Not Defaulted Not Defaulted 0.3 58 married no 41,375 Defaulted Defaulted 0.2 43 single no 53,778 Not Defaulted Defaulted 0.1 Thus, output will contain predicted class column, confusion matrix and classification plot SAMPLE OUTPUT 2 : PREDICTED CLASS & PROBABILITY
  • 11. SAMPLE OUTPUT 3 : CLASSIFICATION PLOT • Lesser the overlap between two classes in the plot above , better the classification done by model
  • 12. INTERPRETATION OF IMPORTANT MODEL SUMMARY STATISTICS Accuracy:  If Accuracy >= 70% : Model is well fit on provided data and predicted classes are reasonably accurate  If Accuracy < 70% : Model is not well fit on provided data and predicted classes are likely to contain high chances of error Coefficients and p value :  If value of coefficient is positive and p value <0.05 , variable is positively correlated with target variable  If value of coefficient is negative and p value <0.05 , variable is negatively correlated with target variable  If p value > 0.05, variable is unimportant in terms of predicting target variable classes
  • 13. Limitations It is applicable only when target variable is categorical Sample size must be at least 1000 in order to get reliable predictions Binary logistic regression is not suitable when number of classes > 2 Level 1 of the target variable should represent the desired outcome. i.e. if desired class is yes in response/non response target variable then Yes has to be recoded into 1 and No into 0
  • 14. General applications Credit/loan approval analysis •Given a list of client’s transactional attributes, predict whether a client will default or not on a bank loan Medical Diagnosis •Given a list of symptoms, predict if a patient has disease X or not Rain forecasting •Based on temperature, humidity, pressure etc. predict if it will be raining or not Treatment effectiveness analysis •Based on patient’s body attributes such as blood pressure, sugar, hemoglobin, name of a drug taken, type of a treatment taken etc., check the likelihood of a disease being cured Fraud analysis •Based on various bills submitted by an employee for reimbursement of food , travel , medical expense etc., predict the likelihood of an employee doing fraud
  • 15. Use case 1 Business benefit: •Once classes are assigned, bank will have a loan applicants’ dataset with each applicant labeled as “likely/unlikely to default”. •Based on this labels , bank can easily make a decision on whether to give loan to an applicant or not and if yes then how much credit limit and interest rate each applicant is eligible for based on the amount of risk involved. Business problem : •A bank loans officer wants to predict if the loan applicant will be a bank defaulter or non defaulter based on attributes such as Loan amount , Monthly installment, Employment tenure , Times delinquent, Annual income, Debt to income ratio etc. •Here the target variable would be ‘past default status’ and predicted class would be containing values ‘yes or no’ representing ‘likely to default/unlikely to default’ class respectively.
  • 16. Use case 1 : Input Dataset Customer ID Loan amount Monthly installment Annual income Debt to income ratio Times delinquent Employment tenure Past default status 1039153 21000 701.73 105000 9 5 4 No 1069697 15000 483.38 92000 11 5 2 No 1068120 25600 824.96 110000 10 9 2 No 563175 23000 534.94 80000 9 2 12 No 562842 19750 483.65 57228 11 3 21 Yes 562681 25000 571.78 113000 10 0 9 No 562404 21250 471.2 31008 12 1 12 Yes 700159 14400 448.99 82000 20 6 6 No 696484 10000 241.33 45000 18 8 2 Yes 702598 11700 381.61 45192 20 7 3 Yes 702470 10000 243.29 38000 17 9 7 Yes 702373 4800 144.77 54000 19 8 2 Yes
  • 17. Use case 1 : Output : Predicted Class Output : Each record will have the predicted class assigned as shown below (Column : Likelihood to default) : Customer ID Loan amount Monthly installment Annual income Debt to income ratio Times delinquent Employment tenure Past default status Likelihood to default 1039153 21000 701.73 105000 9 5 4 No No 1069697 15000 483.38 92000 11 5 2 No No 1068120 25600 824.96 110000 10 9 2 No No 563175 23000 534.94 80000 9 2 12 No No 562842 19750 483.65 57228 11 3 21 Yes No 562681 25000 571.78 113000 10 0 9 No No 562404 21250 471.2 31008 12 1 12 Yes Yes 700159 14400 448.99 82000 20 6 6 No No 696484 10000 241.33 45000 18 8 2 Yes Yes 702598 11700 381.61 45192 20 7 3 Yes Yes 702470 10000 243.29 38000 17 9 7 Yes Yes 702373 4800 144.77 54000 19 8 2 Yes No
  • 18. Use case 1 : Output : Class profile  As can be seen in the table above, there are distinctive characteristics of defaulters (Class : Yes ) and non defaulters ( Class : No ).  Defaulters have tendency to be delinquent, higher debt to income ratio and lower employment tenure as compared to non defaulters  Hence , delinquency , employment tenure and debt to income ratio are the determinant factors when it comes to classifying loan applicants into likely defaulter/non defaulters Class(Likely to default) Average loan amount Average monthly installment Average annual income Average debt to income ratio Average times delinquent Average employment tenure No 10447.30 304.87 66467.74 9.58 1.69 16.82 Yes 7521.32 227.43 60935.28 16.55 6.91 4.01
  • 19. Use case 2 Business benefit: •Given the body profile of a patient and recent treatments and drugs taken by him/her , probability of a cure can be predicted and changes in treatment/drug can be suggested if required. Business problem : •A doctor/ pharmacist wants to predict the likelihood of a new patient’s disease being cured/not cured based on various attributes of a patient such as blood pressure , hemoglobin level, sugar level , name of a drug given to patient, name of a treatment given to patient etc. •Here the target variable would be ‘past cure status’ and predicted class would contain values ‘yes or no’ meaning ‘prone to cure/ not prone to cure’ respectively..
  • 20. Use case 3 Business benefit: •Such classification can prevent a company from spending unreasonably on any employee and can in turn save the company budget by detecting such fraud beforehand. Business problem : •An accountant/human resource manager wants to predict the likelihood of an employee doing fraud to a company based on various bills submitted by him/her so far such as food bill , travel bill , medical bill. •The target variable in this case would be ‘past fraud status’ and predicted class would contain values ‘yes or no’ representing likely fraud and no fraud respectively.
  • 21. Want to Learn More? Get in touch with us @ support@Smarten.com And Do Checkout the Learning section on Smarten.com June 2018