Machine Learning with Binary Logistic Regression - APAC

© 2020 Minitab, LLC.
Machine Learning with Binary Logistic Regression

• 25+ years of
experience
• Minitab Trainer
• Statistical Consultant
• Minitab Software
Designer
• Master's in Statistics
Meet the Presenter:
Cheryl Pammer
Senior Advisory Statistician

Learning Objectives
►Use Binary Logistic Regression in a Machine Learning Environment
►Discuss Methods for Model Selection Using:
 P-values
 Area Under the ROC Curve
 Information Criteria
4

Basic Supervised Machine Learning Algorithms
►Continuous Y: Regression, CART Regression Trees
►Categorical Y: Logistic Regression, CART Classification Trees
5

What Has a Machine Learned?
6
Training Data Test Data
?Data is split into a training set and a test set:
►Training (or learn) = creates model
►Test = assesses model performance

Why is Model Testing (Validation) Important?
►Assessing model with the same data used to fit model leads to
overfitting.
►Overfit models do not predict well.
7

Bias-Variance Trade-Off
8
Model with High Bias Model with High Variance

Validation
9
Validation helps find the best balance between too simple (high bias)
and too complex (high variance)

Example
►Hospital system needs to estimate
the probability that a patient will need
to be readmitted within 30 days.
►Administrators use patient data for
the past year to determine the key
drivers of readmission and predict
readmission probability for new
patients.
10

Binary Logistic Regression
Models relationship between
binary response (Y) and multiple
features (X).
11

Binary Logistic Regression
12
Relationship can be expressed as an equation:
Loge[p/(1-p)] = β0 + β1x1 + … + βkxk
Probability (Event) = exp(β0 + β1x1 + … + βkxk)
(1 + exp(β0 + β1x1 + … + βkxk))

Baseline Rate
13

Potential Predictors  Number of Hospital Days
 Number of Lab Procedures
 Number of Procedures
 Number of Medications
 Number of Outpatient Visits
 Number of Emergency Visits
 Number of Diagnoses
 Race
 Gender
 Age
 Admission
 Discharged To
 Diabetes
14

Potential Terms Up To Order 2
15

Stepwise Regression Using P-values
Automatically select regression or logistic regression models by
adding or removing terms, one step at a time:
►Backward Elimination: Start with full model (all terms) and
remove term with the highest p-value until everything left is
significant.
►Forward Selection: Start with an empty model (intercept only) and
add term with lowest p-value until no significant terms remain.
►Stepwise: Start with an empty model (intercept only), add term
with lowest p-value. At each step, add or remove terms based on
p-values.

Validation With a Test Set
Hold out random
X% of data when
fitting model.
17

Receiver Operating Characteristic (ROC) Curve
►Plot of True Positive Rate vs False Positive Rate
►For a random classifier True Positive Rate = False Positive Rate
18
True Yes True No
Model = Yes #TP #FP
Model = No #FN #TN
Sensitivity = TP/(TP + FN)
Specificity = TN/(FP + TN)

Stepwise Model Selection
19
Step Predictors Test
ROC
1 Discharged To 0.5655
2 Discharged To
#Emergency Visits
0.5830
3 Discharged To
#Emergency Visits
#Diagnoses
0.6007
…
8 Final (8 Terms) 0.6021

Visualizing Model Results
20

Visualizing Model Results
21

Predicting the Probability of Readmission
22

Problems With P-Value-Based Variable Selection
►With larger data sets, tests can be
too powerful and almost everything
is significant.
►With many potential terms, some
will be significant by chance.
►Because individual p-values are
dependent on other terms in
model, finding the correct subset of
terms is challenging.
23
?

Model Selection Strategies
►Need criterion to compare models.
 Categorical Y: Area under ROC curve (Test ROC)
 Continuous Y: Test R2
 Information criteria: AIC, BIC
►Given a criterion, need a search strategy.
 Look at all possible models (Best Subsets)
 Stepwise procedures
Many automated model fitting techniques exist. These vary
depending on type of model.
24

Example: Furniture Delivery
►Furniture manufacturer investigates
12 potential key drivers of defects
in setup and delivery process.
►Data represent individual deliveries
over several months.
►Response:
 Yes = Damaged
 No = Not Damaged
25

Information Criteria
►When a model used to represent a population, information is lost.
►Akaike Information Criteria (AIC) and Bayesian Information Criteria
(BIC) estimate relative information loss. Less is better.
26

Information Criteria
Goal: Find a model that is neither underfit nor overfit.
►Assess goodness of fit using likelihood function – discourages
underfitting.
►Penalize overfitting by considering the number of model terms.
27

Properties of AIC and BIC
►BIC will typically choose a model
as small or smaller than AIC
►As sample size grows to infinity
it can be shown that:
 AIC will always choose a
model that contains the true
model; it won’t leave any
variables out
 BIC will choose exactly the
right model
28

Forward Information Criteria Model Selection
29
Step Predictors BIC Test
ROC
1 Warehouse Time 1348 0.9977
2 + Tech Lead 1290 0.9989
3 + Team Lift 1077 0.9993
4 + Load Type 877 0.9997
5 + Stop Number 810 0.9998
6 + Driver 1122 0.9998
BIC typically results in smaller models unless n is small.

Receiver Operating Characteristic (ROC) Curve
30
Sensitivity = TP/(TP + FN)
Specificity = TN/(FP + TN)
True Yes True No
Model = Yes #TP #FP
Model = No #FN #TN

One Look of the Data…
31

Highly Significant, But…
32

The Real Key Result
33

Take-aways
You have learned to use:
►Binary Logistic Regression in a Machine Learning Environment
►Old and New Methods for Model Selection:
 P-values
 Area Under the ROC Curve
 Information Criteria
Questions?
Cheryl Pammer
cpammer@minitab.com
34

Upcoming Webinars and Virtual Events
• Machine Learning with Classification & Regression Trees
(CART® )
Time: Wednesday 15 July, 12PM AEST (10AM HKT / 2PM NZST)
See all the details and sign up at:
https://info.minitab.com/resources/webinars/webinar-wednesdays

Upcoming Webinars and Virtual Events
• Online/Virtual Training
Minitab is now offering virtual training taught by
Minitab experts – perfect for remote/home workers.
Visit www.minitab.com/training/training for more info.
• Talk to Minitab
Complimentary resources to help you deal quickly with today's challenges and changing environment.
Visit www.minitab.com and click on the Talk to Minitab button and a Minitab representative will be in touch!

Our Approach: More Than Business Analytics… Solutions Analytics
Software
Services
Training
Learn first-hand by attending public or
customized trainings in your facilities
according to your requirements.
Statistical
Consulting
Personalized help with statistical
challenges from collecting the right data
to interpreting analysis more.
Support
Assistance with installation,
implementation, version updates
and license management.
Master statistics and
Minitab anywhere
with online training
Machine learning and
predictive analytics
software
Start, track, manage
and execute
improvement projects
with real-time
dashboards
Powerful statistical
software everyone
can use.
Data Analysis Predictive Modeling Visual Business Tools Project Oversight
Visual tools to
process and product
excellence.
Online Training
Solutions analytics is our integrated approach to providing software and services that enable organizations to make better decisions that drive business excellence.

Machine Learning with Binary Logistic Regression - APAC

Recommended

Recommended

More Related Content

What's hot

What's hot (11)

Similar to Machine Learning with Binary Logistic Regression - APAC

Similar to Machine Learning with Binary Logistic Regression - APAC (20)

More from Minitab, LLC

More from Minitab, LLC (17)

Recently uploaded

Recently uploaded (20)

Machine Learning with Binary Logistic Regression - APAC