SlideShare a Scribd company logo
1 of 26
Logistic Regression III
SIT095
The Collection and Analysis of Quantitative
Data II
Week 9
Luke Sloan
Introduction
• Recap – Last Week
• Workshop Feedback
• Multinomial Logistic Regression in SPSS
• Model Interpretation
• In Class Exercise
• Writing-Up
• Summary
Recap – Last Week
• Variable selection
• Binary logistic regression in SPSS
• Model interpretation
• Intuitive results?
Workshop Feedback
TASK:
To run and interpret a binary logistic regression model with
‘Sex’ as the dependent variable using your own choice of
independent variables
Were your models successful?
Did you have any problems or issues?
TODAY: I will show you how to run and interpret a multinomial logistic model in
SPSS. I will use a different dependent variable (‘edlev7’) and the same dataset.
Did you find anything interesting (interpretation of odds ratios)?
Did you have difficulty in interpretation?
Multinomial Logistic Regression in
SPSS I
• Very similar to binary logistic regression
• For a categorical dependent variable with more than two categories
• ‘edlev7’ asks for the highest educational qualification of a respondent and
has three categories: ‘Higher Education’, ‘Other Qualification’ and ‘None’
• One of these categories has to be designated a ‘reference category’ to
which the others will be compared
• E.g. if ‘None’ is the ‘reference category’…
– respondents who had Higher Education qualifications were more likely to be
female (odds increase of 2.3) than respondents with no qualifications
– Respondents who had other qualifications were less likely to be female (odds
decrease of 0.45) than respondent with no qualifications
It is not possible to compare groups that are not the ‘reference category’ i.e. we cannot
draw comparisons between ‘Higher Education’ and ‘Other Qualification’ directly
Multinomial Logistic Regression in
SPSS II
Education Level - 2000 (3 groups)
Frequency Percent Valid Percent
Cumulative
Percent
Valid HIGHER EDUCAT 2015 24.5 31.2 31.2
OTHER QUAL 2826 34.4 43.8 75.0
NONE 1614 19.6 25.0 100.0
Total 6455 78.5 100.0
Missing NEV WENT SCH 16 .2
NA 4 .0
AGEOUT,MSPR 1745 21.2
System 1 .0
Total 1766 21.5
Total 8221 100.0
Deciding on a ‘reference category’ should be an informed decision – what
do we want to compare?
As a rule of
thumb, the
‘reference
category’ should
be the most
populated
response (highest
frequency), but
this can be over-
ruled by your
research agenda
In this case I am going to use ‘Other Qualification’ for
several reasons: largest group, median point and
interesting from a theoretical perspective (difference
between ‘Other Qual’ and ‘Higher Education’ might
question value of studying at university…
Multinomial Logistic Regression in
SPSS III
• You still need to select your variables carefully
• Consider hypotheses, frequencies, recoding, relationships and
multicolinearity
• My variables (including recodes):
– ‘manual2’ (non-manual/manual)
– ‘ethnic2’ (white/non-white)
– ‘marital2’ (married/cohabiting/single/widowed/divorced or separated)
– ‘seefrnd2’ (weekly/monthly/less than monthly/not in last year)
– ‘cntctmp’ (yes/no)
– ‘age’ (in years)
– ‘alcdrug2’ (very big problem/fairly big problem/minor problem/not a
problem/happens but is not a problem)
– ‘influence2’ (yes/no)
Excluded due to multicolinearity – could be interesting…
Multinomial Logistic Regression in
SPSS IV
1) To begin, go to ‘Analyze’, ‘Regression’ and select ‘Multinomial Logistic…’
2) Your dependent
goes here
3) Click on ‘Reference
Category…’
By default SPSS will use the last category in your independent categorical variables as
the ‘reference category’
Multinomial Logistic Regression in
SPSS V
You need to tell SPSS which response
for the dependent variable you want
to be used as the ‘reference category’
4) Because ‘Other Qualification’ is
coded as ‘2’ in our dataset and we
want to use this as the ‘reference
category’ we select ‘Custom’ and type
the value (‘2’)
‘Category Order’ is important when
specifying ‘First Category’ or ‘Last
Category’ – always a good idea to
specify a custom value manually
5) Click ‘Continue’
Multinomial Logistic Regression in
SPSS VI
Notice that the dependent is now follows by ‘(Custom)’
6) Your
categorical
independent
variables (factors)
go here
7) Your interval
independent
variables
(covariates) go
here
8) Click on
‘Statistics…’
Multinomial Logistic Regression in
SPSS VII
9) Select ‘Information Criteria’, ‘Cell
probabilities’, ‘Classification table’
and ‘Goodness-of-fit’
Note that some options are already
selected – leave them as they are
10) Click ‘Continue’
Multinomial Logistic Regression in
SPSS VIII
11) Click ‘Save…’
Multinomial Logistic Regression in
SPSS IX
12) Select ‘Estimated
response probabilities’,
‘Predicted category’,
‘Predicted category
probability’ and ‘Actual
category probability’
These values will be saved
as variables on the
datasheet for later analysis
Ignore this option as we
are not interested in
exporting the model
13) Click ‘Continue’
Multinomial Logistic Regression in
SPSS X
14) Click ‘OK’ to
run the model
Model Interpretation I
Case Processing Summary
N
Marginal
Percentage
Education Level - 2000 (3
groups)
HIGHER EDUCAT 1942 32.2%
OTHER QUAL 2575 42.7%
NONE 1515 25.1%
Manual or non manual Non-Manual 3558 59.0%
Manual 2474 41.0%
Ethnicity White 5760 95.5%
Non-White 272 4.5%
Marital status married 3043 50.4%
cohabiting&SSC 547 9.1%
single 1291 21.4%
widowed 277 4.6%
div/sep 874 14.5%
See friends Weekly 4620 76.6%
Monthly 871 14.4%
Less Than Monthly 429 7.1%
Not In Last Year 112 1.9%
contacted MP no 5344 88.6%
yes 688 11.4%
Valid 6032 100.0%
Missing 2189
Total 8221
Subpopulation 1511
a
a. The dependent variable has only one value observed in 846 (56.0%)
subpopulations.
This table tells us the
frequencies and percentages of
respondents from the dataset
that fall into each category for all
the categorical variables
(including the dependent)
We need to look out for low
frequencies – but this shouldn’t
be a problem if you’ve chosen
your variables rigorously!
Notice the number of valid cases
– i.e. cases without missing data
(remember the assumptions!)
Model Interpretation II
Model Fitting Information
Model Model Fitting Criteria Likelihood Ratio Tests
AIC BIC
-2 Log
Likelihood Chi-Square df Sig.
Intercept Only 6820.102 6833.512 6816.102
Final 5074.633 5235.549 5026.633 1789.468 22 .000
This table tells us whether our
model is a significant improvement
on the ‘intercept only’ (null) model
p<0.05 means rejecting the null hypothesis
that there is no difference between the
‘intercept only’ and populated model
Model Interpretation III
Goodness-of-Fit
Chi-Square df Sig.
Pearson 3211.136 2998 .003
Deviance 3114.276 2998 .068
Pseudo R-Square
Cox and Snell .257
Nagelkerke .291
McFadden .138
The pseudo R-square tells us how much
of the variance in the dependent variable
is explained by the model – low values
are normal in logistic regression (think
about variance in dependent!)
Both of these statistics test
how well the model fits that
data (expected and actual
values) and p<0.05 means that
there is a significant difference
between the two i.e. the model
is not a good fit!
According to the Pearson statistic
the model is a bad fit, but the
Deviance statistic suggests
otherwise (not not by much!)
This could be due to low frequencies in
crosstabs or ‘overdispersion’ (see Field
2009:308) – subjective judgment…
Model Interpretation V
Likelihood Ratio Tests
Effect Model Fitting Criteria Likelihood Ratio Tests
AIC of
Reduced
Model
BIC of
Reduced
Model
-2 Log
Likelihood of
Reduced
Model Chi-Square df Sig.
Intercept 5074.633 5235.549 5026.633 .000 0 .
age 5605.268 5752.774 5561.268 534.634 2 .000
manual2 6018.795 6166.302 5974.795 948.162 2 .000
Ethnic2 5074.901 5222.408 5030.901 4.268 2 .118
marital2 5087.697 5194.974 5055.697 29.064 8 .000
seefrnd2 5075.437 5196.124 5039.437 12.804 6 .046
cntctmp 5096.844 5244.350 5052.844 26.210 2 .000
The chi-square statistic is the difference in -2 log-likelihoods between the final model and a
reduced model. The reduced model is formed by omitting an effect from the final model. The null
hypothesis is that all parameters of that effect are 0.
This table tells us which independent variables had a significant effect in our model
Ethnicity
(‘Ethnic2’) is the
only predictor
that does not
significantly
effect the
highest
educational
qualification of a
respondent in
the model
Model Interpretation VI
Parameter Estimates
Education Level - 2000 (3 groups)
a
B Std. Error Wald df Sig. Exp(B)
95% Confidence Interval for
Exp(B)
Lower Bound Upper Bound
HIGHER
EDUCAT
Intercept -.988 .372 7.063 1 .008
age .000 .003 .028 1 .867 1.000 .994 1.005
[manual2=1.00] 1.282 .073 309.342 1 .000 3.602 3.123 4.156
[manual2=2.00] 0
b
. . 0 . . . .
[Ethnic2=1.00] -.298 .146 4.181 1 .041 .742 .558 .988
[Ethnic2=2.00] 0
b
. . 0 . . . .
[marital2=1.00] .113 .098 1.340 1 .247 1.120 .925 1.356
[marital2=2.00] .268 .134 3.992 1 .046 1.307 1.005 1.701
[marital2=3.00] .123 .114 1.156 1 .282 1.130 .904 1.413
[marital2=4.00] -.310 .207 2.242 1 .134 .734 .489 1.100
[marital2=5.00] 0
b
. . 0 . . . .
[seefrnd2=1.00] .204 .301 .461 1 .497 1.226 .680 2.211
[seefrnd2=2.00] .193 .309 .391 1 .532 1.213 .662 2.222
[seefrnd2=3.00] .305 .321 .906 1 .341 1.357 .724 2.543
[seefrnd2=4.00] 0
b
. . 0 . . . .
[cntctmp=0] -.249 .094 6.993 1 .008 .780 .649 .938
[cntctmp=1] 0
b
. . 0 . . . .
Because we are comparing both ‘Higher Education’ and ‘No Qualification’ with the
reference category ‘Other Qualification’ we are given two parameter estimate tables
This is the parameter estimates table comparing respondents with a ‘Higher Education
Qualification’ with respondents with a ‘Other Qualification’
Model Interpretation VII
NONE Intercept -2.705 .357 57.555 1 .000
age .065 .003 428.739 1 .000 1.068 1.061 1.074
[manual2=1.00] -1.184 .074 255.802 1 .000 .306 .265 .354
[manual2=2.00] 0
b
. . 0 . . . .
[Ethnic2=1.00] -.164 .182 .806 1 .369 .849 .594 1.214
[Ethnic2=2.00] 0
b
. . 0 . . . .
[marital2=1.00] -.215 .100 4.618 1 .032 .806 .663 .981
[marital2=2.00] -.195 .165 1.384 1 .239 .823 .595 1.138
[marital2=3.00] .093 .125 .550 1 .458 1.097 .859 1.401
[marital2=4.00] .062 .174 .128 1 .721 1.064 .757 1.496
[marital2=5.00] 0
b
. . 0 . . . .
[seefrnd2=1.00] -.468 .240 3.811 1 .051 .627 .392 1.002
[seefrnd2=2.00] -.664 .255 6.781 1 .009 .515 .312 .848
[seefrnd2=3.00] -.273 .270 1.018 1 .313 .761 .448 1.293
[seefrnd2=4.00] 0
b
. . 0 . . . .
[cntctmp=0] .392 .121 10.525 1 .001 1.480 1.168 1.875
[cntctmp=1] 0
b
. . 0 . . . .
a. The reference category is: OTHER QUAL.
b. This parameter is set to zero because it is redundant.
This is the parameter estimates table comparing respondents with a ‘No Qualification’
with respondents with a ‘Other Qualification’
The interpretation of results is exactly the same as for binary logistic regression – SPSS
doesn’t provide a parameter coding table, so you need to work this out manually
Model Interpretation VIII
Classification
Observed Predicted
HIGHER
EDUCAT OTHER QUAL NONE Percent Correct
HIGHER EDUCAT 1405 402 135 72.3%
OTHER QUAL 1217 943 415 36.6%
NONE 319 428 768 50.7%
Overall Percentage 48.8% 29.4% 21.9% 51.7%
Finally you are given a classification table that tells you how well the predictive model
performed – look for misclassifications and ask yourself why… you can always run a
new and improved model!
The model has trouble with ‘Other Qualification’ respondents – it
tries to assign many of the to ‘Higher Education’
51.7% correctly predicted is okay – but the model is best at predicting respondents
with ‘Higher Education’ qualifications… can you do better?
In Class Exercise
• Work in small groups to interpret the results of my model
(the odds ratios) for ‘manual2’ and ‘seefrnd2’
• Remember to…
– Look for significance
– Negative or positive coefficient?
– Interpret the Exp(B) (odds ratio)
– We are not comparing ‘No Qual’ with ‘HE Qual’
You need to know that…
[‘manual2’ = 1.00] refers to non-manual respondent
[‘manual2’ = 2.00] refers to manual respondent (reference category)
[‘seefrnd2’ = 1.00] refers to seeing friends weekly
[‘seefrnd2’ = 2.00] refers to seeing friends monthly
[‘seefrnd2’ = 3.00] refers to seeing friends less than monthly
[‘seefrnd2’ = 4.00] refers to seeing friends not in the last year (reference category)
Writing-Up I
• Report the test results from the output – always give the test statistic, degrees of
freedom (if appropriate) and the p-value
• Always explain what the test result means for your model
• Remember – if your model doesn’t fit then there’s no point in writing about it!
• Report which coefficients are not significant – offer an explanation as to why (why
were your hypotheses and bivariate tests wrong?... complexity of interactions?)
• Regarding reporting odds ratios:
– Report whether the odds increase or decrease
– Give the odds ratio (or percentage point increase if you prefer)
– Give the degrees of freedom
– Give the Wald statistic
• Remember to say ‘all other things being equal’ every now and again!
Writing-Up II
EXAMPLE:
The coefficient for the variable ‘manual2’ (whether a respondent has a manual or
non-manual occupation) was significant for both respondents with a higher education
and no qualification.
Non-manual respondents were much more likely to have a higher education than an
‘other’ qualification than manual respondents (odds = 3.6, 1 d.f., Wald = 309.34) all
other things being equal.
Also, non-manual respondents were much less likely not to have any qualifications
than to have an ‘other’ qualification than manual respondents (odds = 0.31, 1 d.f.,
Wald = 255.80) all other things being equal.
Although the language is awkward we can summarise by saying that respondents with
higher education qualifications are more likely to have non-manual jobs than
respondents with ‘other’ qualifications. Also, respondents with no qualifications are
less likely to have non-manual jobs than respondents with ‘other’ qualifications. Both
of these statements are made in reference to respondents who have manual
occupations (the dummy ref cat.) and with ‘other’ qualifications (DV ref cat.)
Summary
• Binary and multinomial models are very
similar, but notice the subtle differences
• Again interpretation of the coefficients and
Exp(B) are the tricky bit
• The models are very powerful, even when
saying ‘more likely’ or ‘less likely’
Workshop Task
• Run a multinomial logistic regression model with the dependent variable
‘edlev7’
• See if you can get a better prediction rate than me!
• Use everything you’ve learnt over the past weeks, starting with the proper
procedure for variable selection
• Use these slides to check that the model works (follow my step-by-step
guide to operation and interpretation)
• Interpret the odds ratios and draw some conclusions about your model
• If your model doesn’t work then work in pairs
• This technique is advanced, so ask for help if you are unsure

More Related Content

Similar to SIT095_Lecture_9_Logistic_Regression_Part_3.pptx

Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
Sherri Gunder
 
Mb0050 research methodology
Mb0050   research methodologyMb0050   research methodology
Mb0050 research methodology
smumbahelp
 

Similar to SIT095_Lecture_9_Logistic_Regression_Part_3.pptx (20)

Visual Tools for explaining Machine Learning Models
Visual Tools for explaining Machine Learning ModelsVisual Tools for explaining Machine Learning Models
Visual Tools for explaining Machine Learning Models
 
Bivariate Regression
Bivariate RegressionBivariate Regression
Bivariate Regression
 
analysis part 02.pptx
analysis part 02.pptxanalysis part 02.pptx
analysis part 02.pptx
 
PSYCH 625 MENTOR Become Exceptional--psych625mentor.com
PSYCH 625 MENTOR Become Exceptional--psych625mentor.comPSYCH 625 MENTOR Become Exceptional--psych625mentor.com
PSYCH 625 MENTOR Become Exceptional--psych625mentor.com
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
 
PSYCH 625 Exceptional Education - snaptutorial.com
PSYCH 625   Exceptional Education - snaptutorial.comPSYCH 625   Exceptional Education - snaptutorial.com
PSYCH 625 Exceptional Education - snaptutorial.com
 
Descriptive Statistics, Numerical Description
Descriptive Statistics, Numerical DescriptionDescriptive Statistics, Numerical Description
Descriptive Statistics, Numerical Description
 
Eco550 Assignment 1
Eco550 Assignment 1Eco550 Assignment 1
Eco550 Assignment 1
 
PSYCH 625 MENTOR Education Counseling -- psych625mentor.com
PSYCH 625 MENTOR Education Counseling -- psych625mentor.comPSYCH 625 MENTOR Education Counseling -- psych625mentor.com
PSYCH 625 MENTOR Education Counseling -- psych625mentor.com
 
PSYCH 625 MENTOR Redefined Education--psych625mentor.com
PSYCH 625 MENTOR Redefined Education--psych625mentor.comPSYCH 625 MENTOR Redefined Education--psych625mentor.com
PSYCH 625 MENTOR Redefined Education--psych625mentor.com
 
PSYCH 625 MENTOR Education for Service-- psych625mentor.com
PSYCH 625 MENTOR Education for Service-- psych625mentor.comPSYCH 625 MENTOR Education for Service-- psych625mentor.com
PSYCH 625 MENTOR Education for Service-- psych625mentor.com
 
PSYCH 625 MENTOR Inspiring Innovation--psych625mentor.com
PSYCH 625 MENTOR Inspiring Innovation--psych625mentor.comPSYCH 625 MENTOR Inspiring Innovation--psych625mentor.com
PSYCH 625 MENTOR Inspiring Innovation--psych625mentor.com
 
PSYCH 625 MENTOR Knowledge is divine--psych625mentor.com
PSYCH 625 MENTOR Knowledge is divine--psych625mentor.comPSYCH 625 MENTOR Knowledge is divine--psych625mentor.com
PSYCH 625 MENTOR Knowledge is divine--psych625mentor.com
 
PSYCH 625 MENTOR Education Planning--psych625mentor.com
PSYCH 625 MENTOR Education Planning--psych625mentor.comPSYCH 625 MENTOR Education Planning--psych625mentor.com
PSYCH 625 MENTOR Education Planning--psych625mentor.com
 
PSYCH 625 MENTOR Achievement Education / psych625mentor.com
PSYCH 625 MENTOR Achievement Education / psych625mentor.comPSYCH 625 MENTOR Achievement Education / psych625mentor.com
PSYCH 625 MENTOR Achievement Education / psych625mentor.com
 
Statistics
StatisticsStatistics
Statistics
 
Mb0050 research methodology
Mb0050   research methodologyMb0050   research methodology
Mb0050 research methodology
 
Basics of Stats (2).pptx
Basics of Stats (2).pptxBasics of Stats (2).pptx
Basics of Stats (2).pptx
 
Module 3 statistics
Module 3   statisticsModule 3   statistics
Module 3 statistics
 

More from dawitg2

unit1_practical-applications-of-biotechnology.ppt
unit1_practical-applications-of-biotechnology.pptunit1_practical-applications-of-biotechnology.ppt
unit1_practical-applications-of-biotechnology.ppt
dawitg2
 
fertilizers-150111082613-conversion-gate01.pdf
fertilizers-150111082613-conversion-gate01.pdffertilizers-150111082613-conversion-gate01.pdf
fertilizers-150111082613-conversion-gate01.pdf
dawitg2
 
sac-301-manuresfertilizersandsoilfertilitymanagement-210105122952.pdf
sac-301-manuresfertilizersandsoilfertilitymanagement-210105122952.pdfsac-301-manuresfertilizersandsoilfertilitymanagement-210105122952.pdf
sac-301-manuresfertilizersandsoilfertilitymanagement-210105122952.pdf
dawitg2
 
davindergill 135021014 -170426133338.pdf
davindergill 135021014 -170426133338.pdfdavindergill 135021014 -170426133338.pdf
davindergill 135021014 -170426133338.pdf
dawitg2
 
preparationofdifferentagro-chemicaldosesforfield-140203005533-phpapp02.pdf
preparationofdifferentagro-chemicaldosesforfield-140203005533-phpapp02.pdfpreparationofdifferentagro-chemicaldosesforfield-140203005533-phpapp02.pdf
preparationofdifferentagro-chemicaldosesforfield-140203005533-phpapp02.pdf
dawitg2
 
-pre-cautions--on--seed-storage--1-.pptx
-pre-cautions--on--seed-storage--1-.pptx-pre-cautions--on--seed-storage--1-.pptx
-pre-cautions--on--seed-storage--1-.pptx
dawitg2
 
Intgrated pest management 02 _ 06_05.pdf
Intgrated pest management 02 _ 06_05.pdfIntgrated pest management 02 _ 06_05.pdf
Intgrated pest management 02 _ 06_05.pdf
dawitg2
 
pesticide formulation pp602lec1to2-211207073233.pdf
pesticide formulation pp602lec1to2-211207073233.pdfpesticide formulation pp602lec1to2-211207073233.pdf
pesticide formulation pp602lec1to2-211207073233.pdf
dawitg2
 
diseasespests-2013-130708184617-phpapp02.pptx
diseasespests-2013-130708184617-phpapp02.pptxdiseasespests-2013-130708184617-phpapp02.pptx
diseasespests-2013-130708184617-phpapp02.pptx
dawitg2
 
agricultaraly important agrochemicals.ppt
agricultaraly important agrochemicals.pptagricultaraly important agrochemicals.ppt
agricultaraly important agrochemicals.ppt
dawitg2
 
agri pesticide chemistry-180722174627.pdf
agri pesticide chemistry-180722174627.pdfagri pesticide chemistry-180722174627.pdf
agri pesticide chemistry-180722174627.pdf
dawitg2
 
372922285 -important Fungal-Nutrition.ppt
372922285 -important Fungal-Nutrition.ppt372922285 -important Fungal-Nutrition.ppt
372922285 -important Fungal-Nutrition.ppt
dawitg2
 
disease development and pathogenesis-201118142432.pptx
disease development and pathogenesis-201118142432.pptxdisease development and pathogenesis-201118142432.pptx
disease development and pathogenesis-201118142432.pptx
dawitg2
 

More from dawitg2 (20)

bacterial plant pathogen jeyarajesh-190413122916.pptx
bacterial plant pathogen jeyarajesh-190413122916.pptxbacterial plant pathogen jeyarajesh-190413122916.pptx
bacterial plant pathogen jeyarajesh-190413122916.pptx
 
unit1_practical-applications-of-biotechnology.ppt
unit1_practical-applications-of-biotechnology.pptunit1_practical-applications-of-biotechnology.ppt
unit1_practical-applications-of-biotechnology.ppt
 
Introduction-to-Plant-Cell-Culture-lec1.ppt
Introduction-to-Plant-Cell-Culture-lec1.pptIntroduction-to-Plant-Cell-Culture-lec1.ppt
Introduction-to-Plant-Cell-Culture-lec1.ppt
 
fertilizers-150111082613-conversion-gate01.pdf
fertilizers-150111082613-conversion-gate01.pdffertilizers-150111082613-conversion-gate01.pdf
fertilizers-150111082613-conversion-gate01.pdf
 
sac-301-manuresfertilizersandsoilfertilitymanagement-210105122952.pdf
sac-301-manuresfertilizersandsoilfertilitymanagement-210105122952.pdfsac-301-manuresfertilizersandsoilfertilitymanagement-210105122952.pdf
sac-301-manuresfertilizersandsoilfertilitymanagement-210105122952.pdf
 
davindergill 135021014 -170426133338.pdf
davindergill 135021014 -170426133338.pdfdavindergill 135021014 -170426133338.pdf
davindergill 135021014 -170426133338.pdf
 
preparationofdifferentagro-chemicaldosesforfield-140203005533-phpapp02.pdf
preparationofdifferentagro-chemicaldosesforfield-140203005533-phpapp02.pdfpreparationofdifferentagro-chemicaldosesforfield-140203005533-phpapp02.pdf
preparationofdifferentagro-chemicaldosesforfield-140203005533-phpapp02.pdf
 
pesticide stoeage storage ppwscript.ppt
pesticide stoeage storage  ppwscript.pptpesticide stoeage storage  ppwscript.ppt
pesticide stoeage storage ppwscript.ppt
 
-pre-cautions--on--seed-storage--1-.pptx
-pre-cautions--on--seed-storage--1-.pptx-pre-cautions--on--seed-storage--1-.pptx
-pre-cautions--on--seed-storage--1-.pptx
 
Intgrated pest management 02 _ 06_05.pdf
Intgrated pest management 02 _ 06_05.pdfIntgrated pest management 02 _ 06_05.pdf
Intgrated pest management 02 _ 06_05.pdf
 
major potato and tomato disease in ethiopia .pptx
major potato and tomato disease in ethiopia .pptxmajor potato and tomato disease in ethiopia .pptx
major potato and tomato disease in ethiopia .pptx
 
pesticide formulation pp602lec1to2-211207073233.pdf
pesticide formulation pp602lec1to2-211207073233.pdfpesticide formulation pp602lec1to2-211207073233.pdf
pesticide formulation pp602lec1to2-211207073233.pdf
 
diseasespests-2013-130708184617-phpapp02.pptx
diseasespests-2013-130708184617-phpapp02.pptxdiseasespests-2013-130708184617-phpapp02.pptx
diseasespests-2013-130708184617-phpapp02.pptx
 
pesticide pp602lec1to2-211207073233.pptx
pesticide pp602lec1to2-211207073233.pptxpesticide pp602lec1to2-211207073233.pptx
pesticide pp602lec1to2-211207073233.pptx
 
agricultaraly important agrochemicals.ppt
agricultaraly important agrochemicals.pptagricultaraly important agrochemicals.ppt
agricultaraly important agrochemicals.ppt
 
agri pesticide chemistry-180722174627.pdf
agri pesticide chemistry-180722174627.pdfagri pesticide chemistry-180722174627.pdf
agri pesticide chemistry-180722174627.pdf
 
372922285 -important Fungal-Nutrition.ppt
372922285 -important Fungal-Nutrition.ppt372922285 -important Fungal-Nutrition.ppt
372922285 -important Fungal-Nutrition.ppt
 
disease development and pathogenesis-201118142432.pptx
disease development and pathogenesis-201118142432.pptxdisease development and pathogenesis-201118142432.pptx
disease development and pathogenesis-201118142432.pptx
 
ppp211lecture8-221211055228-824cf9da.pptx
ppp211lecture8-221211055228-824cf9da.pptxppp211lecture8-221211055228-824cf9da.pptx
ppp211lecture8-221211055228-824cf9da.pptx
 
defensemechanismsinplants-180308104711.pptx
defensemechanismsinplants-180308104711.pptxdefensemechanismsinplants-180308104711.pptx
defensemechanismsinplants-180308104711.pptx
 

Recently uploaded

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxThird Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 

SIT095_Lecture_9_Logistic_Regression_Part_3.pptx

  • 1. Logistic Regression III SIT095 The Collection and Analysis of Quantitative Data II Week 9 Luke Sloan
  • 2. Introduction • Recap – Last Week • Workshop Feedback • Multinomial Logistic Regression in SPSS • Model Interpretation • In Class Exercise • Writing-Up • Summary
  • 3. Recap – Last Week • Variable selection • Binary logistic regression in SPSS • Model interpretation • Intuitive results?
  • 4. Workshop Feedback TASK: To run and interpret a binary logistic regression model with ‘Sex’ as the dependent variable using your own choice of independent variables Were your models successful? Did you have any problems or issues? TODAY: I will show you how to run and interpret a multinomial logistic model in SPSS. I will use a different dependent variable (‘edlev7’) and the same dataset. Did you find anything interesting (interpretation of odds ratios)? Did you have difficulty in interpretation?
  • 5. Multinomial Logistic Regression in SPSS I • Very similar to binary logistic regression • For a categorical dependent variable with more than two categories • ‘edlev7’ asks for the highest educational qualification of a respondent and has three categories: ‘Higher Education’, ‘Other Qualification’ and ‘None’ • One of these categories has to be designated a ‘reference category’ to which the others will be compared • E.g. if ‘None’ is the ‘reference category’… – respondents who had Higher Education qualifications were more likely to be female (odds increase of 2.3) than respondents with no qualifications – Respondents who had other qualifications were less likely to be female (odds decrease of 0.45) than respondent with no qualifications It is not possible to compare groups that are not the ‘reference category’ i.e. we cannot draw comparisons between ‘Higher Education’ and ‘Other Qualification’ directly
  • 6. Multinomial Logistic Regression in SPSS II Education Level - 2000 (3 groups) Frequency Percent Valid Percent Cumulative Percent Valid HIGHER EDUCAT 2015 24.5 31.2 31.2 OTHER QUAL 2826 34.4 43.8 75.0 NONE 1614 19.6 25.0 100.0 Total 6455 78.5 100.0 Missing NEV WENT SCH 16 .2 NA 4 .0 AGEOUT,MSPR 1745 21.2 System 1 .0 Total 1766 21.5 Total 8221 100.0 Deciding on a ‘reference category’ should be an informed decision – what do we want to compare? As a rule of thumb, the ‘reference category’ should be the most populated response (highest frequency), but this can be over- ruled by your research agenda In this case I am going to use ‘Other Qualification’ for several reasons: largest group, median point and interesting from a theoretical perspective (difference between ‘Other Qual’ and ‘Higher Education’ might question value of studying at university…
  • 7. Multinomial Logistic Regression in SPSS III • You still need to select your variables carefully • Consider hypotheses, frequencies, recoding, relationships and multicolinearity • My variables (including recodes): – ‘manual2’ (non-manual/manual) – ‘ethnic2’ (white/non-white) – ‘marital2’ (married/cohabiting/single/widowed/divorced or separated) – ‘seefrnd2’ (weekly/monthly/less than monthly/not in last year) – ‘cntctmp’ (yes/no) – ‘age’ (in years) – ‘alcdrug2’ (very big problem/fairly big problem/minor problem/not a problem/happens but is not a problem) – ‘influence2’ (yes/no) Excluded due to multicolinearity – could be interesting…
  • 8. Multinomial Logistic Regression in SPSS IV 1) To begin, go to ‘Analyze’, ‘Regression’ and select ‘Multinomial Logistic…’ 2) Your dependent goes here 3) Click on ‘Reference Category…’ By default SPSS will use the last category in your independent categorical variables as the ‘reference category’
  • 9. Multinomial Logistic Regression in SPSS V You need to tell SPSS which response for the dependent variable you want to be used as the ‘reference category’ 4) Because ‘Other Qualification’ is coded as ‘2’ in our dataset and we want to use this as the ‘reference category’ we select ‘Custom’ and type the value (‘2’) ‘Category Order’ is important when specifying ‘First Category’ or ‘Last Category’ – always a good idea to specify a custom value manually 5) Click ‘Continue’
  • 10. Multinomial Logistic Regression in SPSS VI Notice that the dependent is now follows by ‘(Custom)’ 6) Your categorical independent variables (factors) go here 7) Your interval independent variables (covariates) go here 8) Click on ‘Statistics…’
  • 11. Multinomial Logistic Regression in SPSS VII 9) Select ‘Information Criteria’, ‘Cell probabilities’, ‘Classification table’ and ‘Goodness-of-fit’ Note that some options are already selected – leave them as they are 10) Click ‘Continue’
  • 12. Multinomial Logistic Regression in SPSS VIII 11) Click ‘Save…’
  • 13. Multinomial Logistic Regression in SPSS IX 12) Select ‘Estimated response probabilities’, ‘Predicted category’, ‘Predicted category probability’ and ‘Actual category probability’ These values will be saved as variables on the datasheet for later analysis Ignore this option as we are not interested in exporting the model 13) Click ‘Continue’
  • 14. Multinomial Logistic Regression in SPSS X 14) Click ‘OK’ to run the model
  • 15. Model Interpretation I Case Processing Summary N Marginal Percentage Education Level - 2000 (3 groups) HIGHER EDUCAT 1942 32.2% OTHER QUAL 2575 42.7% NONE 1515 25.1% Manual or non manual Non-Manual 3558 59.0% Manual 2474 41.0% Ethnicity White 5760 95.5% Non-White 272 4.5% Marital status married 3043 50.4% cohabiting&SSC 547 9.1% single 1291 21.4% widowed 277 4.6% div/sep 874 14.5% See friends Weekly 4620 76.6% Monthly 871 14.4% Less Than Monthly 429 7.1% Not In Last Year 112 1.9% contacted MP no 5344 88.6% yes 688 11.4% Valid 6032 100.0% Missing 2189 Total 8221 Subpopulation 1511 a a. The dependent variable has only one value observed in 846 (56.0%) subpopulations. This table tells us the frequencies and percentages of respondents from the dataset that fall into each category for all the categorical variables (including the dependent) We need to look out for low frequencies – but this shouldn’t be a problem if you’ve chosen your variables rigorously! Notice the number of valid cases – i.e. cases without missing data (remember the assumptions!)
  • 16. Model Interpretation II Model Fitting Information Model Model Fitting Criteria Likelihood Ratio Tests AIC BIC -2 Log Likelihood Chi-Square df Sig. Intercept Only 6820.102 6833.512 6816.102 Final 5074.633 5235.549 5026.633 1789.468 22 .000 This table tells us whether our model is a significant improvement on the ‘intercept only’ (null) model p<0.05 means rejecting the null hypothesis that there is no difference between the ‘intercept only’ and populated model
  • 17. Model Interpretation III Goodness-of-Fit Chi-Square df Sig. Pearson 3211.136 2998 .003 Deviance 3114.276 2998 .068 Pseudo R-Square Cox and Snell .257 Nagelkerke .291 McFadden .138 The pseudo R-square tells us how much of the variance in the dependent variable is explained by the model – low values are normal in logistic regression (think about variance in dependent!) Both of these statistics test how well the model fits that data (expected and actual values) and p<0.05 means that there is a significant difference between the two i.e. the model is not a good fit! According to the Pearson statistic the model is a bad fit, but the Deviance statistic suggests otherwise (not not by much!) This could be due to low frequencies in crosstabs or ‘overdispersion’ (see Field 2009:308) – subjective judgment…
  • 18. Model Interpretation V Likelihood Ratio Tests Effect Model Fitting Criteria Likelihood Ratio Tests AIC of Reduced Model BIC of Reduced Model -2 Log Likelihood of Reduced Model Chi-Square df Sig. Intercept 5074.633 5235.549 5026.633 .000 0 . age 5605.268 5752.774 5561.268 534.634 2 .000 manual2 6018.795 6166.302 5974.795 948.162 2 .000 Ethnic2 5074.901 5222.408 5030.901 4.268 2 .118 marital2 5087.697 5194.974 5055.697 29.064 8 .000 seefrnd2 5075.437 5196.124 5039.437 12.804 6 .046 cntctmp 5096.844 5244.350 5052.844 26.210 2 .000 The chi-square statistic is the difference in -2 log-likelihoods between the final model and a reduced model. The reduced model is formed by omitting an effect from the final model. The null hypothesis is that all parameters of that effect are 0. This table tells us which independent variables had a significant effect in our model Ethnicity (‘Ethnic2’) is the only predictor that does not significantly effect the highest educational qualification of a respondent in the model
  • 19. Model Interpretation VI Parameter Estimates Education Level - 2000 (3 groups) a B Std. Error Wald df Sig. Exp(B) 95% Confidence Interval for Exp(B) Lower Bound Upper Bound HIGHER EDUCAT Intercept -.988 .372 7.063 1 .008 age .000 .003 .028 1 .867 1.000 .994 1.005 [manual2=1.00] 1.282 .073 309.342 1 .000 3.602 3.123 4.156 [manual2=2.00] 0 b . . 0 . . . . [Ethnic2=1.00] -.298 .146 4.181 1 .041 .742 .558 .988 [Ethnic2=2.00] 0 b . . 0 . . . . [marital2=1.00] .113 .098 1.340 1 .247 1.120 .925 1.356 [marital2=2.00] .268 .134 3.992 1 .046 1.307 1.005 1.701 [marital2=3.00] .123 .114 1.156 1 .282 1.130 .904 1.413 [marital2=4.00] -.310 .207 2.242 1 .134 .734 .489 1.100 [marital2=5.00] 0 b . . 0 . . . . [seefrnd2=1.00] .204 .301 .461 1 .497 1.226 .680 2.211 [seefrnd2=2.00] .193 .309 .391 1 .532 1.213 .662 2.222 [seefrnd2=3.00] .305 .321 .906 1 .341 1.357 .724 2.543 [seefrnd2=4.00] 0 b . . 0 . . . . [cntctmp=0] -.249 .094 6.993 1 .008 .780 .649 .938 [cntctmp=1] 0 b . . 0 . . . . Because we are comparing both ‘Higher Education’ and ‘No Qualification’ with the reference category ‘Other Qualification’ we are given two parameter estimate tables This is the parameter estimates table comparing respondents with a ‘Higher Education Qualification’ with respondents with a ‘Other Qualification’
  • 20. Model Interpretation VII NONE Intercept -2.705 .357 57.555 1 .000 age .065 .003 428.739 1 .000 1.068 1.061 1.074 [manual2=1.00] -1.184 .074 255.802 1 .000 .306 .265 .354 [manual2=2.00] 0 b . . 0 . . . . [Ethnic2=1.00] -.164 .182 .806 1 .369 .849 .594 1.214 [Ethnic2=2.00] 0 b . . 0 . . . . [marital2=1.00] -.215 .100 4.618 1 .032 .806 .663 .981 [marital2=2.00] -.195 .165 1.384 1 .239 .823 .595 1.138 [marital2=3.00] .093 .125 .550 1 .458 1.097 .859 1.401 [marital2=4.00] .062 .174 .128 1 .721 1.064 .757 1.496 [marital2=5.00] 0 b . . 0 . . . . [seefrnd2=1.00] -.468 .240 3.811 1 .051 .627 .392 1.002 [seefrnd2=2.00] -.664 .255 6.781 1 .009 .515 .312 .848 [seefrnd2=3.00] -.273 .270 1.018 1 .313 .761 .448 1.293 [seefrnd2=4.00] 0 b . . 0 . . . . [cntctmp=0] .392 .121 10.525 1 .001 1.480 1.168 1.875 [cntctmp=1] 0 b . . 0 . . . . a. The reference category is: OTHER QUAL. b. This parameter is set to zero because it is redundant. This is the parameter estimates table comparing respondents with a ‘No Qualification’ with respondents with a ‘Other Qualification’ The interpretation of results is exactly the same as for binary logistic regression – SPSS doesn’t provide a parameter coding table, so you need to work this out manually
  • 21. Model Interpretation VIII Classification Observed Predicted HIGHER EDUCAT OTHER QUAL NONE Percent Correct HIGHER EDUCAT 1405 402 135 72.3% OTHER QUAL 1217 943 415 36.6% NONE 319 428 768 50.7% Overall Percentage 48.8% 29.4% 21.9% 51.7% Finally you are given a classification table that tells you how well the predictive model performed – look for misclassifications and ask yourself why… you can always run a new and improved model! The model has trouble with ‘Other Qualification’ respondents – it tries to assign many of the to ‘Higher Education’ 51.7% correctly predicted is okay – but the model is best at predicting respondents with ‘Higher Education’ qualifications… can you do better?
  • 22. In Class Exercise • Work in small groups to interpret the results of my model (the odds ratios) for ‘manual2’ and ‘seefrnd2’ • Remember to… – Look for significance – Negative or positive coefficient? – Interpret the Exp(B) (odds ratio) – We are not comparing ‘No Qual’ with ‘HE Qual’ You need to know that… [‘manual2’ = 1.00] refers to non-manual respondent [‘manual2’ = 2.00] refers to manual respondent (reference category) [‘seefrnd2’ = 1.00] refers to seeing friends weekly [‘seefrnd2’ = 2.00] refers to seeing friends monthly [‘seefrnd2’ = 3.00] refers to seeing friends less than monthly [‘seefrnd2’ = 4.00] refers to seeing friends not in the last year (reference category)
  • 23. Writing-Up I • Report the test results from the output – always give the test statistic, degrees of freedom (if appropriate) and the p-value • Always explain what the test result means for your model • Remember – if your model doesn’t fit then there’s no point in writing about it! • Report which coefficients are not significant – offer an explanation as to why (why were your hypotheses and bivariate tests wrong?... complexity of interactions?) • Regarding reporting odds ratios: – Report whether the odds increase or decrease – Give the odds ratio (or percentage point increase if you prefer) – Give the degrees of freedom – Give the Wald statistic • Remember to say ‘all other things being equal’ every now and again!
  • 24. Writing-Up II EXAMPLE: The coefficient for the variable ‘manual2’ (whether a respondent has a manual or non-manual occupation) was significant for both respondents with a higher education and no qualification. Non-manual respondents were much more likely to have a higher education than an ‘other’ qualification than manual respondents (odds = 3.6, 1 d.f., Wald = 309.34) all other things being equal. Also, non-manual respondents were much less likely not to have any qualifications than to have an ‘other’ qualification than manual respondents (odds = 0.31, 1 d.f., Wald = 255.80) all other things being equal. Although the language is awkward we can summarise by saying that respondents with higher education qualifications are more likely to have non-manual jobs than respondents with ‘other’ qualifications. Also, respondents with no qualifications are less likely to have non-manual jobs than respondents with ‘other’ qualifications. Both of these statements are made in reference to respondents who have manual occupations (the dummy ref cat.) and with ‘other’ qualifications (DV ref cat.)
  • 25. Summary • Binary and multinomial models are very similar, but notice the subtle differences • Again interpretation of the coefficients and Exp(B) are the tricky bit • The models are very powerful, even when saying ‘more likely’ or ‘less likely’
  • 26. Workshop Task • Run a multinomial logistic regression model with the dependent variable ‘edlev7’ • See if you can get a better prediction rate than me! • Use everything you’ve learnt over the past weeks, starting with the proper procedure for variable selection • Use these slides to check that the model works (follow my step-by-step guide to operation and interpretation) • Interpret the odds ratios and draw some conclusions about your model • If your model doesn’t work then work in pairs • This technique is advanced, so ask for help if you are unsure