SlideShare a Scribd company logo
1 of 73
SW388R7
Data Analysis &
Computers II
Slide 1

Multinomial Logistic Regression
Basic Relationships

Multinomial Logistic Regression
Describing Relationships
Classification Accuracy
Sample Problems
Compu
ters II

Multinomial logistic regression

Slide 2


Multinomial logistic regression is used to analyze relationships
between a non-metric dependent variable and metric or
dichotomous independent variables.



Multinomial logistic regression compares multiple groups
through a combination of binary logistic regressions.



The group comparisons are equivalent to the comparisons for a
dummy-coded dependent variable, with the group with the
highest numeric score used as the reference group.



For example, if we wanted to study differences in BSW, MSW,
and PhD students using multinomial logistic regression, the
analysis would compare BSW students to PhD students and MSW
students to PhD students. For each independent variable, there
would be two comparisons.
Compu
ters II

What multinomial logistic regression predicts

Slide 3


Multinomial logistic regression provides a set of coefficients for
each of the two comparisons. The coefficients for the
reference group are all zeros, similar to the coefficients for the
reference group for a dummy-coded variable.



Thus, there are three equations, one for each of the groups
defined by the dependent variable.



The three equations can be used to compute the probability
that a subject is a member of each of the three groups. A case
is predicted to belong to the group associated with the highest
probability.



Predicted group membership can be compared to actual group
membership to obtain a measure of classification accuracy.
Compu
ters II

Level of measurement requirements

Slide 4


Multinomial logistic regression analysis requires that the
dependent variable be non-metric. Dichotomous, nominal, and
ordinal variables satisfy the level of measurement requirement.



Multinomial logistic regression analysis requires that the
independent variables be metric or dichotomous. Since SPSS
will automatically dummy-code nominal level variables, they
can be included since they will be dichotomized in the analysis.



In SPSS, non-metric independent variables are included as
“factors.” SPSS will dummy-code non-metric IVs.



In SPSS, metric independent variables are included as
“covariates.” If an independent variable is ordinal, we will
attach the usual caution.
Compu
ters II

Assumptions and outliers

Slide 5


Multinomial logistic regression does not make any assumptions
of normality, linearity, and homogeneity of variance for the
independent variables.



Because it does not impose these requirements, it is preferred
to discriminant analysis when the data does not satisfy these
assumptions.



SPSS does not compute any diagnostic statistics for outliers. To
evaluate outliers, the advice is to run multiple binary logistic
regressions and use those results to test the exclusion of
outliers or influential cases.
Compu
ters II

Sample size requirements

Slide 6


The minimum number of cases per independent variable is 10,
using a guideline provided by Hosmer and Lemeshow, authors of
Applied Logistic Regression, one of the main resources for
Logistic Regression.



For preferred case-to-variable ratios, we will use 20 to 1.
Compu
ters II

Methods for including variables

Slide 7


The only method for selecting independent variables in SPSS is
simultaneous or direct entry.
Compu
ters II

Overall test of relationship - 1

Slide 8


The overall test of relationship among the independent
variables and groups defined by the dependent is based on the
reduction in the likelihood values for a model which does not
contain any independent variables and the model that contains
the independent variables.



This difference in likelihood follows a chi-square distribution,
and is referred to as the model chi-square.



The significance test for the final model chi-square (after the
independent variables have been added) is our statistical
evidence of the presence of a relationship between the
dependent variable and the combination of the independent
variables.
Compu
ters II
Slide 9

Overall test of relationship - 2

Model Fitting Information
Model
Intercept Only
Final

-2 Log
Likelihood
284.429
265.972

Chi-Square
18.457

df

Sig.
6

.005

The presence of a relationship between the dependent
variable and combination of independent variables is
based on the statistical significance of the final model
chi-square in the SPSS table titled "Model Fitting
Information".
In this analysis, the probability of the model chi-square
(18.457) was 0.005, less than or equal to the level of
significance of 0.05. The null hypothesis that there was
no difference between the model without independent
variables and the model with independent variables
was rejected. The existence of a relationship between
the independent variables and the dependent variable
was supported.
ters II

Strength of multinomial logistic regression
relationship

Slide
10


While multinomial logistic regression does compute correlation
measures to estimate the strength of the relationship (pseudo R
square measures, such as Nagelkerke's R²), these correlations
measures do not really tell us much about the accuracy or
errors associated with the model.



A more useful measure to assess the utility of a multinomial
logistic regression model is classification accuracy, which
compares predicted group membership based on the logistic
model to the actual, known group membership, which is the
value for the dependent variable.
ters II
Slide
11

Evaluating usefulness for logistic models


The benchmark that we will use to characterize a multinomial
logistic regression model as useful is a 25% improvement over
the rate of accuracy achievable by chance alone.



Even if the independent variables had no relationship to the
groups defined by the dependent variable, we would still
expect to be correct in our predictions of group membership
some percentage of the time. This is referred to as by chance
accuracy.



The estimate of by chance accuracy that we will use is the
proportional by chance accuracy rate, computed by summing
the squared percentage of cases in each group. The only
difference between by chance accuracy for binary logistic
models and by chance accuracy for multinomial logistic models
is the number of groups defined by the dependent variable.
ters II
Slide
12

Computing by chance accuracy
The percentage of cases in each group defined by the dependent
variable is found in the ‘Case Processing Summary’ table.
Case Processing Summary
N
HIGHWAYS
AND BRIDGES
Valid
Missing
Total
Subpopulation

1
2
3

62
93
12
167
103
270
153a

Marginal
Percentage
37.1%
55.7%
7.2%
100.0%

a. The dependent variable has only one value observed
in 146 (95.4%) subpopulations.

The proportional by chance accuracy rate was
computed by calculating the proportion of cases for
each group based on the number of cases in each
group in the 'Case Processing Summary', and then
squaring and summing the proportion of cases in each
group (0.371² + 0.557² + 0.072² = 0.453).
The proportional by chance accuracy criteria is 56.6%
(1.25 x 45.3% = 56.6%).
ters II
Slide
13

Comparing accuracy rates


To characterize our model as useful, we compare the overall
percentage accuracy rate produced by SPSS at the last step in which
variables are entered to 25% more than the proportional by chance
accuracy. (Note: SPSS does not compute a cross-validated accuracy
rate for multinomial logistic regression .)
Classification
Predicted
Observed
1
2
3
Overall Percentage

1
15
7
5
16.2%

2
47
86
7
83.8%

3
0
0
0
.0%

The classification accuracy rate was 60.5%
which was greater than or equal to the
proportional by chance accuracy criteria of
56.6% (1.25 x 45.3% = 56.6%).
The criteria for classification accuracy is
satisfied in this example.

Percent
Correct
24.2%
92.5%
.0%
60.5%
ters II
Slide
14

Numerical problems








The maximum likelihood method used to calculate multinomial
logistic regression is an iterative fitting process that attempts
to cycle through repetitions to find an answer.
Sometimes, the method will break down and not be able to
converge or find an answer.
Sometimes the method will produce wildly improbable results,
reporting that a one-unit change in an independent variable
increases the odds of the modeled event by hundreds of
thousands or millions. These implausible results can be
produced by multicollinearity, categories of predictors having
no cases or zero cells, and complete separation whereby the
two groups are perfectly separated by the scores on one or
more independent variables.
The clue that we have numerical problems and should not
interpret the results are standard errors for some independent
variables that are larger than 2.0.
ters II

Relationship of individual independent
variables and the dependent variable

Slide
15


There are two types of tests for individual independent
variables:
 The likelihood ratio test evaluates the overall relationship
between an independent variable and the dependent
variable
 The Wald test evaluates whether or not the independent
variable is statistically significant in differentiating between
the two groups in each of the embedded binary logistic
comparisons.



If an independent variable has an overall relationship to the
dependent variable, it might or might not be statistically
significant in differentiating between pairs of groups defined by
the dependent variable.
ters II

Relationship of individual independent
variables and the dependent variable

Slide
16


The interpretation for an independent variable focuses on its
ability to distinguish between pairs of groups and the
contribution which it makes to changing the odds of being in
one dependent variable group rather than the other.



We should not interpret the significance of an independent
variable’s role in distinguishing between pairs of groups unless
the independent variable also has an overall relationship to the
dependent variable in the likelihood ratio test.



The interpretation of an independent variable’s role in
differentiating dependent variable groups is the same as we
used in binary logistic regression. The difference in
multinomial logistic regression is that we can have multiple
interpretations for an independent variable in relation to
different pairs of groups.
ters II

Relationship of individual independent
variables and the dependent variable

Slide
17

Parameter Estimates

HIGHWAYS
a
AND BRIDGES
1

2

Intercept
AGE
EDUC
CONLEGIS
Intercept
AGE
EDUC
CONLEGIS

B
3.240
.019
.071
-1.373
3.639
.003
.172
-1.657

Std. Error
2.478
.020
.108
.620
2.456
.020
.110
.613

95% Confidence Interva
Exp(B)
SPSS identifies the comparisons Exp(B)
it makes for Bound Upper B
Wald
df
Sig.
Lower
groups defined by1the dependent variable in
1.709
.191
the table of ‘Parameter Estimates,’ 1.019 either .980
using
.906
1
.341
the value codes or the value labels, depending
.427
1
.514
1.073
on the options settings for pivot table labeling. .868
4.913
1
.027
.253
.075
The 2.195
reference category is .138
identified in the
1
footnote to the table.
.017
1
.897
1.003
.963
In this analysis, two comparisons will be
2.463
1
.117
1.188
.958
made:
7.298
1
.007
.191
.057

a. The reference category is: 3.

HIGHWAYS
a
AND BRIDGES
TOO LITTLE

ABOUT RIGHT

Intercept
AGE
EDUC
CONLEGIS
Intercept
AGE
EDUC
CONLEGIS

•the TOO LITTLE group (coded 1, shaded
blue) will be compared to the TOO MUCH
Parameter Estimates
group (coded 3, shaded purple)
•the ABOUT RIGHT group (coded 2 ,
shaded orange)) will be compared to the
TOO MUCH group (coded 3, shaded
purple). Wald
Std. Error
df
Sig.
Exp(B)

B
3.240
2.478
1.709
1
.191
The reference category plays the same role in
.019
.020
.906
1
.341
multinomial logistic regression that it plays in
.071
.108
.427
1
.514
the dummy-coding of a nominal variable: it is
the category that4.913
would be coded with .027
zeros
-1.373
.620
1
for all of the dummy-coded variables that all
3.639
2.456
2.195
1
.138
other categories are interpreted against.
.003
.020
.017
1
.897
.172
.110
2.463
1
.117
-1.657
.613
7.298
1
.007

a. The reference category is: TOO MUCH.

1.019
1.073
.253
1.003
1.188
.191

95% C

Lower B
ters II

Relationship of individual independent
variables and the dependent variable

Slide
18

Likelihood Ratio Tests

Effect
Intercept
AGE
EDUC
CONLEGIS

-2 Log
Likelihood of
Reduced
Model
268.323
268.625
270.395
275.194

Chi-Square
2.350
2.652
4.423
9.221

df
2
2
2
2

Sig.
.309
.265
.110
.010

In this example, there is a
statistically significant
relationship between the
independent variable
CONLEGIS and the dependent
variable. (0.010 < 0.05)

The chi-square statistic is the difference in -2 log-likelihoods
between the final model and a reduced model. The reduced model is
Parameter Estimates
formed by omitting an effect from the final model. The null hypothesis
is that all parameters of that effect are 0.

HIGHWAYS
a
AND BRIDGES
1

2

B
3.240
.019
.071
-1.373
3.639
.003
.172
-1.657

Intercept
AGE
EDUC
CONLEGIS
Intercept
AGE
EDUC
CONLEGIS

Std. Error
2.478
.020
.108
.620
2.456
.020
.110
.613

Wald
1.709
.906
.427
4.913
2.195
.017
2.463
7.298

df
1
1
1
1
1
1
1
1

As well, the independent
variable CONLEGIS is
significant in distinguishing
both category 1 of 95% Confidence Interval f
the
dependent variable from Exp(B)
category 3 of the dependent
Sig.
Exp(B)
Lower
variable. (0.027 < 0.05) Bound Upper Bou
.191
.341
.514
.027
.138
.897
.117
.007

a. The reference category is: 3.

And the independent variable CONLEGIS is significant in
distinguishing category 2 of the dependent variable from
category 3 of the dependent variable. (0.007 < 0.05)

1.019
1.073
.253

.980
.868
.075

1.0
1.3
.8

1.003
1.188
.191

.963
.958
.057

1.0
1.4
.6
ters II
Interpreting relationship of individual independent
variables to the dependent variable

Slide
19

Likelihood Ratio Tests

Effect
Intercept
AGE
EDUC
CONLEGIS

-2 Log
Survey
Likelihood of respondents who had less confidence in congress (higher
values correspond to lower confidence) were less likely to be in the
Reduced
group ofChi-Square
survey respondents who thought we spend too little money
Model
df
Sig.
on highways and bridges (DV category 1), rather than the group of
268.323 respondents who thought we spend too much money on
2.350
2
.309
survey
268.625
2.652
.265
highways and bridges (DV 2
category 3).
270.395
4.423
2
.110
For each unit9.221
increase in confidence in Congress, the odds of being
275.194
2
.010

in the group of survey respondents who thought we spend too little

The chi-square statistic is theon highwayslog-likelihoods decreased by 74.7%. (0.253 – 1.0
money difference in -2 and bridges
between the final model-0.747)
= and a reduced model. The reduced model is
Parameter Estimates
formed by omitting an effect from the final model. The null hypothesis
is that all parameters of that effect are 0.

HIGHWAYS
a
AND BRIDGES
1

2

Intercept
AGE
EDUC
CONLEGIS
Intercept
AGE
EDUC
CONLEGIS

a. The reference category is: 3.

B
3.240
.019
.071
-1.373
3.639
.003
.172
-1.657

Std. Error
2.478
.020
.108
.620
2.456
.020
.110
.613

Wald
1.709
.906
.427
4.913
2.195
.017
2.463
7.298

df
1
1
1
1
1
1
1
1

Sig.
.191
.341
.514
.027
.138
.897
.117
.007

Exp(B)

95% Confidence Interval f
Exp(B)
Lower Bound
Upper Bou

1.019
1.073
.253

.980
.868
.075

1.0
1.3
.8

1.003
1.188
.191

.963
.958
.057

1.0
1.4
.6
ters II
Interpreting relationship of individual independent
variables to the dependent variable

Slide
20

Likelihood Ratio Tests

Effect
Intercept
AGE
EDUC
CONLEGIS

-2 Log
Likelihood of
Reduced
Model
268.323
268.625
270.395
275.194

Chi-Square
2.350
2.652
4.423
9.221

df
2
2
2
2

Sig.
.309
.265
.110
.010

Survey respondents who had less confidence in congress (higher

The chi-square statistic is the difference in -2 log-likelihoods confidence) were less likely to be in the
values correspond to lower
group of survey The reduced model is
between the final model and a reduced model. respondents who thought we spend about the right
Parameter Estimates
amount of money The null hypothesis
formed by omitting an effect from the final model.on highways and bridges (DV category 2), rather
than the group of survey respondents who thought we spend too
is that all parameters of that effect are 0.

much money on highways and bridges (DV Category 3).

HIGHWAYS
a
AND BRIDGES
1

2

B
Std. Error
Wald
df
Sig.
Exp(B)
For each unit increase in confidence in Congress, the odds of being
in
Intercept the group of survey respondents who thought we spend about the
3.240
2.478
1.709
1
.191
right amount of money on highways and bridges decreased by
AGE
.019
.020
1
.341
1.019
80.9%. (0.191 – 1.0 = 0.809) .906
EDUC
.071
.108
.427
1
.514
1.073
CONLEGIS
-1.373
.620
4.913
1
.027
.253
Intercept
3.639
2.456
2.195
1
.138
AGE
.003
.020
.017
1
.897
1.003
EDUC
.172
.110
2.463
1
.117
1.188
CONLEGIS
-1.657
.613
7.298
1
.007
.191

a. The reference category is: 3.

95% Confidence Interval f
Exp(B)
Lower Bound
Upper Bou
.980
.868
.075

1.0
1.3
.8

.963
.958
.057

1.0
1.4
.6
ters II

Relationship of individual independent
variables and the dependent variable

Slide
21

Likelihood Ratio Tests

Effect
Intercept
AGE
EDUC
POLVIEWS
SEX

-2 Log
Likelihood of
Reduced
Model
327.463a
333.440
329.606
334.636
338.985

Chi-Square
.000
5.976
2.143
7.173
11.521

df

Sig.
0
2
2
2
2

.
.050
.343
.028
.003

The chi-square statistic is the difference in -2 log-likelihoods
Parameter Estimates
between the final model and a reduced model. The reduced model
is formed by omitting an effect from the final model. The null
hypothesis is that all parameters of that effect are 0.
a.
a
NATCHLD
B
Std. Error
Wald
df
This reduced model is equivalent to the final2.233 because
model
TOO LITTLE
Intercept
8.434
14.261
1
omitting the effect does not increase the degrees of freedom.
AGE
-.023
.017
1.756
1
EDUC
-.066
.102
.414
1
POLVIEWS
-.575
.251
5.234
1
[SEX=1]
-2.167
.805
7.242
1
b
[SEX=2]
0
.
.
0
ABOUT RIGHT Intercept
4.485
2.255
3.955
1
AGE
-.001
.018
.003
1
EDUC
.011
.104
.011
1
POLVIEWS
-.397
.257
2.375
1
[SEX=1]
-1.606
.824
3.800
1
b
[SEX=2]
0
.
.
0
a. The reference category is: TOO MUCH.

In this example, there is
a statistically significant
relationship between SEX
and the dependent
variable, spending on
childcare assistance.

As well, SEX plays a
statistically significant role
in differentiating 95% Confidence Interval
the TOO
LITTLE group from the TOO
Exp(B)
MUCH Exp(B)
(reference) group.
Sig.
Lower Bound
Upper Bo
(0.007 < 0.5)
.000
.185
.977
.944
.520
.936
.766
.022
.563
.344
.007
.115
.024
.
.
.
However, SEX does not
.047differentiate the ABOUT
.955RIGHT .999
.965
group from the
TOO MUCH (reference)
.916
1.011
.824
group.(0.51 > 0.5)
.123
.673
.406
.051
.201
.040
.
.
.

1.
1.
.
.

1.
1.
1.
1.
ters II
Slide
22

Interpreting relationship of individual independent
variables and the dependent variable
Likelihood Ratio Tests

Effect
Intercept
AGE
EDUC
POLVIEWS
SEX

-2 Log
Likelihood of
Reduced
Model
Chi-Square
df
Sig.
327.463a
.000
0
.
Survey respondents who were2 male (code 1 for sex) were less likely
333.440
5.976
.050
to 329.606
be in the group of survey respondents who thought we spend too
2.143
2
.343
little money on childcare assistance (DV category 1), rather than the
334.636
2
.028
group of survey 7.173
respondents who thought we spend too much
money on childcare assistance (DV category 3).
338.985
11.521
2
.003

The chi-square statistic is the difference in -2 log-likelihoods
Survey respondents who were male were 88.5% less likely (0.115 –
Parameter Estimates
between the final model and a reduced model. The reduced model
1.0 = -0.885) to be in the group of survey respondents who thought
is formed by omittingspend too little final model. The null
we an effect from the money on childcare assistance.
hypothesis is that all parameters of that effect are 0.
a.
a
NATCHLD
B
Std. Error
Wald
df
Sig.
Exp(B)
This reduced model is equivalent to the final2.233 because
model
TOO LITTLE
Intercept
8.434
14.261
1
.000
omitting the effect does not increase the degrees of freedom.
AGE
-.023
.017
1.756
1
.185
.977
EDUC
-.066
.102
.414
1
.520
.936
POLVIEWS
-.575
.251
5.234
1
.022
.563
[SEX=1]
-2.167
.805
7.242
1
.007
.115
b
[SEX=2]
0
.
.
0
.
.
ABOUT RIGHT Intercept
4.485
2.255
3.955
1
.047
AGE
-.001
.018
.003
1
.955
.999
EDUC
.011
.104
.011
1
.916
1.011
POLVIEWS
-.397
.257
2.375
1
.123
.673
[SEX=1]
-1.606
.824
3.800
1
.051
.201
b
[SEX=2]
0
.
.
0
.
.
a. The reference category is: TOO MUCH.

95% Confidence Interval
Exp(B)
Lower Bound
Upper Bo
.944
.766
.344
.024
.

1.
1.
.
.

.965
.824
.406
.040
.

1.
1.
1.
1.
ters II

Interpreting relationships for independent
variable in problems

Slide
23


In the multinomial logistic regression problems, the problem
statement will ask about only one of the independent variables.
The answer will be true or false based on only the relationship
between the specified independent variable and the dependent
variable. The individual relationships between other
independent variables are the dependent variable are not used
in determining whether or not the answer is true or false.
ters II
Slide
24

Problem 1
11. In the dataset GSS2000, is the following statement true, false, or an incorrect application
of a statistic? Assume that there is no problem with missing data, outliers, or influential cases,
and that the validation analysis will confirm the generalizability of the results. Use a level of
significance of 0.05 for evaluating the statistical relationships.
The variables "age" [age], "highest year of school completed" [educ] and "confidence in
Congress" [conlegis] were useful predictors for distinguishing between groups based on
responses to "opinion about spending on highways and bridges" [natroad]. These predictors
differentiate survey respondents who thought we spend too little money on highways and
bridges from survey respondents who thought we spend too much money on highways and
bridges and survey respondents who thought we spend about the right amount of money on
highways and bridges from survey respondents who thought we spend too much money on
highways and bridges.
Among this set of predictors, confidence in Congress was helpful in distinguishing among the
groups defined by responses to opinion about spending on highways and bridges. Survey
respondents who had less confidence in congress were less likely to be in the group of survey
respondents who thought we spend too little money on highways and bridges, rather than the
group of survey respondents who thought we spend too much money on highways and bridges.
For each unit increase in confidence in Congress, the odds of being in the group of survey
respondents who thought we spend too little money on highways and bridges decreased by
74.7%. Survey respondents who had less confidence in congress were less likely to be in the
group of survey respondents who thought we spend about the right amount of money on
highways and bridges, rather than the group of survey respondents who thought we spend too
much money on highways and bridges. For each unit increase in confidence in Congress, the
odds of being in the group of survey respondents who thought we spend about the right amount
of money on highways and bridges decreased by 80.9%.
1.
2.
3.
4.

True
True with caution
False
Inappropriate application of a statistic
ters II
Slide
25

Dissecting problem 1 - 1
11. In the dataset GSS2000, is the following statement true, false, or an incorrect application
of a statistic? Assume that there is no problem with missing data, outliers, or influential cases,
and that the validation analysis will confirm the generalizability of the results. Use a level of
significance of 0.05 for evaluating the statistical relationships.
The variables "age" [age], "highest year of school completed" [educ] and "confidence in
Congress" [conlegis] were useful predictors for distinguishing between groups based on
responses to "opinion about spending on highways and bridges" [natroad]. These predictors
differentiate survey respondents who For thesewe spend too little money on highways and
thought problems, we will
bridges from survey respondents who assume that spend is nomuch money on highways and
thought we there too problem
bridges and survey respondents who thought we spend about the right amount of money on
with missing data, outliers, or
highways and bridges from survey respondents who thought wethe
influential cases, and that spend too much money on
highways and bridges.
validation analysis will confirm

the generalizability of the
Among this set of predictors, confidence in Congress was helpful in distinguishing among the
results
groups defined by responses to opinion about spending on highways and bridges. Survey
respondents who had less confidence in congress were less likely to be in the group of survey
In this money we are told and
respondents who thought we spend too littleproblem,on highways to bridges, rather than the
use we spend too much
group of survey respondents who thought 0.05 as alpha for the money on highways and bridges.
For each unit increase in confidence in Congress, logistic regression. in the group of survey
multinomial the odds of being
respondents who thought we spend too little money on highways and bridges decreased by
74.7%. Survey respondents who had less confidence in congress were less likely to be in the
group of survey respondents who thought we spend about the right amount of money on
highways and bridges, rather than the group of survey respondents who thought we spend too
much money on highways and bridges. For each unit increase in confidence in Congress, the
odds of being in the group of survey respondents who thought we spend about the right amount
of money on highways and bridges decreased by 80.9%.

1.
2.
3.
4.

True
True with caution
False
Inappropriate application of a statistic
ters II
Slide
26

Dissecting problem 1 - 2
The variables listed first in the problem
statement are the independent variables
(IVs): "age" [age], "highest year of school
11. In the dataset GSS2000,"confidence in
completed" [educ] and is the following statement true, false, or an incorrect application
of a statistic? Assume that there is no problem with missing data, outliers, or influential cases,
Congress" [conlegis].

and that the validation analysis will confirm the generalizability of the results. Use a level of
significance of 0.05 for evaluating the statistical relationships.

The variables "age" [age], "highest year of school completed" [educ] and "confidence in
Congress" [conlegis] were useful predictors for distinguishing between groups based on
responses to "opinion about spending on highways and bridges" [natroad]. These predictors
differentiate survey respondents who thought we spend too little money on highways and
bridges from survey respondents who thought we spend too much money on highways and
bridges and survey respondents who thought we spend about the right amount of money on
highways and bridges from survey respondents who thought we spend too much money on
The variable used to define
highways and bridges.the dependent
groups is
variable (DV): "opinion about

Among this set of predictors, confidence in Congress was helpful in distinguishing among the
spending on highways and
groups defined by responses to opinion about spending on highways and bridges. Survey
respondents bridges" [natroad].
who had less confidence in congress were less likely to be in the group of survey
respondents who thought we spend too little money on highways and bridges, rather than the
group of survey respondents who thought we spend too much money on highways and bridges.
For each unit increase in confidence in Congress, the odds of being in the group of survey
respondents who thought we spend too little moneySPSS only supports direct or
on highways and bridges decreased by
simultaneous entry of independent in the
74.7%. Survey respondents who had less confidence in congress were less likely to be
group of survey respondents who thought we spend variables in multinomial logistic
about the right amount of money on
regression, so we have no choice of
highways and bridges, rather than the group of survey respondents who thought we spend too
much money on highways and bridges. For each unitmethod for entering variables.
increase in confidence in Congress, the
odds of being in the group of survey respondents who thought we spend about the right amount
of money on highways and bridges decreased by 80.9%.
ters II
Slide
27

Dissecting problem 1 - 3
SPSS multinomial logistic regression models the relationship by
comparing each of the groups defined by the dependent variable to the
group with the highest code value.

11. In the dataset GSS2000, opinionfollowing statement true, false, or an incorrect application
The responses to is the about spending on highways and bridges were:
of a statistic? Assume that there is no problem with missing data, outliers, or influential cases,
and that the validation analysis will confirm the= Too much.
generalizability of the results. Use a level of
1= Too little, 2 = About right, and 3
significance of 0.05 for evaluating the statistical relationships.
The variables "age" [age], "highest year of school completed" [educ] and "confidence in
Congress" [conlegis] were useful predictors for distinguishing between groups based on
responses to "opinion about spending on highways and bridges" [natroad]. These predictors
differentiate survey respondents who thought we spend too little money on highways and
bridges from survey respondents who thought we spend too much money on highways and
bridges and survey respondents who thought we spend about the right amount of money on
highways and bridges from survey respondents who thought we spend too much money on
highways and bridges.
Among this set of predictors, confidence in Congress was helpful in distinguishing among the
groups defined by responses to opinion about spending on highways and bridges. Survey
respondents who had less confidence in congress were less likely to be in the group of survey
respondents who The analysis spend too in two money on highways and bridges, rather than the
thought we will result little comparisons:
group of survey respondents who thought we spend too spend too little money
• survey respondents who thought we much money on highways and bridges.
For each unit increase in confidence in Congress, the odds of being in the group of survey
versus survey respondents who thought we spend too much
respondents who thought we spend too and bridges on highways and bridges decreased by
money on highways little money
74.7%. Survey respondents respondents who thought wecongress were less likely to be in the
• survey who had less confidence in spend about the right
group of survey respondentsof money versus survey respondents whoamount of money on
who thought we spend about the right thought we
amount
highways and bridges, rather than the group of survey respondents who thought we spend too
spend too bridges. For on highways and bridges.
much money on highways and much money each unit increase in confidence in Congress, the
odds of being in the group of survey respondents who thought we spend about the right amount
of money on highways and bridges decreased by 80.9%.
ters II
Slide
28

Dissecting problem 1 - 4

Each problem includes a statement about the relationship between
one independent variable and the dependent variable. The answer
to the problem is based on the stated relationship, ignoring the
The variablesrelationships between the other independent variables and the
"age" [age], "highest year of school completed" [educ] and "confidence in
dependent variable.
Congress" [conlegis] were useful predictors for distinguishing between groups based on

responses to "opinion about spending on highways and bridges" [natroad]. These predictors
differentiate This problem identifies a difference forspendof the comparisons highways and
survey respondents who thought we both too little money on
bridges from among respondents who thought we spend too much money on highways and
survey groups modeled by the multinomial logistic regression.
bridges and survey respondents who thought we spend about the right amount of money on
highways and bridges from survey respondents who thought we spend too much money on
highways and bridges.
Among this set of predictors, confidence in Congress was helpful in distinguishing among the
groups defined by responses to opinion about spending on highways and bridges. Survey
respondents who had less confidence in congress were less likely to be in the group of
survey respondents who thought we spend too little money on highways and bridges, rather
than the group of survey respondents who thought we spend too much money on highways
and bridges. For each unit increase in confidence in Congress, the odds of being in the
group of survey respondents who thought we spend too little money on highways and
bridges decreased by 74.7%. Survey respondents who had less confidence in congress were
less likely to be in the group of survey respondents who thought we spend about the right
amount of money on highways and bridges, rather than the group of survey respondents
who thought we spend too much money on highways and bridges. For each unit increase in
confidence in Congress, the odds of being in the group of survey respondents who thought
we spend about the right amount of money on highways and bridges decreased by 80.9%.
ters II
Slide
29

Dissecting problem 1 - 5
11. In the dataset GSS2000, is the following statement true, false, or an incorrect application
of a statistic? Assume that there is no problem with missing data, outliers, or influential cases,
and that the validation analysis will confirm the generalizability of the results. Use a level of
significance of 0.05 for evaluating the statistical relationships.
The variables "age" [age], "highest year of school completed" [educ] and "confidence in
Congress" [conlegis] were useful predictors for distinguishing between groups based on
responses to "opinion about spending on highways and bridges" [natroad]. These predictors
differentiate survey respondents who thought we spend too little money on highways and
bridges from survey respondents who thought we spend too much money on highways and
bridges and survey respondents who thought we spend about the right amount of money on
highways and bridges from survey respondents who thought we spend too much money on
highways and bridges.
Among this set of predictors, confidence in Congress was helpful in distinguishing among the
groups defined by responses to opinion about spending on highways and bridges. Survey
respondents who had less confidence in congress were less likely to be in the group of survey
respondents who thought we spend too little money on highways and bridges, rather than the
group of survey respondents who thought we spend too much money on highways and bridges.
In order for the multinomial logistic regression
For each unit increase in confidence in Congress, the odds of being in the group of survey
question to be on highways and bridges decreased
respondents who thought we spend too little money true, the overall relationship must by
be statistically significant, were less be no
74.7%. Survey respondents who had less confidence in congress there mustlikely to be in the
evidence of numerical problems, the classification
group of survey respondents who thought we spend about the right amount of money on
highways and bridges, rather than the accuracy rate must be substantiallythought we spend too
group of survey respondents who better than
much money on highways and bridges.couldeach unit increase in confidence in Congress, the
For be obtained by chance alone, and the
odds of being in the group of survey respondents who thought we spendbe statistically amount
stated individual relationship must about the right
of money on highways and bridges decreased by and interpreted correctly.
significant 80.9%.
ters II
Slide
30

Request multinomial logistic regression

Select the Regression |
Multinomial Logistic…
command from the
Analyze menu.
ters II
Slide
31

Selecting the dependent variable

First, highlight the
dependent variable
natroad in the list
of variables.

Second, click on the right
arrow button to move the
dependent variable to the
Dependent text box.
ters II
Slide
32

Selecting metric independent variables
Metric independent variables are specified as covariates
in multinomial logistic regression. Metric variables can
be either interval or, by convention, ordinal.

Move the metric
independent variables,
age, educ and conlegis to
the Covariate(s) list box.

In this analysis, there are no nonmetric independent variables. Nonmetric independent variables would be
moved to the Factor(s) list box.
ters II
Slide
33

Specifying statistics to include in the output

While we will accept most of
the SPSS defaults for the
analysis, we need to specifically
request the classification table.
Click on the Statistics… button
to make a request.
ters II
Slide
34

Requesting the classification table

First, keep the SPSS
defaults for Summary
statistics, Likelihood
ratio test, and
Parameter estimates.

Second, mark the
checkbox for the
Classification table.

Third, click
on the
Continue
button to
complete the
request.
ters II
Slide
35

Completing the multinomial
logistic regression request

Click on the OK
button to request
the output for the
multinomial logistic
regression.

The multinomial logistic procedure supports
additional commands to specify the model
computed for the relationships (we will use the
default main effects model), additional
specifications for computing the regression,
and saving classification results. We will not
make use of these options.
ters II
Slide
36

LEVEL OF MEASUREMENT - 1
11. In the dataset GSS2000, is the following statement true, false, or an incorrect application
of a statistic? Assume that there is no problem with missing data, outliers, or influential cases,
and that the validation analysis will confirm the generalizability of the results. Use a level of
significance of 0.05 for evaluating the statistical relationships.
The variables "age" [age], "highest year of school completed" [educ] and "confidence in
Congress" [conlegis] were useful predictors for distinguishing between groups based on
responses to "opinion about spending on highways and bridges" [natroad]. These predictors
differentiate survey respondents who thought we spend too little money on highways and
bridges from survey respondents who thought we spend too much money on highways and
bridges and survey respondents who thought we spend about the right amount of money on
highways and bridges from survey respondents who thought we spend too much money on
highways and bridges.
Among this set of predictors, confidence in Congress was helpful in distinguishing among the
groups defined by responses to opinion about spending on highways and bridges. Survey
respondents who had less confidence in congressrequires that the to be in the group of survey
Multinomial logistic regression were less likely
respondents who thought we spend too little money andhighways and bridges, rather than the
dependent variable be non-metric on the
group of survey respondents who thought we spend too much money on highways and bridges.
independent variables be metric or dichotomous.
For each unit increase in confidence in Congress, the odds of being in the group of survey
respondents who thought we spend too little money on highways and bridges decreased by
"Opinion about spending on highways and
bridges" [natroad] is confidence in congress were less likely to be in the
74.7%. Survey respondents who had lessordinal, satisfying the nonmetric level of thought we spend about the the
group of survey respondents who measurement requirement forright amount of money on
dependent variable.
highways and bridges, rather than the group of survey respondents who thought we spend too
much money on highways and bridges. For each unit increase in confidence in Congress, the
It contains three respondents who thought we
odds of being in the group of surveycategories: survey respondents spend about the right amount
who thought we spend too
of money on highways and bridges decreased little money, about
the right amount of money, by 80.9%.
and too much
money on highways and bridges.
1. True
2. True with caution
ters II
Slide
37

LEVEL OF MEASUREMENT - 2
"Age" [age] and "highest year of
school completed" [educ] are interval,
11. satisfying the metric or dichotomous
In the dataset GSS2000, is the following statement true, false, or an incorrect application
of alevel of measurement requirement for
statistic? Assume that there is no problem with missing data, outliers, or influential cases,
independent variables.
and that the validation analysis will confirm the generalizability of the results. Use a level of

significance of 0.05 for evaluating the statistical relationships.

The variables "age" [age], "highest year of school completed" [educ] and "confidence in
Congress" [conlegis] were useful predictors for distinguishing between groups based on
responses to "opinion about spending on highways and bridges" [natroad]. These predictors
differentiate survey respondents who thought we spend too little money on highways and
bridges from survey respondents who thought we spend too much money on highways and
bridges and survey respondents who thought we spend about the right amount of money on
highways and bridges from survey respondents who thought we spend too much money on
"Confidence in Congress" [conlegis] is ordinal,
highways and bridges. satisfying the metric or dichotomous level of

measurement requirement for independent
variables. If we follow the convention of treating
Among this set of predictors, confidence in Congress was helpfulthe distinguishing among the
ordinal level variables as metric variables, in level
groups defined by responses to opinion about spending on highways is bridges. Survey
of measurement requirement for the analysis and
respondents who had less confidence in congress analysts do not agree in the group of survey
satisfied. Since some data were less likely to be
with this convention, a note of caution should be
respondents who thought we spend too little money on highways and bridges, rather than the
included in our interpretation.
group of survey respondents who thought we spend too much money on highways and bridges.

For each unit increase in confidence in Congress, the odds of being in the group of survey
respondents who thought we spend too little money on highways and bridges decreased by
74.7%. Survey respondents who had less confidence in congress were less likely to be in the
group of survey respondents who thought we spend about the right amount of money on
highways and bridges, rather than the group of survey respondents who thought we spend too
much money on highways and bridges. For each unit increase in confidence in Congress, the
odds of being in the group of survey respondents who thought we spend about the right amount
of money on highways and bridges decreased by 80.9%.
ters II
Slide
38

Sample size – ratio of cases to variables
Case Processing Summary
N
HIGHWAYS
AND BRIDGES
Valid
Missing
Total
Subpopulation

1
2
3

62
93
12
167
103
270
153a

Marginal
Percentage
37.1%
55.7%
7.2%
100.0%

a. The dependent variable has only one value observed

Multinomial logistic regression requires that the minimum ratio
in 146 (95.4%) subpopulations.
of valid cases to independent variables be at least 10 to 1. The
ratio of valid cases (167) to number of independent variables
(3) was 55.7 to 1, which was equal to or greater than the
minimum ratio. The requirement for a minimum ratio of cases
to independent variables was satisfied.
The preferred ratio of valid cases to independent variables is
20 to 1. The ratio of 55.7 to 1 was equal to or greater than the
preferred ratio. The preferred ratio of cases to independent
variables was satisfied.
ters II
Slide
39

OVERALL RELATIONSHIP BETWEEN
INDEPENDENT AND DEPENDENT VARIABLES
Model Fitting Information
Model
Intercept Only
Final

-2 Log
Likelihood
284.429
265.972

Chi-Square
18.457

df

Sig.
6

.005

The presence of a relationship between the dependent
variable and combination of independent variables is
based on the statistical significance of the final model
chi-square in the SPSS table titled "Model Fitting
Information".
In this analysis, the probability of the model chi-square
(18.457) was 0.005, less than or equal to the level of
significance of 0.05. The null hypothesis that there was
no difference between the model without independent
variables and the model with independent variables
was rejected. The existence of a relationship between
the independent variables and the dependent variable
was supported.
ters II
Slide
40

NUMERICAL PROBLEMS
Parameter Estimates

HIGHWAYS
a
AND BRIDGES
1

2

Intercept
AGE
EDUC
CONLEGIS
Intercept
AGE
EDUC
CONLEGIS

a. The reference category is: 3.

B
3.240
.019
.071
-1.373
3.639
.003
.172
-1.657

Std. Error
2.478
.020
.108
.620
2.456
.020
.110
.613

Wald
1.709
.906
.427
4.913
2.195
.017
2.463
7.298

95% Confidence Inter
Exp(B)
Multicollinearity in the multinomial
df
Sig.
Exp(B)
logistic regression solution is Lower Bound Upper
1 by examining the standard
.191
detected
errors1for the .341
b coefficients. A
1.019
.980
standard error larger than 2.0
1
.514
1.073
.868
indicates numerical problems, such
1
.027
.253
.075
as multicollinearity among the
1
.138
independent variables, zero cells for
a dummy-coded independent
1
.897
1.003
.963
variable because all of the subjects
1
.117
1.188
.958
have the same value for the
1
.007
.191
variable, and 'complete separation' .057

whereby the two groups in the
dependent event variable can be
perfectly separated by scores on
one of the independent variables.
Analyses that indicate numerical
problems should not be interpreted.

None of the independent variables
in this analysis had a standard error
larger than 2.0. (We are not
interested in the standard errors
associated with the intercept.)
ters II
Slide
41

RELATIONSHIP OF INDIVIDUAL INDEPENDENT
VARIABLES TO DEPENDENT VARIABLE - 1
Likelihood Ratio Tests

Effect
Intercept
AGE
EDUC
CONLEGIS

-2 Log
Likelihood of
Reduced
Model
268.323
268.625
270.395
275.194

Chi-Square
2.350
2.652
4.423
9.221

df
2
2
2
2

Sig.
.309
.265
.110
.010

The chi-square statistic is the difference in -2 log-likelihoods
between the final model and a reduced model. The reduced model is
formed by omitting an effect from the final model. The null hypothesis
is that all parameters of that effect are 0.

The statistical significance of the relationship between
confidence in Congress and opinion about spending on
highways and bridges is based on the statistical significance of
the chi-square statistic in the SPSS table titled "Likelihood
Ratio Tests".
For this relationship, the probability of the chi-square statistic
(9.221) was 0.010, less than or equal to the level of
significance of 0.05. The null hypothesis that all of the b
coefficients associated with confidence in Congress were equal
to zero was rejected. The existence of a relationship between
confidence in Congress and opinion about spending on
highways and bridges was supported.
ters II

RELATIONSHIP OF INDIVIDUAL INDEPENDENT
VARIABLES TO DEPENDENT VARIABLE - 2

Slide
42

Parameter Estimates

HIGHWAYS
a
AND BRIDGES
1

2

Intercept
AGE
EDUC
CONLEGIS
Intercept
AGE
EDUC
CONLEGIS

B
3.240
.019
.071
-1.373
3.639
.003
.172
-1.657

Std. Error
2.478
.020
.108
.620
2.456
.020
.110
.613

Wald
1.709
.906
.427
4.913
2.195
.017
2.463
7.298

df
1
1
1
1
1
1
1
1

Sig.
.191
.341
.514
.027
.138
.897
.117
.007

a. The reference category is: 3.

In the comparison of survey respondents who thought we spend
too little money on highways and bridges to survey respondents
who thought we spend too much money on highways and
bridges, the probability of the Wald statistic (4.913) for the
variable confidence in Congress [conlegis] was 0.027. Since the
probability was less than or equal to the level of significance of
0.05, the null hypothesis that the b coefficient for confidence in
Congress was equal to zero for this comparison was rejected.

Exp(B)

95% Confiden
Exp
Lower Bound

1.019
1.073
.253

.980
.868
.075

1.003
1.188
.191

.963
.958
.057
ters II

RELATIONSHIP OF INDIVIDUAL INDEPENDENT
VARIABLES TO DEPENDENT VARIABLE - 3

Slide
43

Parameter Estimates

HIGHWAYS
a
AND BRIDGES
1

2

Intercept
AGE
EDUC
CONLEGIS
Intercept
AGE
EDUC
CONLEGIS

B
3.240
.019
.071
-1.373
3.639
.003
.172
-1.657

Std. Error
2.478
.020
.108
.620
2.456
.020
.110
.613

Wald
1.709
.906
.427
4.913
2.195
.017
2.463
7.298

df
1
1
1
1
1
1
1
1

Sig.
.191
.341
.514
.027
.138
.897
.117
.007

a. The reference category is: 3.
The value of Exp(B) was 0.253 which implies that for each unit

increase in confidence in Congress the odds decreased by 74.7%
(0.253 - 1.0 = -0.747).
The relationship stated in the problem is supported. Survey
respondents who had less confidence in congress were less likely
to be in the group of survey respondents who thought we spend
too little money on highways and bridges, rather than the group of
survey respondents who thought we spend too much money on
highways and bridges. For each unit increase in confidence in
Congress, the odds of being in the group of survey respondents
who thought we spend too little money on highways and bridges
decreased by 74.7%.

Exp(B)

95% Confiden
Exp
Lower Bound

1.019
1.073
.253

.980
.868
.075

1.003
1.188
.191

.963
.958
.057
ters II

RELATIONSHIP OF INDIVIDUAL INDEPENDENT
VARIABLES TO DEPENDENT VARIABLE - 4

Slide
44

Parameter Estimates

HIGHWAYS
a
AND BRIDGES
1

2

Intercept
AGE
EDUC
CONLEGIS
Intercept
AGE
EDUC
CONLEGIS

B
3.240
.019
.071
-1.373
3.639
.003
.172
-1.657

Std. Error
2.478
.020
.108
.620
2.456
.020
.110
.613

Wald
1.709
.906
.427
4.913
2.195
.017
2.463
7.298

df
1
1
1
1
1
1
1
1

Sig.
.191
.341
.514
.027
.138
.897
.117
.007

a. The reference category is: 3.

In the comparison of survey respondents who thought we spend
about the right amount of money on highways and bridges to
survey respondents who thought we spend too much money on
highways and bridges, the probability of the Wald statistic
(7.298) for the variable confidence in Congress [conlegis] was
0.007. Since the probability was less than or equal to the level
of significance of 0.05, the null hypothesis that the b coefficient
for confidence in Congress was equal to zero for this comparison
was rejected.

Exp(B)

95% Confiden
Exp
Lower Bound

1.019
1.073
.253

.980
.868
.075

1.003
1.188
.191

.963
.958
.057
ters II
Slide
45

RELATIONSHIP OF INDIVIDUAL INDEPENDENT
VARIABLES TO DEPENDENT VARIABLE - 5
Parameter Estimates

95% Con
HIGHWAYS
a
AND BRIDGES
1

2

Intercept
AGE
EDUC
CONLEGIS
Intercept
AGE
EDUC
CONLEGIS

B
3.240
.019
.071
-1.373
3.639
.003
.172
-1.657

Std. Error
2.478
.020
.108
.620
2.456
.020
.110
.613

Wald
1.709
.906
.427
4.913
2.195
.017
2.463
7.298

df
1
1
1
1
1
1
1
1

Sig.
.191
.341
.514
.027
.138
.897
.117
.007

a. The reference category is: 3.

The value of Exp(B) was 0.191 which implies that for each unit increase in
confidence in Congress the odds decreased by 80.9% (0.191-1.0=-0.809).
The relationship stated in the problem is supported. Survey respondents
who had less confidence in congress were less likely to be in the group of
survey respondents who thought we spend about the right amount of
money on highways and bridges, rather than the group of survey
respondents who thought we spend too much money on highways and
bridges. For each unit increase in confidence in Congress, the odds of
being in the group of survey respondents who thought we spend about the
right amount of money on highways and bridges decreased by 80.9%.

Exp(B)

Lower Bou

1.019
1.073
.253

.9
.8
.0

1.003
1.188
.191

.9
.9
.0
ters II
Slide
46

CLASSIFICATION USING THE MULTINOMIAL LOGISTIC
REGRESSION MODEL: BY CHANCE ACCURACY RATE
The independent variables could be characterized as useful
predictors distinguishing survey respondents who thought we
spend too little money on highways and bridges, survey
respondents who thought we spend about the right amount
of money on highways and bridges and survey respondents
who thought we spend too much money on highways and
bridges if the classification accuracy rate was substantially
higher than the accuracy attainable by chance alone.
Operationally, the classification accuracy rate should be 25%
or more higher than the proportional by chance accuracy
rate.

Case Processing Summary
N
HIGHWAYS
AND BRIDGES

1
2
3

Marginal
Percentage
37.1%
55.7%
7.2%
100.0%

62
93
12
Valid
167
Missing
103
Total
270
The proportional by chance accuracy rate was computed by
Subpopulation
153
calculating the proportion of cases for eachagroup based on

the number of cases in each group in the 'Case Processing
a.
Summary',The dependent variable has only one value the proportion of
and then squaring and summing observed
in 146 (95.4%) subpopulations.
cases in each group (0.371² + 0.557² + 0.072² = 0.453).
ters II
Slide
47

CLASSIFICATION USING THE MULTINOMIAL LOGISTIC
REGRESSION MODEL: CLASSIFICATION ACCURACY

Classification
Predicted
Observed
1
2
3
Overall Percentage

1
15
7
5
16.2%

2
47
86
7
83.8%

3
0
0
0
.0%

The classification accuracy rate was 60.5%
which was greater than or equal to the
proportional by chance accuracy criteria of
56.6% (1.25 x 45.3% = 56.6%).
The criteria for classification accuracy is
satisfied.

Percent
Correct
24.2%
92.5%
.0%
60.5%
ters II
Slide
48

Answering the question in problem 1 - 1
11. In the dataset GSS2000, is the following statement true, false, or an incorrect application
of a statistic? Assume that there is no problem with missing data, outliers, or influential cases,
and that the validation analysis will confirm the generalizability of the results. Use a level of
significance of 0.05 for evaluating the statistical relationships.
The variables "age" [age], "highest year of school completed" [educ] and "confidence in
Congress" [conlegis] were useful predictors for distinguishing between groups based on
responses to "opinion about spending on highways and bridges" [natroad]. These predictors
differentiate survey respondents who thought we spend too little money on highways and
bridges from survey respondents who thought we spend too much money on highways and
bridges and survey respondents who thought we spend about the right amount of money on
highways and bridges from survey respondents who thought we spend too much money on
highways and bridges.
Among this set of predictors, confidence in Congress was helpful in distinguishing among the
groups defined by responses to opinion about spending on highways and bridges. Survey
We found a statistically significant be in
respondents who had less confidence in congress were less likely tooverallthe group of survey
relationship between highways and bridges, rather than the
respondents who thought we spend too little money onthe combination of
independent variables and the dependent
group of survey respondents who thought we spend too much money on highways and bridges.
variable.
For each unit increase in confidence in Congress, the odds of being in the group of survey
respondents who thought we spend too little money on highways and bridges decreased by
74.7%. Survey respondents who had less was no evidence of numerical less likelyin be in the
There confidence in congress were problems to
group of survey respondents who thought we spend about the right amount of money on
the solution.
highways and bridges, rather than the group of survey respondents who thought we spend too
much money on highways and bridges. For each unit increaseaccuracy surpassed
Moreover, the classification in confidence in Congress, the
odds of being in the group of survey respondents whochance accuracy criteria, the right amount
the proportional by thought we spend about
of money on highways and bridges supporting the 80.9%.of the model.
decreased by utility
1. True
2. True with caution
3. False
ters II
Slide
49

Answering the question in problem 1 - 2
We verified that each statement about the [educ] and
The variables "age" [age], "highest year of school completed" relationship "confidence in
Congress" [conlegis]between an independent for distinguishingdependent groups based on
were useful predictors variable and the between
variable was correct in both direction of the relationship These predictors
responses to "opinion about spending on highways and bridges" [natroad].
differentiate surveyand the change in likelihoodwe spend too little money on highways and
respondents who thought associated with a one-unit
bridges from survey change of the who thought variable, for both of the
respondents independent we spend too much money on highways and
bridges and survey respondents who thought we stated in the problem. amount of money on
comparisons between groups spend about the right
highways and bridges from survey respondents who thought we spend too much money on
highways and bridges.

Among this set of predictors, confidence in Congress was helpful in distinguishing among the
groups defined by responses to opinion about spending on highways and bridges. Survey
respondents who had less confidence in congress were less likely to be in the group of survey
respondents who thought we spend too little money on highways and bridges, rather than the
group of survey respondents who thought we spend too much money on highways and bridges.
For each unit increase in confidence in Congress, the odds of being in the group of survey
respondents who thought we spend too little money on highways and bridges decreased by
74.7%. Survey respondents who had less confidence in congress were less likely to be in the
group of survey respondents who thought we spend about the right amount of money on
highways and bridges, rather than the group of survey respondents who thought we spend too
much money on highways and bridges. For each unit increase in confidence in Congress, the
odds of being in the group of survey respondents who thought we spend about the right amount
of money on highways and bridges decreased by 80.9%.
1.
2.
3.
4.

True
True with caution
False
Inappropriate application of a statistic

The answer to the question is true
with caution.
A caution is added because of the
inclusion of ordinal level variables.
ters II
Slide
50

Problem 2
1. In the dataset GSS2000, is the following statement true, false, or an incorrect application of
a statistic? Assume that there is no problem with missing data, outliers, or influential cases,
and that the validation analysis will confirm the generalizability of the results. Use a level of
significance of 0.05 for evaluating the statistical relationships.
The variables "highest year of school completed" [educ], "sex" [sex] and "total family income"
[income98] were useful predictors for distinguishing between groups based on responses to
"opinion about spending on space exploration" [natspac]. These predictors differentiate survey
respondents who thought we spend too little money on space exploration from survey
respondents who thought we spend too much money on space exploration and survey
respondents who thought we spend about the right amount of money on space exploration from
survey respondents who thought we spend too much money on space exploration.
Among this set of predictors, total family income was helpful in distinguishing among the
groups defined by responses to opinion about spending on space exploration. Survey
respondents who had higher total family incomes were more likely to be in the group of survey
respondents who thought we spend about the right amount of money on space exploration,
rather than the group of survey respondents who thought we spend too much money on space
exploration. For each unit increase in total family income, the odds of being in the group of
survey respondents who thought we spend about the right amount of money on space
exploration increased by 6.0%.
1.
2.
3.
4.

True
True with caution
False
Inappropriate application of a statistic
ters II
Slide
51

Dissecting problem 2 - 1
1. In the dataset GSS2000, is the following statement true, false, or an incorrect
application of a statistic? Assume that there is no problem with missing data, outliers, or
influential cases, and that the validation analysis will confirm the generalizability of the
results. Use a level of significance of 0.05 for evaluating the statistical relationships.
The variables "highest year of school completed" [educ], "sex" [sex] and "total family income"
[income98] were useful predictors for distinguishing between groups based on responses to
"opinion about spending on space exploration" [natspac]. we will predictors differentiate survey
For these problems, These
respondents who thought we spend too little money on is no problem
assume that there space exploration from survey
respondents who thought we spend too much money on outliers, or
with missing data, space exploration and survey
respondents who thought we spend about the right amount of money on space exploration from
influential cases, and that the
survey respondents who thought we spend too much moneyconfirm exploration.
on space
validation analysis will
the generalizability of the

Among this set of predictors, total family income was helpful in distinguishing among the
results
groups defined by responses to opinion about spending on space exploration. Survey
respondents who had higher total familythis problem, we are told to to be in the group of survey
In incomes were more likely
respondents who thought we spend about0.05 right amount of money on space exploration,
use the as alpha for the
rather than the group of survey respondents who logistic regression. too much money on space
multinomial thought we spend
exploration. For each unit increase in total family income, the odds of being in the group of
survey respondents who thought we spend about the right amount of money on space
exploration increased by 6.0%.
1.
2.
3.
4.

True
True with caution
False
Inappropriate application of a statistic
ters II
Slide
52

Dissecting problem 2 - 2
The variables listed first in the problem
statement are the independent variables
1. In (IVs): "highest year of is the following statement true, false, or an incorrect application of
the dataset GSS2000, school completed"
a statistic? Assume [sex] there is nofamily
[educ], "sex" that and "total problem with missing data, outliers, or influential cases,
and that the validation analysis will confirm the generalizability of the results. Use a level of
income" [income98].

significance of 0.05 for evaluating the statistical relationships.

The variables "highest year of school completed" [educ], "sex" [sex] and "total family
income" [income98] were useful predictors for distinguishing between groups based on
responses to "opinion about spending on space exploration" [natspac]. These predictors
differentiate survey respondents who thought we spend too little money on space exploration
from survey respondents who thought we spend too much money on space exploration and
survey respondents who thought we spend about the right amount of money on space
exploration from survey respondents who thought we spend too much money on space
The variable
exploration. used to define
groups is the dependent
variable (DV): "opinion about
Among this on space
spending set of predictors, total family income was helpful in distinguishing among the
groups defined by responses to opinion about spending on space exploration. Survey
exploration" [natspac].

respondents who had higher total family incomes were more likely to be in the group of survey
respondents who thought we spend about the right amount of money on space exploration,
rather than the group of survey respondents who thought we spend too much money on space
SPSS only odds of direct in
exploration. For each unit increase in total family income, thesupports being or the group of
simultaneous entry of independent
survey respondents who thought we spend about the right amount of money on space
variables in multinomial logistic
exploration increased by 6.0%.
1. True
2. True with caution
3. False

regression, so we have no choice of
method for entering variables.
ters II
Slide
53

Dissecting problem 2 - 3
SPSS multinomial logistic regression models the relationship
by comparing each of the groups defined by the dependent
variable to the group with the highest code value.

1. In the dataset GSS2000,to opinion about spending ontrue, false, or an incorrect application of
The responses is the following statement the space
a statistic? Assume that there is no problem with missing data, outliers, or influential cases,
program were:
and that the1= Too little, 2 = About right, and 3 = Too much.
validation analysis will confirm the generalizability of the results. Use a level of
significance of 0.05 for evaluating the statistical relationships.
The variables "highest year of school completed" [educ], "sex" [sex] and "total family income"
[income98] were useful predictors for distinguishing between groups based on responses to
"opinion about spending on space exploration" [natspac]. These predictors differentiate
survey respondents who thought we spend too little money on space exploration from
survey respondents who thought we spend too much money on space exploration and
survey respondents who thought we spend about the right amount of money on space
exploration from survey respondents who thought we spend too much money on space
exploration.
Among this set of predictors, total family income was helpful in distinguishing among the
The analysis will result about spending on
groups defined by responses to opinion in two comparisons:space exploration. Survey
respondents who • survey respondents who thought we spend likely to be in the group of survey
had higher total family incomes were more too little money
versus survey respondents who amount of money on space
respondents who thought we spend about the rightthought we spend too much exploration,
money on space exploration
rather than the group of survey respondents who thought we spend too much money on space
• survey increase in total family income, the odds the being in the group of
exploration. For each unit respondents who thought we spend about of right
amount of money versus survey respondents who money on
survey respondents who thought we spend about the right amount ofthought we space
exploration increased by 6.0%.
spend too much money on space exploration.
1. True
ters II
Slide
54

Dissecting problem 2 - 4
Each problem includes a statement about the
The variables "highest year of school completed" [educ], "sex" [sex] and "total family income"
[income98]relationship between onefor distinguishing between groups based on responses to
were useful predictors independent variable and
the dependenton space exploration" [natspac]. These predictors differentiate survey
"opinion about spending variable. The answer to the
problem is based on the stated relationship,
respondents who thought we spend too little money on space exploration from survey
respondents who thought we spend too much money on space exploration and survey
ignoring the relationships between the other
respondents who thought we spend about the right variable. of money on space exploration from
independent variables and the dependent amount
survey respondents who thought we spend too much money on space exploration.

Among this set of predictors, total family income was helpful in distinguishing among the
groups defined by responses to opinion about spending on space exploration. Survey
respondents who had higher total family incomes were more likely to be in the group of
survey respondents who thought we spend about the right amount of money on space
exploration, rather than the group of survey respondents who thought we spend too much
money on space exploration. For each unit increase in total family income, the odds of
being in the group of survey respondents who thought we spend about the right amount of
money on space exploration increased by 6.0%.
1.
2.
3.
4.

True
True with caution
This problem identifies a difference for only one
of the two comparisons based on the three values
False
Inappropriate application of a of the dependent variable.
statistic
Other problems will specify both of the possible
comparisons.
ters II
Slide
55

Dissecting problem 2 - 5
The variables "highest year of school completed" [educ], "sex" [sex] and "total family income"
[income98] were useful predictors for distinguishing between groups based on responses to
"opinion about spending on space exploration" [natspac]. These predictors differentiate survey
respondents who thought we spend too little money on space exploration from survey
respondents who thought we spend too much money on space exploration and survey
respondents who thought we spend about the right amount of money on space exploration from
survey respondents who thought we spend too much money on space exploration.
Among this set of predictors, total family income was helpful in distinguishing among the
groups defined by responses to opinion about spending on space exploration. Survey
respondents who had higher total family incomes were more likely to be in the group of survey
respondents who thought we spend about the right amount of money on space exploration,
rather than the group of survey respondents who thought we spend too much money on space
exploration. For each unit increase in total family income, the odds of being in the group of
survey respondents who thought we spend about the right amount of money on space
exploration increased by 6.0%.
1.
2.
3.
4.

True
In order for the multinomial logistic regression
question to be true, the overall relationship must
True with caution
be statistically significant, there must be no
False
evidence of numerical problems, the classification
Inappropriate application of a statistic
accuracy rate must be substantially better than
could be obtained by chance alone, and the
stated individual relationship must be statistically
significant and interpreted correctly.
ters II
Slide
56

LEVEL OF MEASUREMENT - 1
1. In the dataset GSS2000, is the following statement true, false, or an incorrect application of
a statistic? Assume that there is no problem with missing data, outliers, or influential cases,
and that the validation analysis will confirm the generalizability of the results. Use a level of
significance of 0.05 for evaluating the statistical relationships.
The variables "highest year of school completed" [educ], "sex" [sex] and "total family income"
[income98] were useful predictors for distinguishing between groups based on responses to
"opinion about spending on space exploration" [natspac]. These predictors differentiate
survey respondents who thought we spend too little money on space exploration from
survey respondents who thought we spend too much money on space exploration and
survey respondents who thought we spend about the right amount of money on space
exploration from survey respondents who thought we spend too much money on space
exploration.
Among this set of predictors, total family income was helpful in distinguishing among the
Multinomial opinion about spending on space
groups defined by responses tologistic regression requires that the exploration. Survey
dependent variable be non-metric and the
respondents who had higher total family incomes were more likely to be in the group of survey
independent variables be metric or dichotomous.
respondents who thought we spend about the right amount of money on space exploration,
rather than the group of survey respondents who thought we spend too much money on space
"Opinion about spending on space exploration"
exploration. For each unit increase in total family income, the odds of being in the group of
[natspac] is ordinal, satisfying the non-metric
survey respondentslevel of measurement requirement for the
who thought we spend about the right amount of money on space
exploration increased by 6.0%.
dependent variable.
1.
2.
3.
4.

It contains three categories: survey respondents

True
who thought we spend too little money, about
True with cautionright amount of money, and too much
the
money on space exploration.
False
Inappropriate application of a statistic
ters II
Slide
57

LEVEL OF MEASUREMENT - 2
"Highest year of school
"Sex" [sex] is dichotomous,
completed" [educ] is interval,
satisfying the metric or
satisfying the metric or
dichotomous level of measurement
1. In the dataset GSS2000, is the following statement true, false, or an incorrect application of
dichotomous level of
requirement for independent
measurement Assume that there is no problem with missing data, outliers, or influential cases,
a statistic? requirement for
variables.
independent variables.
and that the validation analysis will confirm the generalizability of the results. Use a level of

significance of 0.05 for evaluating the statistical relationships.

The variables "highest year of school completed" [educ], "sex" [sex] and "total family
income" [income98] were useful predictors for distinguishing between groups based on
responses to "opinion about spending on space exploration" [natspac]. These predictors
differentiate survey respondents who thought we spend too little money on space exploration
from survey respondents who thought we spend too much money on space exploration and
survey respondents who thought we spend about the right amount of money on space
exploration from survey family income" [income98] we spend too much money on space
"Total respondents who thought is ordinal,
exploration.
satisfying the metric or dichotomous level of

measurement requirement for independent
variables. If we follow the convention of treating
Among this set of ordinal level total family incomevariables, the in distinguishing among the
predictors, variables as metric was helpful level
groups defined byof measurement requirementspending on space exploration. Survey
responses to opinion about for the analysis is
respondents who had higher total family incomes were not agree to be in the group of survey
satisfied. Since some data analysts do more likely
with this convention, a note of caution should money on space exploration,
respondents who thought we spend about the right amount of be
included in our interpretation.
rather than the group of survey respondents who thought we spend about the right amount of

money on space exploration. For each unit increase in total family income, the odds of being in
the group of survey respondents who thought we spend about the right amount of money on
space exploration increased by 6.0%.
1. True
2. True with caution
ters II
Slide
58

Request multinomial logistic regression

Select the Regression |
Multinomial Logistic…
command from the
Analyze menu.
ters II
Slide
59

Selecting the dependent variable

First, highlight the
dependent variable
natspac in the list
of variables.

Second, click on the right
arrow button to move the
dependent variable to the
Dependent text box.
ters II
Slide
60

Selecting non-metric independent variables
Non-metric independent variables are specified as
factors in multinomial logistic regression. Non-metric
variables can be either dichotomous, nominal, or
ordinal.
These variables will be dummy coded as needed and
each value will be listed separately in the output.

Select the
dichotomous
variable sex.

Move the non-metric
independent variables
listed in the problem to
the Factor(s) list box.
ters II
Slide
61

Selecting metric independent variables
Metric independent variables are specified as covariates
in multinomial logistic regression. Metric variables can
be either interval or, by convention, ordinal.

Move the metric
independent variables,
educ and income98, to
the Covariate(s) list box.
ters II
Slide
62

Specifying statistics to include in the output

While we will accept most of
the SPSS defaults for the
analysis, we need to specifically
request the classification table.
Click on the Statistics… button
to make a request.
ters II
Slide
63

Requesting the classification table

First, keep the SPSS
defaults for Summary
statistics, Likelihood
ratio test, and
Parameter estimates.

Second, mark the
checkbox for the
Classification table.

Third, click
on the
Continue
button to
complete the
request.
ters II
Slide
64

Completing the multinomial
logistic regression request

Click on the OK
button to request
the output for the
multinomial logistic
regression.

The multinomial logistic procedure supports
additional commands to specify the model
computed for the relationships (we will use the
default main effects model), additional
specifications for computing the regression,
and saving classification results. We will not
make use of these options.
ters II
Slide
65

Sample size – ratio of cases to variables
Case Processing Summary
N
SPACE EXPLORATION
PROGRAM
RESPONDENTS SEX
Valid
Missing
Total
Subpopulation

1
2
3
1
2

33
90
85
94
114
208
62
270
138a

Marginal
Percentage
15.9%
43.3%
40.9%
45.2%
54.8%
100.0%

a. The dependent variable has only one value observed in 112

Multinomial logistic regression requires that the minimum ratio
(81.2%) subpopulations.
of valid cases to independent variables be at least 10 to 1. The
ratio of valid cases (208) to number of independent
variables( 3) was 69.3 to 1, which was equal to or greater than
the minimum ratio. The requirement for a minimum ratio of
cases to independent variables was satisfied.
The preferred ratio of valid cases to independent variables is
20 to 1. The ratio of 69.3 to 1 was equal to or greater than the
preferred ratio. The preferred ratio of cases to independent
variables was satisfied.
ters II
Slide
66

OVERALL RELATIONSHIP BETWEEN
INDEPENDENT AND DEPENDENT VARIABLES
Model Fitting Information
Model
Intercept Only
Final

-2 Log
Likelihood
354.268
334.967

Chi-Square
19.301

df

Sig.
6

.004

The presence of a relationship between the dependent
variable and combination of independent variables is
based on the statistical significance of the final model
chi-square in the SPSS table titled "Model Fitting
Information".
In this analysis, the probability of the model chi-square
(19.301) was 0.004, less than or equal to the level of
significance of 0.05. The null hypothesis that there was
no difference between the model without independent
variables and the model with independent variables
was rejected. The existence of a relationship between
the independent variables and the dependent variable
was supported.
ters II
Slide
67

NUMERICAL PROBLEMS
Parameter Estimates

SPACE EXPLORATION
a
PROGRAM
1

2

Intercept
EDUC
INCOME98
[SEX=1]
[SEX=2]
Intercept
EDUC
INCOME98
[SEX=1]
[SEX=2]

B
Std. Error
-4.136
1.157
.101
.089
.097
.050
.672
.426
b
0
.
-2.487
.840
.108
.068
.058
.034
.501
.317
b
0
.

a. The reference category is: 3.
b. This parameter is set to zero because it is redundant.

Wald
12.779
1.276
3.701
2.488
.
8.774
2.521
2.932
2.492
.

df

95% Confidence
Exp(B)
Lower Bound
U

Sig.
Exp(B)
1
Multicollinearity .000
in the multinomial
logistic regression solution is
1
.259
1.106
detected by examining the
1
.054
1.102
standard errors for the b
1
.115
1.959
coefficients. A standard error
larger than 2.0 indicates numerical
0
.
.
problems, such .003
as multicollinearity
1
among the independent variables,
1
.112
1.114
zero cells for a dummy-coded
independent variable because all of
1
.087
1.060
the subjects have the same value
1
.114
1.650
for the variable, and 'complete
0
.
separation' whereby the two .

groups in the dependent event
variable can be perfectly separated
by scores on one of the
independent variables. Analyses
that indicate numerical problems
should not be interpreted.
None of the independent variables
in this analysis had a standard
error larger than 2.0.

.929
.998
.850
.
.975
.992
.886
.
ters II
Slide
68

RELATIONSHIP OF INDIVIDUAL INDEPENDENT
VARIABLES TO DEPENDENT VARIABLE - 1
Likelihood Ratio Tests

Effect
Intercept
EDUC
INCOME98
SEX

-2 Log
Likelihood of
Reduced
Model
334.967a
337.788
340.154
338.511

Chi-Square
.000
2.821
5.187
3.544

df

Sig.
0
2
2
2

.
.244
.075
.170

The chi-square statistic is the difference in -2 log-likelihoods
between the final model and a reduced model. The reduced model
is formed by omitting an effect from the final model. The null
hypothesis is that all parameters of that effect are 0.
a.
The statistical significance of the relationship between
This reduced model spending on space
total family income and opinion aboutis equivalent to the final model because
exploration is based on the statistical significance of the
omitting the effect does not increase the degrees of freedom.

chi-square statistic in the SPSS table titled "Likelihood
Ratio Tests".

For this relationship, the probability of the chi-square
statistic (5.187) was 0.075, greater than the level of
significance of 0.05. The null hypothesis that all of the b
coefficients associated with total family income were
equal to zero was not rejected. The existence of a
relationship between total family income and opinion
about spending on space exploration was not supported.
ters II
Slide
69

Answering the question in problem 2
1. In the dataset GSS2000, is the following statement true, false, or an incorrect application of
a statistic? Assume that there is no problem with missing data, outliers, or influential cases,
and that the validation analysis will confirm the generalizability of the results. Use a level of
significance of 0.05 for evaluating the statistical relationships.
The variables "highest year of school completed" [educ], "sex" [sex] and "total family income"
[income98] were useful predictors for distinguishing between groups based on responses to
"opinion about spending on space exploration" [natspac]. These predictors differentiate survey
respondents who thought we spend too little money on space exploration from survey
respondents who thought we spend too much money on space exploration and survey
respondents who thought we spend about the right amount of money on space exploration from
survey respondents who thought we spend too much money on space exploration.
We found a statistically significant overall
relationship between the combination of
Among this set of predictors, totalindependent variables and the dependent
family income was helpful in distinguishing among the
groups defined by responses to opinion about spending on space exploration. Survey
variable.

respondents who had higher total family incomes were more likely to be in the group of survey
respondents who thought we spend about the right amount numerical problems in
There was no evidence of of money on space exploration,
rather than the group of survey respondents who thought we spend too much money on space
the solution.
exploration. For each unit increase in total family income, the odds of being in the group of
survey respondents who thought we spend about the right amount of money on space
However, the individual relationship between
exploration increased by 6.0%.
1.
2.
3.
4.

total family income and spending on space was
not statistically significant.

True
True with caution
The answer to the question is false.
False
Inappropriate application of a statistic
ters II
Slide
70

Steps in multinomial logistic regression:
level of measurement and initial sample size
The following is a guide to the decision process for answering
problems about the basic relationships in multinomial logistic
regression:
Dependent non-metric?
Independent variables
metric or dichotomous?

No

Inappropriate
application of
a statistic

Yes

Ratio of cases to
independent variables at
least 10 to 1?

Yes
Run multinomial logistic regression

No

Inappropriate
application of
a statistic
ters II
Slide
71

Steps in multinomial logistic regression:
overall relationship and numerical problems

Overall relationship
statistically significant?
(model chi-square test)

No

False

Yes

Standard errors of
coefficients indicate no
numerical problems (s.e.
<= 2.0)?

Yes

No

False
ters II
Slide
72

Steps in multinomial logistic regression:
relationships between IV's and DV

Overall relationship
between specific IV and DV
is statistically significant?
(likelihood ratio test)

No

False

Yes

Role of specific IV and DV
groups statistically significant
and interpreted correctly?
(Wald test and Exp(B))

Yes

No

False
ters II
Slide
73

Steps in multinomial logistic regression:
classification accuracy and adding cautions

Overall accuracy rate is
25% > than proportional
by chance accuracy rate?

No

False

Yes

Satisfies preferred ratio of
cases to IV's of 20 to 1

No

True with caution

Yes
One or more IV's are
ordinal level treated as
metric?

No
True

Yes

True with caution

More Related Content

What's hot

What is Binary Logistic Regression Classification and How is it Used in Analy...
What is Binary Logistic Regression Classification and How is it Used in Analy...What is Binary Logistic Regression Classification and How is it Used in Analy...
What is Binary Logistic Regression Classification and How is it Used in Analy...Smarten Augmented Analytics
 
Ordinal logistic regression
Ordinal logistic regression Ordinal logistic regression
Ordinal logistic regression Dr Athar Khan
 
Multinomial Logistic Regression Analysis
Multinomial Logistic Regression AnalysisMultinomial Logistic Regression Analysis
Multinomial Logistic Regression AnalysisHARISH Kumar H R
 
Correlation & Regression Analysis using SPSS
Correlation & Regression Analysis  using SPSSCorrelation & Regression Analysis  using SPSS
Correlation & Regression Analysis using SPSSParag Shah
 
Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)MikeBlyth
 
Logistic regression with SPSS
Logistic regression with SPSSLogistic regression with SPSS
Logistic regression with SPSSLNIPE
 
Multinomial Logistic Regression
Multinomial Logistic RegressionMultinomial Logistic Regression
Multinomial Logistic RegressionDr Athar Khan
 
Regression analysis
Regression analysisRegression analysis
Regression analysissaba khan
 
Application of ordinal logistic regression in the study of students’ performance
Application of ordinal logistic regression in the study of students’ performanceApplication of ordinal logistic regression in the study of students’ performance
Application of ordinal logistic regression in the study of students’ performanceAlexander Decker
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis pptElkana Rorio
 
Logistic Ordinal Regression
Logistic Ordinal RegressionLogistic Ordinal Regression
Logistic Ordinal RegressionSri Ambati
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionsaba khan
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regressionJames Neill
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysisMurali Raj
 
Logistic Regression.ppt
Logistic Regression.pptLogistic Regression.ppt
Logistic Regression.ppthabtamu biazin
 
Ch4 Confidence Interval
Ch4 Confidence IntervalCh4 Confidence Interval
Ch4 Confidence IntervalFarhan Alfin
 

What's hot (20)

What is Binary Logistic Regression Classification and How is it Used in Analy...
What is Binary Logistic Regression Classification and How is it Used in Analy...What is Binary Logistic Regression Classification and How is it Used in Analy...
What is Binary Logistic Regression Classification and How is it Used in Analy...
 
Ordinal logistic regression
Ordinal logistic regression Ordinal logistic regression
Ordinal logistic regression
 
Multinomial Logistic Regression Analysis
Multinomial Logistic Regression AnalysisMultinomial Logistic Regression Analysis
Multinomial Logistic Regression Analysis
 
Correlation & Regression Analysis using SPSS
Correlation & Regression Analysis  using SPSSCorrelation & Regression Analysis  using SPSS
Correlation & Regression Analysis using SPSS
 
Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)
 
Logistic regression with SPSS
Logistic regression with SPSSLogistic regression with SPSS
Logistic regression with SPSS
 
Multinomial Logistic Regression
Multinomial Logistic RegressionMultinomial Logistic Regression
Multinomial Logistic Regression
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Application of ordinal logistic regression in the study of students’ performance
Application of ordinal logistic regression in the study of students’ performanceApplication of ordinal logistic regression in the study of students’ performance
Application of ordinal logistic regression in the study of students’ performance
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis ppt
 
Logistic Ordinal Regression
Logistic Ordinal RegressionLogistic Ordinal Regression
Logistic Ordinal Regression
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Regression
RegressionRegression
Regression
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysis
 
Logistic Regression.ppt
Logistic Regression.pptLogistic Regression.ppt
Logistic Regression.ppt
 
Ch4 Confidence Interval
Ch4 Confidence IntervalCh4 Confidence Interval
Ch4 Confidence Interval
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 

Viewers also liked

Logistic Regression: Predicting The Chances Of Coronary Heart Disease
Logistic Regression: Predicting The Chances Of Coronary Heart DiseaseLogistic Regression: Predicting The Chances Of Coronary Heart Disease
Logistic Regression: Predicting The Chances Of Coronary Heart DiseaseMichael Lieberman
 
Multinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkDB Tsai
 
Linear Regression Using SPSS
Linear Regression Using SPSSLinear Regression Using SPSS
Linear Regression Using SPSSDr Athar Khan
 
Third party logistics
Third party logisticsThird party logistics
Third party logisticsKuldeep Uttam
 
Reverse Logistics
Reverse  LogisticsReverse  Logistics
Reverse Logisticssanket_123
 

Viewers also liked (7)

Logistic Regression: Predicting The Chances Of Coronary Heart Disease
Logistic Regression: Predicting The Chances Of Coronary Heart DiseaseLogistic Regression: Predicting The Chances Of Coronary Heart Disease
Logistic Regression: Predicting The Chances Of Coronary Heart Disease
 
Multinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache Spark
 
Linear Regression Using SPSS
Linear Regression Using SPSSLinear Regression Using SPSS
Linear Regression Using SPSS
 
Logistics management
Logistics managementLogistics management
Logistics management
 
Third party logistics
Third party logisticsThird party logistics
Third party logistics
 
Reverse Logistics
Reverse  LogisticsReverse  Logistics
Reverse Logistics
 
Logistic management
Logistic managementLogistic management
Logistic management
 

Similar to Multinomial logisticregression basicrelationships

Logistic regression and analysis using statistical information
Logistic regression and analysis using statistical informationLogistic regression and analysis using statistical information
Logistic regression and analysis using statistical informationAsadJaved304231
 
Multinomial Logistic Regression.pdf
Multinomial Logistic Regression.pdfMultinomial Logistic Regression.pdf
Multinomial Logistic Regression.pdfAlemAyahu
 
Binary OR Binomial logistic regression
Binary OR Binomial logistic regression Binary OR Binomial logistic regression
Binary OR Binomial logistic regression Dr Athar Khan
 
Stats ca report_18180485
Stats ca report_18180485Stats ca report_18180485
Stats ca report_18180485sarthakkhare3
 
Guide for building GLMS
Guide for building GLMSGuide for building GLMS
Guide for building GLMSAli T. Lotia
 
cannonicalpresentation-110505114327-phpapp01.pdf
cannonicalpresentation-110505114327-phpapp01.pdfcannonicalpresentation-110505114327-phpapp01.pdf
cannonicalpresentation-110505114327-phpapp01.pdfJermaeDizon2
 
USE OF PLS COMPONENTS TO IMPROVE CLASSIFICATION ON BUSINESS DECISION MAKING
USE OF PLS COMPONENTS TO IMPROVE CLASSIFICATION ON BUSINESS DECISION MAKINGUSE OF PLS COMPONENTS TO IMPROVE CLASSIFICATION ON BUSINESS DECISION MAKING
USE OF PLS COMPONENTS TO IMPROVE CLASSIFICATION ON BUSINESS DECISION MAKINGIJDKP
 
Applications of regression analysis - Measurement of validity of relationship
Applications of regression analysis - Measurement of validity of relationshipApplications of regression analysis - Measurement of validity of relationship
Applications of regression analysis - Measurement of validity of relationshipRithish Kumar
 
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 3 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 3 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai UniversityMadhav Mishra
 
5Linear regressiion analysis power point
5Linear regressiion analysis power point5Linear regressiion analysis power point
5Linear regressiion analysis power pointMitikuTeka1
 
5Linear regression presentation note ppdf
5Linear regression presentation note ppdf5Linear regression presentation note ppdf
5Linear regression presentation note ppdfMitikuTeka1
 
Quantitative Data Analysis: Hypothesis Testing
Quantitative Data Analysis: Hypothesis TestingQuantitative Data Analysis: Hypothesis Testing
Quantitative Data Analysis: Hypothesis TestingMurni Mohd Yusof
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spssDr Nisha Arora
 
Review Parameters Model Building & Interpretation and Model Tunin.docx
Review Parameters Model Building & Interpretation and Model Tunin.docxReview Parameters Model Building & Interpretation and Model Tunin.docx
Review Parameters Model Building & Interpretation and Model Tunin.docxcarlstromcurtis
 
Introduction to Econometrics for under gruadute class.pptx
Introduction to Econometrics for under gruadute class.pptxIntroduction to Econometrics for under gruadute class.pptx
Introduction to Econometrics for under gruadute class.pptxtadegebreyesus
 
ders 8 Quantile-Regression.ppt
ders 8 Quantile-Regression.pptders 8 Quantile-Regression.ppt
ders 8 Quantile-Regression.pptErgin Akalpler
 

Similar to Multinomial logisticregression basicrelationships (20)

Logistic regression and analysis using statistical information
Logistic regression and analysis using statistical informationLogistic regression and analysis using statistical information
Logistic regression and analysis using statistical information
 
Multinomial Logistic Regression.pdf
Multinomial Logistic Regression.pdfMultinomial Logistic Regression.pdf
Multinomial Logistic Regression.pdf
 
Binary OR Binomial logistic regression
Binary OR Binomial logistic regression Binary OR Binomial logistic regression
Binary OR Binomial logistic regression
 
Stats ca report_18180485
Stats ca report_18180485Stats ca report_18180485
Stats ca report_18180485
 
Guide for building GLMS
Guide for building GLMSGuide for building GLMS
Guide for building GLMS
 
Logistic regression sage
Logistic regression sageLogistic regression sage
Logistic regression sage
 
Telecom customer churn prediction
Telecom customer churn predictionTelecom customer churn prediction
Telecom customer churn prediction
 
cannonicalpresentation-110505114327-phpapp01.pdf
cannonicalpresentation-110505114327-phpapp01.pdfcannonicalpresentation-110505114327-phpapp01.pdf
cannonicalpresentation-110505114327-phpapp01.pdf
 
USE OF PLS COMPONENTS TO IMPROVE CLASSIFICATION ON BUSINESS DECISION MAKING
USE OF PLS COMPONENTS TO IMPROVE CLASSIFICATION ON BUSINESS DECISION MAKINGUSE OF PLS COMPONENTS TO IMPROVE CLASSIFICATION ON BUSINESS DECISION MAKING
USE OF PLS COMPONENTS TO IMPROVE CLASSIFICATION ON BUSINESS DECISION MAKING
 
Applications of regression analysis - Measurement of validity of relationship
Applications of regression analysis - Measurement of validity of relationshipApplications of regression analysis - Measurement of validity of relationship
Applications of regression analysis - Measurement of validity of relationship
 
Discriminant analysis.pptx
Discriminant analysis.pptxDiscriminant analysis.pptx
Discriminant analysis.pptx
 
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 3 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 3 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai University
 
5Linear regressiion analysis power point
5Linear regressiion analysis power point5Linear regressiion analysis power point
5Linear regressiion analysis power point
 
5Linear regression presentation note ppdf
5Linear regression presentation note ppdf5Linear regression presentation note ppdf
5Linear regression presentation note ppdf
 
Quantitative Data Analysis: Hypothesis Testing
Quantitative Data Analysis: Hypothesis TestingQuantitative Data Analysis: Hypothesis Testing
Quantitative Data Analysis: Hypothesis Testing
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spss
 
Review Parameters Model Building & Interpretation and Model Tunin.docx
Review Parameters Model Building & Interpretation and Model Tunin.docxReview Parameters Model Building & Interpretation and Model Tunin.docx
Review Parameters Model Building & Interpretation and Model Tunin.docx
 
Introduction to Econometrics for under gruadute class.pptx
Introduction to Econometrics for under gruadute class.pptxIntroduction to Econometrics for under gruadute class.pptx
Introduction to Econometrics for under gruadute class.pptx
 
CH3.pdf
CH3.pdfCH3.pdf
CH3.pdf
 
ders 8 Quantile-Regression.ppt
ders 8 Quantile-Regression.pptders 8 Quantile-Regression.ppt
ders 8 Quantile-Regression.ppt
 

Recently uploaded

BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
Call Girls In ⇛⇛Chhatarpur⇚⇚. Brings Offer Delhi Contact Us 8377877756
Call Girls In ⇛⇛Chhatarpur⇚⇚. Brings Offer Delhi Contact Us 8377877756Call Girls In ⇛⇛Chhatarpur⇚⇚. Brings Offer Delhi Contact Us 8377877756
Call Girls In ⇛⇛Chhatarpur⇚⇚. Brings Offer Delhi Contact Us 8377877756dollysharma2066
 
CATALOG cáp điện Goldcup (bảng giá) 1.4.2024.PDF
CATALOG cáp điện Goldcup (bảng giá) 1.4.2024.PDFCATALOG cáp điện Goldcup (bảng giá) 1.4.2024.PDF
CATALOG cáp điện Goldcup (bảng giá) 1.4.2024.PDFOrient Homes
 
(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCRsoniya singh
 
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… Abridged
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… AbridgedLean: From Theory to Practice — One City’s (and Library’s) Lean Story… Abridged
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… AbridgedKaiNexus
 
The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024christinemoorman
 
Call Girls Miyapur 7001305949 all area service COD available Any Time
Call Girls Miyapur 7001305949 all area service COD available Any TimeCall Girls Miyapur 7001305949 all area service COD available Any Time
Call Girls Miyapur 7001305949 all area service COD available Any Timedelhimodelshub1
 
Vip Female Escorts Noida 9711199171 Greater Noida Escorts Service
Vip Female Escorts Noida 9711199171 Greater Noida Escorts ServiceVip Female Escorts Noida 9711199171 Greater Noida Escorts Service
Vip Female Escorts Noida 9711199171 Greater Noida Escorts Serviceankitnayak356677
 
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Roomdivyansh0kumar0
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...lizamodels9
 
Investment analysis and portfolio management
Investment analysis and portfolio managementInvestment analysis and portfolio management
Investment analysis and portfolio managementJunaidKhan750825
 
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In.../:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...lizamodels9
 
(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCRsoniya singh
 
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCRsoniya singh
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation SlidesKeppelCorporation
 
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...lizamodels9
 
Pitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deckPitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deckHajeJanKamps
 
Sales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessSales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessAggregage
 
Lowrate Call Girls In Laxmi Nagar Delhi ❤️8860477959 Escorts 100% Genuine Ser...
Lowrate Call Girls In Laxmi Nagar Delhi ❤️8860477959 Escorts 100% Genuine Ser...Lowrate Call Girls In Laxmi Nagar Delhi ❤️8860477959 Escorts 100% Genuine Ser...
Lowrate Call Girls In Laxmi Nagar Delhi ❤️8860477959 Escorts 100% Genuine Ser...lizamodels9
 

Recently uploaded (20)

BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
Best Practices for Implementing an External Recruiting Partnership
Best Practices for Implementing an External Recruiting PartnershipBest Practices for Implementing an External Recruiting Partnership
Best Practices for Implementing an External Recruiting Partnership
 
Call Girls In ⇛⇛Chhatarpur⇚⇚. Brings Offer Delhi Contact Us 8377877756
Call Girls In ⇛⇛Chhatarpur⇚⇚. Brings Offer Delhi Contact Us 8377877756Call Girls In ⇛⇛Chhatarpur⇚⇚. Brings Offer Delhi Contact Us 8377877756
Call Girls In ⇛⇛Chhatarpur⇚⇚. Brings Offer Delhi Contact Us 8377877756
 
CATALOG cáp điện Goldcup (bảng giá) 1.4.2024.PDF
CATALOG cáp điện Goldcup (bảng giá) 1.4.2024.PDFCATALOG cáp điện Goldcup (bảng giá) 1.4.2024.PDF
CATALOG cáp điện Goldcup (bảng giá) 1.4.2024.PDF
 
(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR
 
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… Abridged
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… AbridgedLean: From Theory to Practice — One City’s (and Library’s) Lean Story… Abridged
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… Abridged
 
The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024
 
Call Girls Miyapur 7001305949 all area service COD available Any Time
Call Girls Miyapur 7001305949 all area service COD available Any TimeCall Girls Miyapur 7001305949 all area service COD available Any Time
Call Girls Miyapur 7001305949 all area service COD available Any Time
 
Vip Female Escorts Noida 9711199171 Greater Noida Escorts Service
Vip Female Escorts Noida 9711199171 Greater Noida Escorts ServiceVip Female Escorts Noida 9711199171 Greater Noida Escorts Service
Vip Female Escorts Noida 9711199171 Greater Noida Escorts Service
 
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
 
Investment analysis and portfolio management
Investment analysis and portfolio managementInvestment analysis and portfolio management
Investment analysis and portfolio management
 
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In.../:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
 
(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR
 
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
 
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...
 
Pitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deckPitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deck
 
Sales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessSales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for Success
 
Lowrate Call Girls In Laxmi Nagar Delhi ❤️8860477959 Escorts 100% Genuine Ser...
Lowrate Call Girls In Laxmi Nagar Delhi ❤️8860477959 Escorts 100% Genuine Ser...Lowrate Call Girls In Laxmi Nagar Delhi ❤️8860477959 Escorts 100% Genuine Ser...
Lowrate Call Girls In Laxmi Nagar Delhi ❤️8860477959 Escorts 100% Genuine Ser...
 

Multinomial logisticregression basicrelationships

  • 1. SW388R7 Data Analysis & Computers II Slide 1 Multinomial Logistic Regression Basic Relationships Multinomial Logistic Regression Describing Relationships Classification Accuracy Sample Problems
  • 2. Compu ters II Multinomial logistic regression Slide 2  Multinomial logistic regression is used to analyze relationships between a non-metric dependent variable and metric or dichotomous independent variables.  Multinomial logistic regression compares multiple groups through a combination of binary logistic regressions.  The group comparisons are equivalent to the comparisons for a dummy-coded dependent variable, with the group with the highest numeric score used as the reference group.  For example, if we wanted to study differences in BSW, MSW, and PhD students using multinomial logistic regression, the analysis would compare BSW students to PhD students and MSW students to PhD students. For each independent variable, there would be two comparisons.
  • 3. Compu ters II What multinomial logistic regression predicts Slide 3  Multinomial logistic regression provides a set of coefficients for each of the two comparisons. The coefficients for the reference group are all zeros, similar to the coefficients for the reference group for a dummy-coded variable.  Thus, there are three equations, one for each of the groups defined by the dependent variable.  The three equations can be used to compute the probability that a subject is a member of each of the three groups. A case is predicted to belong to the group associated with the highest probability.  Predicted group membership can be compared to actual group membership to obtain a measure of classification accuracy.
  • 4. Compu ters II Level of measurement requirements Slide 4  Multinomial logistic regression analysis requires that the dependent variable be non-metric. Dichotomous, nominal, and ordinal variables satisfy the level of measurement requirement.  Multinomial logistic regression analysis requires that the independent variables be metric or dichotomous. Since SPSS will automatically dummy-code nominal level variables, they can be included since they will be dichotomized in the analysis.  In SPSS, non-metric independent variables are included as “factors.” SPSS will dummy-code non-metric IVs.  In SPSS, metric independent variables are included as “covariates.” If an independent variable is ordinal, we will attach the usual caution.
  • 5. Compu ters II Assumptions and outliers Slide 5  Multinomial logistic regression does not make any assumptions of normality, linearity, and homogeneity of variance for the independent variables.  Because it does not impose these requirements, it is preferred to discriminant analysis when the data does not satisfy these assumptions.  SPSS does not compute any diagnostic statistics for outliers. To evaluate outliers, the advice is to run multiple binary logistic regressions and use those results to test the exclusion of outliers or influential cases.
  • 6. Compu ters II Sample size requirements Slide 6  The minimum number of cases per independent variable is 10, using a guideline provided by Hosmer and Lemeshow, authors of Applied Logistic Regression, one of the main resources for Logistic Regression.  For preferred case-to-variable ratios, we will use 20 to 1.
  • 7. Compu ters II Methods for including variables Slide 7  The only method for selecting independent variables in SPSS is simultaneous or direct entry.
  • 8. Compu ters II Overall test of relationship - 1 Slide 8  The overall test of relationship among the independent variables and groups defined by the dependent is based on the reduction in the likelihood values for a model which does not contain any independent variables and the model that contains the independent variables.  This difference in likelihood follows a chi-square distribution, and is referred to as the model chi-square.  The significance test for the final model chi-square (after the independent variables have been added) is our statistical evidence of the presence of a relationship between the dependent variable and the combination of the independent variables.
  • 9. Compu ters II Slide 9 Overall test of relationship - 2 Model Fitting Information Model Intercept Only Final -2 Log Likelihood 284.429 265.972 Chi-Square 18.457 df Sig. 6 .005 The presence of a relationship between the dependent variable and combination of independent variables is based on the statistical significance of the final model chi-square in the SPSS table titled "Model Fitting Information". In this analysis, the probability of the model chi-square (18.457) was 0.005, less than or equal to the level of significance of 0.05. The null hypothesis that there was no difference between the model without independent variables and the model with independent variables was rejected. The existence of a relationship between the independent variables and the dependent variable was supported.
  • 10. ters II Strength of multinomial logistic regression relationship Slide 10  While multinomial logistic regression does compute correlation measures to estimate the strength of the relationship (pseudo R square measures, such as Nagelkerke's R²), these correlations measures do not really tell us much about the accuracy or errors associated with the model.  A more useful measure to assess the utility of a multinomial logistic regression model is classification accuracy, which compares predicted group membership based on the logistic model to the actual, known group membership, which is the value for the dependent variable.
  • 11. ters II Slide 11 Evaluating usefulness for logistic models  The benchmark that we will use to characterize a multinomial logistic regression model as useful is a 25% improvement over the rate of accuracy achievable by chance alone.  Even if the independent variables had no relationship to the groups defined by the dependent variable, we would still expect to be correct in our predictions of group membership some percentage of the time. This is referred to as by chance accuracy.  The estimate of by chance accuracy that we will use is the proportional by chance accuracy rate, computed by summing the squared percentage of cases in each group. The only difference between by chance accuracy for binary logistic models and by chance accuracy for multinomial logistic models is the number of groups defined by the dependent variable.
  • 12. ters II Slide 12 Computing by chance accuracy The percentage of cases in each group defined by the dependent variable is found in the ‘Case Processing Summary’ table. Case Processing Summary N HIGHWAYS AND BRIDGES Valid Missing Total Subpopulation 1 2 3 62 93 12 167 103 270 153a Marginal Percentage 37.1% 55.7% 7.2% 100.0% a. The dependent variable has only one value observed in 146 (95.4%) subpopulations. The proportional by chance accuracy rate was computed by calculating the proportion of cases for each group based on the number of cases in each group in the 'Case Processing Summary', and then squaring and summing the proportion of cases in each group (0.371² + 0.557² + 0.072² = 0.453). The proportional by chance accuracy criteria is 56.6% (1.25 x 45.3% = 56.6%).
  • 13. ters II Slide 13 Comparing accuracy rates  To characterize our model as useful, we compare the overall percentage accuracy rate produced by SPSS at the last step in which variables are entered to 25% more than the proportional by chance accuracy. (Note: SPSS does not compute a cross-validated accuracy rate for multinomial logistic regression .) Classification Predicted Observed 1 2 3 Overall Percentage 1 15 7 5 16.2% 2 47 86 7 83.8% 3 0 0 0 .0% The classification accuracy rate was 60.5% which was greater than or equal to the proportional by chance accuracy criteria of 56.6% (1.25 x 45.3% = 56.6%). The criteria for classification accuracy is satisfied in this example. Percent Correct 24.2% 92.5% .0% 60.5%
  • 14. ters II Slide 14 Numerical problems     The maximum likelihood method used to calculate multinomial logistic regression is an iterative fitting process that attempts to cycle through repetitions to find an answer. Sometimes, the method will break down and not be able to converge or find an answer. Sometimes the method will produce wildly improbable results, reporting that a one-unit change in an independent variable increases the odds of the modeled event by hundreds of thousands or millions. These implausible results can be produced by multicollinearity, categories of predictors having no cases or zero cells, and complete separation whereby the two groups are perfectly separated by the scores on one or more independent variables. The clue that we have numerical problems and should not interpret the results are standard errors for some independent variables that are larger than 2.0.
  • 15. ters II Relationship of individual independent variables and the dependent variable Slide 15  There are two types of tests for individual independent variables:  The likelihood ratio test evaluates the overall relationship between an independent variable and the dependent variable  The Wald test evaluates whether or not the independent variable is statistically significant in differentiating between the two groups in each of the embedded binary logistic comparisons.  If an independent variable has an overall relationship to the dependent variable, it might or might not be statistically significant in differentiating between pairs of groups defined by the dependent variable.
  • 16. ters II Relationship of individual independent variables and the dependent variable Slide 16  The interpretation for an independent variable focuses on its ability to distinguish between pairs of groups and the contribution which it makes to changing the odds of being in one dependent variable group rather than the other.  We should not interpret the significance of an independent variable’s role in distinguishing between pairs of groups unless the independent variable also has an overall relationship to the dependent variable in the likelihood ratio test.  The interpretation of an independent variable’s role in differentiating dependent variable groups is the same as we used in binary logistic regression. The difference in multinomial logistic regression is that we can have multiple interpretations for an independent variable in relation to different pairs of groups.
  • 17. ters II Relationship of individual independent variables and the dependent variable Slide 17 Parameter Estimates HIGHWAYS a AND BRIDGES 1 2 Intercept AGE EDUC CONLEGIS Intercept AGE EDUC CONLEGIS B 3.240 .019 .071 -1.373 3.639 .003 .172 -1.657 Std. Error 2.478 .020 .108 .620 2.456 .020 .110 .613 95% Confidence Interva Exp(B) SPSS identifies the comparisons Exp(B) it makes for Bound Upper B Wald df Sig. Lower groups defined by1the dependent variable in 1.709 .191 the table of ‘Parameter Estimates,’ 1.019 either .980 using .906 1 .341 the value codes or the value labels, depending .427 1 .514 1.073 on the options settings for pivot table labeling. .868 4.913 1 .027 .253 .075 The 2.195 reference category is .138 identified in the 1 footnote to the table. .017 1 .897 1.003 .963 In this analysis, two comparisons will be 2.463 1 .117 1.188 .958 made: 7.298 1 .007 .191 .057 a. The reference category is: 3. HIGHWAYS a AND BRIDGES TOO LITTLE ABOUT RIGHT Intercept AGE EDUC CONLEGIS Intercept AGE EDUC CONLEGIS •the TOO LITTLE group (coded 1, shaded blue) will be compared to the TOO MUCH Parameter Estimates group (coded 3, shaded purple) •the ABOUT RIGHT group (coded 2 , shaded orange)) will be compared to the TOO MUCH group (coded 3, shaded purple). Wald Std. Error df Sig. Exp(B) B 3.240 2.478 1.709 1 .191 The reference category plays the same role in .019 .020 .906 1 .341 multinomial logistic regression that it plays in .071 .108 .427 1 .514 the dummy-coding of a nominal variable: it is the category that4.913 would be coded with .027 zeros -1.373 .620 1 for all of the dummy-coded variables that all 3.639 2.456 2.195 1 .138 other categories are interpreted against. .003 .020 .017 1 .897 .172 .110 2.463 1 .117 -1.657 .613 7.298 1 .007 a. The reference category is: TOO MUCH. 1.019 1.073 .253 1.003 1.188 .191 95% C Lower B
  • 18. ters II Relationship of individual independent variables and the dependent variable Slide 18 Likelihood Ratio Tests Effect Intercept AGE EDUC CONLEGIS -2 Log Likelihood of Reduced Model 268.323 268.625 270.395 275.194 Chi-Square 2.350 2.652 4.423 9.221 df 2 2 2 2 Sig. .309 .265 .110 .010 In this example, there is a statistically significant relationship between the independent variable CONLEGIS and the dependent variable. (0.010 < 0.05) The chi-square statistic is the difference in -2 log-likelihoods between the final model and a reduced model. The reduced model is Parameter Estimates formed by omitting an effect from the final model. The null hypothesis is that all parameters of that effect are 0. HIGHWAYS a AND BRIDGES 1 2 B 3.240 .019 .071 -1.373 3.639 .003 .172 -1.657 Intercept AGE EDUC CONLEGIS Intercept AGE EDUC CONLEGIS Std. Error 2.478 .020 .108 .620 2.456 .020 .110 .613 Wald 1.709 .906 .427 4.913 2.195 .017 2.463 7.298 df 1 1 1 1 1 1 1 1 As well, the independent variable CONLEGIS is significant in distinguishing both category 1 of 95% Confidence Interval f the dependent variable from Exp(B) category 3 of the dependent Sig. Exp(B) Lower variable. (0.027 < 0.05) Bound Upper Bou .191 .341 .514 .027 .138 .897 .117 .007 a. The reference category is: 3. And the independent variable CONLEGIS is significant in distinguishing category 2 of the dependent variable from category 3 of the dependent variable. (0.007 < 0.05) 1.019 1.073 .253 .980 .868 .075 1.0 1.3 .8 1.003 1.188 .191 .963 .958 .057 1.0 1.4 .6
  • 19. ters II Interpreting relationship of individual independent variables to the dependent variable Slide 19 Likelihood Ratio Tests Effect Intercept AGE EDUC CONLEGIS -2 Log Survey Likelihood of respondents who had less confidence in congress (higher values correspond to lower confidence) were less likely to be in the Reduced group ofChi-Square survey respondents who thought we spend too little money Model df Sig. on highways and bridges (DV category 1), rather than the group of 268.323 respondents who thought we spend too much money on 2.350 2 .309 survey 268.625 2.652 .265 highways and bridges (DV 2 category 3). 270.395 4.423 2 .110 For each unit9.221 increase in confidence in Congress, the odds of being 275.194 2 .010 in the group of survey respondents who thought we spend too little The chi-square statistic is theon highwayslog-likelihoods decreased by 74.7%. (0.253 – 1.0 money difference in -2 and bridges between the final model-0.747) = and a reduced model. The reduced model is Parameter Estimates formed by omitting an effect from the final model. The null hypothesis is that all parameters of that effect are 0. HIGHWAYS a AND BRIDGES 1 2 Intercept AGE EDUC CONLEGIS Intercept AGE EDUC CONLEGIS a. The reference category is: 3. B 3.240 .019 .071 -1.373 3.639 .003 .172 -1.657 Std. Error 2.478 .020 .108 .620 2.456 .020 .110 .613 Wald 1.709 .906 .427 4.913 2.195 .017 2.463 7.298 df 1 1 1 1 1 1 1 1 Sig. .191 .341 .514 .027 .138 .897 .117 .007 Exp(B) 95% Confidence Interval f Exp(B) Lower Bound Upper Bou 1.019 1.073 .253 .980 .868 .075 1.0 1.3 .8 1.003 1.188 .191 .963 .958 .057 1.0 1.4 .6
  • 20. ters II Interpreting relationship of individual independent variables to the dependent variable Slide 20 Likelihood Ratio Tests Effect Intercept AGE EDUC CONLEGIS -2 Log Likelihood of Reduced Model 268.323 268.625 270.395 275.194 Chi-Square 2.350 2.652 4.423 9.221 df 2 2 2 2 Sig. .309 .265 .110 .010 Survey respondents who had less confidence in congress (higher The chi-square statistic is the difference in -2 log-likelihoods confidence) were less likely to be in the values correspond to lower group of survey The reduced model is between the final model and a reduced model. respondents who thought we spend about the right Parameter Estimates amount of money The null hypothesis formed by omitting an effect from the final model.on highways and bridges (DV category 2), rather than the group of survey respondents who thought we spend too is that all parameters of that effect are 0. much money on highways and bridges (DV Category 3). HIGHWAYS a AND BRIDGES 1 2 B Std. Error Wald df Sig. Exp(B) For each unit increase in confidence in Congress, the odds of being in Intercept the group of survey respondents who thought we spend about the 3.240 2.478 1.709 1 .191 right amount of money on highways and bridges decreased by AGE .019 .020 1 .341 1.019 80.9%. (0.191 – 1.0 = 0.809) .906 EDUC .071 .108 .427 1 .514 1.073 CONLEGIS -1.373 .620 4.913 1 .027 .253 Intercept 3.639 2.456 2.195 1 .138 AGE .003 .020 .017 1 .897 1.003 EDUC .172 .110 2.463 1 .117 1.188 CONLEGIS -1.657 .613 7.298 1 .007 .191 a. The reference category is: 3. 95% Confidence Interval f Exp(B) Lower Bound Upper Bou .980 .868 .075 1.0 1.3 .8 .963 .958 .057 1.0 1.4 .6
  • 21. ters II Relationship of individual independent variables and the dependent variable Slide 21 Likelihood Ratio Tests Effect Intercept AGE EDUC POLVIEWS SEX -2 Log Likelihood of Reduced Model 327.463a 333.440 329.606 334.636 338.985 Chi-Square .000 5.976 2.143 7.173 11.521 df Sig. 0 2 2 2 2 . .050 .343 .028 .003 The chi-square statistic is the difference in -2 log-likelihoods Parameter Estimates between the final model and a reduced model. The reduced model is formed by omitting an effect from the final model. The null hypothesis is that all parameters of that effect are 0. a. a NATCHLD B Std. Error Wald df This reduced model is equivalent to the final2.233 because model TOO LITTLE Intercept 8.434 14.261 1 omitting the effect does not increase the degrees of freedom. AGE -.023 .017 1.756 1 EDUC -.066 .102 .414 1 POLVIEWS -.575 .251 5.234 1 [SEX=1] -2.167 .805 7.242 1 b [SEX=2] 0 . . 0 ABOUT RIGHT Intercept 4.485 2.255 3.955 1 AGE -.001 .018 .003 1 EDUC .011 .104 .011 1 POLVIEWS -.397 .257 2.375 1 [SEX=1] -1.606 .824 3.800 1 b [SEX=2] 0 . . 0 a. The reference category is: TOO MUCH. In this example, there is a statistically significant relationship between SEX and the dependent variable, spending on childcare assistance. As well, SEX plays a statistically significant role in differentiating 95% Confidence Interval the TOO LITTLE group from the TOO Exp(B) MUCH Exp(B) (reference) group. Sig. Lower Bound Upper Bo (0.007 < 0.5) .000 .185 .977 .944 .520 .936 .766 .022 .563 .344 .007 .115 .024 . . . However, SEX does not .047differentiate the ABOUT .955RIGHT .999 .965 group from the TOO MUCH (reference) .916 1.011 .824 group.(0.51 > 0.5) .123 .673 .406 .051 .201 .040 . . . 1. 1. . . 1. 1. 1. 1.
  • 22. ters II Slide 22 Interpreting relationship of individual independent variables and the dependent variable Likelihood Ratio Tests Effect Intercept AGE EDUC POLVIEWS SEX -2 Log Likelihood of Reduced Model Chi-Square df Sig. 327.463a .000 0 . Survey respondents who were2 male (code 1 for sex) were less likely 333.440 5.976 .050 to 329.606 be in the group of survey respondents who thought we spend too 2.143 2 .343 little money on childcare assistance (DV category 1), rather than the 334.636 2 .028 group of survey 7.173 respondents who thought we spend too much money on childcare assistance (DV category 3). 338.985 11.521 2 .003 The chi-square statistic is the difference in -2 log-likelihoods Survey respondents who were male were 88.5% less likely (0.115 – Parameter Estimates between the final model and a reduced model. The reduced model 1.0 = -0.885) to be in the group of survey respondents who thought is formed by omittingspend too little final model. The null we an effect from the money on childcare assistance. hypothesis is that all parameters of that effect are 0. a. a NATCHLD B Std. Error Wald df Sig. Exp(B) This reduced model is equivalent to the final2.233 because model TOO LITTLE Intercept 8.434 14.261 1 .000 omitting the effect does not increase the degrees of freedom. AGE -.023 .017 1.756 1 .185 .977 EDUC -.066 .102 .414 1 .520 .936 POLVIEWS -.575 .251 5.234 1 .022 .563 [SEX=1] -2.167 .805 7.242 1 .007 .115 b [SEX=2] 0 . . 0 . . ABOUT RIGHT Intercept 4.485 2.255 3.955 1 .047 AGE -.001 .018 .003 1 .955 .999 EDUC .011 .104 .011 1 .916 1.011 POLVIEWS -.397 .257 2.375 1 .123 .673 [SEX=1] -1.606 .824 3.800 1 .051 .201 b [SEX=2] 0 . . 0 . . a. The reference category is: TOO MUCH. 95% Confidence Interval Exp(B) Lower Bound Upper Bo .944 .766 .344 .024 . 1. 1. . . .965 .824 .406 .040 . 1. 1. 1. 1.
  • 23. ters II Interpreting relationships for independent variable in problems Slide 23  In the multinomial logistic regression problems, the problem statement will ask about only one of the independent variables. The answer will be true or false based on only the relationship between the specified independent variable and the dependent variable. The individual relationships between other independent variables are the dependent variable are not used in determining whether or not the answer is true or false.
  • 24. ters II Slide 24 Problem 1 11. In the dataset GSS2000, is the following statement true, false, or an incorrect application of a statistic? Assume that there is no problem with missing data, outliers, or influential cases, and that the validation analysis will confirm the generalizability of the results. Use a level of significance of 0.05 for evaluating the statistical relationships. The variables "age" [age], "highest year of school completed" [educ] and "confidence in Congress" [conlegis] were useful predictors for distinguishing between groups based on responses to "opinion about spending on highways and bridges" [natroad]. These predictors differentiate survey respondents who thought we spend too little money on highways and bridges from survey respondents who thought we spend too much money on highways and bridges and survey respondents who thought we spend about the right amount of money on highways and bridges from survey respondents who thought we spend too much money on highways and bridges. Among this set of predictors, confidence in Congress was helpful in distinguishing among the groups defined by responses to opinion about spending on highways and bridges. Survey respondents who had less confidence in congress were less likely to be in the group of survey respondents who thought we spend too little money on highways and bridges, rather than the group of survey respondents who thought we spend too much money on highways and bridges. For each unit increase in confidence in Congress, the odds of being in the group of survey respondents who thought we spend too little money on highways and bridges decreased by 74.7%. Survey respondents who had less confidence in congress were less likely to be in the group of survey respondents who thought we spend about the right amount of money on highways and bridges, rather than the group of survey respondents who thought we spend too much money on highways and bridges. For each unit increase in confidence in Congress, the odds of being in the group of survey respondents who thought we spend about the right amount of money on highways and bridges decreased by 80.9%. 1. 2. 3. 4. True True with caution False Inappropriate application of a statistic
  • 25. ters II Slide 25 Dissecting problem 1 - 1 11. In the dataset GSS2000, is the following statement true, false, or an incorrect application of a statistic? Assume that there is no problem with missing data, outliers, or influential cases, and that the validation analysis will confirm the generalizability of the results. Use a level of significance of 0.05 for evaluating the statistical relationships. The variables "age" [age], "highest year of school completed" [educ] and "confidence in Congress" [conlegis] were useful predictors for distinguishing between groups based on responses to "opinion about spending on highways and bridges" [natroad]. These predictors differentiate survey respondents who For thesewe spend too little money on highways and thought problems, we will bridges from survey respondents who assume that spend is nomuch money on highways and thought we there too problem bridges and survey respondents who thought we spend about the right amount of money on with missing data, outliers, or highways and bridges from survey respondents who thought wethe influential cases, and that spend too much money on highways and bridges. validation analysis will confirm the generalizability of the Among this set of predictors, confidence in Congress was helpful in distinguishing among the results groups defined by responses to opinion about spending on highways and bridges. Survey respondents who had less confidence in congress were less likely to be in the group of survey In this money we are told and respondents who thought we spend too littleproblem,on highways to bridges, rather than the use we spend too much group of survey respondents who thought 0.05 as alpha for the money on highways and bridges. For each unit increase in confidence in Congress, logistic regression. in the group of survey multinomial the odds of being respondents who thought we spend too little money on highways and bridges decreased by 74.7%. Survey respondents who had less confidence in congress were less likely to be in the group of survey respondents who thought we spend about the right amount of money on highways and bridges, rather than the group of survey respondents who thought we spend too much money on highways and bridges. For each unit increase in confidence in Congress, the odds of being in the group of survey respondents who thought we spend about the right amount of money on highways and bridges decreased by 80.9%. 1. 2. 3. 4. True True with caution False Inappropriate application of a statistic
  • 26. ters II Slide 26 Dissecting problem 1 - 2 The variables listed first in the problem statement are the independent variables (IVs): "age" [age], "highest year of school 11. In the dataset GSS2000,"confidence in completed" [educ] and is the following statement true, false, or an incorrect application of a statistic? Assume that there is no problem with missing data, outliers, or influential cases, Congress" [conlegis]. and that the validation analysis will confirm the generalizability of the results. Use a level of significance of 0.05 for evaluating the statistical relationships. The variables "age" [age], "highest year of school completed" [educ] and "confidence in Congress" [conlegis] were useful predictors for distinguishing between groups based on responses to "opinion about spending on highways and bridges" [natroad]. These predictors differentiate survey respondents who thought we spend too little money on highways and bridges from survey respondents who thought we spend too much money on highways and bridges and survey respondents who thought we spend about the right amount of money on highways and bridges from survey respondents who thought we spend too much money on The variable used to define highways and bridges.the dependent groups is variable (DV): "opinion about Among this set of predictors, confidence in Congress was helpful in distinguishing among the spending on highways and groups defined by responses to opinion about spending on highways and bridges. Survey respondents bridges" [natroad]. who had less confidence in congress were less likely to be in the group of survey respondents who thought we spend too little money on highways and bridges, rather than the group of survey respondents who thought we spend too much money on highways and bridges. For each unit increase in confidence in Congress, the odds of being in the group of survey respondents who thought we spend too little moneySPSS only supports direct or on highways and bridges decreased by simultaneous entry of independent in the 74.7%. Survey respondents who had less confidence in congress were less likely to be group of survey respondents who thought we spend variables in multinomial logistic about the right amount of money on regression, so we have no choice of highways and bridges, rather than the group of survey respondents who thought we spend too much money on highways and bridges. For each unitmethod for entering variables. increase in confidence in Congress, the odds of being in the group of survey respondents who thought we spend about the right amount of money on highways and bridges decreased by 80.9%.
  • 27. ters II Slide 27 Dissecting problem 1 - 3 SPSS multinomial logistic regression models the relationship by comparing each of the groups defined by the dependent variable to the group with the highest code value. 11. In the dataset GSS2000, opinionfollowing statement true, false, or an incorrect application The responses to is the about spending on highways and bridges were: of a statistic? Assume that there is no problem with missing data, outliers, or influential cases, and that the validation analysis will confirm the= Too much. generalizability of the results. Use a level of 1= Too little, 2 = About right, and 3 significance of 0.05 for evaluating the statistical relationships. The variables "age" [age], "highest year of school completed" [educ] and "confidence in Congress" [conlegis] were useful predictors for distinguishing between groups based on responses to "opinion about spending on highways and bridges" [natroad]. These predictors differentiate survey respondents who thought we spend too little money on highways and bridges from survey respondents who thought we spend too much money on highways and bridges and survey respondents who thought we spend about the right amount of money on highways and bridges from survey respondents who thought we spend too much money on highways and bridges. Among this set of predictors, confidence in Congress was helpful in distinguishing among the groups defined by responses to opinion about spending on highways and bridges. Survey respondents who had less confidence in congress were less likely to be in the group of survey respondents who The analysis spend too in two money on highways and bridges, rather than the thought we will result little comparisons: group of survey respondents who thought we spend too spend too little money • survey respondents who thought we much money on highways and bridges. For each unit increase in confidence in Congress, the odds of being in the group of survey versus survey respondents who thought we spend too much respondents who thought we spend too and bridges on highways and bridges decreased by money on highways little money 74.7%. Survey respondents respondents who thought wecongress were less likely to be in the • survey who had less confidence in spend about the right group of survey respondentsof money versus survey respondents whoamount of money on who thought we spend about the right thought we amount highways and bridges, rather than the group of survey respondents who thought we spend too spend too bridges. For on highways and bridges. much money on highways and much money each unit increase in confidence in Congress, the odds of being in the group of survey respondents who thought we spend about the right amount of money on highways and bridges decreased by 80.9%.
  • 28. ters II Slide 28 Dissecting problem 1 - 4 Each problem includes a statement about the relationship between one independent variable and the dependent variable. The answer to the problem is based on the stated relationship, ignoring the The variablesrelationships between the other independent variables and the "age" [age], "highest year of school completed" [educ] and "confidence in dependent variable. Congress" [conlegis] were useful predictors for distinguishing between groups based on responses to "opinion about spending on highways and bridges" [natroad]. These predictors differentiate This problem identifies a difference forspendof the comparisons highways and survey respondents who thought we both too little money on bridges from among respondents who thought we spend too much money on highways and survey groups modeled by the multinomial logistic regression. bridges and survey respondents who thought we spend about the right amount of money on highways and bridges from survey respondents who thought we spend too much money on highways and bridges. Among this set of predictors, confidence in Congress was helpful in distinguishing among the groups defined by responses to opinion about spending on highways and bridges. Survey respondents who had less confidence in congress were less likely to be in the group of survey respondents who thought we spend too little money on highways and bridges, rather than the group of survey respondents who thought we spend too much money on highways and bridges. For each unit increase in confidence in Congress, the odds of being in the group of survey respondents who thought we spend too little money on highways and bridges decreased by 74.7%. Survey respondents who had less confidence in congress were less likely to be in the group of survey respondents who thought we spend about the right amount of money on highways and bridges, rather than the group of survey respondents who thought we spend too much money on highways and bridges. For each unit increase in confidence in Congress, the odds of being in the group of survey respondents who thought we spend about the right amount of money on highways and bridges decreased by 80.9%.
  • 29. ters II Slide 29 Dissecting problem 1 - 5 11. In the dataset GSS2000, is the following statement true, false, or an incorrect application of a statistic? Assume that there is no problem with missing data, outliers, or influential cases, and that the validation analysis will confirm the generalizability of the results. Use a level of significance of 0.05 for evaluating the statistical relationships. The variables "age" [age], "highest year of school completed" [educ] and "confidence in Congress" [conlegis] were useful predictors for distinguishing between groups based on responses to "opinion about spending on highways and bridges" [natroad]. These predictors differentiate survey respondents who thought we spend too little money on highways and bridges from survey respondents who thought we spend too much money on highways and bridges and survey respondents who thought we spend about the right amount of money on highways and bridges from survey respondents who thought we spend too much money on highways and bridges. Among this set of predictors, confidence in Congress was helpful in distinguishing among the groups defined by responses to opinion about spending on highways and bridges. Survey respondents who had less confidence in congress were less likely to be in the group of survey respondents who thought we spend too little money on highways and bridges, rather than the group of survey respondents who thought we spend too much money on highways and bridges. In order for the multinomial logistic regression For each unit increase in confidence in Congress, the odds of being in the group of survey question to be on highways and bridges decreased respondents who thought we spend too little money true, the overall relationship must by be statistically significant, were less be no 74.7%. Survey respondents who had less confidence in congress there mustlikely to be in the evidence of numerical problems, the classification group of survey respondents who thought we spend about the right amount of money on highways and bridges, rather than the accuracy rate must be substantiallythought we spend too group of survey respondents who better than much money on highways and bridges.couldeach unit increase in confidence in Congress, the For be obtained by chance alone, and the odds of being in the group of survey respondents who thought we spendbe statistically amount stated individual relationship must about the right of money on highways and bridges decreased by and interpreted correctly. significant 80.9%.
  • 30. ters II Slide 30 Request multinomial logistic regression Select the Regression | Multinomial Logistic… command from the Analyze menu.
  • 31. ters II Slide 31 Selecting the dependent variable First, highlight the dependent variable natroad in the list of variables. Second, click on the right arrow button to move the dependent variable to the Dependent text box.
  • 32. ters II Slide 32 Selecting metric independent variables Metric independent variables are specified as covariates in multinomial logistic regression. Metric variables can be either interval or, by convention, ordinal. Move the metric independent variables, age, educ and conlegis to the Covariate(s) list box. In this analysis, there are no nonmetric independent variables. Nonmetric independent variables would be moved to the Factor(s) list box.
  • 33. ters II Slide 33 Specifying statistics to include in the output While we will accept most of the SPSS defaults for the analysis, we need to specifically request the classification table. Click on the Statistics… button to make a request.
  • 34. ters II Slide 34 Requesting the classification table First, keep the SPSS defaults for Summary statistics, Likelihood ratio test, and Parameter estimates. Second, mark the checkbox for the Classification table. Third, click on the Continue button to complete the request.
  • 35. ters II Slide 35 Completing the multinomial logistic regression request Click on the OK button to request the output for the multinomial logistic regression. The multinomial logistic procedure supports additional commands to specify the model computed for the relationships (we will use the default main effects model), additional specifications for computing the regression, and saving classification results. We will not make use of these options.
  • 36. ters II Slide 36 LEVEL OF MEASUREMENT - 1 11. In the dataset GSS2000, is the following statement true, false, or an incorrect application of a statistic? Assume that there is no problem with missing data, outliers, or influential cases, and that the validation analysis will confirm the generalizability of the results. Use a level of significance of 0.05 for evaluating the statistical relationships. The variables "age" [age], "highest year of school completed" [educ] and "confidence in Congress" [conlegis] were useful predictors for distinguishing between groups based on responses to "opinion about spending on highways and bridges" [natroad]. These predictors differentiate survey respondents who thought we spend too little money on highways and bridges from survey respondents who thought we spend too much money on highways and bridges and survey respondents who thought we spend about the right amount of money on highways and bridges from survey respondents who thought we spend too much money on highways and bridges. Among this set of predictors, confidence in Congress was helpful in distinguishing among the groups defined by responses to opinion about spending on highways and bridges. Survey respondents who had less confidence in congressrequires that the to be in the group of survey Multinomial logistic regression were less likely respondents who thought we spend too little money andhighways and bridges, rather than the dependent variable be non-metric on the group of survey respondents who thought we spend too much money on highways and bridges. independent variables be metric or dichotomous. For each unit increase in confidence in Congress, the odds of being in the group of survey respondents who thought we spend too little money on highways and bridges decreased by "Opinion about spending on highways and bridges" [natroad] is confidence in congress were less likely to be in the 74.7%. Survey respondents who had lessordinal, satisfying the nonmetric level of thought we spend about the the group of survey respondents who measurement requirement forright amount of money on dependent variable. highways and bridges, rather than the group of survey respondents who thought we spend too much money on highways and bridges. For each unit increase in confidence in Congress, the It contains three respondents who thought we odds of being in the group of surveycategories: survey respondents spend about the right amount who thought we spend too of money on highways and bridges decreased little money, about the right amount of money, by 80.9%. and too much money on highways and bridges. 1. True 2. True with caution
  • 37. ters II Slide 37 LEVEL OF MEASUREMENT - 2 "Age" [age] and "highest year of school completed" [educ] are interval, 11. satisfying the metric or dichotomous In the dataset GSS2000, is the following statement true, false, or an incorrect application of alevel of measurement requirement for statistic? Assume that there is no problem with missing data, outliers, or influential cases, independent variables. and that the validation analysis will confirm the generalizability of the results. Use a level of significance of 0.05 for evaluating the statistical relationships. The variables "age" [age], "highest year of school completed" [educ] and "confidence in Congress" [conlegis] were useful predictors for distinguishing between groups based on responses to "opinion about spending on highways and bridges" [natroad]. These predictors differentiate survey respondents who thought we spend too little money on highways and bridges from survey respondents who thought we spend too much money on highways and bridges and survey respondents who thought we spend about the right amount of money on highways and bridges from survey respondents who thought we spend too much money on "Confidence in Congress" [conlegis] is ordinal, highways and bridges. satisfying the metric or dichotomous level of measurement requirement for independent variables. If we follow the convention of treating Among this set of predictors, confidence in Congress was helpfulthe distinguishing among the ordinal level variables as metric variables, in level groups defined by responses to opinion about spending on highways is bridges. Survey of measurement requirement for the analysis and respondents who had less confidence in congress analysts do not agree in the group of survey satisfied. Since some data were less likely to be with this convention, a note of caution should be respondents who thought we spend too little money on highways and bridges, rather than the included in our interpretation. group of survey respondents who thought we spend too much money on highways and bridges. For each unit increase in confidence in Congress, the odds of being in the group of survey respondents who thought we spend too little money on highways and bridges decreased by 74.7%. Survey respondents who had less confidence in congress were less likely to be in the group of survey respondents who thought we spend about the right amount of money on highways and bridges, rather than the group of survey respondents who thought we spend too much money on highways and bridges. For each unit increase in confidence in Congress, the odds of being in the group of survey respondents who thought we spend about the right amount of money on highways and bridges decreased by 80.9%.
  • 38. ters II Slide 38 Sample size – ratio of cases to variables Case Processing Summary N HIGHWAYS AND BRIDGES Valid Missing Total Subpopulation 1 2 3 62 93 12 167 103 270 153a Marginal Percentage 37.1% 55.7% 7.2% 100.0% a. The dependent variable has only one value observed Multinomial logistic regression requires that the minimum ratio in 146 (95.4%) subpopulations. of valid cases to independent variables be at least 10 to 1. The ratio of valid cases (167) to number of independent variables (3) was 55.7 to 1, which was equal to or greater than the minimum ratio. The requirement for a minimum ratio of cases to independent variables was satisfied. The preferred ratio of valid cases to independent variables is 20 to 1. The ratio of 55.7 to 1 was equal to or greater than the preferred ratio. The preferred ratio of cases to independent variables was satisfied.
  • 39. ters II Slide 39 OVERALL RELATIONSHIP BETWEEN INDEPENDENT AND DEPENDENT VARIABLES Model Fitting Information Model Intercept Only Final -2 Log Likelihood 284.429 265.972 Chi-Square 18.457 df Sig. 6 .005 The presence of a relationship between the dependent variable and combination of independent variables is based on the statistical significance of the final model chi-square in the SPSS table titled "Model Fitting Information". In this analysis, the probability of the model chi-square (18.457) was 0.005, less than or equal to the level of significance of 0.05. The null hypothesis that there was no difference between the model without independent variables and the model with independent variables was rejected. The existence of a relationship between the independent variables and the dependent variable was supported.
  • 40. ters II Slide 40 NUMERICAL PROBLEMS Parameter Estimates HIGHWAYS a AND BRIDGES 1 2 Intercept AGE EDUC CONLEGIS Intercept AGE EDUC CONLEGIS a. The reference category is: 3. B 3.240 .019 .071 -1.373 3.639 .003 .172 -1.657 Std. Error 2.478 .020 .108 .620 2.456 .020 .110 .613 Wald 1.709 .906 .427 4.913 2.195 .017 2.463 7.298 95% Confidence Inter Exp(B) Multicollinearity in the multinomial df Sig. Exp(B) logistic regression solution is Lower Bound Upper 1 by examining the standard .191 detected errors1for the .341 b coefficients. A 1.019 .980 standard error larger than 2.0 1 .514 1.073 .868 indicates numerical problems, such 1 .027 .253 .075 as multicollinearity among the 1 .138 independent variables, zero cells for a dummy-coded independent 1 .897 1.003 .963 variable because all of the subjects 1 .117 1.188 .958 have the same value for the 1 .007 .191 variable, and 'complete separation' .057 whereby the two groups in the dependent event variable can be perfectly separated by scores on one of the independent variables. Analyses that indicate numerical problems should not be interpreted. None of the independent variables in this analysis had a standard error larger than 2.0. (We are not interested in the standard errors associated with the intercept.)
  • 41. ters II Slide 41 RELATIONSHIP OF INDIVIDUAL INDEPENDENT VARIABLES TO DEPENDENT VARIABLE - 1 Likelihood Ratio Tests Effect Intercept AGE EDUC CONLEGIS -2 Log Likelihood of Reduced Model 268.323 268.625 270.395 275.194 Chi-Square 2.350 2.652 4.423 9.221 df 2 2 2 2 Sig. .309 .265 .110 .010 The chi-square statistic is the difference in -2 log-likelihoods between the final model and a reduced model. The reduced model is formed by omitting an effect from the final model. The null hypothesis is that all parameters of that effect are 0. The statistical significance of the relationship between confidence in Congress and opinion about spending on highways and bridges is based on the statistical significance of the chi-square statistic in the SPSS table titled "Likelihood Ratio Tests". For this relationship, the probability of the chi-square statistic (9.221) was 0.010, less than or equal to the level of significance of 0.05. The null hypothesis that all of the b coefficients associated with confidence in Congress were equal to zero was rejected. The existence of a relationship between confidence in Congress and opinion about spending on highways and bridges was supported.
  • 42. ters II RELATIONSHIP OF INDIVIDUAL INDEPENDENT VARIABLES TO DEPENDENT VARIABLE - 2 Slide 42 Parameter Estimates HIGHWAYS a AND BRIDGES 1 2 Intercept AGE EDUC CONLEGIS Intercept AGE EDUC CONLEGIS B 3.240 .019 .071 -1.373 3.639 .003 .172 -1.657 Std. Error 2.478 .020 .108 .620 2.456 .020 .110 .613 Wald 1.709 .906 .427 4.913 2.195 .017 2.463 7.298 df 1 1 1 1 1 1 1 1 Sig. .191 .341 .514 .027 .138 .897 .117 .007 a. The reference category is: 3. In the comparison of survey respondents who thought we spend too little money on highways and bridges to survey respondents who thought we spend too much money on highways and bridges, the probability of the Wald statistic (4.913) for the variable confidence in Congress [conlegis] was 0.027. Since the probability was less than or equal to the level of significance of 0.05, the null hypothesis that the b coefficient for confidence in Congress was equal to zero for this comparison was rejected. Exp(B) 95% Confiden Exp Lower Bound 1.019 1.073 .253 .980 .868 .075 1.003 1.188 .191 .963 .958 .057
  • 43. ters II RELATIONSHIP OF INDIVIDUAL INDEPENDENT VARIABLES TO DEPENDENT VARIABLE - 3 Slide 43 Parameter Estimates HIGHWAYS a AND BRIDGES 1 2 Intercept AGE EDUC CONLEGIS Intercept AGE EDUC CONLEGIS B 3.240 .019 .071 -1.373 3.639 .003 .172 -1.657 Std. Error 2.478 .020 .108 .620 2.456 .020 .110 .613 Wald 1.709 .906 .427 4.913 2.195 .017 2.463 7.298 df 1 1 1 1 1 1 1 1 Sig. .191 .341 .514 .027 .138 .897 .117 .007 a. The reference category is: 3. The value of Exp(B) was 0.253 which implies that for each unit increase in confidence in Congress the odds decreased by 74.7% (0.253 - 1.0 = -0.747). The relationship stated in the problem is supported. Survey respondents who had less confidence in congress were less likely to be in the group of survey respondents who thought we spend too little money on highways and bridges, rather than the group of survey respondents who thought we spend too much money on highways and bridges. For each unit increase in confidence in Congress, the odds of being in the group of survey respondents who thought we spend too little money on highways and bridges decreased by 74.7%. Exp(B) 95% Confiden Exp Lower Bound 1.019 1.073 .253 .980 .868 .075 1.003 1.188 .191 .963 .958 .057
  • 44. ters II RELATIONSHIP OF INDIVIDUAL INDEPENDENT VARIABLES TO DEPENDENT VARIABLE - 4 Slide 44 Parameter Estimates HIGHWAYS a AND BRIDGES 1 2 Intercept AGE EDUC CONLEGIS Intercept AGE EDUC CONLEGIS B 3.240 .019 .071 -1.373 3.639 .003 .172 -1.657 Std. Error 2.478 .020 .108 .620 2.456 .020 .110 .613 Wald 1.709 .906 .427 4.913 2.195 .017 2.463 7.298 df 1 1 1 1 1 1 1 1 Sig. .191 .341 .514 .027 .138 .897 .117 .007 a. The reference category is: 3. In the comparison of survey respondents who thought we spend about the right amount of money on highways and bridges to survey respondents who thought we spend too much money on highways and bridges, the probability of the Wald statistic (7.298) for the variable confidence in Congress [conlegis] was 0.007. Since the probability was less than or equal to the level of significance of 0.05, the null hypothesis that the b coefficient for confidence in Congress was equal to zero for this comparison was rejected. Exp(B) 95% Confiden Exp Lower Bound 1.019 1.073 .253 .980 .868 .075 1.003 1.188 .191 .963 .958 .057
  • 45. ters II Slide 45 RELATIONSHIP OF INDIVIDUAL INDEPENDENT VARIABLES TO DEPENDENT VARIABLE - 5 Parameter Estimates 95% Con HIGHWAYS a AND BRIDGES 1 2 Intercept AGE EDUC CONLEGIS Intercept AGE EDUC CONLEGIS B 3.240 .019 .071 -1.373 3.639 .003 .172 -1.657 Std. Error 2.478 .020 .108 .620 2.456 .020 .110 .613 Wald 1.709 .906 .427 4.913 2.195 .017 2.463 7.298 df 1 1 1 1 1 1 1 1 Sig. .191 .341 .514 .027 .138 .897 .117 .007 a. The reference category is: 3. The value of Exp(B) was 0.191 which implies that for each unit increase in confidence in Congress the odds decreased by 80.9% (0.191-1.0=-0.809). The relationship stated in the problem is supported. Survey respondents who had less confidence in congress were less likely to be in the group of survey respondents who thought we spend about the right amount of money on highways and bridges, rather than the group of survey respondents who thought we spend too much money on highways and bridges. For each unit increase in confidence in Congress, the odds of being in the group of survey respondents who thought we spend about the right amount of money on highways and bridges decreased by 80.9%. Exp(B) Lower Bou 1.019 1.073 .253 .9 .8 .0 1.003 1.188 .191 .9 .9 .0
  • 46. ters II Slide 46 CLASSIFICATION USING THE MULTINOMIAL LOGISTIC REGRESSION MODEL: BY CHANCE ACCURACY RATE The independent variables could be characterized as useful predictors distinguishing survey respondents who thought we spend too little money on highways and bridges, survey respondents who thought we spend about the right amount of money on highways and bridges and survey respondents who thought we spend too much money on highways and bridges if the classification accuracy rate was substantially higher than the accuracy attainable by chance alone. Operationally, the classification accuracy rate should be 25% or more higher than the proportional by chance accuracy rate. Case Processing Summary N HIGHWAYS AND BRIDGES 1 2 3 Marginal Percentage 37.1% 55.7% 7.2% 100.0% 62 93 12 Valid 167 Missing 103 Total 270 The proportional by chance accuracy rate was computed by Subpopulation 153 calculating the proportion of cases for eachagroup based on the number of cases in each group in the 'Case Processing a. Summary',The dependent variable has only one value the proportion of and then squaring and summing observed in 146 (95.4%) subpopulations. cases in each group (0.371² + 0.557² + 0.072² = 0.453).
  • 47. ters II Slide 47 CLASSIFICATION USING THE MULTINOMIAL LOGISTIC REGRESSION MODEL: CLASSIFICATION ACCURACY Classification Predicted Observed 1 2 3 Overall Percentage 1 15 7 5 16.2% 2 47 86 7 83.8% 3 0 0 0 .0% The classification accuracy rate was 60.5% which was greater than or equal to the proportional by chance accuracy criteria of 56.6% (1.25 x 45.3% = 56.6%). The criteria for classification accuracy is satisfied. Percent Correct 24.2% 92.5% .0% 60.5%
  • 48. ters II Slide 48 Answering the question in problem 1 - 1 11. In the dataset GSS2000, is the following statement true, false, or an incorrect application of a statistic? Assume that there is no problem with missing data, outliers, or influential cases, and that the validation analysis will confirm the generalizability of the results. Use a level of significance of 0.05 for evaluating the statistical relationships. The variables "age" [age], "highest year of school completed" [educ] and "confidence in Congress" [conlegis] were useful predictors for distinguishing between groups based on responses to "opinion about spending on highways and bridges" [natroad]. These predictors differentiate survey respondents who thought we spend too little money on highways and bridges from survey respondents who thought we spend too much money on highways and bridges and survey respondents who thought we spend about the right amount of money on highways and bridges from survey respondents who thought we spend too much money on highways and bridges. Among this set of predictors, confidence in Congress was helpful in distinguishing among the groups defined by responses to opinion about spending on highways and bridges. Survey We found a statistically significant be in respondents who had less confidence in congress were less likely tooverallthe group of survey relationship between highways and bridges, rather than the respondents who thought we spend too little money onthe combination of independent variables and the dependent group of survey respondents who thought we spend too much money on highways and bridges. variable. For each unit increase in confidence in Congress, the odds of being in the group of survey respondents who thought we spend too little money on highways and bridges decreased by 74.7%. Survey respondents who had less was no evidence of numerical less likelyin be in the There confidence in congress were problems to group of survey respondents who thought we spend about the right amount of money on the solution. highways and bridges, rather than the group of survey respondents who thought we spend too much money on highways and bridges. For each unit increaseaccuracy surpassed Moreover, the classification in confidence in Congress, the odds of being in the group of survey respondents whochance accuracy criteria, the right amount the proportional by thought we spend about of money on highways and bridges supporting the 80.9%.of the model. decreased by utility 1. True 2. True with caution 3. False
  • 49. ters II Slide 49 Answering the question in problem 1 - 2 We verified that each statement about the [educ] and The variables "age" [age], "highest year of school completed" relationship "confidence in Congress" [conlegis]between an independent for distinguishingdependent groups based on were useful predictors variable and the between variable was correct in both direction of the relationship These predictors responses to "opinion about spending on highways and bridges" [natroad]. differentiate surveyand the change in likelihoodwe spend too little money on highways and respondents who thought associated with a one-unit bridges from survey change of the who thought variable, for both of the respondents independent we spend too much money on highways and bridges and survey respondents who thought we stated in the problem. amount of money on comparisons between groups spend about the right highways and bridges from survey respondents who thought we spend too much money on highways and bridges. Among this set of predictors, confidence in Congress was helpful in distinguishing among the groups defined by responses to opinion about spending on highways and bridges. Survey respondents who had less confidence in congress were less likely to be in the group of survey respondents who thought we spend too little money on highways and bridges, rather than the group of survey respondents who thought we spend too much money on highways and bridges. For each unit increase in confidence in Congress, the odds of being in the group of survey respondents who thought we spend too little money on highways and bridges decreased by 74.7%. Survey respondents who had less confidence in congress were less likely to be in the group of survey respondents who thought we spend about the right amount of money on highways and bridges, rather than the group of survey respondents who thought we spend too much money on highways and bridges. For each unit increase in confidence in Congress, the odds of being in the group of survey respondents who thought we spend about the right amount of money on highways and bridges decreased by 80.9%. 1. 2. 3. 4. True True with caution False Inappropriate application of a statistic The answer to the question is true with caution. A caution is added because of the inclusion of ordinal level variables.
  • 50. ters II Slide 50 Problem 2 1. In the dataset GSS2000, is the following statement true, false, or an incorrect application of a statistic? Assume that there is no problem with missing data, outliers, or influential cases, and that the validation analysis will confirm the generalizability of the results. Use a level of significance of 0.05 for evaluating the statistical relationships. The variables "highest year of school completed" [educ], "sex" [sex] and "total family income" [income98] were useful predictors for distinguishing between groups based on responses to "opinion about spending on space exploration" [natspac]. These predictors differentiate survey respondents who thought we spend too little money on space exploration from survey respondents who thought we spend too much money on space exploration and survey respondents who thought we spend about the right amount of money on space exploration from survey respondents who thought we spend too much money on space exploration. Among this set of predictors, total family income was helpful in distinguishing among the groups defined by responses to opinion about spending on space exploration. Survey respondents who had higher total family incomes were more likely to be in the group of survey respondents who thought we spend about the right amount of money on space exploration, rather than the group of survey respondents who thought we spend too much money on space exploration. For each unit increase in total family income, the odds of being in the group of survey respondents who thought we spend about the right amount of money on space exploration increased by 6.0%. 1. 2. 3. 4. True True with caution False Inappropriate application of a statistic
  • 51. ters II Slide 51 Dissecting problem 2 - 1 1. In the dataset GSS2000, is the following statement true, false, or an incorrect application of a statistic? Assume that there is no problem with missing data, outliers, or influential cases, and that the validation analysis will confirm the generalizability of the results. Use a level of significance of 0.05 for evaluating the statistical relationships. The variables "highest year of school completed" [educ], "sex" [sex] and "total family income" [income98] were useful predictors for distinguishing between groups based on responses to "opinion about spending on space exploration" [natspac]. we will predictors differentiate survey For these problems, These respondents who thought we spend too little money on is no problem assume that there space exploration from survey respondents who thought we spend too much money on outliers, or with missing data, space exploration and survey respondents who thought we spend about the right amount of money on space exploration from influential cases, and that the survey respondents who thought we spend too much moneyconfirm exploration. on space validation analysis will the generalizability of the Among this set of predictors, total family income was helpful in distinguishing among the results groups defined by responses to opinion about spending on space exploration. Survey respondents who had higher total familythis problem, we are told to to be in the group of survey In incomes were more likely respondents who thought we spend about0.05 right amount of money on space exploration, use the as alpha for the rather than the group of survey respondents who logistic regression. too much money on space multinomial thought we spend exploration. For each unit increase in total family income, the odds of being in the group of survey respondents who thought we spend about the right amount of money on space exploration increased by 6.0%. 1. 2. 3. 4. True True with caution False Inappropriate application of a statistic
  • 52. ters II Slide 52 Dissecting problem 2 - 2 The variables listed first in the problem statement are the independent variables 1. In (IVs): "highest year of is the following statement true, false, or an incorrect application of the dataset GSS2000, school completed" a statistic? Assume [sex] there is nofamily [educ], "sex" that and "total problem with missing data, outliers, or influential cases, and that the validation analysis will confirm the generalizability of the results. Use a level of income" [income98]. significance of 0.05 for evaluating the statistical relationships. The variables "highest year of school completed" [educ], "sex" [sex] and "total family income" [income98] were useful predictors for distinguishing between groups based on responses to "opinion about spending on space exploration" [natspac]. These predictors differentiate survey respondents who thought we spend too little money on space exploration from survey respondents who thought we spend too much money on space exploration and survey respondents who thought we spend about the right amount of money on space exploration from survey respondents who thought we spend too much money on space The variable exploration. used to define groups is the dependent variable (DV): "opinion about Among this on space spending set of predictors, total family income was helpful in distinguishing among the groups defined by responses to opinion about spending on space exploration. Survey exploration" [natspac]. respondents who had higher total family incomes were more likely to be in the group of survey respondents who thought we spend about the right amount of money on space exploration, rather than the group of survey respondents who thought we spend too much money on space SPSS only odds of direct in exploration. For each unit increase in total family income, thesupports being or the group of simultaneous entry of independent survey respondents who thought we spend about the right amount of money on space variables in multinomial logistic exploration increased by 6.0%. 1. True 2. True with caution 3. False regression, so we have no choice of method for entering variables.
  • 53. ters II Slide 53 Dissecting problem 2 - 3 SPSS multinomial logistic regression models the relationship by comparing each of the groups defined by the dependent variable to the group with the highest code value. 1. In the dataset GSS2000,to opinion about spending ontrue, false, or an incorrect application of The responses is the following statement the space a statistic? Assume that there is no problem with missing data, outliers, or influential cases, program were: and that the1= Too little, 2 = About right, and 3 = Too much. validation analysis will confirm the generalizability of the results. Use a level of significance of 0.05 for evaluating the statistical relationships. The variables "highest year of school completed" [educ], "sex" [sex] and "total family income" [income98] were useful predictors for distinguishing between groups based on responses to "opinion about spending on space exploration" [natspac]. These predictors differentiate survey respondents who thought we spend too little money on space exploration from survey respondents who thought we spend too much money on space exploration and survey respondents who thought we spend about the right amount of money on space exploration from survey respondents who thought we spend too much money on space exploration. Among this set of predictors, total family income was helpful in distinguishing among the The analysis will result about spending on groups defined by responses to opinion in two comparisons:space exploration. Survey respondents who • survey respondents who thought we spend likely to be in the group of survey had higher total family incomes were more too little money versus survey respondents who amount of money on space respondents who thought we spend about the rightthought we spend too much exploration, money on space exploration rather than the group of survey respondents who thought we spend too much money on space • survey increase in total family income, the odds the being in the group of exploration. For each unit respondents who thought we spend about of right amount of money versus survey respondents who money on survey respondents who thought we spend about the right amount ofthought we space exploration increased by 6.0%. spend too much money on space exploration. 1. True
  • 54. ters II Slide 54 Dissecting problem 2 - 4 Each problem includes a statement about the The variables "highest year of school completed" [educ], "sex" [sex] and "total family income" [income98]relationship between onefor distinguishing between groups based on responses to were useful predictors independent variable and the dependenton space exploration" [natspac]. These predictors differentiate survey "opinion about spending variable. The answer to the problem is based on the stated relationship, respondents who thought we spend too little money on space exploration from survey respondents who thought we spend too much money on space exploration and survey ignoring the relationships between the other respondents who thought we spend about the right variable. of money on space exploration from independent variables and the dependent amount survey respondents who thought we spend too much money on space exploration. Among this set of predictors, total family income was helpful in distinguishing among the groups defined by responses to opinion about spending on space exploration. Survey respondents who had higher total family incomes were more likely to be in the group of survey respondents who thought we spend about the right amount of money on space exploration, rather than the group of survey respondents who thought we spend too much money on space exploration. For each unit increase in total family income, the odds of being in the group of survey respondents who thought we spend about the right amount of money on space exploration increased by 6.0%. 1. 2. 3. 4. True True with caution This problem identifies a difference for only one of the two comparisons based on the three values False Inappropriate application of a of the dependent variable. statistic Other problems will specify both of the possible comparisons.
  • 55. ters II Slide 55 Dissecting problem 2 - 5 The variables "highest year of school completed" [educ], "sex" [sex] and "total family income" [income98] were useful predictors for distinguishing between groups based on responses to "opinion about spending on space exploration" [natspac]. These predictors differentiate survey respondents who thought we spend too little money on space exploration from survey respondents who thought we spend too much money on space exploration and survey respondents who thought we spend about the right amount of money on space exploration from survey respondents who thought we spend too much money on space exploration. Among this set of predictors, total family income was helpful in distinguishing among the groups defined by responses to opinion about spending on space exploration. Survey respondents who had higher total family incomes were more likely to be in the group of survey respondents who thought we spend about the right amount of money on space exploration, rather than the group of survey respondents who thought we spend too much money on space exploration. For each unit increase in total family income, the odds of being in the group of survey respondents who thought we spend about the right amount of money on space exploration increased by 6.0%. 1. 2. 3. 4. True In order for the multinomial logistic regression question to be true, the overall relationship must True with caution be statistically significant, there must be no False evidence of numerical problems, the classification Inappropriate application of a statistic accuracy rate must be substantially better than could be obtained by chance alone, and the stated individual relationship must be statistically significant and interpreted correctly.
  • 56. ters II Slide 56 LEVEL OF MEASUREMENT - 1 1. In the dataset GSS2000, is the following statement true, false, or an incorrect application of a statistic? Assume that there is no problem with missing data, outliers, or influential cases, and that the validation analysis will confirm the generalizability of the results. Use a level of significance of 0.05 for evaluating the statistical relationships. The variables "highest year of school completed" [educ], "sex" [sex] and "total family income" [income98] were useful predictors for distinguishing between groups based on responses to "opinion about spending on space exploration" [natspac]. These predictors differentiate survey respondents who thought we spend too little money on space exploration from survey respondents who thought we spend too much money on space exploration and survey respondents who thought we spend about the right amount of money on space exploration from survey respondents who thought we spend too much money on space exploration. Among this set of predictors, total family income was helpful in distinguishing among the Multinomial opinion about spending on space groups defined by responses tologistic regression requires that the exploration. Survey dependent variable be non-metric and the respondents who had higher total family incomes were more likely to be in the group of survey independent variables be metric or dichotomous. respondents who thought we spend about the right amount of money on space exploration, rather than the group of survey respondents who thought we spend too much money on space "Opinion about spending on space exploration" exploration. For each unit increase in total family income, the odds of being in the group of [natspac] is ordinal, satisfying the non-metric survey respondentslevel of measurement requirement for the who thought we spend about the right amount of money on space exploration increased by 6.0%. dependent variable. 1. 2. 3. 4. It contains three categories: survey respondents True who thought we spend too little money, about True with cautionright amount of money, and too much the money on space exploration. False Inappropriate application of a statistic
  • 57. ters II Slide 57 LEVEL OF MEASUREMENT - 2 "Highest year of school "Sex" [sex] is dichotomous, completed" [educ] is interval, satisfying the metric or satisfying the metric or dichotomous level of measurement 1. In the dataset GSS2000, is the following statement true, false, or an incorrect application of dichotomous level of requirement for independent measurement Assume that there is no problem with missing data, outliers, or influential cases, a statistic? requirement for variables. independent variables. and that the validation analysis will confirm the generalizability of the results. Use a level of significance of 0.05 for evaluating the statistical relationships. The variables "highest year of school completed" [educ], "sex" [sex] and "total family income" [income98] were useful predictors for distinguishing between groups based on responses to "opinion about spending on space exploration" [natspac]. These predictors differentiate survey respondents who thought we spend too little money on space exploration from survey respondents who thought we spend too much money on space exploration and survey respondents who thought we spend about the right amount of money on space exploration from survey family income" [income98] we spend too much money on space "Total respondents who thought is ordinal, exploration. satisfying the metric or dichotomous level of measurement requirement for independent variables. If we follow the convention of treating Among this set of ordinal level total family incomevariables, the in distinguishing among the predictors, variables as metric was helpful level groups defined byof measurement requirementspending on space exploration. Survey responses to opinion about for the analysis is respondents who had higher total family incomes were not agree to be in the group of survey satisfied. Since some data analysts do more likely with this convention, a note of caution should money on space exploration, respondents who thought we spend about the right amount of be included in our interpretation. rather than the group of survey respondents who thought we spend about the right amount of money on space exploration. For each unit increase in total family income, the odds of being in the group of survey respondents who thought we spend about the right amount of money on space exploration increased by 6.0%. 1. True 2. True with caution
  • 58. ters II Slide 58 Request multinomial logistic regression Select the Regression | Multinomial Logistic… command from the Analyze menu.
  • 59. ters II Slide 59 Selecting the dependent variable First, highlight the dependent variable natspac in the list of variables. Second, click on the right arrow button to move the dependent variable to the Dependent text box.
  • 60. ters II Slide 60 Selecting non-metric independent variables Non-metric independent variables are specified as factors in multinomial logistic regression. Non-metric variables can be either dichotomous, nominal, or ordinal. These variables will be dummy coded as needed and each value will be listed separately in the output. Select the dichotomous variable sex. Move the non-metric independent variables listed in the problem to the Factor(s) list box.
  • 61. ters II Slide 61 Selecting metric independent variables Metric independent variables are specified as covariates in multinomial logistic regression. Metric variables can be either interval or, by convention, ordinal. Move the metric independent variables, educ and income98, to the Covariate(s) list box.
  • 62. ters II Slide 62 Specifying statistics to include in the output While we will accept most of the SPSS defaults for the analysis, we need to specifically request the classification table. Click on the Statistics… button to make a request.
  • 63. ters II Slide 63 Requesting the classification table First, keep the SPSS defaults for Summary statistics, Likelihood ratio test, and Parameter estimates. Second, mark the checkbox for the Classification table. Third, click on the Continue button to complete the request.
  • 64. ters II Slide 64 Completing the multinomial logistic regression request Click on the OK button to request the output for the multinomial logistic regression. The multinomial logistic procedure supports additional commands to specify the model computed for the relationships (we will use the default main effects model), additional specifications for computing the regression, and saving classification results. We will not make use of these options.
  • 65. ters II Slide 65 Sample size – ratio of cases to variables Case Processing Summary N SPACE EXPLORATION PROGRAM RESPONDENTS SEX Valid Missing Total Subpopulation 1 2 3 1 2 33 90 85 94 114 208 62 270 138a Marginal Percentage 15.9% 43.3% 40.9% 45.2% 54.8% 100.0% a. The dependent variable has only one value observed in 112 Multinomial logistic regression requires that the minimum ratio (81.2%) subpopulations. of valid cases to independent variables be at least 10 to 1. The ratio of valid cases (208) to number of independent variables( 3) was 69.3 to 1, which was equal to or greater than the minimum ratio. The requirement for a minimum ratio of cases to independent variables was satisfied. The preferred ratio of valid cases to independent variables is 20 to 1. The ratio of 69.3 to 1 was equal to or greater than the preferred ratio. The preferred ratio of cases to independent variables was satisfied.
  • 66. ters II Slide 66 OVERALL RELATIONSHIP BETWEEN INDEPENDENT AND DEPENDENT VARIABLES Model Fitting Information Model Intercept Only Final -2 Log Likelihood 354.268 334.967 Chi-Square 19.301 df Sig. 6 .004 The presence of a relationship between the dependent variable and combination of independent variables is based on the statistical significance of the final model chi-square in the SPSS table titled "Model Fitting Information". In this analysis, the probability of the model chi-square (19.301) was 0.004, less than or equal to the level of significance of 0.05. The null hypothesis that there was no difference between the model without independent variables and the model with independent variables was rejected. The existence of a relationship between the independent variables and the dependent variable was supported.
  • 67. ters II Slide 67 NUMERICAL PROBLEMS Parameter Estimates SPACE EXPLORATION a PROGRAM 1 2 Intercept EDUC INCOME98 [SEX=1] [SEX=2] Intercept EDUC INCOME98 [SEX=1] [SEX=2] B Std. Error -4.136 1.157 .101 .089 .097 .050 .672 .426 b 0 . -2.487 .840 .108 .068 .058 .034 .501 .317 b 0 . a. The reference category is: 3. b. This parameter is set to zero because it is redundant. Wald 12.779 1.276 3.701 2.488 . 8.774 2.521 2.932 2.492 . df 95% Confidence Exp(B) Lower Bound U Sig. Exp(B) 1 Multicollinearity .000 in the multinomial logistic regression solution is 1 .259 1.106 detected by examining the 1 .054 1.102 standard errors for the b 1 .115 1.959 coefficients. A standard error larger than 2.0 indicates numerical 0 . . problems, such .003 as multicollinearity 1 among the independent variables, 1 .112 1.114 zero cells for a dummy-coded independent variable because all of 1 .087 1.060 the subjects have the same value 1 .114 1.650 for the variable, and 'complete 0 . separation' whereby the two . groups in the dependent event variable can be perfectly separated by scores on one of the independent variables. Analyses that indicate numerical problems should not be interpreted. None of the independent variables in this analysis had a standard error larger than 2.0. .929 .998 .850 . .975 .992 .886 .
  • 68. ters II Slide 68 RELATIONSHIP OF INDIVIDUAL INDEPENDENT VARIABLES TO DEPENDENT VARIABLE - 1 Likelihood Ratio Tests Effect Intercept EDUC INCOME98 SEX -2 Log Likelihood of Reduced Model 334.967a 337.788 340.154 338.511 Chi-Square .000 2.821 5.187 3.544 df Sig. 0 2 2 2 . .244 .075 .170 The chi-square statistic is the difference in -2 log-likelihoods between the final model and a reduced model. The reduced model is formed by omitting an effect from the final model. The null hypothesis is that all parameters of that effect are 0. a. The statistical significance of the relationship between This reduced model spending on space total family income and opinion aboutis equivalent to the final model because exploration is based on the statistical significance of the omitting the effect does not increase the degrees of freedom. chi-square statistic in the SPSS table titled "Likelihood Ratio Tests". For this relationship, the probability of the chi-square statistic (5.187) was 0.075, greater than the level of significance of 0.05. The null hypothesis that all of the b coefficients associated with total family income were equal to zero was not rejected. The existence of a relationship between total family income and opinion about spending on space exploration was not supported.
  • 69. ters II Slide 69 Answering the question in problem 2 1. In the dataset GSS2000, is the following statement true, false, or an incorrect application of a statistic? Assume that there is no problem with missing data, outliers, or influential cases, and that the validation analysis will confirm the generalizability of the results. Use a level of significance of 0.05 for evaluating the statistical relationships. The variables "highest year of school completed" [educ], "sex" [sex] and "total family income" [income98] were useful predictors for distinguishing between groups based on responses to "opinion about spending on space exploration" [natspac]. These predictors differentiate survey respondents who thought we spend too little money on space exploration from survey respondents who thought we spend too much money on space exploration and survey respondents who thought we spend about the right amount of money on space exploration from survey respondents who thought we spend too much money on space exploration. We found a statistically significant overall relationship between the combination of Among this set of predictors, totalindependent variables and the dependent family income was helpful in distinguishing among the groups defined by responses to opinion about spending on space exploration. Survey variable. respondents who had higher total family incomes were more likely to be in the group of survey respondents who thought we spend about the right amount numerical problems in There was no evidence of of money on space exploration, rather than the group of survey respondents who thought we spend too much money on space the solution. exploration. For each unit increase in total family income, the odds of being in the group of survey respondents who thought we spend about the right amount of money on space However, the individual relationship between exploration increased by 6.0%. 1. 2. 3. 4. total family income and spending on space was not statistically significant. True True with caution The answer to the question is false. False Inappropriate application of a statistic
  • 70. ters II Slide 70 Steps in multinomial logistic regression: level of measurement and initial sample size The following is a guide to the decision process for answering problems about the basic relationships in multinomial logistic regression: Dependent non-metric? Independent variables metric or dichotomous? No Inappropriate application of a statistic Yes Ratio of cases to independent variables at least 10 to 1? Yes Run multinomial logistic regression No Inappropriate application of a statistic
  • 71. ters II Slide 71 Steps in multinomial logistic regression: overall relationship and numerical problems Overall relationship statistically significant? (model chi-square test) No False Yes Standard errors of coefficients indicate no numerical problems (s.e. <= 2.0)? Yes No False
  • 72. ters II Slide 72 Steps in multinomial logistic regression: relationships between IV's and DV Overall relationship between specific IV and DV is statistically significant? (likelihood ratio test) No False Yes Role of specific IV and DV groups statistically significant and interpreted correctly? (Wald test and Exp(B)) Yes No False
  • 73. ters II Slide 73 Steps in multinomial logistic regression: classification accuracy and adding cautions Overall accuracy rate is 25% > than proportional by chance accuracy rate? No False Yes Satisfies preferred ratio of cases to IV's of 20 to 1 No True with caution Yes One or more IV's are ordinal level treated as metric? No True Yes True with caution