SlideShare a Scribd company logo
1 of 64
Statistica Sinica 16(2006), 847-860
PSEUDO-R
2
IN LOGISTIC REGRESSION MODEL
Bo Hu, Jun Shao and Mari Palta
University of Wisconsin-Madison
Abstract: Logistic regression with binary and multinomial
outcomes is commonly
used, and researchers have long searched for an interpretable
measure of the strength
of a particular logistic model. This article describes the large
sample properties
of some pseudo-R2 statistics for assessing the predictive
strength of the logistic
regression model. We present theoretical results regarding the
convergence and
asymptotic normality of pseudo-R2s. Simulation results and an
example are also
presented. The behavior of the pseudo-R2s is investigated
numerically across a
range of conditions to aid in practical interpretation.
Key words and phrases: Entropy, logistic regression, pseudo-R2
1. Introduction
Logistic regression for binary and multinomial outcomes is
commonly used
in health research. Researchers often desire a statistic ranging
from zero to one
to summarize the overall strength of a given model, with zero
indicating a model
with no predictive value and one indicating a perfect fit. The
coefficient of deter-
mination R2 for the linear regression model serves as a standard
for such measures
(Draper and Smith (1998)). Statisticians have searched for a
corresponding in-
dicator for models with binary/multinomial outcome. Many
different R2 statis-
tics have been proposed in the past three decades (see, e.g.,
McFadden (1973),
McKelvey and Zavoina (1975), Maddala (1983), Agresti (1986),
Nagelkerke
(1991), Cox and Wermuch (1992), Ash and Shwartz (1999),
Zheng and Agresti
(2000)). These statistics, which are usually identical to the
standard R2 when
applied to a linear model, generally fall into categories of
entropy-based and
variance-based (Mittlböck and Schemper (1996)). Entropy-
based R2 statistics,
also called pseudo-R2s, have gained some popularity in the
social sciences (Mad-
dala (1983), Laitila (1993) and Long (1997)). McKelvey and
Zavoina (1975)
proposed a pseudo-R2 based on a latent model structure, where
the binary/
multinomial outcome results from discretizing a continuous
latent variable that
is related to the predictors through a linear model. Their
pseudo-R2 is defined
as the proportion of the variance of the latent variable that is
explained by the
848 BO HU, JUN SHAO AND MARI PALTA
covariate. McFadden (1973) suggested an alternative, known as
“likelihood-
ratio index”, comparing a model without any predictor to a
model including all
predictors. It is defined as one minus the ratio of the log
likelihood with inter-
cepts only, and the log likelihood with all predictors. If the
slope parameters
are all 0, McFadden’s R2 is 0, but it is never 1. Maddala (1983)
developed
another pseudo-R2 that can be applied to any model estimated
by the maximum
likelihood method. This popular and widely used measure is
expressed as
R2M = 1 −
(
L(θ̃)
L(θ̂)
)
2
n
, (1)
where L(θ̃) is the maximized likelihood for the model without
any predictor and
L(θ̂) is the maximized likelihood for the model with all
predictors. In terms of
the likelihood ratio statistic λ = −2 log(L(θ̃)/L(θ̂)), R2M = 1 −
e−λ/n. Maddala
proved that R2M has an upper bound of 1 − (L(θ̃))2/n and, thus,
suggested a
normed measure based on a general principle of Cragg and
Uhler (1970):
R2N =
1 −
(
L(θ̃)
L(θ̂)
)
2
n
1 − (L(θ̃))
2
n
. (2)
While the statistics in (1) and (2) are widely used, their
statistical properties
have not been fully investigated. Mittlböck and Schemper
(1996) reviewed R2M
and R2N along with other measures, but their results are mainly
empirical and
numerical. The R2 for the linear model is interpreted as the
proportion of the
variation in the response that can explained by the regressors.
However, there is
no clear interpretation of the pseudo-R2s in terms of variance of
the outcome in
logistic regression. Note that both R2M and R
2
N are statistics and thus random.
In linear regression, the standard R2 converges almost surely to
the ratio of the
variability due to the covariates over the total variability as the
sample size in-
creases to infinity. Once we know the limiting values of R2M
and R
2
N , these limits
can be similarly used to understand how the pseudo-R2s capture
the predictive
strength of the model. The pseudo-R2s for a given data set are
point estimators
for the limiting values that are unknown. To account for the
variability in esti-
mation, it is desirable to study the asymptotic sampling
distributions of R2M and
R2N , which can be used to obtain asymptotic confidence
intervals for the limiting
values of pseudo-R2s. Helland (1987) studied the sampling
distributions of R2
statistics in linear regression.
In this article we study the behavior of R2M and R
2
N under the logistic re-
gression model. In Section 2, we derive the limits of R2M and R
2
N and provide
PSEUDO-R2 IN LOGISTIC REGRESSION MODEL 849
interpretations of them. We also present some graphs describing
the behavior of
R2N across a range of practical situations. The asymptotic
distributions of R
2
M
and R2N are derived in Section 3 and some simulation results
are presented. An
example is given in Section 4.
2. What Does Pseudo-R2 Measure
In this section we explore the issue of what R2M in (1) and R
2
N in (2) measure
in the setting of binary or multinomial outcomes.
2.1. Limits of pseudo-R2s
Consider a study of n subjects whose outcomes fall in one of m
categories.
Let Yi = (Yi1, . . . , Yim)
′ be the outcome vector associated with the ith subject,
where Yij = 1 if the outcome falls in the jth category, and Yij =
0 otherwise.
We assume that Y1, . . . , Yn are independent and that Yi is
associated with a p-
dimensional vector Xi of predictors (covariates) through the
multinomial logit
model
Pij = E(Yij|Xi) =
exp(αj + X
′
iβj )
∑m
k=1 exp(αk + X
′
iβk)
, j = 1, . . . , m, (3)
where αm = βm = 0, α1, . . . , αm−1 are unknown scalar
parameters, and β1, . . .,
βm−1 are unknown p-vectors of parameters. Let θ be the (p +
1)(m − 1) dimen-
sional parameter (α1, β
′
1, . . . , αm−1, β
′
m−1). Then the likelihood function under
the multinomial logit model can be written as
L(θ) =
n
∏
i=1
P
Yi1
i1 P
Yi2
i2 · · · P
Yim
im . (4)
Procedures for obtaining the maximum likelihood estimator θ̂ of
θ are available
in most statistical software packages. The following theorem
provides the asymp-
totic limits of the pseudo-R2s defined in (1) and (2). Its proof is
given in the
Appendix.
Theorem 1. Assume that covariates Xi, i = 1, . . . , n, are
independent and
identically distributed random p-vectors with finite second
moment. If
H1 = −
m
∑
j=1
E(Pij ) log E(Pij ), (5)
H2 = −
m
∑
j=1
E(Pij log Pij ), (6)
850 BO HU, JUN SHAO AND MARI PALTA
then, as n → ∞, R2M →p 1−e2(H2−H1) and R2N →p (1 −
e2(H2−H1))/(1 − e−2H1 ),
where →p denotes convergence in probability.
2.2. Interpretation of the limits of pseudo-R2s
It is useful to consider whether the limits of pseudo-R2 can be
interpreted
much as R2 can be for linear regression analysis.
Theorem 1 reveals that both R2M and R
2
N converge to limits that can be
described in terms of entropy. If the covariates Xis are i.i.d., Yi
= (Yi1, . . . , Yim)
′,
i = 1, . . . , n, are also i.i.d. multinomial distributed with
probability vector (E(Pi1),
. . . , E(Pim)) where the expectation is taken over Xi. Then H1
given in (5) is
exactly the entropy measuring the marginal variation of Yi.
Similarly, −
∑m
j=1 Pij
log Pij corresponds to the conditional entropy measuring the
variation of Yi given
Xi and H2 can be considered as the average conditional entropy.
Therefore
H1 − H2 measures the difference in entropy explained by the
covariate X, which
is always greater than 0 by Jensen’s inequality, and is 0 if and
only if the covariates
and outcomes are independent. For example, when (Xi, Yi) is
bivariate normal,
H1 − H2 = log(
√
1 − ρ2)−1 where ρ is the correlation coefficient, and the limit
of R2M is ρ
2.
The limit of R2M is 1 − e−2(H1−H2) monotone in increasing
H1 − H2. Then
we can write the limit of R2N as the limit of R
2
M divided by its upper bound:
R2N →p
1 − e−2(H1−H2)
1 − e−2H1
=
e2H1 − e2H2
e2H1 − 1
.
When both H1 and H2 are small, 1 − e−2(H1−H2) ≈ 2(H1 −
H2), 1 − e−2H1 ≈ 2H1
and the limit of R2N is approximately (H1 − H2)/H1, the
entropy explained by
the covariates relative to the marginal entropy H1.
2.3. Limits of R2
M
and R2
N
relative to model parameters
For illustration, we examine the magnitude of the limits of RM
and RN
under different parameter settings when the Xis are i.i.d.
standard normal and
the outcome is binary. Figures 1 and 2 show the relationship
between the limits
of R2M and R
2
N and the parameters α and β. In these figures, profile lines of
the
limits are given for different levels of the response probability
eα/(1 + eα) at the
mean of Xi and odds ratio e
β per standard deviation of the covariate. The limits
tend to increase as the absolute value of β increases with other
parameters fixed,
which is consistent with the behavior of the usual R2 in linear
regression models.
However, we note that the limits tend to be low, even in models
where the
parameters indicate a rather strong association with the
outcome. For example,
PSEUDO-R2 IN LOGISTIC REGRESSION MODEL 851
a moderate size odds ratio of 2 per standard deviation of Xi is
associated with
the limit of R2N at most 0.10. As the pseudo-R
2 measures do not correspond
in magnitude to what is familiar from R2 for ordinary
regression, judgments
about the strength of the logistic model should refer to profiles
such as those
provided in Figures 1 and 2. Knowing what odds ratio for a
single predictor
model produces the same pseudo-R2 as a given multiple
predictor model greatly
facilitates subject matter relevance assessment.
PSfrag replacements
0
2
4
6
0.0
0.2 0.4 0.6 0.8
e
α
β
e
α
1+eα
e
β
Figure 1. Contour plot of limits of R2M against e
α/(1 + eα) and odds ratio eβ .
852 BO HU, JUN SHAO AND MARI PALTA
PSfrag replacements
0
2
4
6
0.0
0.2 0.4 0.6 0.8
e
α
β
e
α
1+eα
e
β
Figure 2. Contour plot of limits of R2N against e
α/(1 + eα) and odds ratio eβ.
It may be noted that neither R2N nor R
2
M can equal 1, except in degenerate
models. This property is a logical consequence of the nature of
binary outcomes.
The denominator, 1 − (L(θ̃))2/n, equals the numerator when
L(θ̂) equals 1, which
occurs only for a degenerate outcome that is always 0 or 1. In
fact, any perfectly
fitting model for binary data would predict probabilities that are
only 0 or 1. This
constitutes a degenerate logistic model, which cannot be fit. In
comparison to
PSEUDO-R2 IN LOGISTIC REGRESSION MODEL 853
the R2 for a linear model, R2 of 1 implies residual variance of
0. As the variance
and entropy of binomial and multinomial data depend on the
mean, this again
can occur only when the predicted probabilities are 0 and 1. The
mean-entropy
dependence influences the size of the pseudo-R2s and tends to
keep them away
from 1 even when the mean probabilities are strongly dependent
on the covariate.
For ease of model interpretation, investigators often categorize
a continuous
variable, which leads to a loss of information. Consider a
standard normally
distributed covariate. We calculate the limit of R2N when
cutting the normal
covariate into two, three, five or six categories. The threshold
points we choose
are 0 for two categories, ±1 for three categories, ±0.5, ±1 for
five categories, and
0, ±0.5, ±1 for six categories. In Figures 3 and 4, we plot the
corresponding limits
of R2N against e
α/(1 + eα) by fixing β at 1, and against eβ by setting α = 1. The
fewer the number of categories we use for the covariate, the
more information we
lose, i.e., the smaller the limit of R2N . In this example, we
note that using five or
six categories retains most of the information provided by the
original continuous
covariate.
PSfrag replacements
0
2
4
6
0.0
0.2
0.4
0.6
0.8
e
α
β
e
α
1+eα
eβ
0
.0
0
0
.0
5
0
.1
0
0
.1
5
0
.2
0
0
.2
5
0
.3
0
0.0
0.1
0.2
0.3
0.4
0.5
0.6 0.8
Continuous
Two
Three
Five
Six
Figure 3. Limit of R2N with covariate N (0, 1) dichotomized
into K categories
against eα/(1 + eα), K = 2, 3, 5, 6
854 BO HU, JUN SHAO AND MARI PALTA
PSfrag replacements
0 2 4 6
0.0
0.2
0.4
0.6
0.8
e
α
β
e
α
1+eα
eβ
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0
.0
0
.1
0
.2
0
.3
0
.4
0
.5
0
.6
0.8
Continuous
Two
Three
Five
Six
Figure 4. Limit of R2N with covariate N (0, 1) dichotomized
into K categories
against odds ratio eβ , K = 2, 3, 5, 6
3. Sampling Distributions of Pseudo-R2s
The result in the previous section indicates that the limit of a
pseudo-R2 is
a measure of the predictive strength of a model relating the
logistic responses to
some predictors (covariates). The quantities R2M and R
2
N are statistics and are
random. They should be treated as estimators of their limiting
values in assessing
the model strength. In this section, we derive the asymptotic
distributions of R2M
and R2N that are useful for deriving large sample confidence
intervals.
3.1. Asymptotic distributions of pseudo-R2s
Theorem 2. Under the conditions of Theorem 1,
√
n
[
R2M − (1 − e2(H2−H1))
]
→d N (0, σ21 ) (7)
√
n
[
R2N −
1 − e2(H2−H1)
1 − e−2H1
]
→d N (0, σ22 ), (8)
PSEUDO-R2 IN LOGISTIC REGRESSION MODEL 855
where H1 and H2 are given by (5) and (6), σ
2
1 = g
′
1Σg1 and σ
2
2 = g
′
2Σg2 with
g1 = −2e2(H2−H1) (1 + log γ1, . . . , 1 + log γm, −1) , (9)
g2 =
e−2H1 (1 − e2H2 )
(1 − e−2H1 )2
(
1 + log γ1, . . . , 1 + log γm, e
2H2
1 − e2H1
1 − e2H2
)
, (10)
Σ =
(
Cov(Yi) η
η′ �
)
. (11)
Here γj = Ex(Pij ), j = 1, . . . , m, is the expected probability
that the outcome
falls in jth category, the jth element of η is ηj = Ex(Pij log Pij )
+ γj H2, and
� =
∑m
j=1 Ex
(
Pij (log Pij )
2
)
− H22 .
When all the slope parameters βj are 0 (i.e., Xi and Yi are
uncorrelated),
both σ21 and σ
2
2 are zero. g1, g2 and Σ can be estimated by replacing the un-
known quantities, which are related to the covariate
distribution, with consistent
estimators. For example, γ can be estimated by (
∑
P̂ i1/n, . . . ,
∑
P̂ im/n)
′.
Suppose gk, k = 1, 2, and Σ are estimated by ĝk and Σ̂,
respectively. Theorem
2 leads to the following asymptotic 100(1 − α)% confidence
interval for the limit
of R2M :
(
R2M − Z α
2
ĝ′1Σ̂ĝ1, R
2
M + Z α
2
ĝ′1Σ̂ĝ1
)
, (12)
where Zα is the 1 − α quantile of the standard normal
distribution. A confidence
interval for the limit of R2N can be obtained by replacing R
2
M and ĝ1 in (12) with
R2N and ĝ2, respectively. If the resulting lower limit of the
confidence interval is
below 0 or the upper limit is above 1, it is conventional to use
the margin value
of 0 or 1.
3.2. Simulation results
In this section, we examine by simulation the finite sample
performance of
the confidence intervals based on the asymptotic results derived
in Section 3.1.
Our simulation experiments consider the logistic regression
model with binary
outcome and a single normal covariate with mean 0 and
standard deviation 1.
All the simulations were run with 3,000 replications of an
artificially gener-
ated data set. In each replication, we simulated a sample of size
200 or 1,000
from the standard normal distribution as covariate vectors X ,
and simulated
200 or 1,000 binary outcomes according to success probability
exp(α + βX)/(1+
exp(α + βX)). Tables 1 and 2 show the results for different
values of α and β.
In all the simulations with sample size 1,000, the estimated
confidence inter-
vals derived by Theorem 2 displayed coverage probability close
to the expected
level of 0.95. Coverage probability is less satisfactory with
sample size 200 when
the model is weak.
856 BO HU, JUN SHAO AND MARI PALTA
Table 1. Simulation average of pseudo-R2s and 95% confidence
intervals in
the logit model with normal covariate (sample size=1,000).
α β R2M (limit) CI (coverage
∗ ) R2N (limit) CI (coverage
∗ )
2 0.5 0.028 (0.027) (0.008,0.047) (0.930) 0.052 (0.050)
(0.015,0.088)(0.930)
1 0.108 (0.103) (0.069,0.137) (0.937) 0.178 (0.178)
(0.121,0.236) (0.940)
2 0.298 (0.298) (0.258,0.338) (0.929) 0.455 (0.454)
(0.396,0.513) (0.927)
1 0.5 0.047 (0.046) (0.022,0.071) (0.937) 0.067 (0.066)
(0.032,0.103) (0.940)
1 0.151 (0.150) (0.113,0.189) (0.943) 0.213 (0.213)
(0.160,0.267) (0.945)
2 0.351 (0.350) (0.312,0.390) (0.922) 0.483 (0.483)
(0.430,0.536) (0.929)
0.5 0.5 0.054 (0.053) (0.028,0.080) (0.940) 0.073 (0.072)
(0.038,0.109) (0.941)
1 0.166 (0.166) (0.127,0.204) (0.948) 0.224 (0.226)
(0.172,0.276) (0.948)
2 0.367 (0.365) (0.327,0.404) (0.931) 0.492 (0.491)
(0.443,0.546) (0.933)
0 0.5 0.056 (0.055) (0.030,0.083) (0.938) 0.075 (0.074)
(0.039,0.111) (0.938)
1 0.171 (0.171) (0.133,0.210) (0.931) 0.229 (0.228)
(0.177,0.281) (0.933)
2 0.371 (0.370) (0.332,0.409) (0.925) 0.490 (0.494)
(0.443,0.546) (0.929)
∗ The relative frequency with which the intervals contain the
true limit
Table 2. Simulation average of pseudo-R2s and 95% confidence
intervals in
the logit model with normal covariate (sample size=200).
α β R2M (limit) CI (coverage) R
2
N (limit) CI (coverage)
2 0.5 0.031 (0.027) (0, 0.074) (0.912) 0.058 (0.050) (0, 0.138)
(0.922)
1 0.107 (0.103) (0.0310, 0.182) (0.918) 0.185 (0.178) (0.0580,
0.312) (0.920)
2 0.299 (0.298) (0.2110, 0.387) (0.910) 0.458 (0.454) (0.3290,
0.588) (0.913)
1 0.5 0.050 (0.046) (0, 0.105) (0.915) 0.073 (0.066) (0, 0.151)
(0.914)
1 0.152 (0.150) (0.0680, 0.235) (0.928) 0.215 (0.213) (0.0980,
0.332) (0.925)
2 0.351 (0.350) (0.2650, 0.437) (0.930) 0.484 (0.483) (0.3660,
0.601) (0.932)
0.5 0.5 0.058 (0.053) (0, 0.116) (0.919) 0.078 (0.072) (0, 0.157)
(0.920)
1 0.168 (0.166) (0.0820, 0.253) (0.927) 0.227 (0.226) (0.1120,
0.343) (0.930)
2 0.367 (0.365) (0.2820, 0.452) (0.925) 0.494 (0.491) (0.3790,
0.608) (0.925)
0 0.5 0.059 (0.055) (0.0010, 0.118) (0.911) 0.079 (0.074)
(0.0010, 0.158) (0.912)
1 0.174 (0.171) (0.0880, 0.260) (0.928) 0.233 (0.228) (0.1180,
0.347) (0.930)
2 0.372 (0.370) (0.2870, 0.457) (0.925) 0.496 (0.494) (0.3830,
0.610) (0.922)
4. Example
We now turn to an example of logistic regression from Fox’s
(2001) text on
fitting generalized linear models. This example draws on data
from the 1976 U.S.
Panel Study of Income Dynamics. There are 753 families in the
data set with
8 variables. The variables are defined in Table 3. The logarithm
of the wife’s
estimated wage rate is based on her actual earnings if she is in
the labor force;
otherwise this variable is imputed from other predictors. The
definition of other
variables is straightforward.
PSEUDO-R2 IN LOGISTIC REGRESSION MODEL 857
Table 3. Variables in the women labor force dataset.
Variable Description Remarks
lfp wife’s labor-force participation factor: no,yes
k5 number of children ages 5 and younger 0-3, few 3’s
k618 number of children ages 6 to 18 0-8, few > 5
age wife’s age in years 30-60, single years
wc wife’s college attendance factor: no,yes
hc husband’s college attendance factor: no,yes
lwg log of wife’s estimated wage rate see text
inc family income excluding wife’s income $1, 000s
We assume a binary logit model with no labor force
participation as the
baseline category. Other variables are treated as predictors in
the model. The
estimated model with all the predictors has the following form:
log
P
1−P
=
3.18−1.47k5−0.07k618−0.06age+0.81wc+0.11hc+0.61lwg−0.03i
nc,
where P is the probability that the wife in the family is in the
labor force. The
variables k618 and hc are not statistically significant based on
the likelihood-ratio
test. Table 4 shows the values of R2M and R
2
N , as well as 95% confidence intervals
of limits of R2M and R
2
N , for the model containg all the predictors, and models
excluding certain predictors.
Table 4. R2M and R
2
N with 95% confidence intervals of models for women
labor force data.
Model R2M (95% CI.) R
2
N (95% CI.)
Use all predictors 0.152 ( 0.109, 0.195) 0.205 ( 0.147, 0.262)
Exclude k5 0.074 ( 0.040, 0.108) 0.100 ( 0.054, 0.145)
Exclude age 0.123 ( 0.083, 0.164) 0.165 ( 0.111, 0.219)
Exclude wc 0.138 ( 0.096, 0.180) 0.185 ( 0.129, 0.241)
Exclude lwg 0.133 ( 0.092, 0.174) 0.179 ( 0.123, 0.234)
Exclude inc 0.130 ( 0.087, 0.172) 0.175 ( 0.119, 0.230)
Exclude k618 0.151 ( 0.108, 0.194) 0.203 ( 0.145, 0.261)
Exclude hc 0.152 ( 0.109, 0.195) 0.204 ( 0.146, 0.262)
Use k618, hc only 0.003 (-0.005, 0.010) 0.004 (-0.006, 0.013)
For the model with all the covariates, R2M and R
2
N are around 0.15 and 0.20,
respectively. The results imply a moderately strong model when
referencing the
odds ratio scale equivalents in Figure 1. Dropping a significant
covariate results
in a notable decrease in the values of pseudo-R2s, while no
significant change
occurs if we drop the insignificant covariates. R2M and R
2
N are near zero when we
858 BO HU, JUN SHAO AND MARI PALTA
exclude all the significant covariates. However, model selection
procedures using
pseudo-R2 need further research.
Acknowledgements
The research work is supported by Grant CA-53786 from the
National Cancer
Institute. The authors thank the referees and an editor for
helpful comments.
Appendix
For the proof of results in Section 3, we begin with a lemma and
then sketch
the main steps for Theorem 1 and 2.
Lemma 1. Assume that covariates Xi, i = 1, . . . , n, are i.i.d.
random p-vectors
with finite second moment, then (log L(θ̂) − log L(θ))/
√
n →p 0, where θ̂ is the
maximum likelihood estimator of θ.
Proof of Lemma 1. We first prove that ∂2 log L(θ)/∂θ∂θ
′
= Op(n). The score
function is
∂ log L(θ)
∂θ
=
(
n
∑
i=1
(Yi1 − Pi1),
n
∑
i=1
(Yi1 − Pi1)X′i, . . . ,
n
∑
i=1
(Yim − Pim)X′i
)
′
.
Let ηk = (αk, β
′
k)
′ ∈ Rp+1 for k = 1, . . . , m, and Ui = (1, X′i )′. Then
∂2 log L(θ)
∂ηk∂η
′
k
= −
n
∑
i=1
Pik(1 − Pik)UiU ′i , k = 1, 2, . . . , m,
∂2 log L(θ)
∂ηk∂η
′
l
= −
n
∑
i=1
PikPilUiU
′
i , k 6= l.
Since UiU
′
i =
(
1 X′i
Xi XiX
′
i
)
, each element in the second derivative matrix
∂2 log L(θ)/∂θ∂θ
′
is Op(n) by assumption. For simplicity, we write this as
∂2 log L(θ)/∂θ∂θ
′
= Op(n). Let Sn(θ) = ∂ log L(θ)/∂θ, Jn(θ) =−∂2 log L(θ)/∂θ∂θ
′
and In(θ) = E(Jn(θ)), where the expectation is taken over
covariates. It follows
that the cumulative information matrix In(θ) = Op(n). By a
second-order Taylor
expansion,
log L(θ̂) − log L(θ)√
n
=
Sn(θ̂)√
n
′
(θ̂ − θ) −
1
2
√
n
(θ̂ − θ)′Jn(θ∗)(θ̂ − θ)
= (θ̂ − θ)′In(θ)
1
2 In(θ)
−
1
2
√
n
Jn(θ
∗ )
2n
3
2
√
nIn(θ)
−
1
2 In(θ)
1
2 (θ̂ − θ),
PSEUDO-R2 IN LOGISTIC REGRESSION MODEL 859
where θ∗ is a vector between θ and θ̂. The asymptotic normality
results of the
MLE gives In(θ)
1/2(θ̂ −θ) → N (0, 1). The lemma then follows from the fact that
In(θ)
−1/2
√
n = Op(1) and Jn(θ
∗ )/n = Op(1).
Proof of Theorem 1. Let f (x) = log(1 − x)/2, then
f (R2) =
1
n
log L(θ̃) − 1
n
log L(θ̂)
=
m
∑
j=1
nj
n
log(
nj
n
) − 1
n
log L(θ) +
1
n
(
log L(θ) − log L(θ̂)
)
.
The convergence of
∑m
j=1(nj /n) log(nj/n) and log L(θ)/n come from the Law
of Large Numbers. The results of the theorem follow from the
lemma and the
Continuous Mapping Theorem.
Proof of Theorem 2. Let S2M = 1 − (L(θ̃)/L(θ))2/n and S2N =
(1−
(L(θ̃)/L(θ))2/n)/(1 − (L(θ̃))2/n). It follows from the lemma that
S2M and S2N
have the same asymptotic distribution as R2M and R
2
N , respectively, in the sense
that
√
n(S2M − R2M ) →p 0 and
√
n(S2N − R2N ) →p 0.
Define Zi = (Yi1, . . . , Yim, Wi) where Wi =
∑m
j=1 Yij log Pij . Then Zi’s form
a i.i.d. random sequence with µ = E(Zi) = (γ
′,
∑m
j=1 E(P1j log P1j )) = (γ
′, −E2),
Cov (Zi) = Σ, γ and Σ as defined in Section 3. By the
Multidimensional Central
Limit Theorem,
√
n
(
Z
̄ − µ
)
→ N (0, Σ). (13)
Let φ1(x1, . . . , xm) = 1 − e2(
∑
m
j=1
xj log xj−xm+1) and φ2(x1, . . . , xm) = (1−
e2(
∑
m
j=1
xj log xj −xm+1))/(1 − e2
∑
m
j=1
xj log xj ). Applying the delta-method with φ1
and φ2 to (13), respectively, leads to the asymptotic normality
results in
Theorem 2.
References
Agresti, A. (1986). Applying R2 type measures to ordered
categorical data. Technometrics. 28,
133-138.
Ash, A. and Shwartz, M. (1999). R2: a useful measure of model
performance when predicting
a dichotomous outcome. Statist. Medicine 18, 375-384.
Cox, D. R. and Wermuch, N. (1992). A comment on the
coefficient of determination for binary
responses. Amer. Statist. 46, 1-4.
Cragg, J. G. and Uhler, R. S. (1970). The demand for
automobiles. Canad. J. Economics 3,
386-406.
Draper, N. R. and Smith, H. (1998). Applied Rregression
Analysis. 3rd edition. Wiley, New
York.
Fox, J. (2001). An R and S-Plus Companion to Applied
Regression. Sage Publications.
860 BO HU, JUN SHAO AND MARI PALTA
Helland, I. S. (1987). On the interpretation and use of R2 in
regression analysis, Biometrics 43,
61-69.
Laitila, T. (1993). A pseudo-R2 measure for limited and
qualitative dependent variable models.
J. Econometrics 56, 341-356.
Long, J. S. (1997). Regression Models for Categorical and
Limited Dependent Variables. Sage
Publications.
Maddala, G. S. (1983). Limited-Dependent and Qualitative
Variables in Econometrics. Cam-
bridge University Press, Cambridge.
McFadden, D. (1973). Conditional logit analysis of qualitative
choice behavior. In Frontiers in
Econometrics (Edited by P. Zarembka), 105-42. Academic
Press, New York.
McKelvey, R. D. and Zavoina, W. (1975). A statistical model
for the analysis of ordinal level
dependent variables. J. Math. Soc. 4, 103-120.
Mittlböck M. and Schemper, M. (1996). Explained variation for
logistic regression. Statist.
Medicine 15, 1987-1997.
Nagelkerke, N. J. D. (1991). A note on a general definition of
the coefficient of determination.
Biometrika 78, 691-693.
Zheng B. Y. and Agresti, A. (2000). Summarizing the predictive
power of a generalized liner
model. Statist. Medicine 19, 1771-1781.
Department of Statistics, University of Wisconsin-Madison,
Madison, WI, 53706, U.S.A.
E-mail: [email protected]
Department of Statistics, University of Wisconsin-Madison,
Madison, WI, 53706, U.S.A.
E-mail: [email protected]
Department of Population Health Sciences, University of
Wisconsin-Madison, Madison, WI,
53706, U.S.A.
E-mail: [email protected]
(Received August 2004; accepted July 2005)
1. Introduction2. What Does Pseudo-R2 Measure2.1. Limits of
pseudo-R2s2.2. Interpretation of the limits of pseudo2.3. Limits
of R2M and R2N relative to model parameters3. Sampling
Distributions of Pseudo-R2s3.1. Asymptotic distributions of
pseudo3.2. Simulation results4. ExampleAppendix
A note about the References tool in Word
On a PC/Windows system (based on Office 2010)
When you need to create a citation (giving credit for work that
you are referencing), you
click on References, then on Insert Citation. The next step is to
add a new source.
When you get to the "Create Source" window, it is suggested
that you click on the
"Show All Bibliography Fields." Here is a sample Source
screen.
Once you have entered all the source information, click on
Bibliography and then Insert
Bibliography.
This is the citation:
(Joseph, 2000)
This is how the source is entered into the References list:
Joseph, J. (2000, October). Ethics in the Workplace. Retrieved
August 3, 2015, from asae-The
Center for Association Leadership:
http://www.asaecenter.org/Resources/articledetail.cfm?ItemNum
ber=13073
Other fields on the source page would be used for a journal
article or an article from a
periodical.
On a Mac/OS system (based on Office 2013)
From the MAC Help files:
To add a citation, a works cited list, or a bibliography to your
document, you first
add a list of the sources that you used.
Add a source by using the Source Manager
The Source Manager lists every source ever entered on your
computer so that
you can reuse them in any other document. This is useful, for
example, if you
write research papers that use many of the same sources. If you
open a
document that includes citations, the sources for those citations
appear under
Current list. All the sources that you have cited, either in
previous documents or
in the current document, appear under Master list.
1. Open up your Word document.
2. On the Document Elements tab , under References ,
click Manage.
3. At the bottom of the Citations tool, click , and then click
Citation Source Manager .
1 2
3
4
1 2
3
4
4. Click New.
5. On the Type of Source pop-up menu, select a source type.
6. Complete as many of the fields as you want. The required
fields are
marked with an asterisk (*). These fields provide the minimum
information
that you must have for a citation.
7.
Note You can insert citations even when you do not have all
the publishing details.
If publishing details are omitted, citations are inserted as
numbered placeholders.
Then you can edit the sources later. You must enter all the
required information for a
source before you can create a bibliography.
8. When you are finished, click OK.The source information that
you entered
appears in the Current list and Master list of the Source
Manager.
9. To add additional sources, repeat steps 3 through 6.
10. Click Close.The source information that you entered appears
in the
Citations List in the Citations tool.
Edit a source in the Citations tool
You can edit a source directly in the document or in the
Citations tool. When you
change the source, the changes apply to all instances of that
citation throughout
the document. However, if you make a manual change to a
particular citation
within the document, those changes apply only to that particular
citation. Also,
that particular citation is not updated or overridden when you
update the citations
and bibliography.
1. On the Document Elements tab, under References, click
Manage.
2. In the Citations List, select the citation that you want to edit.
3. At the bottom of the Citations tool, click , and then click Edit
Source.
4. Make the changes that you want, and then click OK. If you
see a message
that asks whether you want to save changes in both the Master
list and the
Current list, click No to change only the current document, or
click Yes to
apply changes to the source of the citation and use it in other
documents.
Remove a source from the Citations List
Before you can remove a source from the Citations List, you
must delete all
related citations.
1. In the document, delete all the citations associated with the
source that
you want to remove.
2. Tip You can use the search field to locate citations. In the
search field , enter part of the citation.
3. On the Document Elements tab, under References, click
Manage.
4. At the bottom of the Citations tool, click , and then click
Citation Source
Manager.
5. In the Current list, select the source that you want to remove,
and then
click Delete. The source now appears only in the Master list.
6.
Note If the Delete button is unavailable, or if you see a check
mark next to the source in the list, there is still at least one
related citation in the document. Delete all remaining related
citations in the document, and then try deleting
the source again.
7. Click Close. The source that you removed no longer
appears in the Citations
List.
Step 2. Insert, edit, or delete a citation (optional)
Insert a citation
1. In your document, click where you want to insert the citation.
2. On the Document Elements tab, under References, click
Manage.
3. In the Citations List, double-click the source that you want to
cite. The
citation appears in the document.
Add page numbers or suppress author, year, or title for a
specific citation
Use this option to make custom changes to a citation and keep
the ability to
update the citation automatically.
Note The changes that you make by using this method apply
only to this citation.
1. Click anywhere between the parentheses of the citation. A
frame appears
around the citation.
2. Click the arrow on the frame, and then click Edit this
Citation.
3. Add page numbers, or select the Author, Year, or Title check
box to keep that
information from showing in the citation.
Make manual changes to a specific citation
If you want to change a specific citation manually, you can
make the citation text
static and edit the citation in any way that you want. After you
make the text
static, the citation will no longer update automatically. If you
want to make
changes later, you must make the changes manually.
1. Click anywhere between the parentheses of the citation. A
frame appears
around the citation.
2. Click the arrow on the frame, and then click Convert Citation
to Static Text.
3. In the document, make the changes to the citation.
Delete a single citation from the document
1. In the document, find the citation that you want to delete.
2.
Tip You can use the search field to locate citations. In the
search field , enter part of the citation.
3. Select the whole citation, including the parentheses, and then
press DELETE.
Step 3. Insert or edit a works cited list or a bibliography
A works cited list is a list of all works you referred to (or
"cited") in your
document, and is typically used when you cite sources using the
MLA style. A
works cited list differs from a bibliography, which is a list of
all works that you
consulted when your researched and wrote your document.
Insert a works cited list or a bibliography
1. In your document, click where you want the works cited list
or bibliography to
appear (usually at the very end of the document, following a
page break).
2. On the Document Elements tab, under References, click
Bibliography, and
then click Bibliography or Works Cited.
Change a works cited list or a bibliography style
You can change the style of all the citations contained in a
document's works
cited list or bibliography without manually editing the style of
the citations
themselves. For example, you can change the citations from the
APA style to the
MLA style.
1. On the View menu, click Draft or Print Layout.
2. On the Document Elements tab, under References, click the
Bibliography
Style pop-up menu, and then click the style that you want to
change the
bibliography's references to. All
references in your document's bibliography change to the new
style.
Update a works cited list or a bibliography
If you add new sources to the document after you inserted the
works cited list or
bibliography, you can update the works cited list or
bibliography to include the
new sources.
1. Click the works cited list or bibliography. A frame appears
around it.
2. Click the arrow on the frame, and then click Update Citations
and
Bibliography.
Research Paper Using Word
This assignment has two goals: 1) have students, via research,
increase their understanding of impacts of information
technology on current world issues, and 2) learn to correctly use
the tools and techniques within Word to format a research
paper, including use of available References and citation tools.
These skills will be valuable throughout a student’s
academic career.
The paper will require a title page, NO abstract, three to five
full pages of content with incorporation of a minimum of 3
external resources from credible sources and a Works
Cited/References page. Wikipedia and similar general
information
sites, blogs or discussion groups are not considered creditable
sources for a research project. No more than 10% of the
paper may be in the form of a direct citation from an external
source. Choose your topic from the list of topics that follow
these organization steps.
Paper organization
Open Word and save a blank document with the following
name:
“Student’s LastNameFirstInitial Research Paper”
The paper should be organized in the following way:
1. Title page:
a. Center in the middle of the page (horizontally and vertically)
the title (subject) of the paper and below that
your name
2. Body of the paper:
a. Use 12-point Arial font
b. Set the margins at 1”
c. Entire paper should be double-spaced
d. Length – 3-5 full pages, not counting the title page or the
References page.
e. Include a minimum of 3 APA-formatted citations and related
References page. Every reference must be cited
at least once, and every citation have an entry in the References
list. If you are not familiar with APA format,
it is recommended that you use the References feature in Word
for your citations and Reference List or refer
to the "Citing and Writing" option under the
Resources/Library/Get Help area in the LEO classroom. It is
important to review the final format for APA-style correctness
even if generated by Word.
f. Include at least two (2) informational footnotes. Footnotes are
not used to list a reference! Footnotes contain
information about the topic to which the footnote has been
attached.
g. Place the references on a separate page following the body of
the paper. Note: Use a hard return (CTRL
Enter) after the end of your paper body and the start of the
References page.
3. Organization of the content of the paper:
Include the following sections in the paper (include, in bold,
the headings identified here):
a. Introduction - Identify the issue or idea. Explain why the
topic was selected and what you are trying to
achieve (what is your end goal). The introduction should not be
more than half a page; details will be
discussed in the follow-on areas.
b. Areas of interest, activity or issue – Define the issue or idea
in greater detail. Define the specific problem
or problems or new idea. Identify other underlining or related
issues as well as dependencies. Explain what
impacts will result if not addressed.
c. Research Findings – Summarize your research findings and
what they contribute to the study of the issue
or idea. You must identify (cite) the sources of the research or
class material related to your topic that you
include in the findings.
d. Proposed solution(s), idea(s), courses of action(s). List
solutions, ideas or courses of action with an
analysis of its effectiveness (how will your suggestions affect
or change the current situation). If more than
one idea is suggested, provide an analysis that covers all
proposed suggestions.
e. Conclusion – Summarize the conclusions of your paper.
A list of topics from which students can choose is provided
here:
Topics for Research Paper
The focus of the paper should be on one of the following:
1. How has information technology led to the struggle between
online and brick-and-mortar stores? What do the
next 5-10 years look like?
2. How has information technology opened up the potential for
5G networks? Are there any downsides to
the implementation of this technology?
3. How has information technology impacted the use of robots
in your local stores?
4. How has information technology supported the development
of monopolies – Amazon, Microsoft, telecom
companies? Will these monopolies survive?
5. How has information technology supported the development
of facial recognition software and the current issues
related to its use?
6. How has information technology led to the use of biometrics
and the potential for rise of an International “Big
Brother”?
7. How has information technology led to the development of
the Internet of Things and the concern about the
impact of privacy laws (or lack thereof) on the IOT?
8. How has information technology supported the development
of Facebook and other social media sites? Should
social media sites be regulated?
9. Who/what is Huawei and what are the issues the U.S. and
other countries are having with Huawei?
10. How has information technology changed the political
process within the past 5 years?
Writing Quality for the Research Paper
• All Grammar, Verb Tenses, Pronouns, Spelling, Punctuation,
and Writing Competency should be without error.
• Be particularly careful about mis-matching a noun and
pronoun. For example, if you say "A person does this…" then
do
not use "their" or "they" when referring to that person. "Person"
is singular; "their" or "they" is plural.
• Remember: there is not their, your is not you're, its is not it's,
too is not to or two, site is not cite, and who should be
used after an individual, not that. For example, "the person
WHO made the speech" not "the person THAT made the
speech."
• in the previous sentence. It is more business-like to say "In a
professional paper one should not use contractions,"
rather than saying, "In a professional paper you don't use
contractions."
• In a professional paper one does not use contractions (doesn't,
don't, etc.) and one
does not use the personal I, you or your. Use the impersonal as
in the previous
sentence. It is more business-like to say "In a professional paper
one should not use
contractions," rather than saying, "In a professional paper you
don't use contractions."
• Remember: spell-check, then proofread. Better yet, have a
friend or colleague read it before submitting it. Read it out
loud to yourself. Read it as if you are submitting it to your boss.
Grading Criteria
Paper Mechanics
Format- title pg,
font, margins, paper
length
0.5 Title page included: Arial 12-point font used; margins set at
1";
body of the paper is 3-5 pages, double spaced. The title page
and References page are not counted as part of the 3-5 pages of
text.
APA work -
citations and
references
0.5 A minimum of 3 correctly formatted citations matched to
references; both citations and references in APA format.
Footnotes 0.5 A minimum of 2 footnotes that contain additional
information
but are NOT references.
Mechanics-
grammar, spelling,
etc.
1.5 Grammar, spellings, and punctuation correct throughout the
paper.
Content
Introduction 2 This is a summary of the topic. Simply identify
the issue
without going into great detail, explain why was the topic
selected and what the you are trying to achieve (what is your
end goal). The introduction should not be more than half a
page; details will be discussed in the follow-on areas.
Issue 2.0 Define the issue or idea. Define the specific problem
or
problems or new idea. Identify other underlying or related
issues as well as dependencies. Explain what impacts will
result if not addressed.
Findings 2.0 Identify research or class material related to your
topic.
Summarize your findings and what they contribute to the
study of the issue or idea. Sources must be identified in
citations and the related References list.
Solution
s/actions 3.0 List solution, idea or courses of action with an
analysis of its
effectiveness (how will your suggestions affect or change the
current situation). If more than one idea is suggested,
provide an analysis that covers all proposed suggestions.
Conclusion 2 Summarize the conclusions of your paper. In a
paragraph
briefly identify the issue, the findings, your proposed
solution/actions. However, do not simply repeat the words in
the previous sections.
You can find instructions on how to use the References tool in
Word on a PC or on a Mac in a file included in the
Assignment link.
Pseudo-R2 in Logistic Regression

More Related Content

Similar to Pseudo-R2 in Logistic Regression

Mixed Model Analysis for Overdispersion
Mixed Model Analysis for OverdispersionMixed Model Analysis for Overdispersion
Mixed Model Analysis for Overdispersiontheijes
 
Lecture 1 maximum likelihood
Lecture 1 maximum likelihoodLecture 1 maximum likelihood
Lecture 1 maximum likelihoodAnant Dashpute
 
Exploring bivariate data
Exploring bivariate dataExploring bivariate data
Exploring bivariate dataUlster BOCES
 
Research Methodology Module-06
Research Methodology Module-06Research Methodology Module-06
Research Methodology Module-06Kishor Ade
 
Linear Regression
Linear Regression Linear Regression
Linear Regression Rupak Roy
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regressionMohit Asija
 
Get Multiple Regression Assignment Help
Get Multiple Regression Assignment Help Get Multiple Regression Assignment Help
Get Multiple Regression Assignment Help HelpWithAssignment.com
 
Chapter 2 part3-Least-Squares Regression
Chapter 2 part3-Least-Squares RegressionChapter 2 part3-Least-Squares Regression
Chapter 2 part3-Least-Squares Regressionnszakir
 

Similar to Pseudo-R2 in Logistic Regression (20)

Notes Ch8
Notes Ch8Notes Ch8
Notes Ch8
 
Mixed Model Analysis for Overdispersion
Mixed Model Analysis for OverdispersionMixed Model Analysis for Overdispersion
Mixed Model Analysis for Overdispersion
 
Simple linear regressionn and Correlation
Simple linear regressionn and CorrelationSimple linear regressionn and Correlation
Simple linear regressionn and Correlation
 
Lecture 1 maximum likelihood
Lecture 1 maximum likelihoodLecture 1 maximum likelihood
Lecture 1 maximum likelihood
 
Exploring bivariate data
Exploring bivariate dataExploring bivariate data
Exploring bivariate data
 
Regression -Linear.pptx
Regression -Linear.pptxRegression -Linear.pptx
Regression -Linear.pptx
 
Multiple Linear Regression
Multiple Linear Regression Multiple Linear Regression
Multiple Linear Regression
 
Research Methodology Module-06
Research Methodology Module-06Research Methodology Module-06
Research Methodology Module-06
 
Regression
RegressionRegression
Regression
 
Linear Regression
Linear Regression Linear Regression
Linear Regression
 
2. diagnostics, collinearity, transformation, and missing data
2. diagnostics, collinearity, transformation, and missing data 2. diagnostics, collinearity, transformation, and missing data
2. diagnostics, collinearity, transformation, and missing data
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
Get Multiple Regression Assignment Help
Get Multiple Regression Assignment Help Get Multiple Regression Assignment Help
Get Multiple Regression Assignment Help
 
Regression
RegressionRegression
Regression
 
Chapter 2 part3-Least-Squares Regression
Chapter 2 part3-Least-Squares RegressionChapter 2 part3-Least-Squares Regression
Chapter 2 part3-Least-Squares Regression
 
gamdependence_revision1
gamdependence_revision1gamdependence_revision1
gamdependence_revision1
 
12943625.ppt
12943625.ppt12943625.ppt
12943625.ppt
 
Chap04 01
Chap04 01Chap04 01
Chap04 01
 
Math n Statistic
Math n StatisticMath n Statistic
Math n Statistic
 
Corr And Regress
Corr And RegressCorr And Regress
Corr And Regress
 

More from susanschei

Src TemplateStandard Recipe CardName of dishSpanish Vegie Tray Ba.docx
Src TemplateStandard Recipe CardName of dishSpanish Vegie Tray Ba.docxSrc TemplateStandard Recipe CardName of dishSpanish Vegie Tray Ba.docx
Src TemplateStandard Recipe CardName of dishSpanish Vegie Tray Ba.docxsusanschei
 
SPT 208 Final Project Guidelines and Rubric Overview .docx
SPT 208 Final Project Guidelines and Rubric  Overview .docxSPT 208 Final Project Guidelines and Rubric  Overview .docx
SPT 208 Final Project Guidelines and Rubric Overview .docxsusanschei
 
Ssalinas_ThreeMountainsRegionalHospitalCodeofEthics73119.docxR.docx
Ssalinas_ThreeMountainsRegionalHospitalCodeofEthics73119.docxR.docxSsalinas_ThreeMountainsRegionalHospitalCodeofEthics73119.docxR.docx
Ssalinas_ThreeMountainsRegionalHospitalCodeofEthics73119.docxR.docxsusanschei
 
Spring 2020Professor Tim SmithE mail [email protected]Teach.docx
Spring 2020Professor Tim SmithE mail [email protected]Teach.docxSpring 2020Professor Tim SmithE mail [email protected]Teach.docx
Spring 2020Professor Tim SmithE mail [email protected]Teach.docxsusanschei
 
Spring 2020 – Business Continuity & Disaster R.docx
Spring 2020 – Business Continuity & Disaster R.docxSpring 2020 – Business Continuity & Disaster R.docx
Spring 2020 – Business Continuity & Disaster R.docxsusanschei
 
Sports Business Landscape Graphic OrganizerContent.docx
Sports Business Landscape Graphic OrganizerContent.docxSports Business Landscape Graphic OrganizerContent.docx
Sports Business Landscape Graphic OrganizerContent.docxsusanschei
 
Spring 2020Carlow University Department of Psychology & Co.docx
Spring 2020Carlow University Department of Psychology & Co.docxSpring 2020Carlow University Department of Psychology & Co.docx
Spring 2020Carlow University Department of Psychology & Co.docxsusanschei
 
SPOTLIGHT ON STRATEGY FOR TURBULENT TIMESSpotlight ARTWORK.docx
SPOTLIGHT ON STRATEGY FOR TURBULENT TIMESSpotlight ARTWORK.docxSPOTLIGHT ON STRATEGY FOR TURBULENT TIMESSpotlight ARTWORK.docx
SPOTLIGHT ON STRATEGY FOR TURBULENT TIMESSpotlight ARTWORK.docxsusanschei
 
Sport Ticket sales staff trainingChapter 4Sales .docx
Sport Ticket sales staff trainingChapter 4Sales .docxSport Ticket sales staff trainingChapter 4Sales .docx
Sport Ticket sales staff trainingChapter 4Sales .docxsusanschei
 
SPOTLIGHT ARTWORK Do Ho Suh, Floor, 1997–2000, PVC figures, gl.docx
SPOTLIGHT ARTWORK Do Ho Suh, Floor, 1997–2000, PVC figures, gl.docxSPOTLIGHT ARTWORK Do Ho Suh, Floor, 1997–2000, PVC figures, gl.docx
SPOTLIGHT ARTWORK Do Ho Suh, Floor, 1997–2000, PVC figures, gl.docxsusanschei
 
Sponsorship Works 2018 8PROJECT DETAILSSponsorship tit.docx
Sponsorship Works 2018 8PROJECT DETAILSSponsorship tit.docxSponsorship Works 2018 8PROJECT DETAILSSponsorship tit.docx
Sponsorship Works 2018 8PROJECT DETAILSSponsorship tit.docxsusanschei
 
SPM 4723 Annotated Bibliography You second major proje.docx
SPM 4723 Annotated Bibliography You second major proje.docxSPM 4723 Annotated Bibliography You second major proje.docx
SPM 4723 Annotated Bibliography You second major proje.docxsusanschei
 
Speech Environment and Recording Requirements• You must have a.docx
Speech Environment and Recording Requirements• You must have a.docxSpeech Environment and Recording Requirements• You must have a.docx
Speech Environment and Recording Requirements• You must have a.docxsusanschei
 
Sped4 Interview 2.10.17 Audio.m4aJodee [000008] And we are .docx
Sped4 Interview 2.10.17 Audio.m4aJodee [000008] And we are .docxSped4 Interview 2.10.17 Audio.m4aJodee [000008] And we are .docx
Sped4 Interview 2.10.17 Audio.m4aJodee [000008] And we are .docxsusanschei
 
Speech Recognition in the Electronic Health Record (2013 u.docx
Speech Recognition in the Electronic Health Record (2013 u.docxSpeech Recognition in the Electronic Health Record (2013 u.docx
Speech Recognition in the Electronic Health Record (2013 u.docxsusanschei
 
Sped Focus Group.m4aJodee [000001] This is a focus group wi.docx
Sped Focus Group.m4aJodee [000001] This is a focus group wi.docxSped Focus Group.m4aJodee [000001] This is a focus group wi.docx
Sped Focus Group.m4aJodee [000001] This is a focus group wi.docxsusanschei
 
Specialized Terms 20.0 Definitions and examples of specialized.docx
Specialized Terms 20.0 Definitions and examples of specialized.docxSpecialized Terms 20.0 Definitions and examples of specialized.docx
Specialized Terms 20.0 Definitions and examples of specialized.docxsusanschei
 
Special notes Media and the media are plural and take plural verb.docx
Special notes Media and the media are plural and take plural verb.docxSpecial notes Media and the media are plural and take plural verb.docx
Special notes Media and the media are plural and take plural verb.docxsusanschei
 
SPECIAL ISSUE ON POLITICAL VIOLENCEResearch on Social Move.docx
SPECIAL ISSUE ON POLITICAL VIOLENCEResearch on Social Move.docxSPECIAL ISSUE ON POLITICAL VIOLENCEResearch on Social Move.docx
SPECIAL ISSUE ON POLITICAL VIOLENCEResearch on Social Move.docxsusanschei
 
SPECIAL ISSUE CRITICAL REALISM IN IS RESEARCHCRITICAL RE.docx
SPECIAL ISSUE  CRITICAL REALISM IN IS RESEARCHCRITICAL RE.docxSPECIAL ISSUE  CRITICAL REALISM IN IS RESEARCHCRITICAL RE.docx
SPECIAL ISSUE CRITICAL REALISM IN IS RESEARCHCRITICAL RE.docxsusanschei
 

More from susanschei (20)

Src TemplateStandard Recipe CardName of dishSpanish Vegie Tray Ba.docx
Src TemplateStandard Recipe CardName of dishSpanish Vegie Tray Ba.docxSrc TemplateStandard Recipe CardName of dishSpanish Vegie Tray Ba.docx
Src TemplateStandard Recipe CardName of dishSpanish Vegie Tray Ba.docx
 
SPT 208 Final Project Guidelines and Rubric Overview .docx
SPT 208 Final Project Guidelines and Rubric  Overview .docxSPT 208 Final Project Guidelines and Rubric  Overview .docx
SPT 208 Final Project Guidelines and Rubric Overview .docx
 
Ssalinas_ThreeMountainsRegionalHospitalCodeofEthics73119.docxR.docx
Ssalinas_ThreeMountainsRegionalHospitalCodeofEthics73119.docxR.docxSsalinas_ThreeMountainsRegionalHospitalCodeofEthics73119.docxR.docx
Ssalinas_ThreeMountainsRegionalHospitalCodeofEthics73119.docxR.docx
 
Spring 2020Professor Tim SmithE mail [email protected]Teach.docx
Spring 2020Professor Tim SmithE mail [email protected]Teach.docxSpring 2020Professor Tim SmithE mail [email protected]Teach.docx
Spring 2020Professor Tim SmithE mail [email protected]Teach.docx
 
Spring 2020 – Business Continuity & Disaster R.docx
Spring 2020 – Business Continuity & Disaster R.docxSpring 2020 – Business Continuity & Disaster R.docx
Spring 2020 – Business Continuity & Disaster R.docx
 
Sports Business Landscape Graphic OrganizerContent.docx
Sports Business Landscape Graphic OrganizerContent.docxSports Business Landscape Graphic OrganizerContent.docx
Sports Business Landscape Graphic OrganizerContent.docx
 
Spring 2020Carlow University Department of Psychology & Co.docx
Spring 2020Carlow University Department of Psychology & Co.docxSpring 2020Carlow University Department of Psychology & Co.docx
Spring 2020Carlow University Department of Psychology & Co.docx
 
SPOTLIGHT ON STRATEGY FOR TURBULENT TIMESSpotlight ARTWORK.docx
SPOTLIGHT ON STRATEGY FOR TURBULENT TIMESSpotlight ARTWORK.docxSPOTLIGHT ON STRATEGY FOR TURBULENT TIMESSpotlight ARTWORK.docx
SPOTLIGHT ON STRATEGY FOR TURBULENT TIMESSpotlight ARTWORK.docx
 
Sport Ticket sales staff trainingChapter 4Sales .docx
Sport Ticket sales staff trainingChapter 4Sales .docxSport Ticket sales staff trainingChapter 4Sales .docx
Sport Ticket sales staff trainingChapter 4Sales .docx
 
SPOTLIGHT ARTWORK Do Ho Suh, Floor, 1997–2000, PVC figures, gl.docx
SPOTLIGHT ARTWORK Do Ho Suh, Floor, 1997–2000, PVC figures, gl.docxSPOTLIGHT ARTWORK Do Ho Suh, Floor, 1997–2000, PVC figures, gl.docx
SPOTLIGHT ARTWORK Do Ho Suh, Floor, 1997–2000, PVC figures, gl.docx
 
Sponsorship Works 2018 8PROJECT DETAILSSponsorship tit.docx
Sponsorship Works 2018 8PROJECT DETAILSSponsorship tit.docxSponsorship Works 2018 8PROJECT DETAILSSponsorship tit.docx
Sponsorship Works 2018 8PROJECT DETAILSSponsorship tit.docx
 
SPM 4723 Annotated Bibliography You second major proje.docx
SPM 4723 Annotated Bibliography You second major proje.docxSPM 4723 Annotated Bibliography You second major proje.docx
SPM 4723 Annotated Bibliography You second major proje.docx
 
Speech Environment and Recording Requirements• You must have a.docx
Speech Environment and Recording Requirements• You must have a.docxSpeech Environment and Recording Requirements• You must have a.docx
Speech Environment and Recording Requirements• You must have a.docx
 
Sped4 Interview 2.10.17 Audio.m4aJodee [000008] And we are .docx
Sped4 Interview 2.10.17 Audio.m4aJodee [000008] And we are .docxSped4 Interview 2.10.17 Audio.m4aJodee [000008] And we are .docx
Sped4 Interview 2.10.17 Audio.m4aJodee [000008] And we are .docx
 
Speech Recognition in the Electronic Health Record (2013 u.docx
Speech Recognition in the Electronic Health Record (2013 u.docxSpeech Recognition in the Electronic Health Record (2013 u.docx
Speech Recognition in the Electronic Health Record (2013 u.docx
 
Sped Focus Group.m4aJodee [000001] This is a focus group wi.docx
Sped Focus Group.m4aJodee [000001] This is a focus group wi.docxSped Focus Group.m4aJodee [000001] This is a focus group wi.docx
Sped Focus Group.m4aJodee [000001] This is a focus group wi.docx
 
Specialized Terms 20.0 Definitions and examples of specialized.docx
Specialized Terms 20.0 Definitions and examples of specialized.docxSpecialized Terms 20.0 Definitions and examples of specialized.docx
Specialized Terms 20.0 Definitions and examples of specialized.docx
 
Special notes Media and the media are plural and take plural verb.docx
Special notes Media and the media are plural and take plural verb.docxSpecial notes Media and the media are plural and take plural verb.docx
Special notes Media and the media are plural and take plural verb.docx
 
SPECIAL ISSUE ON POLITICAL VIOLENCEResearch on Social Move.docx
SPECIAL ISSUE ON POLITICAL VIOLENCEResearch on Social Move.docxSPECIAL ISSUE ON POLITICAL VIOLENCEResearch on Social Move.docx
SPECIAL ISSUE ON POLITICAL VIOLENCEResearch on Social Move.docx
 
SPECIAL ISSUE CRITICAL REALISM IN IS RESEARCHCRITICAL RE.docx
SPECIAL ISSUE  CRITICAL REALISM IN IS RESEARCHCRITICAL RE.docxSPECIAL ISSUE  CRITICAL REALISM IN IS RESEARCHCRITICAL RE.docx
SPECIAL ISSUE CRITICAL REALISM IN IS RESEARCHCRITICAL RE.docx
 

Recently uploaded

Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerunnathinaik
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 

Recently uploaded (20)

Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developer
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 

Pseudo-R2 in Logistic Regression

  • 1. Statistica Sinica 16(2006), 847-860 PSEUDO-R 2 IN LOGISTIC REGRESSION MODEL Bo Hu, Jun Shao and Mari Palta University of Wisconsin-Madison Abstract: Logistic regression with binary and multinomial outcomes is commonly used, and researchers have long searched for an interpretable measure of the strength of a particular logistic model. This article describes the large sample properties of some pseudo-R2 statistics for assessing the predictive strength of the logistic regression model. We present theoretical results regarding the convergence and asymptotic normality of pseudo-R2s. Simulation results and an example are also presented. The behavior of the pseudo-R2s is investigated numerically across a
  • 2. range of conditions to aid in practical interpretation. Key words and phrases: Entropy, logistic regression, pseudo-R2 1. Introduction Logistic regression for binary and multinomial outcomes is commonly used in health research. Researchers often desire a statistic ranging from zero to one to summarize the overall strength of a given model, with zero indicating a model with no predictive value and one indicating a perfect fit. The coefficient of deter- mination R2 for the linear regression model serves as a standard for such measures (Draper and Smith (1998)). Statisticians have searched for a corresponding in- dicator for models with binary/multinomial outcome. Many different R2 statis- tics have been proposed in the past three decades (see, e.g., McFadden (1973), McKelvey and Zavoina (1975), Maddala (1983), Agresti (1986), Nagelkerke (1991), Cox and Wermuch (1992), Ash and Shwartz (1999), Zheng and Agresti
  • 3. (2000)). These statistics, which are usually identical to the standard R2 when applied to a linear model, generally fall into categories of entropy-based and variance-based (Mittlböck and Schemper (1996)). Entropy- based R2 statistics, also called pseudo-R2s, have gained some popularity in the social sciences (Mad- dala (1983), Laitila (1993) and Long (1997)). McKelvey and Zavoina (1975) proposed a pseudo-R2 based on a latent model structure, where the binary/ multinomial outcome results from discretizing a continuous latent variable that is related to the predictors through a linear model. Their pseudo-R2 is defined as the proportion of the variance of the latent variable that is explained by the 848 BO HU, JUN SHAO AND MARI PALTA covariate. McFadden (1973) suggested an alternative, known as “likelihood- ratio index”, comparing a model without any predictor to a model including all
  • 4. predictors. It is defined as one minus the ratio of the log likelihood with inter- cepts only, and the log likelihood with all predictors. If the slope parameters are all 0, McFadden’s R2 is 0, but it is never 1. Maddala (1983) developed another pseudo-R2 that can be applied to any model estimated by the maximum likelihood method. This popular and widely used measure is expressed as R2M = 1 − ( L(θ̃) L(θ̂) ) 2 n , (1) where L(θ̃) is the maximized likelihood for the model without any predictor and L(θ̂) is the maximized likelihood for the model with all predictors. In terms of
  • 5. the likelihood ratio statistic λ = −2 log(L(θ̃)/L(θ̂)), R2M = 1 − e−λ/n. Maddala proved that R2M has an upper bound of 1 − (L(θ̃))2/n and, thus, suggested a normed measure based on a general principle of Cragg and Uhler (1970): R2N = 1 − ( L(θ̃) L(θ̂) ) 2 n 1 − (L(θ̃)) 2 n . (2) While the statistics in (1) and (2) are widely used, their statistical properties have not been fully investigated. Mittlböck and Schemper (1996) reviewed R2M and R2N along with other measures, but their results are mainly empirical and
  • 6. numerical. The R2 for the linear model is interpreted as the proportion of the variation in the response that can explained by the regressors. However, there is no clear interpretation of the pseudo-R2s in terms of variance of the outcome in logistic regression. Note that both R2M and R 2 N are statistics and thus random. In linear regression, the standard R2 converges almost surely to the ratio of the variability due to the covariates over the total variability as the sample size in- creases to infinity. Once we know the limiting values of R2M and R 2 N , these limits can be similarly used to understand how the pseudo-R2s capture the predictive strength of the model. The pseudo-R2s for a given data set are point estimators for the limiting values that are unknown. To account for the variability in esti- mation, it is desirable to study the asymptotic sampling distributions of R2M and
  • 7. R2N , which can be used to obtain asymptotic confidence intervals for the limiting values of pseudo-R2s. Helland (1987) studied the sampling distributions of R2 statistics in linear regression. In this article we study the behavior of R2M and R 2 N under the logistic re- gression model. In Section 2, we derive the limits of R2M and R 2 N and provide PSEUDO-R2 IN LOGISTIC REGRESSION MODEL 849 interpretations of them. We also present some graphs describing the behavior of R2N across a range of practical situations. The asymptotic distributions of R 2 M and R2N are derived in Section 3 and some simulation results are presented. An example is given in Section 4. 2. What Does Pseudo-R2 Measure In this section we explore the issue of what R2M in (1) and R
  • 8. 2 N in (2) measure in the setting of binary or multinomial outcomes. 2.1. Limits of pseudo-R2s Consider a study of n subjects whose outcomes fall in one of m categories. Let Yi = (Yi1, . . . , Yim) ′ be the outcome vector associated with the ith subject, where Yij = 1 if the outcome falls in the jth category, and Yij = 0 otherwise. We assume that Y1, . . . , Yn are independent and that Yi is associated with a p- dimensional vector Xi of predictors (covariates) through the multinomial logit model Pij = E(Yij|Xi) = exp(αj + X ′ iβj ) ∑m k=1 exp(αk + X ′ iβk)
  • 9. , j = 1, . . . , m, (3) where αm = βm = 0, α1, . . . , αm−1 are unknown scalar parameters, and β1, . . ., βm−1 are unknown p-vectors of parameters. Let θ be the (p + 1)(m − 1) dimen- sional parameter (α1, β ′ 1, . . . , αm−1, β ′ m−1). Then the likelihood function under the multinomial logit model can be written as L(θ) = n ∏ i=1 P Yi1 i1 P Yi2 i2 · · · P Yim im . (4) Procedures for obtaining the maximum likelihood estimator θ̂ of
  • 10. θ are available in most statistical software packages. The following theorem provides the asymp- totic limits of the pseudo-R2s defined in (1) and (2). Its proof is given in the Appendix. Theorem 1. Assume that covariates Xi, i = 1, . . . , n, are independent and identically distributed random p-vectors with finite second moment. If H1 = − m ∑ j=1 E(Pij ) log E(Pij ), (5) H2 = − m ∑ j=1 E(Pij log Pij ), (6) 850 BO HU, JUN SHAO AND MARI PALTA
  • 11. then, as n → ∞, R2M →p 1−e2(H2−H1) and R2N →p (1 − e2(H2−H1))/(1 − e−2H1 ), where →p denotes convergence in probability. 2.2. Interpretation of the limits of pseudo-R2s It is useful to consider whether the limits of pseudo-R2 can be interpreted much as R2 can be for linear regression analysis. Theorem 1 reveals that both R2M and R 2 N converge to limits that can be described in terms of entropy. If the covariates Xis are i.i.d., Yi = (Yi1, . . . , Yim) ′, i = 1, . . . , n, are also i.i.d. multinomial distributed with probability vector (E(Pi1), . . . , E(Pim)) where the expectation is taken over Xi. Then H1 given in (5) is exactly the entropy measuring the marginal variation of Yi. Similarly, − ∑m j=1 Pij log Pij corresponds to the conditional entropy measuring the variation of Yi given Xi and H2 can be considered as the average conditional entropy. Therefore
  • 12. H1 − H2 measures the difference in entropy explained by the covariate X, which is always greater than 0 by Jensen’s inequality, and is 0 if and only if the covariates and outcomes are independent. For example, when (Xi, Yi) is bivariate normal, H1 − H2 = log( √ 1 − ρ2)−1 where ρ is the correlation coefficient, and the limit of R2M is ρ 2. The limit of R2M is 1 − e−2(H1−H2) monotone in increasing H1 − H2. Then we can write the limit of R2N as the limit of R 2 M divided by its upper bound: R2N →p 1 − e−2(H1−H2) 1 − e−2H1 = e2H1 − e2H2 e2H1 − 1 . When both H1 and H2 are small, 1 − e−2(H1−H2) ≈ 2(H1 − H2), 1 − e−2H1 ≈ 2H1
  • 13. and the limit of R2N is approximately (H1 − H2)/H1, the entropy explained by the covariates relative to the marginal entropy H1. 2.3. Limits of R2 M and R2 N relative to model parameters For illustration, we examine the magnitude of the limits of RM and RN under different parameter settings when the Xis are i.i.d. standard normal and the outcome is binary. Figures 1 and 2 show the relationship between the limits of R2M and R 2 N and the parameters α and β. In these figures, profile lines of the limits are given for different levels of the response probability eα/(1 + eα) at the mean of Xi and odds ratio e β per standard deviation of the covariate. The limits tend to increase as the absolute value of β increases with other parameters fixed, which is consistent with the behavior of the usual R2 in linear regression models.
  • 14. However, we note that the limits tend to be low, even in models where the parameters indicate a rather strong association with the outcome. For example, PSEUDO-R2 IN LOGISTIC REGRESSION MODEL 851 a moderate size odds ratio of 2 per standard deviation of Xi is associated with the limit of R2N at most 0.10. As the pseudo-R 2 measures do not correspond in magnitude to what is familiar from R2 for ordinary regression, judgments about the strength of the logistic model should refer to profiles such as those provided in Figures 1 and 2. Knowing what odds ratio for a single predictor model produces the same pseudo-R2 as a given multiple predictor model greatly facilitates subject matter relevance assessment. PSfrag replacements 0 2
  • 15. 4 6 0.0 0.2 0.4 0.6 0.8 e α β e α 1+eα e β Figure 1. Contour plot of limits of R2M against e α/(1 + eα) and odds ratio eβ . 852 BO HU, JUN SHAO AND MARI PALTA PSfrag replacements 0 2 4 6 0.0
  • 16. 0.2 0.4 0.6 0.8 e α β e α 1+eα e β Figure 2. Contour plot of limits of R2N against e α/(1 + eα) and odds ratio eβ. It may be noted that neither R2N nor R 2 M can equal 1, except in degenerate models. This property is a logical consequence of the nature of binary outcomes. The denominator, 1 − (L(θ̃))2/n, equals the numerator when L(θ̂) equals 1, which occurs only for a degenerate outcome that is always 0 or 1. In fact, any perfectly fitting model for binary data would predict probabilities that are only 0 or 1. This constitutes a degenerate logistic model, which cannot be fit. In
  • 17. comparison to PSEUDO-R2 IN LOGISTIC REGRESSION MODEL 853 the R2 for a linear model, R2 of 1 implies residual variance of 0. As the variance and entropy of binomial and multinomial data depend on the mean, this again can occur only when the predicted probabilities are 0 and 1. The mean-entropy dependence influences the size of the pseudo-R2s and tends to keep them away from 1 even when the mean probabilities are strongly dependent on the covariate. For ease of model interpretation, investigators often categorize a continuous variable, which leads to a loss of information. Consider a standard normally distributed covariate. We calculate the limit of R2N when cutting the normal covariate into two, three, five or six categories. The threshold points we choose are 0 for two categories, ±1 for three categories, ±0.5, ±1 for five categories, and 0, ±0.5, ±1 for six categories. In Figures 3 and 4, we plot the
  • 18. corresponding limits of R2N against e α/(1 + eα) by fixing β at 1, and against eβ by setting α = 1. The fewer the number of categories we use for the covariate, the more information we lose, i.e., the smaller the limit of R2N . In this example, we note that using five or six categories retains most of the information provided by the original continuous covariate. PSfrag replacements 0 2 4 6 0.0 0.2 0.4 0.6 0.8 e α
  • 20. 0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.8 Continuous Two Three Five Six Figure 3. Limit of R2N with covariate N (0, 1) dichotomized into K categories against eα/(1 + eα), K = 2, 3, 5, 6 854 BO HU, JUN SHAO AND MARI PALTA PSfrag replacements
  • 21. 0 2 4 6 0.0 0.2 0.4 0.6 0.8 e α β e α 1+eα eβ 0.00 0.05 0.10 0.15 0.20 0.25
  • 23. into K categories against odds ratio eβ , K = 2, 3, 5, 6 3. Sampling Distributions of Pseudo-R2s The result in the previous section indicates that the limit of a pseudo-R2 is a measure of the predictive strength of a model relating the logistic responses to some predictors (covariates). The quantities R2M and R 2 N are statistics and are random. They should be treated as estimators of their limiting values in assessing the model strength. In this section, we derive the asymptotic distributions of R2M and R2N that are useful for deriving large sample confidence intervals. 3.1. Asymptotic distributions of pseudo-R2s Theorem 2. Under the conditions of Theorem 1, √ n [ R2M − (1 − e2(H2−H1)) ] →d N (0, σ21 ) (7)
  • 24. √ n [ R2N − 1 − e2(H2−H1) 1 − e−2H1 ] →d N (0, σ22 ), (8) PSEUDO-R2 IN LOGISTIC REGRESSION MODEL 855 where H1 and H2 are given by (5) and (6), σ 2 1 = g ′ 1Σg1 and σ 2 2 = g ′ 2Σg2 with g1 = −2e2(H2−H1) (1 + log γ1, . . . , 1 + log γm, −1) , (9) g2 =
  • 25. e−2H1 (1 − e2H2 ) (1 − e−2H1 )2 ( 1 + log γ1, . . . , 1 + log γm, e 2H2 1 − e2H1 1 − e2H2 ) , (10) Σ = ( Cov(Yi) η η′ � ) . (11) Here γj = Ex(Pij ), j = 1, . . . , m, is the expected probability that the outcome falls in jth category, the jth element of η is ηj = Ex(Pij log Pij ) + γj H2, and � = ∑m
  • 26. j=1 Ex ( Pij (log Pij ) 2 ) − H22 . When all the slope parameters βj are 0 (i.e., Xi and Yi are uncorrelated), both σ21 and σ 2 2 are zero. g1, g2 and Σ can be estimated by replacing the un- known quantities, which are related to the covariate distribution, with consistent estimators. For example, γ can be estimated by ( ∑ P̂ i1/n, . . . , ∑ P̂ im/n) ′. Suppose gk, k = 1, 2, and Σ are estimated by ĝk and Σ̂, respectively. Theorem 2 leads to the following asymptotic 100(1 − α)% confidence interval for the limit of R2M : (
  • 27. R2M − Z α 2 ĝ′1Σ̂ĝ1, R 2 M + Z α 2 ĝ′1Σ̂ĝ1 ) , (12) where Zα is the 1 − α quantile of the standard normal distribution. A confidence interval for the limit of R2N can be obtained by replacing R 2 M and ĝ1 in (12) with R2N and ĝ2, respectively. If the resulting lower limit of the confidence interval is below 0 or the upper limit is above 1, it is conventional to use the margin value of 0 or 1. 3.2. Simulation results In this section, we examine by simulation the finite sample performance of the confidence intervals based on the asymptotic results derived in Section 3.1.
  • 28. Our simulation experiments consider the logistic regression model with binary outcome and a single normal covariate with mean 0 and standard deviation 1. All the simulations were run with 3,000 replications of an artificially gener- ated data set. In each replication, we simulated a sample of size 200 or 1,000 from the standard normal distribution as covariate vectors X , and simulated 200 or 1,000 binary outcomes according to success probability exp(α + βX)/(1+ exp(α + βX)). Tables 1 and 2 show the results for different values of α and β. In all the simulations with sample size 1,000, the estimated confidence inter- vals derived by Theorem 2 displayed coverage probability close to the expected level of 0.95. Coverage probability is less satisfactory with sample size 200 when the model is weak. 856 BO HU, JUN SHAO AND MARI PALTA
  • 29. Table 1. Simulation average of pseudo-R2s and 95% confidence intervals in the logit model with normal covariate (sample size=1,000). α β R2M (limit) CI (coverage ∗ ) R2N (limit) CI (coverage ∗ ) 2 0.5 0.028 (0.027) (0.008,0.047) (0.930) 0.052 (0.050) (0.015,0.088)(0.930) 1 0.108 (0.103) (0.069,0.137) (0.937) 0.178 (0.178) (0.121,0.236) (0.940) 2 0.298 (0.298) (0.258,0.338) (0.929) 0.455 (0.454) (0.396,0.513) (0.927) 1 0.5 0.047 (0.046) (0.022,0.071) (0.937) 0.067 (0.066) (0.032,0.103) (0.940) 1 0.151 (0.150) (0.113,0.189) (0.943) 0.213 (0.213) (0.160,0.267) (0.945) 2 0.351 (0.350) (0.312,0.390) (0.922) 0.483 (0.483) (0.430,0.536) (0.929) 0.5 0.5 0.054 (0.053) (0.028,0.080) (0.940) 0.073 (0.072) (0.038,0.109) (0.941) 1 0.166 (0.166) (0.127,0.204) (0.948) 0.224 (0.226) (0.172,0.276) (0.948) 2 0.367 (0.365) (0.327,0.404) (0.931) 0.492 (0.491) (0.443,0.546) (0.933)
  • 30. 0 0.5 0.056 (0.055) (0.030,0.083) (0.938) 0.075 (0.074) (0.039,0.111) (0.938) 1 0.171 (0.171) (0.133,0.210) (0.931) 0.229 (0.228) (0.177,0.281) (0.933) 2 0.371 (0.370) (0.332,0.409) (0.925) 0.490 (0.494) (0.443,0.546) (0.929) ∗ The relative frequency with which the intervals contain the true limit Table 2. Simulation average of pseudo-R2s and 95% confidence intervals in the logit model with normal covariate (sample size=200). α β R2M (limit) CI (coverage) R 2 N (limit) CI (coverage) 2 0.5 0.031 (0.027) (0, 0.074) (0.912) 0.058 (0.050) (0, 0.138) (0.922) 1 0.107 (0.103) (0.0310, 0.182) (0.918) 0.185 (0.178) (0.0580, 0.312) (0.920) 2 0.299 (0.298) (0.2110, 0.387) (0.910) 0.458 (0.454) (0.3290, 0.588) (0.913) 1 0.5 0.050 (0.046) (0, 0.105) (0.915) 0.073 (0.066) (0, 0.151) (0.914) 1 0.152 (0.150) (0.0680, 0.235) (0.928) 0.215 (0.213) (0.0980, 0.332) (0.925) 2 0.351 (0.350) (0.2650, 0.437) (0.930) 0.484 (0.483) (0.3660,
  • 31. 0.601) (0.932) 0.5 0.5 0.058 (0.053) (0, 0.116) (0.919) 0.078 (0.072) (0, 0.157) (0.920) 1 0.168 (0.166) (0.0820, 0.253) (0.927) 0.227 (0.226) (0.1120, 0.343) (0.930) 2 0.367 (0.365) (0.2820, 0.452) (0.925) 0.494 (0.491) (0.3790, 0.608) (0.925) 0 0.5 0.059 (0.055) (0.0010, 0.118) (0.911) 0.079 (0.074) (0.0010, 0.158) (0.912) 1 0.174 (0.171) (0.0880, 0.260) (0.928) 0.233 (0.228) (0.1180, 0.347) (0.930) 2 0.372 (0.370) (0.2870, 0.457) (0.925) 0.496 (0.494) (0.3830, 0.610) (0.922) 4. Example We now turn to an example of logistic regression from Fox’s (2001) text on fitting generalized linear models. This example draws on data from the 1976 U.S. Panel Study of Income Dynamics. There are 753 families in the data set with 8 variables. The variables are defined in Table 3. The logarithm of the wife’s estimated wage rate is based on her actual earnings if she is in the labor force;
  • 32. otherwise this variable is imputed from other predictors. The definition of other variables is straightforward. PSEUDO-R2 IN LOGISTIC REGRESSION MODEL 857 Table 3. Variables in the women labor force dataset. Variable Description Remarks lfp wife’s labor-force participation factor: no,yes k5 number of children ages 5 and younger 0-3, few 3’s k618 number of children ages 6 to 18 0-8, few > 5 age wife’s age in years 30-60, single years wc wife’s college attendance factor: no,yes hc husband’s college attendance factor: no,yes lwg log of wife’s estimated wage rate see text inc family income excluding wife’s income $1, 000s We assume a binary logit model with no labor force participation as the baseline category. Other variables are treated as predictors in the model. The estimated model with all the predictors has the following form: log
  • 33. P 1−P = 3.18−1.47k5−0.07k618−0.06age+0.81wc+0.11hc+0.61lwg−0.03i nc, where P is the probability that the wife in the family is in the labor force. The variables k618 and hc are not statistically significant based on the likelihood-ratio test. Table 4 shows the values of R2M and R 2 N , as well as 95% confidence intervals of limits of R2M and R 2 N , for the model containg all the predictors, and models excluding certain predictors. Table 4. R2M and R 2 N with 95% confidence intervals of models for women labor force data. Model R2M (95% CI.) R 2 N (95% CI.) Use all predictors 0.152 ( 0.109, 0.195) 0.205 ( 0.147, 0.262) Exclude k5 0.074 ( 0.040, 0.108) 0.100 ( 0.054, 0.145)
  • 34. Exclude age 0.123 ( 0.083, 0.164) 0.165 ( 0.111, 0.219) Exclude wc 0.138 ( 0.096, 0.180) 0.185 ( 0.129, 0.241) Exclude lwg 0.133 ( 0.092, 0.174) 0.179 ( 0.123, 0.234) Exclude inc 0.130 ( 0.087, 0.172) 0.175 ( 0.119, 0.230) Exclude k618 0.151 ( 0.108, 0.194) 0.203 ( 0.145, 0.261) Exclude hc 0.152 ( 0.109, 0.195) 0.204 ( 0.146, 0.262) Use k618, hc only 0.003 (-0.005, 0.010) 0.004 (-0.006, 0.013) For the model with all the covariates, R2M and R 2 N are around 0.15 and 0.20, respectively. The results imply a moderately strong model when referencing the odds ratio scale equivalents in Figure 1. Dropping a significant covariate results in a notable decrease in the values of pseudo-R2s, while no significant change occurs if we drop the insignificant covariates. R2M and R 2 N are near zero when we 858 BO HU, JUN SHAO AND MARI PALTA exclude all the significant covariates. However, model selection
  • 35. procedures using pseudo-R2 need further research. Acknowledgements The research work is supported by Grant CA-53786 from the National Cancer Institute. The authors thank the referees and an editor for helpful comments. Appendix For the proof of results in Section 3, we begin with a lemma and then sketch the main steps for Theorem 1 and 2. Lemma 1. Assume that covariates Xi, i = 1, . . . , n, are i.i.d. random p-vectors with finite second moment, then (log L(θ̂) − log L(θ))/ √ n →p 0, where θ̂ is the maximum likelihood estimator of θ. Proof of Lemma 1. We first prove that ∂2 log L(θ)/∂θ∂θ ′ = Op(n). The score function is ∂ log L(θ)
  • 36. ∂θ = ( n ∑ i=1 (Yi1 − Pi1), n ∑ i=1 (Yi1 − Pi1)X′i, . . . , n ∑ i=1 (Yim − Pim)X′i ) ′ . Let ηk = (αk, β ′ k) ′ ∈ Rp+1 for k = 1, . . . , m, and Ui = (1, X′i )′. Then
  • 37. ∂2 log L(θ) ∂ηk∂η ′ k = − n ∑ i=1 Pik(1 − Pik)UiU ′i , k = 1, 2, . . . , m, ∂2 log L(θ) ∂ηk∂η ′ l = − n ∑ i=1 PikPilUiU ′ i , k 6= l. Since UiU ′
  • 38. i = ( 1 X′i Xi XiX ′ i ) , each element in the second derivative matrix ∂2 log L(θ)/∂θ∂θ ′ is Op(n) by assumption. For simplicity, we write this as ∂2 log L(θ)/∂θ∂θ ′ = Op(n). Let Sn(θ) = ∂ log L(θ)/∂θ, Jn(θ) =−∂2 log L(θ)/∂θ∂θ ′ and In(θ) = E(Jn(θ)), where the expectation is taken over covariates. It follows that the cumulative information matrix In(θ) = Op(n). By a second-order Taylor expansion, log L(θ̂) − log L(θ)√
  • 39. n = Sn(θ̂)√ n ′ (θ̂ − θ) − 1 2 √ n (θ̂ − θ)′Jn(θ∗)(θ̂ − θ) = (θ̂ − θ)′In(θ) 1 2 In(θ) − 1 2 √ n Jn(θ ∗ ) 2n 3
  • 40. 2 √ nIn(θ) − 1 2 In(θ) 1 2 (θ̂ − θ), PSEUDO-R2 IN LOGISTIC REGRESSION MODEL 859 where θ∗ is a vector between θ and θ̂. The asymptotic normality results of the MLE gives In(θ) 1/2(θ̂ −θ) → N (0, 1). The lemma then follows from the fact that In(θ) −1/2 √ n = Op(1) and Jn(θ ∗ )/n = Op(1). Proof of Theorem 1. Let f (x) = log(1 − x)/2, then f (R2) = 1
  • 41. n log L(θ̃) − 1 n log L(θ̂) = m ∑ j=1 nj n log( nj n ) − 1 n log L(θ) + 1 n ( log L(θ) − log L(θ̂) ) . The convergence of
  • 42. ∑m j=1(nj /n) log(nj/n) and log L(θ)/n come from the Law of Large Numbers. The results of the theorem follow from the lemma and the Continuous Mapping Theorem. Proof of Theorem 2. Let S2M = 1 − (L(θ̃)/L(θ))2/n and S2N = (1− (L(θ̃)/L(θ))2/n)/(1 − (L(θ̃))2/n). It follows from the lemma that S2M and S2N have the same asymptotic distribution as R2M and R 2 N , respectively, in the sense that √ n(S2M − R2M ) →p 0 and √ n(S2N − R2N ) →p 0. Define Zi = (Yi1, . . . , Yim, Wi) where Wi = ∑m j=1 Yij log Pij . Then Zi’s form a i.i.d. random sequence with µ = E(Zi) = (γ ′, ∑m j=1 E(P1j log P1j )) = (γ ′, −E2),
  • 43. Cov (Zi) = Σ, γ and Σ as defined in Section 3. By the Multidimensional Central Limit Theorem, √ n ( Z ̄ − µ ) → N (0, Σ). (13) Let φ1(x1, . . . , xm) = 1 − e2( ∑ m j=1 xj log xj−xm+1) and φ2(x1, . . . , xm) = (1− e2( ∑ m j=1 xj log xj −xm+1))/(1 − e2 ∑ m j=1 xj log xj ). Applying the delta-method with φ1 and φ2 to (13), respectively, leads to the asymptotic normality
  • 44. results in Theorem 2. References Agresti, A. (1986). Applying R2 type measures to ordered categorical data. Technometrics. 28, 133-138. Ash, A. and Shwartz, M. (1999). R2: a useful measure of model performance when predicting a dichotomous outcome. Statist. Medicine 18, 375-384. Cox, D. R. and Wermuch, N. (1992). A comment on the coefficient of determination for binary responses. Amer. Statist. 46, 1-4. Cragg, J. G. and Uhler, R. S. (1970). The demand for automobiles. Canad. J. Economics 3, 386-406. Draper, N. R. and Smith, H. (1998). Applied Rregression Analysis. 3rd edition. Wiley, New York. Fox, J. (2001). An R and S-Plus Companion to Applied Regression. Sage Publications.
  • 45. 860 BO HU, JUN SHAO AND MARI PALTA Helland, I. S. (1987). On the interpretation and use of R2 in regression analysis, Biometrics 43, 61-69. Laitila, T. (1993). A pseudo-R2 measure for limited and qualitative dependent variable models. J. Econometrics 56, 341-356. Long, J. S. (1997). Regression Models for Categorical and Limited Dependent Variables. Sage Publications. Maddala, G. S. (1983). Limited-Dependent and Qualitative Variables in Econometrics. Cam- bridge University Press, Cambridge. McFadden, D. (1973). Conditional logit analysis of qualitative choice behavior. In Frontiers in Econometrics (Edited by P. Zarembka), 105-42. Academic Press, New York. McKelvey, R. D. and Zavoina, W. (1975). A statistical model for the analysis of ordinal level dependent variables. J. Math. Soc. 4, 103-120. Mittlböck M. and Schemper, M. (1996). Explained variation for logistic regression. Statist.
  • 46. Medicine 15, 1987-1997. Nagelkerke, N. J. D. (1991). A note on a general definition of the coefficient of determination. Biometrika 78, 691-693. Zheng B. Y. and Agresti, A. (2000). Summarizing the predictive power of a generalized liner model. Statist. Medicine 19, 1771-1781. Department of Statistics, University of Wisconsin-Madison, Madison, WI, 53706, U.S.A. E-mail: [email protected] Department of Statistics, University of Wisconsin-Madison, Madison, WI, 53706, U.S.A. E-mail: [email protected] Department of Population Health Sciences, University of Wisconsin-Madison, Madison, WI, 53706, U.S.A. E-mail: [email protected] (Received August 2004; accepted July 2005) 1. Introduction2. What Does Pseudo-R2 Measure2.1. Limits of pseudo-R2s2.2. Interpretation of the limits of pseudo2.3. Limits of R2M and R2N relative to model parameters3. Sampling Distributions of Pseudo-R2s3.1. Asymptotic distributions of pseudo3.2. Simulation results4. ExampleAppendix A note about the References tool in Word
  • 47. On a PC/Windows system (based on Office 2010) When you need to create a citation (giving credit for work that you are referencing), you click on References, then on Insert Citation. The next step is to add a new source. When you get to the "Create Source" window, it is suggested that you click on the "Show All Bibliography Fields." Here is a sample Source screen. Once you have entered all the source information, click on Bibliography and then Insert Bibliography. This is the citation: (Joseph, 2000) This is how the source is entered into the References list: Joseph, J. (2000, October). Ethics in the Workplace. Retrieved August 3, 2015, from asae-The Center for Association Leadership: http://www.asaecenter.org/Resources/articledetail.cfm?ItemNum ber=13073 Other fields on the source page would be used for a journal article or an article from a periodical.
  • 48. On a Mac/OS system (based on Office 2013) From the MAC Help files: To add a citation, a works cited list, or a bibliography to your document, you first add a list of the sources that you used. Add a source by using the Source Manager The Source Manager lists every source ever entered on your computer so that you can reuse them in any other document. This is useful, for example, if you write research papers that use many of the same sources. If you open a document that includes citations, the sources for those citations appear under Current list. All the sources that you have cited, either in previous documents or in the current document, appear under Master list. 1. Open up your Word document. 2. On the Document Elements tab , under References , click Manage. 3. At the bottom of the Citations tool, click , and then click Citation Source Manager .
  • 49. 1 2 3 4 1 2 3 4 4. Click New. 5. On the Type of Source pop-up menu, select a source type. 6. Complete as many of the fields as you want. The required fields are marked with an asterisk (*). These fields provide the minimum information that you must have for a citation. 7. Note You can insert citations even when you do not have all the publishing details. If publishing details are omitted, citations are inserted as numbered placeholders. Then you can edit the sources later. You must enter all the required information for a
  • 50. source before you can create a bibliography. 8. When you are finished, click OK.The source information that you entered appears in the Current list and Master list of the Source Manager. 9. To add additional sources, repeat steps 3 through 6. 10. Click Close.The source information that you entered appears in the Citations List in the Citations tool. Edit a source in the Citations tool You can edit a source directly in the document or in the Citations tool. When you change the source, the changes apply to all instances of that citation throughout the document. However, if you make a manual change to a particular citation within the document, those changes apply only to that particular citation. Also, that particular citation is not updated or overridden when you update the citations and bibliography. 1. On the Document Elements tab, under References, click Manage. 2. In the Citations List, select the citation that you want to edit. 3. At the bottom of the Citations tool, click , and then click Edit Source.
  • 51. 4. Make the changes that you want, and then click OK. If you see a message that asks whether you want to save changes in both the Master list and the Current list, click No to change only the current document, or click Yes to apply changes to the source of the citation and use it in other documents. Remove a source from the Citations List Before you can remove a source from the Citations List, you must delete all related citations. 1. In the document, delete all the citations associated with the source that you want to remove. 2. Tip You can use the search field to locate citations. In the search field , enter part of the citation. 3. On the Document Elements tab, under References, click Manage. 4. At the bottom of the Citations tool, click , and then click Citation Source Manager. 5. In the Current list, select the source that you want to remove, and then click Delete. The source now appears only in the Master list.
  • 52. 6. Note If the Delete button is unavailable, or if you see a check mark next to the source in the list, there is still at least one related citation in the document. Delete all remaining related citations in the document, and then try deleting the source again. 7. Click Close. The source that you removed no longer appears in the Citations List. Step 2. Insert, edit, or delete a citation (optional) Insert a citation 1. In your document, click where you want to insert the citation. 2. On the Document Elements tab, under References, click Manage. 3. In the Citations List, double-click the source that you want to cite. The citation appears in the document. Add page numbers or suppress author, year, or title for a specific citation Use this option to make custom changes to a citation and keep the ability to update the citation automatically.
  • 53. Note The changes that you make by using this method apply only to this citation. 1. Click anywhere between the parentheses of the citation. A frame appears around the citation. 2. Click the arrow on the frame, and then click Edit this Citation. 3. Add page numbers, or select the Author, Year, or Title check box to keep that information from showing in the citation. Make manual changes to a specific citation If you want to change a specific citation manually, you can make the citation text static and edit the citation in any way that you want. After you make the text static, the citation will no longer update automatically. If you want to make changes later, you must make the changes manually. 1. Click anywhere between the parentheses of the citation. A frame appears around the citation. 2. Click the arrow on the frame, and then click Convert Citation to Static Text.
  • 54. 3. In the document, make the changes to the citation. Delete a single citation from the document 1. In the document, find the citation that you want to delete. 2. Tip You can use the search field to locate citations. In the search field , enter part of the citation. 3. Select the whole citation, including the parentheses, and then press DELETE. Step 3. Insert or edit a works cited list or a bibliography A works cited list is a list of all works you referred to (or "cited") in your document, and is typically used when you cite sources using the MLA style. A works cited list differs from a bibliography, which is a list of all works that you consulted when your researched and wrote your document. Insert a works cited list or a bibliography 1. In your document, click where you want the works cited list or bibliography to appear (usually at the very end of the document, following a page break). 2. On the Document Elements tab, under References, click Bibliography, and
  • 55. then click Bibliography or Works Cited. Change a works cited list or a bibliography style You can change the style of all the citations contained in a document's works cited list or bibliography without manually editing the style of the citations themselves. For example, you can change the citations from the APA style to the MLA style. 1. On the View menu, click Draft or Print Layout. 2. On the Document Elements tab, under References, click the Bibliography Style pop-up menu, and then click the style that you want to change the bibliography's references to. All references in your document's bibliography change to the new style. Update a works cited list or a bibliography If you add new sources to the document after you inserted the works cited list or bibliography, you can update the works cited list or bibliography to include the new sources. 1. Click the works cited list or bibliography. A frame appears around it. 2. Click the arrow on the frame, and then click Update Citations
  • 56. and Bibliography. Research Paper Using Word This assignment has two goals: 1) have students, via research, increase their understanding of impacts of information technology on current world issues, and 2) learn to correctly use the tools and techniques within Word to format a research paper, including use of available References and citation tools. These skills will be valuable throughout a student’s academic career. The paper will require a title page, NO abstract, three to five full pages of content with incorporation of a minimum of 3 external resources from credible sources and a Works Cited/References page. Wikipedia and similar general information sites, blogs or discussion groups are not considered creditable sources for a research project. No more than 10% of the paper may be in the form of a direct citation from an external source. Choose your topic from the list of topics that follow these organization steps. Paper organization Open Word and save a blank document with the following name:
  • 57. “Student’s LastNameFirstInitial Research Paper” The paper should be organized in the following way: 1. Title page: a. Center in the middle of the page (horizontally and vertically) the title (subject) of the paper and below that your name 2. Body of the paper: a. Use 12-point Arial font b. Set the margins at 1” c. Entire paper should be double-spaced d. Length – 3-5 full pages, not counting the title page or the References page. e. Include a minimum of 3 APA-formatted citations and related References page. Every reference must be cited at least once, and every citation have an entry in the References list. If you are not familiar with APA format, it is recommended that you use the References feature in Word for your citations and Reference List or refer to the "Citing and Writing" option under the Resources/Library/Get Help area in the LEO classroom. It is important to review the final format for APA-style correctness even if generated by Word. f. Include at least two (2) informational footnotes. Footnotes are not used to list a reference! Footnotes contain information about the topic to which the footnote has been attached. g. Place the references on a separate page following the body of the paper. Note: Use a hard return (CTRL Enter) after the end of your paper body and the start of the
  • 58. References page. 3. Organization of the content of the paper: Include the following sections in the paper (include, in bold, the headings identified here): a. Introduction - Identify the issue or idea. Explain why the topic was selected and what you are trying to achieve (what is your end goal). The introduction should not be more than half a page; details will be discussed in the follow-on areas. b. Areas of interest, activity or issue – Define the issue or idea in greater detail. Define the specific problem or problems or new idea. Identify other underlining or related issues as well as dependencies. Explain what impacts will result if not addressed. c. Research Findings – Summarize your research findings and what they contribute to the study of the issue or idea. You must identify (cite) the sources of the research or class material related to your topic that you include in the findings. d. Proposed solution(s), idea(s), courses of action(s). List solutions, ideas or courses of action with an analysis of its effectiveness (how will your suggestions affect or change the current situation). If more than one idea is suggested, provide an analysis that covers all proposed suggestions. e. Conclusion – Summarize the conclusions of your paper.
  • 59. A list of topics from which students can choose is provided here: Topics for Research Paper The focus of the paper should be on one of the following: 1. How has information technology led to the struggle between online and brick-and-mortar stores? What do the next 5-10 years look like? 2. How has information technology opened up the potential for 5G networks? Are there any downsides to the implementation of this technology? 3. How has information technology impacted the use of robots in your local stores? 4. How has information technology supported the development of monopolies – Amazon, Microsoft, telecom companies? Will these monopolies survive? 5. How has information technology supported the development of facial recognition software and the current issues related to its use? 6. How has information technology led to the use of biometrics and the potential for rise of an International “Big Brother”? 7. How has information technology led to the development of the Internet of Things and the concern about the impact of privacy laws (or lack thereof) on the IOT?
  • 60. 8. How has information technology supported the development of Facebook and other social media sites? Should social media sites be regulated? 9. Who/what is Huawei and what are the issues the U.S. and other countries are having with Huawei? 10. How has information technology changed the political process within the past 5 years? Writing Quality for the Research Paper • All Grammar, Verb Tenses, Pronouns, Spelling, Punctuation, and Writing Competency should be without error. • Be particularly careful about mis-matching a noun and pronoun. For example, if you say "A person does this…" then do not use "their" or "they" when referring to that person. "Person" is singular; "their" or "they" is plural. • Remember: there is not their, your is not you're, its is not it's, too is not to or two, site is not cite, and who should be used after an individual, not that. For example, "the person WHO made the speech" not "the person THAT made the speech." • in the previous sentence. It is more business-like to say "In a professional paper one should not use contractions," rather than saying, "In a professional paper you don't use contractions." • In a professional paper one does not use contractions (doesn't, don't, etc.) and one
  • 61. does not use the personal I, you or your. Use the impersonal as in the previous sentence. It is more business-like to say "In a professional paper one should not use contractions," rather than saying, "In a professional paper you don't use contractions." • Remember: spell-check, then proofread. Better yet, have a friend or colleague read it before submitting it. Read it out loud to yourself. Read it as if you are submitting it to your boss. Grading Criteria Paper Mechanics Format- title pg, font, margins, paper length 0.5 Title page included: Arial 12-point font used; margins set at 1"; body of the paper is 3-5 pages, double spaced. The title page and References page are not counted as part of the 3-5 pages of text. APA work - citations and references 0.5 A minimum of 3 correctly formatted citations matched to
  • 62. references; both citations and references in APA format. Footnotes 0.5 A minimum of 2 footnotes that contain additional information but are NOT references. Mechanics- grammar, spelling, etc. 1.5 Grammar, spellings, and punctuation correct throughout the paper. Content Introduction 2 This is a summary of the topic. Simply identify the issue without going into great detail, explain why was the topic selected and what the you are trying to achieve (what is your end goal). The introduction should not be more than half a page; details will be discussed in the follow-on areas. Issue 2.0 Define the issue or idea. Define the specific problem or problems or new idea. Identify other underlying or related issues as well as dependencies. Explain what impacts will result if not addressed. Findings 2.0 Identify research or class material related to your topic.
  • 63. Summarize your findings and what they contribute to the study of the issue or idea. Sources must be identified in citations and the related References list. Solution s/actions 3.0 List solution, idea or courses of action with an analysis of its effectiveness (how will your suggestions affect or change the current situation). If more than one idea is suggested, provide an analysis that covers all proposed suggestions. Conclusion 2 Summarize the conclusions of your paper. In a paragraph briefly identify the issue, the findings, your proposed solution/actions. However, do not simply repeat the words in the previous sections. You can find instructions on how to use the References tool in Word on a PC or on a Mac in a file included in the Assignment link.