Logistic regression is used to model the probability of binary and multiclass classification problems. It assumes a linear relationship between predictors and the log-odds of the target variable. The regression coefficients are estimated using maximum likelihood estimation in an iterative process. Model fit is assessed using measures like deviance and likelihood ratio tests rather than R^2, with smaller deviance indicating better fit. The predictive ability of logistic regression models can be evaluated using metrics like accuracy from a confusion matrix, cross-validation, and the area under the ROC curve (AUC).
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
Β
Logistic Regression Model Fitting & Assessment
1. Logistic Regression
Logistic model (or logit model) is used to model the probability of events realized in two
classes such as alive/dead or healthy/sick etc. This can be extended to model several classes of
events such as determining whether an image contains a cat, dog, lion, etc. Each object being
detected in the image would be assigned a probability between 0 and 1, the sum adding to one.
Consider a logistic model with two predictors x1 and x2, and one binary response variable y
which we denote as π = π(π = 1). We assume a linear relationship between the predictor
variables and the log-odds of the event. This relationship can be expressed as,
log
π
1 β π
= Ξ² + Ξ² π₯ + Ξ² π₯
We can recover the odds by exponentiating the log-odds.
π
1 β π
= π
By simple algebraic manipulation, the probability that Y=1 is,
π =
π
π + 1
=
1
1 + π ( )
The above formula shows that once the Ξ² are fixed, we can compute either the log-odds that
Y=1, or the probability that Y=1 for a given observation. The main use-case of a logistic model
is, given an observation (x1, x2), an estimate of the probability that Y=1 can be obtained. In most
applications, the base b of the logarithm is usually to be e. However, in some cases, it can be
easier to communicate the results by working in base 2 or base 10. The above bivariate case
can easily be extended to multilinear model,
log
π
1 β π
= Ξ² + Ξ² π₯ + Ξ² π₯ + β― + Ξ² π₯
Model fitting
The regression coefficients in logistic regression model are usually estimated using maximum
likelihood estimation. Unlike linear regression with normally distributed residuals, it is not
possible to find a closed-form expression for the coefficient values that maximize the likelihood
function, so that an iterative process must be used instead; for example Newton's method. This
process begins with a tentative solution, revises it slightly to see if it can be improved, and
repeats this revision until no more improvement is made, at which point the process is said to
have converged. In some instances, the model may not reach convergence. Non-convergence
of a model indicates that the coefficients are not meaningful because the iterative process was
unable to find appropriate solutions. A failure to converge may occur for a number of reasons
like having a large ratio of predictor to case, multicollinearity, sparseness, or complete
separation.
Goodness of fit in linear regression models is generally measured using R2
. Since this has no
direct analogue in logistic regression, alternative measures like deviance and likelihood ratio
tests are used. Deviance is analogous to the sum of squares calculations in linear regression
and is a measure of the lack of fit to the data in a logistic regression model. When a "saturated"
model is available (a model with a theoretically perfect fit), deviance is calculated by
comparing a given model with the saturated model. This computation gives the likelihood-ratio
test.
2. π· = β2ln
In the above equation, D represents the deviance and ln represents the natural logarithm. The
log of this likelihood ratio (the ratio of the fitted model to the saturated model) will produce a
negative value, hence the need for a negative sign. D can be shown to follow an
approximate chi-squared distribution. Smaller values indicate better fit as the fitted model
deviates less from the saturated model. When assessed upon a chi-square distribution,
nonsignificant chi-square values indicate very little unexplained variance and thus, good model
fit. Conversely, a significant chi-square value indicates that a significant amount of the variance
is unexplained.
Assessing the predictive ability of the model
One straightforward method to assess the predictivity is to compute the confusion matrix and
evaluate the accuracy. One can also resort to k-fold cross validation. As a last step, one may
also see the ROC (receiver operating characteristic) curve and calculate the AUC (area under
the curve) which are typical performance measurements for a binary classifier. The ROC is a
curve generated by plotting the true positive rate (TPR) against the false positive rate (FPR) at
various threshold settings while the AUC is the area under the ROC curve. As a rule of thumb,
a model with good predictive ability should have an AUC closer to 1 (1 is ideal) than to 0.5.