Logistic Ordinal Regression

Logistic Ordinal Regression
Wendy C Wong

Michal K and Nidhi M

Table of Content
• Ordinal Regression

• Building Linear Models Ordinal Regression

• Linear Models used;

• model parameters updates;

• model predictions

• H2O implementations

• Example and results

What is Ordinal Regression?
• Ordinal regression/classification or ranking learning is a
regression analysis used to predict an ordinal variable (a
variable where the relative ordering between different
values is significant);

• Ordinal regression are used most often in social sciences
to model human levels of preference/satisfaction (levels
1-5 for very poor, poor, average, good, excellent)

Linear Models used for Ordinal Regression
• Let be our predictor of size p and be the associated
ordinal response. Note: takes value from 1 to K.

• A GLM is used to ﬁt ONE coeﬃcient vector for all classes of
the ordinal variable response and a set of thresholds to a data
set.

• model the CUMULATIVE PROBABILITY as the logistic function

• Note that the separating hyperplanes are parallel for all
classes. The non-decreasing vector is
used to separate all the classes.

• Ordered Probit-standard normal distribution and Proportional
Hazards:
xi
1 + exp(−exp(βT
xi + θj))
yj
θ1 < θ2 < . . . < θK−1
P(y < = j |xi) = σ(βT
xi + θj) = 1/(1 + exp(−βT
xi − θj)) = γij
yi

Model Parameters Updates
• The likelihood function:

• The log-likelihood function is

• The pdfs are:

• for j = 1

• for j = K

• To ﬁnd the model parameters, maximize the log-likelihood
function minus your favorite regularization penalties. Take
the derivatives and update each model parameter with a
learning rate*the derivative for that model parameter…..
N−1
∏
i=0
pd f (yi = yrespi)
N−1
∑
n=0
log(σ(βT
xi + θyj
) − σ(βT
xi + θyj−1))
pd f (yi = 1) = σ(βT
xi + θ1)
pd f (yi = K ) = 1 − pd f (yi = K − 1)

Model Predictions
• The log proportional odds is:

• When the proportional odds > 1 (log(.) > 0), it implies that
it is more probable that the data point belongs to class
j or lower than belonging to classes j+1 and beyond.

• This implies that a data point is classiﬁed as:

• class K:

• class j (>=1 and <= K-1): and
log(
γij
1 − γij
) =
1
1 + exp(−βT xi − θj)
1 −
1
1 + exp(−βT xi − θj))
= βT
xi + θj
xi
xi
βT
xi + θK−1 > 0
βT
xi + θj > 0 βT
xi + θj+1 < = 0

Alternate Model Parameters Optimization
• I decided to modify the model parameters to directly
increase the probability of correct predictions.

• Hence, I will optimize the error function
where

• for correct prediction

• for incorrect prediction
L(β, θ, xi, yrespi) = (βT
xi + θj)2
N−1
∑
i=0
L(β, θ, xi, yrespi)
L(β, θ, xi, yrespi) = 0
βT
xi + θj < = 0
j < yrespiβT
xi + θj > 0
j > = yrespi
βT
xi + θj > 0
j < yrespi
βT
xi + θj < = 0
j > = yrespi

H2O Implementation
• To use ordinal regression, set family=“ordinal”;

• To change model parameters using the likelihood function, do not set solver or
set solver to “GRADIENT_DESCENT_LH”

• To change model parameters using the other loss function, set solver to
“GRADIENT_DESCENT_SQERR”

• Gradient descent: first-order method, use gridsearch to find good learning rate,
regularization values (beta, alpha)….

• In R: ordinal.fit <- h2o.glm(y=Y, x=X, training_frame=
Dtrain, family="ordinal",
solver="GRADIENT_DESCENT_SQERR")
• In Python:
ordinal_fit = H2OGeneralizedLinearEstimator(family="ordinal",
solver=“GRADIENT_DESCENT_LH”)

ordinal_fit.train(y=Y, x=X, training_frame=Dtrain)

Summary/Results
Table 1
Dataset LH
performance
SQERR
performance
R ordinal
5 columns with enum 0.9959 0.99751
5 numerical columns 0.99968 0.999445
20 columns with enums 0.998 0.999155
Multinomial dataset 0.47372 0.45527
nidhi dataset 0.5675 0.58 0.5775

Reference
• Peter McCullagh, Regression Models for Ordinal Data, J.
R. Statist, Soc. B(1980), 42, No 2, pp.109-142

• Wikipedia, Ordinal Regression

• Alan Agresti, “Analysis of Ordinal Categorical data”, John
Wiley & Sons, Inc. July, 2012

Logistic Ordinal Regression

Recommended

Recommended

More Related Content

Similar to Logistic Ordinal Regression

Similar to Logistic Ordinal Regression (20)

More from Sri Ambati

More from Sri Ambati (20)

Recently uploaded

Recently uploaded (20)

Logistic Ordinal Regression