Discriminant analysis is a multivariate
statistical technique used for classifying a
set of observations into pre defined groups.
understand group differences and to
predict the likelihood that a particular entity
will belong to a particular class or group
based on independent variables.
main purpose is to classify a subject into
one of the two groups on the basis of some
2) A second purpose of the discriminant analysis
is to study the relationship between group
membership and the variables used to
predict the group membership.
Situations for its use
the dependent variable is
dichotomous or multichotomous.
variables are metric, i.e.
interval or ratio.
Application of discriminant
identify the characteristics on the basis
of which one can classify an individual as1. basketballer or volleyballer on the basis
of anthropometric variables.
2. High or low performer on the basis of skill.
3. Juniors or seniors category on the basis
of the maturity parameters.
What we do in discriminant
It is also known as discriminant function analysis.
In, discriminant analysis, the dependent variable is
a categorical variable, whereas independent
variables are metric.
after developing the discriminant model, for a
given set of new observation the discriminant
function Z is computed, and the subject/ object is
assigned to first group if the value of Z is less than 0
and to second group if more than 0. This criterion
holds true if an equal number of observations are
taken in both the groups for developing a
1. Sample size
group sizes of the dependent should not be
grossly different i.e. 80:20, here logistic
regression may be prefer.
should be at least five times the number of
2. Normal distribution
Each of the independent variable is normally
3. Homogeneity of variances / covariances
All variables have linear and homoscedastic
Outliers should not be present in the data.
DA is highly sensitive to the inclusion of
There should be any correlation among the
6. Mutually exclusive
The groups must be mutually exclusive, with
every subject or case belonging to only one
Each of the allocations for the dependent
categories in the initial classiﬁcation are
No independent variables should have a zero
variability in either of the groups formed by
the dependent variable.
Variables in the analysis
2) Discriminant function
A discriminant function is a latent variable
which is constructed as a linear combination
of independent variables, such that
Z= c+b1X1+ b2X2+…+bnXn
The discriminant function is also known as
canonical root. This discriminant function is
used to classify the subject/cases into one of
the two groups on the basis of the observed
values of the predictor variables
3) Classification matrix
In DA, it serves as a yardstick in measuring the
accuracy of a model in classifying an individual
/case into one of the two groups. It is also known
as confusion matrix, assignment matrix,or
prediction matrix. It tells us as to what percentage
of the existing data points are correctly classified
by the model developed in DA.
4) Stepwise method of discriminant analysis
Discriminant function can be developed either by
entering all independent variables together or in
stepwise depending upon whether the study is
confirmatory or exploratory.
5) Power of discriminatory variables
After developing the model in the discriminant
analysis based on the selected independent
variables, it is important to know the relative
importance of the variables so selected.
6) Box’s M Test
By using Box’s M Tests, we test a null hypothesis that
the covariance matrices do not differ between
groups formed by the dependent variable. If the
Box’s M Test is insignificant, it indicates that the
assumptions required for DA holds true.
7) Eigen values
Eigen value is the index of overall fit.
8) WILKS lambda
It measures the efficiency of discriminant
function in the model.
Its value shows, how much percentage of
variability in dependent variable is not
explained by the independent variables.
9) Cannonial correlation
The canonical correlation is the multiple
correlation between the predictors and the
discriminant function. With only one function it
provides an index of overall model ﬁt which is
interpreted as being the proportion of
variance explained (R2).
STEPS IN ANALYSIS :
In step one the
which have the
discriminating power are
function model is
developed by using
the coefficients of
STEPS IN ANALYSIS Contd…
In step three
Wilk’s lambda is
In step four the
in discriminating the
groups are being
STEPS IN ANALYSIS Contd…
In step five classification of subjects to their
respective group is being made.
APPLICATION OF SPSS
Eg. To identify the players into different categories
during selection process