YOU
are WELCOME to the session on
MULTIPLE DISCRIMINANT
ANALYSIS
To begin with…
Let’s start with…
Let’s start with…
• Bachche Ki Dua is a
1902 Urdu poem
by Allama
Muhammad Iqbal.
• It is a (child's) prayer
to God seeking
benevolent qualities of
character and a life
lived serving
humanity.
We start the session with a
strong belief…
“I believe if we
had a larger
conception of
our possibilities, a
larger faith in
ourselves,
we could
accomplish
infinitely more.”
A Faith…
All power is within you;
you can do anything
and everything.
Let’s start with two
thoughts…
Thought #1:
Thought #2:
You are
missing the
excitement of
life!
This Workshop is a GREAT
OPPORTUNITY to learn!!!
STATISTICS is a journey from
data to WISDOM!
STATISTICS is a journey from
data to WISDOM!
We have to LEARN
STATISTICS to build …
•… our own capabilities and also, to
build our students capabilities!
…this workshop will prove
to be a milestone in our
learning experience!!!
Trust that …
TIME TO TALK ABOUT
RESEARCH
Research is …
……….MEDITATION
Research is …
ISOLATED COOPERATIVE
NARROW WIDE
Research is …
UNSTRUCTURED STRUCTURED
FRUSTRATING SATISFYING
Research is
a journey
from known
to
unknown.
What comes to your mind when you read
such a news item?
Research is about IDEAS!
Again, look at it and tell me what IDEA
comes to your mind?
Now, let’s look at another piece
of information…
What IDEA comes
to your mind?
Is there any commonality in
the ideas about these two?
Are you not interested in
finding out…
• …what makes a loan – NPA?
• …what makes Indians not to go for equity
investment?
Also, are you not interested in
predicting…
• …when a loan will become NPA?
• …whether a particular investor is likely to
invest in equity?
How many of you
are interested in it?
Focusing on precise issues…
Do we have a ‘classification’ issue?
Are we looking for a prediction tool?
Can we determine factors discriminating
among certain items or objects?
Do we have a tool
that could help us in
this regard?
Yes, we have a tool that
could help us –
• In finding out what are the factors that discriminates;
• In classification of items and objects; and
• In predicting to which group a particular item will
belong.
And, it is …
Understanding the concept of
Multiple Discriminant Analysis…
Multiple Discriminant Analysis
• It is a technique for generating a linear combination of
independent variables (called a discriminant function)
in such a way that values on it (called discriminant
scores) are similar for sample units within pre-
specified subgroups and are maximally separated
across the subgroups.
• The dependent variable in discriminant analysis is
categorical, and independent variables are interval or
ratio scale variables.
Multiple Discriminant Analysis
(continued)
• Discriminant Analysis is used to analyze relationships
between a non-metric dependent variable and metric
or dichotomous independent variables.
• Discriminant Analysis attempts to use the
independent variables to distinguish among the
groups or categories of the dependent variable.
Understanding how Multiple
Discriminant Analysis works…
Multiple Discriminant Analysis –
how it works?
• It works by creating a new variable called the discriminant
function score which is used to predict to which group a case
belongs.
• In it, one computes the coefficients for the independent
variables that maximize the measure of distance between the
groups defined by the dependent variable.
• The discriminant function is similar to a regression equation
in which the independent variables are multiplied by
coefficients and summed to produce a score-called z-score..
Understand the following:
• DISCRIMINANT FUNCTION - is a mathematical
function representing linear combination of
independent variables employed in a discriminant
analysis model.
• DISCRIMINANT WEIGHT (COEFFICIENT) - is a
coefficient of an independent variable in a
discriminant function.
• DISCRIMINANT SCORE - is the value of the
discriminant function obtained for a unit by inserting
the units independent variables values into the
discriminant function. It is also known as .
Multiple Discriminant Analysis
Model…
𝒛 = 𝜶 + 𝜷𝟏𝑿𝟏 + 𝜷𝟐𝑿𝟐 + 𝜷𝟑𝑿𝟑 + ⋯ + 𝜷𝒌𝑿𝒌
• WHERE -
Z = Discriminant Score;
Xi = ith Independent Variable; and
bi = Discriminant Coefficient/ Weight
Visualizing how Multiple
Discriminant Analysis works…
• Conceptually, we can visualize the discriminant
function or equation as defining the boundary between
groups.
• Discriminant scores are standardized, so that if the
score falls on one side of the boundary (standard score
less than zero, the case is predicted to be a member of
one group) and if the score falls on the other side of
the boundary (positive standard score), it is predicted
to be a member of the other group.
Look at the following graph…
What do you
say about it?
Understanding assumptions of
Discriminant Analysis…
Assumptions of Multiple Discriminant
Analysis…
1. Two or more groups;
2. At least two cases per group;
3. Any number of discriminating variables, provided that it is
less than the total number of cases minus two;
4. Discriminating or independent variables are measured at
least at the interval measure;
5. The covariance matrices for each group must be equal;
6. No discriminating variable may be linear combination of other
discriminating variables.
7. Each group has been drawn from a population with a
multivariate normal distribution on the discriminating
variables.
Understanding how a Discriminant
Function is derived…
How to derive the Discriminating
coefficients for a Discriminating Function?
• Discrimination Function is derived by using the following rule -
– Determine the values of the discriminating co-efficient
in such a manner that the variation between the groups
should be maximized and the variation within the
groups should be minimized.
• It is also assumed that each case has 50% probability belonging to
either group. It is known as Fisher’s Criterion for the classification.
• Discriminating Coefficients are of two types - one is
Unstandardized Coefficients and another is Standardized
Coefficients.
• UNSTANDARDIZED COEFFICIENTS are those which
are not adjusted for variances and their variances may
not be equal to one. They are used to calculate the
discriminating score. But, they do not provide a
meaningful comparison of the variables discriminating
power.
• STANDARDIZED COEFFICIENTS provide a meaningful
comparative between the discriminating power of the
independent variables.
Unstandardized Vs Standardized
Coefficients
How many discriminating functions
shall we get?
• If the dependent variable defines two groups, one statistically
significant discriminant function is required to distinguish the
groups; if the dependent variable defines three groups, two
statistically significant discriminant functions are required to
distinguish among the three groups; etc.
• The number of possible discriminant functions in an analysis
is limited to the smaller of the number of independent
variables or one less than the number of groups defined by the
dependent variable.
• If a discriminant function is able to distinguish among groups,
it must have a strong relationship to at least one of the
independent variables.
How to decided about the membership of a
particular case – to which group that belongs
to?
• There are three methods to classify group memberships:
1. Maximum Likelihood Method: Assign case to group k if the probability
of membership is greater in group k than any other group (for 2 group
problem, it means that the probability is more than 50%)
2. Fisher (Linear) Classification Functions: Assign a membership to
group k if its score on the function for group k is greater than any other
function scores
3. Distance Function: assign membership to group k if its distance to the
centroid of the group is minimum
– Note: SPSS uses Maximum likelihood method
Testing the results
statistically…
Testing equality of covariance…
• Box’s M test (also called Box’s Test) is a test for
Equivalence of Covariance Matrices.
• It is a parametric test used to test if two or
more covariance matrices are equal (homogeneous).
• The null hypothesis for this test is that the observed
covariance matrices for the dependent variables are
equal across groups. In other words, a non-
significant test result (i.e. one with a large p-value)
indicates that the covariance matrices are equal.
Unexplained Variance and
Wilks' lambda
• Wilks' lambda is an index of unexplained variance; lower
it is, better it is.
• Testing of Wilks' lambda: it is done through Chi-
Square Statistics.
• The associated chi-square statistic tests the
hypothesis that the means of the functions listed are
equal across groups. The small significance value
indicates that the discriminant function does better than
chance at separating the groups.
Structure Matrix…
• The structure matrix table shows the correlations of
each variable with each discriminant function.
• By identifying the largest absolute correlations
associated with each discriminant function we gain
insight into how well each variable is correlated with a
discriminant function.
Having understood all
these…
What Next?
Practice…
Go to SPSS
Any question?
That’s all from my side
Trust that you must be feeling
comfortable with Multiple
Discriminant Analysis!!!!!
It is a humble beginning for all of
us……
But, still a long way to go…
DISCRIMINANT ANALYSIS.pptx
DISCRIMINANT ANALYSIS.pptx

DISCRIMINANT ANALYSIS.pptx

  • 1.
    YOU are WELCOME tothe session on MULTIPLE DISCRIMINANT ANALYSIS
  • 2.
  • 3.
    Let’s start with… •Bachche Ki Dua is a 1902 Urdu poem by Allama Muhammad Iqbal. • It is a (child's) prayer to God seeking benevolent qualities of character and a life lived serving humanity.
  • 8.
    We start thesession with a strong belief… “I believe if we had a larger conception of our possibilities, a larger faith in ourselves, we could accomplish infinitely more.” A Faith…
  • 9.
    All power iswithin you; you can do anything and everything.
  • 11.
    Let’s start withtwo thoughts…
  • 12.
  • 13.
    Thought #2: You are missingthe excitement of life!
  • 14.
    This Workshop isa GREAT OPPORTUNITY to learn!!!
  • 15.
    STATISTICS is ajourney from data to WISDOM! STATISTICS is a journey from data to WISDOM!
  • 16.
    We have toLEARN STATISTICS to build … •… our own capabilities and also, to build our students capabilities!
  • 17.
    …this workshop willprove to be a milestone in our learning experience!!! Trust that …
  • 19.
    TIME TO TALKABOUT RESEARCH
  • 20.
  • 21.
    Research is … ISOLATEDCOOPERATIVE NARROW WIDE
  • 22.
    Research is … UNSTRUCTUREDSTRUCTURED FRUSTRATING SATISFYING
  • 23.
    Research is a journey fromknown to unknown.
  • 24.
    What comes toyour mind when you read such a news item?
  • 25.
  • 26.
    Again, look atit and tell me what IDEA comes to your mind?
  • 27.
    Now, let’s lookat another piece of information… What IDEA comes to your mind?
  • 28.
    Is there anycommonality in the ideas about these two?
  • 29.
    Are you notinterested in finding out… • …what makes a loan – NPA? • …what makes Indians not to go for equity investment?
  • 30.
    Also, are younot interested in predicting… • …when a loan will become NPA? • …whether a particular investor is likely to invest in equity? How many of you are interested in it?
  • 31.
    Focusing on preciseissues… Do we have a ‘classification’ issue? Are we looking for a prediction tool? Can we determine factors discriminating among certain items or objects? Do we have a tool that could help us in this regard?
  • 32.
    Yes, we havea tool that could help us – • In finding out what are the factors that discriminates; • In classification of items and objects; and • In predicting to which group a particular item will belong. And, it is …
  • 34.
    Understanding the conceptof Multiple Discriminant Analysis…
  • 35.
    Multiple Discriminant Analysis •It is a technique for generating a linear combination of independent variables (called a discriminant function) in such a way that values on it (called discriminant scores) are similar for sample units within pre- specified subgroups and are maximally separated across the subgroups. • The dependent variable in discriminant analysis is categorical, and independent variables are interval or ratio scale variables.
  • 36.
    Multiple Discriminant Analysis (continued) •Discriminant Analysis is used to analyze relationships between a non-metric dependent variable and metric or dichotomous independent variables. • Discriminant Analysis attempts to use the independent variables to distinguish among the groups or categories of the dependent variable.
  • 37.
  • 38.
    Multiple Discriminant Analysis– how it works? • It works by creating a new variable called the discriminant function score which is used to predict to which group a case belongs. • In it, one computes the coefficients for the independent variables that maximize the measure of distance between the groups defined by the dependent variable. • The discriminant function is similar to a regression equation in which the independent variables are multiplied by coefficients and summed to produce a score-called z-score..
  • 39.
    Understand the following: •DISCRIMINANT FUNCTION - is a mathematical function representing linear combination of independent variables employed in a discriminant analysis model. • DISCRIMINANT WEIGHT (COEFFICIENT) - is a coefficient of an independent variable in a discriminant function. • DISCRIMINANT SCORE - is the value of the discriminant function obtained for a unit by inserting the units independent variables values into the discriminant function. It is also known as .
  • 40.
    Multiple Discriminant Analysis Model… 𝒛= 𝜶 + 𝜷𝟏𝑿𝟏 + 𝜷𝟐𝑿𝟐 + 𝜷𝟑𝑿𝟑 + ⋯ + 𝜷𝒌𝑿𝒌 • WHERE - Z = Discriminant Score; Xi = ith Independent Variable; and bi = Discriminant Coefficient/ Weight
  • 41.
    Visualizing how Multiple DiscriminantAnalysis works… • Conceptually, we can visualize the discriminant function or equation as defining the boundary between groups. • Discriminant scores are standardized, so that if the score falls on one side of the boundary (standard score less than zero, the case is predicted to be a member of one group) and if the score falls on the other side of the boundary (positive standard score), it is predicted to be a member of the other group.
  • 42.
    Look at thefollowing graph… What do you say about it?
  • 43.
  • 44.
    Assumptions of MultipleDiscriminant Analysis… 1. Two or more groups; 2. At least two cases per group; 3. Any number of discriminating variables, provided that it is less than the total number of cases minus two; 4. Discriminating or independent variables are measured at least at the interval measure; 5. The covariance matrices for each group must be equal; 6. No discriminating variable may be linear combination of other discriminating variables. 7. Each group has been drawn from a population with a multivariate normal distribution on the discriminating variables.
  • 45.
    Understanding how aDiscriminant Function is derived…
  • 46.
    How to derivethe Discriminating coefficients for a Discriminating Function? • Discrimination Function is derived by using the following rule - – Determine the values of the discriminating co-efficient in such a manner that the variation between the groups should be maximized and the variation within the groups should be minimized. • It is also assumed that each case has 50% probability belonging to either group. It is known as Fisher’s Criterion for the classification. • Discriminating Coefficients are of two types - one is Unstandardized Coefficients and another is Standardized Coefficients.
  • 47.
    • UNSTANDARDIZED COEFFICIENTSare those which are not adjusted for variances and their variances may not be equal to one. They are used to calculate the discriminating score. But, they do not provide a meaningful comparison of the variables discriminating power. • STANDARDIZED COEFFICIENTS provide a meaningful comparative between the discriminating power of the independent variables. Unstandardized Vs Standardized Coefficients
  • 48.
    How many discriminatingfunctions shall we get? • If the dependent variable defines two groups, one statistically significant discriminant function is required to distinguish the groups; if the dependent variable defines three groups, two statistically significant discriminant functions are required to distinguish among the three groups; etc. • The number of possible discriminant functions in an analysis is limited to the smaller of the number of independent variables or one less than the number of groups defined by the dependent variable. • If a discriminant function is able to distinguish among groups, it must have a strong relationship to at least one of the independent variables.
  • 49.
    How to decidedabout the membership of a particular case – to which group that belongs to? • There are three methods to classify group memberships: 1. Maximum Likelihood Method: Assign case to group k if the probability of membership is greater in group k than any other group (for 2 group problem, it means that the probability is more than 50%) 2. Fisher (Linear) Classification Functions: Assign a membership to group k if its score on the function for group k is greater than any other function scores 3. Distance Function: assign membership to group k if its distance to the centroid of the group is minimum – Note: SPSS uses Maximum likelihood method
  • 50.
  • 51.
    Testing equality ofcovariance… • Box’s M test (also called Box’s Test) is a test for Equivalence of Covariance Matrices. • It is a parametric test used to test if two or more covariance matrices are equal (homogeneous). • The null hypothesis for this test is that the observed covariance matrices for the dependent variables are equal across groups. In other words, a non- significant test result (i.e. one with a large p-value) indicates that the covariance matrices are equal.
  • 52.
    Unexplained Variance and Wilks'lambda • Wilks' lambda is an index of unexplained variance; lower it is, better it is. • Testing of Wilks' lambda: it is done through Chi- Square Statistics. • The associated chi-square statistic tests the hypothesis that the means of the functions listed are equal across groups. The small significance value indicates that the discriminant function does better than chance at separating the groups.
  • 53.
    Structure Matrix… • Thestructure matrix table shows the correlations of each variable with each discriminant function. • By identifying the largest absolute correlations associated with each discriminant function we gain insight into how well each variable is correlated with a discriminant function.
  • 54.
  • 55.
  • 56.
  • 57.
  • 59.
    Trust that youmust be feeling comfortable with Multiple Discriminant Analysis!!!!!
  • 60.
    It is ahumble beginning for all of us…… But, still a long way to go…