Seminar by
LIVI WILSON
DFK 1305
Discriminant analysis
• Discriminant analysis is a branch of multivariate
statistics. It is a regression based statistical
technique used in determining which particular
classification or group an item of data or
an object belongs to, on the basis of
its characteristics or essential features.
• Discriminant Analysis is a multivariate statistical
technique used when the dependent variable is
categorical and the independent variables are
quantitative.
• Discriminant Function Analysis (DA) undertakes the
same task as multiple linear regression by predicting an
outcome .
• In many cases, the dependent variable consists of two groups or
classifications, for example, male versus female, high versus low or
good credit risk versus bad credit risk, therefore to classify
between them we use Linear Discriminant Analysis (LDA).
• When three or more classifications are there for naturally
occurring groups such as low, medium and high, different
locations, this technique is referred to as Multiple Discriminant
Analysis (MDA).
Objective
• Its primary objective is to predict an individual's inclusion in
a group when his inclusion and a set of observations about
the group are known. This happens through a process of
discriminating one observed variable against the others.
• The second objective is to determine the quality of the
observed variables in the set. Determining variable quality
helps improve the analysis' margin of error.
Group Inclusion
• For example, to predict the probability that someone will be
included in a group of successful college graduates at
university, use the known observation set of scores, high
school grade point average, having an older sibling in college
and class rank.
Variable Quality
• The secondary objective of discriminant analysis is to
determine the quality of the variable used in your primary
prediction. In our example, once time has passed and you can
observe the group of successful college graduates, you can
model your prediction theory against the actual outcome. You
can determine which variables were best at predicting an
individual's successful inclusion.
Controling Predictive Error
• Analyzing the predictive theory against actual
results creates a third objective, improving the
predictive model for the future. Using your analysis,
you can show which variables were most relevant to
success.
Advantages
• Discrimination of different groups
• Accuracy of classification of groups can be determined
• Helps for categorical regression analysis
• Visual graphics makes clear understanding for the two or more
categories with computational logics.
• Linear discrimination cannot be used when subgroups are
stronger.
• The selection of the predictor variables are not strong until
a strong classification exists.
• It cannot be used when there is insufficient data to define
sample means
• If the number of observations are less, the discrimination
method cannot be used.
Disadvantages
Applications
• Widespread application in situations where the primary
objective is identifying the group to which an object belongs
• Agriculture- Fisheries, Crop studies, yield studies,
Geoinformatics, Bioinformatics
• Socio-economics and Behavioral studies of Fishermen
communities
• Morphometric analysis and taxonomic investigation
Applications
• Dynamics of the marine plankton, algae and
nekton on a spatial and temporal scale
• Stock structure studies in fish populations
• Hydrological and physico-chemical studies in
different water resources
• Bankruptcy prediction based on accounting ratios and other
financial variables (LDA)
• Face recognition (Computerized)
• Marketing –Different types of customers and products based on
surveys.
Applications
• The observations should be from the random sample.
• Each predictor variable is normally distributed.
• There must be at least two groups or categories.
• Each group or category must be well defined, clearly
differentiated from any other group(s).
• It deals large data sets only and invariable sample size also.
Assumptions
Hypothesis
• Discriminant analysis tests the following hypotheses:
H0: The group means of a set of independent variables
for two or more groups are equal.
Against
H1: The group means for two or more groups are not
equal
• This group means is referred to as a centroid.
Contd…
• If the overlap in the distribution is small, the
discriminant function separates the groups well.
• If the overlap is large, the function is a poor
discriminator between the groups.
Fig: Discriminant analysis between variables 1, 2 and 3
Discriment analysis

Discriment analysis

  • 1.
  • 2.
    Discriminant analysis • Discriminantanalysis is a branch of multivariate statistics. It is a regression based statistical technique used in determining which particular classification or group an item of data or an object belongs to, on the basis of its characteristics or essential features.
  • 3.
    • Discriminant Analysisis a multivariate statistical technique used when the dependent variable is categorical and the independent variables are quantitative. • Discriminant Function Analysis (DA) undertakes the same task as multiple linear regression by predicting an outcome .
  • 4.
    • In manycases, the dependent variable consists of two groups or classifications, for example, male versus female, high versus low or good credit risk versus bad credit risk, therefore to classify between them we use Linear Discriminant Analysis (LDA). • When three or more classifications are there for naturally occurring groups such as low, medium and high, different locations, this technique is referred to as Multiple Discriminant Analysis (MDA).
  • 5.
    Objective • Its primaryobjective is to predict an individual's inclusion in a group when his inclusion and a set of observations about the group are known. This happens through a process of discriminating one observed variable against the others. • The second objective is to determine the quality of the observed variables in the set. Determining variable quality helps improve the analysis' margin of error.
  • 6.
    Group Inclusion • Forexample, to predict the probability that someone will be included in a group of successful college graduates at university, use the known observation set of scores, high school grade point average, having an older sibling in college and class rank.
  • 7.
    Variable Quality • Thesecondary objective of discriminant analysis is to determine the quality of the variable used in your primary prediction. In our example, once time has passed and you can observe the group of successful college graduates, you can model your prediction theory against the actual outcome. You can determine which variables were best at predicting an individual's successful inclusion.
  • 8.
    Controling Predictive Error •Analyzing the predictive theory against actual results creates a third objective, improving the predictive model for the future. Using your analysis, you can show which variables were most relevant to success.
  • 9.
    Advantages • Discrimination ofdifferent groups • Accuracy of classification of groups can be determined • Helps for categorical regression analysis • Visual graphics makes clear understanding for the two or more categories with computational logics.
  • 10.
    • Linear discriminationcannot be used when subgroups are stronger. • The selection of the predictor variables are not strong until a strong classification exists. • It cannot be used when there is insufficient data to define sample means • If the number of observations are less, the discrimination method cannot be used. Disadvantages
  • 11.
    Applications • Widespread applicationin situations where the primary objective is identifying the group to which an object belongs • Agriculture- Fisheries, Crop studies, yield studies, Geoinformatics, Bioinformatics • Socio-economics and Behavioral studies of Fishermen communities • Morphometric analysis and taxonomic investigation
  • 12.
    Applications • Dynamics ofthe marine plankton, algae and nekton on a spatial and temporal scale • Stock structure studies in fish populations • Hydrological and physico-chemical studies in different water resources
  • 13.
    • Bankruptcy predictionbased on accounting ratios and other financial variables (LDA) • Face recognition (Computerized) • Marketing –Different types of customers and products based on surveys. Applications
  • 14.
    • The observationsshould be from the random sample. • Each predictor variable is normally distributed. • There must be at least two groups or categories. • Each group or category must be well defined, clearly differentiated from any other group(s). • It deals large data sets only and invariable sample size also. Assumptions
  • 15.
    Hypothesis • Discriminant analysistests the following hypotheses: H0: The group means of a set of independent variables for two or more groups are equal. Against H1: The group means for two or more groups are not equal • This group means is referred to as a centroid.
  • 16.
    Contd… • If theoverlap in the distribution is small, the discriminant function separates the groups well. • If the overlap is large, the function is a poor discriminator between the groups.
  • 17.
    Fig: Discriminant analysisbetween variables 1, 2 and 3

Editor's Notes

  • #3 Regression based statistical technique used in determining which particular classification or group (such as 'ill' or 'healthy') an item of data or an object (such as a patient) belongs to on the basis of its characteristics or essential features. It differs from group building techniques such as cluster analysis in that the classifications orgroups to choose from must be known in advance. Read more: http://www.businessdictionary.com/definition/discriminant-analysis.html#ixzz3JgMcS9dz
  • #18 Fig: Discriminant analysis between variables 1, 2 and 3