 Tayyaba Latif
ANOVA
Sir Ronald A. Fisher developed
the procedure known as analysis of
variance(ANOVA)
An analysis of variance (ANOVA) is used
to compare the means of two or
more independent samples
and to test whether the difference
between the means are statistically
significant.
• One-way ANOVA is also called
single factor analysis of
variance
• It’s an extension of a t-test for
independent samples.
• Used when there are two or more
independent groups.
HYPOTHESES FOR THE ONE-WAY ANOVA
The tests are non-directional in that the null hypothesis specifies that
H0: all means are equal
and the alternative hypothesis simply states that
Ha: at least one mean is different.
from which the K samples are selected are equal. Or that each of the group
means is equal.
K -number of levels of the independent variable
 IV : Nominal/ Categorical
o Measured at the nominal scales
o 2 or more categorical, independent groups.
o Defines the groups that are compared.
Example
 Political preferences
 Place of residence
 Profession
 DV: Interval/ Ratio
o 1-Dependent variable
o Measured at the interval or ratio scales (i.e., they are continuous).
Examples
 temperature in centigrade
 exam performance (measured from 0 to 100),
 weight (measured in kg).
 Salary(measured in Rupees)
Usama and co. is a leading manufacturer of cookies. The company
has launched a new product in four major cities of Pakistan;
Karachi, Gujranwala, Multan, Lahore. After one month, the
company realized that there is a difference between price per
pack of cookies across cities. In order to make a quick inference the
company collected the data about the price from six randomly
selected stores across the four cities.
Independent Variable: City (Karachi, Gujranwala, Multan, Lahore.)
Dependent Variable: Price
 Cases are randomly sampled or assigned
 Each group is a simple random sample from its population.
 Each individual in the population has an equal probability of
being selected in the sample.
 This reduces the chance that differences in materials or
conditions strongly bias results.
 Random samples are more likely to be representative of the
population; therefore you can be more confident with your
statistical inferences with a random sample.
Here are some common approaches to making sure a sample is
randomly created:
 Systematic selection (every nth unit or at specific times during the
day).
 Avoiding the use of judgment or convenience to select samples.
 Cases are independent from each other
 Commonly referred to as the assumption of
independence.
 The observations are random
 Independent samples from the
populations. Independence means the value of one
observation does not influence or affect the value of
other observations.
 There is no relationship between the observations in
each group or between the groups themselves.
Example:
 One person’s score should not provide any clue as to how any of the other
people should score.
 That is, one event does not depend on another.
 Normality of DV for each level of IV
 Commonly referred to as the assumption of normality.
 The distributions of the populations from which the
samples are selected are normal.
 Your dependent variable should be approximately normally
distributed for each category of the IV that is being compared.
 Test for normality using the Shapiro-Wilk test of normality.
 Normality of DV for each level of IV
 Cell size are relatively equal
 Sample size is Sufficient (rule of thumb ; error df>20)
Example
• Usama and co. is a leading manufacturer of cookies. The
company has launched a new product in four major cities of
pakistan; Karachi, Gujranwala, Multan, Lahore. After one
month, the company realized that there is a difference
between price per pack of cookies across cities. In order to
make a quick inference the company collected the data about
the price from six randomly selected stores across the four
cities.
Check the violations
• According to the Donaldson 1968, we can ignore the
violation due to the robustness of One way Anova
test.
 Homogeneity of Variance
 Population variances in different levels of the IV are
equal
 Cell sizes are relatively equal
 Fmax <10
F-Max Calculation
• F-max= larger variance/ smaller variance
• F-max= 0.6416650816/ 0.27499536
• F-max= 2.33 which is less than 10 so, this assumption is also
met.
 Homogeneity of Variance
 Referred to as the assumption of homogeneity
of variance.
 This means that the population variances in
each group are equal.
 Test this assumption in SPSS Statistics using
Levene's test for homogeneity of variances.
There are two tests that you can run that are applicable when the assumption of
homogeneity of variances has been violated:
(1) Welch ANOVA also use a different post-hoc test (i.e Games-Howell).
(2) Brown and Forsythe test.
Assumptions underlying the one way anova

Assumptions underlying the one way anova

  • 2.
  • 3.
    ANOVA Sir Ronald A.Fisher developed the procedure known as analysis of variance(ANOVA) An analysis of variance (ANOVA) is used to compare the means of two or more independent samples and to test whether the difference between the means are statistically significant.
  • 4.
    • One-way ANOVAis also called single factor analysis of variance • It’s an extension of a t-test for independent samples. • Used when there are two or more independent groups.
  • 5.
    HYPOTHESES FOR THEONE-WAY ANOVA The tests are non-directional in that the null hypothesis specifies that H0: all means are equal and the alternative hypothesis simply states that Ha: at least one mean is different. from which the K samples are selected are equal. Or that each of the group means is equal. K -number of levels of the independent variable
  • 8.
     IV :Nominal/ Categorical o Measured at the nominal scales o 2 or more categorical, independent groups. o Defines the groups that are compared. Example  Political preferences  Place of residence  Profession
  • 9.
     DV: Interval/Ratio o 1-Dependent variable o Measured at the interval or ratio scales (i.e., they are continuous). Examples  temperature in centigrade  exam performance (measured from 0 to 100),  weight (measured in kg).  Salary(measured in Rupees)
  • 10.
    Usama and co.is a leading manufacturer of cookies. The company has launched a new product in four major cities of Pakistan; Karachi, Gujranwala, Multan, Lahore. After one month, the company realized that there is a difference between price per pack of cookies across cities. In order to make a quick inference the company collected the data about the price from six randomly selected stores across the four cities. Independent Variable: City (Karachi, Gujranwala, Multan, Lahore.) Dependent Variable: Price
  • 11.
     Cases arerandomly sampled or assigned  Each group is a simple random sample from its population.  Each individual in the population has an equal probability of being selected in the sample.  This reduces the chance that differences in materials or conditions strongly bias results.  Random samples are more likely to be representative of the population; therefore you can be more confident with your statistical inferences with a random sample.
  • 12.
    Here are somecommon approaches to making sure a sample is randomly created:  Systematic selection (every nth unit or at specific times during the day).  Avoiding the use of judgment or convenience to select samples.
  • 13.
     Cases areindependent from each other  Commonly referred to as the assumption of independence.  The observations are random  Independent samples from the populations. Independence means the value of one observation does not influence or affect the value of other observations.  There is no relationship between the observations in each group or between the groups themselves.
  • 14.
    Example:  One person’sscore should not provide any clue as to how any of the other people should score.  That is, one event does not depend on another.
  • 15.
     Normality ofDV for each level of IV  Commonly referred to as the assumption of normality.  The distributions of the populations from which the samples are selected are normal.  Your dependent variable should be approximately normally distributed for each category of the IV that is being compared.  Test for normality using the Shapiro-Wilk test of normality.
  • 16.
     Normality ofDV for each level of IV  Cell size are relatively equal  Sample size is Sufficient (rule of thumb ; error df>20)
  • 17.
    Example • Usama andco. is a leading manufacturer of cookies. The company has launched a new product in four major cities of pakistan; Karachi, Gujranwala, Multan, Lahore. After one month, the company realized that there is a difference between price per pack of cookies across cities. In order to make a quick inference the company collected the data about the price from six randomly selected stores across the four cities.
  • 24.
    Check the violations •According to the Donaldson 1968, we can ignore the violation due to the robustness of One way Anova test.
  • 25.
     Homogeneity ofVariance  Population variances in different levels of the IV are equal  Cell sizes are relatively equal  Fmax <10
  • 28.
    F-Max Calculation • F-max=larger variance/ smaller variance • F-max= 0.6416650816/ 0.27499536 • F-max= 2.33 which is less than 10 so, this assumption is also met.
  • 29.
     Homogeneity ofVariance  Referred to as the assumption of homogeneity of variance.  This means that the population variances in each group are equal.  Test this assumption in SPSS Statistics using Levene's test for homogeneity of variances.
  • 30.
    There are twotests that you can run that are applicable when the assumption of homogeneity of variances has been violated: (1) Welch ANOVA also use a different post-hoc test (i.e Games-Howell). (2) Brown and Forsythe test.