CHI-SQUARE DISTRIBUTION
Bipul Kumar Sarker
Lecturer
BBA Professional
Habibullah Bahar University College
Chapter-07, Part-02
Introduction:
• The Chi-square test is one of the most commonly used non-parametric
test, in which the sampling distribution of the test statistic is a chi-square
distribution, when the null hypothesis is true.
• It was introduced by Karl Pearson as a test of association. The Greek
Letter χ2 is used to denote this test.
• It can be applied when there are few or no assumptions about the
population parameter.
• It can be applied on categorical data or qualitative data using a
contingency table.
• Used to evaluate unpaired/unrelated samples and proportions.
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Definition:
The Chi- square (𝜒2
) test is one of the simplest and most widely used non
parametric tests in statistical work.
The 𝜒2 test was first used by Karl Pearson in the year 1900. The quantity 𝜒2
describes the magnitude of the discrepancy between theory and observation.
It is defined as:
𝝌 𝟐
=
𝑂𝑖 − 𝐸𝑖
2
𝐸𝑖
Where,
O refers to the observed frequencies
And E refers to the expected frequencies.
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Note:
 If 𝜒2 is zero, it means that the observed and expected frequencies coincide with
each other.
 The greater the discrepancy between the observed and expected frequencies the
greater is the value of 𝜒2.
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Chi-square Distribution:
The square of a standard normal variate is a Chi-square variate with 1 degree of
freedom i.e.
If X is normally distributed with mean 𝜇 and standard deviation 𝜎, then (
𝑋−𝜇
𝜎
)2
is a
Chi-square variate (𝜒2) with 1 d.f. The distribution of Chi-square depends on the
degrees of freedom. There is a different distribution for each number of degrees of
freedom.
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Properties of Chi-square distribution:
1. The Mean of 𝜒2 distribution is equal to the number of degrees of freedom (n)
2. The variance is equal to two times the number of degrees of freedom. i.e The variance of
𝜒2 distribution is equal to 2n
3. The median of 𝜒2 distribution divides, the area of the curve into two equal parts, each part
being 0.5
4. The mode of 𝜒2 distribution is equal to (n-2)
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Properties of Chi-square distribution:
5. Since Chi-square values always positive, the Chi-square curve is always positively
skewed.
6. Since Chi-square values increase with the increase in the degrees of freedom, there
is a new Chi-square distribution with every increase in the number of degrees of
freedom.
7. The lowest value of Chi-square is zero and the highest value is infinity ie 𝜒2 0
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
#5. The 2 distribution is not symmetrical and all the values are positive. The distribution
is described by degrees of freedom. For each degrees of freedom we have asymmetric
curves.
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
8. As the degrees of freedom increase, the chi-square curve approaches a normal
distribution.
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Properties of Chi-square distribution:
8. When Two Chi- squares 𝜒2
1and 𝜒2
2 are independent 𝜒2
distribution with 𝑛1and
𝑛2 degrees of freedom and their sum 𝜒2
1 + 𝜒2
2 will follow 𝜒2
distribution with
(𝑛1 + 𝑛2) degrees of freedom.
9. When n (d.f) > 30, the distributionn of 2𝜒2 approximately follows normal
distribution. The mean of the distribution 2𝜒2 is 2n − 1 and the standard
deviation is equal to 1.
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Applications of a chi-square test:
i. Goodness of fit of distributions
ii. Test of independence of attributes
iii.Test of homogeneity
This test can be used in,
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Conditions for applying 𝜒2
test:
1. The data must be in the form of frequencies
2. All the items in the sample must independent
3. N, the total frequency should be reasonably large, say greater than 50.
4. No theoretical cell-frequency should be less than 5. If it is less than 5, the
frequencies should be pooled together in order to make it 5 or more than 5.
5. 𝜒2 test is wholly dependent on degrees of freedom.
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Chi Square formula:
The chi-squared test is used to determine whether there is a significant difference
between the expected frequencies and the observed frequencies in one or more
categories.
The value of 𝜒2 is calculated as:
𝜒2
=
𝑂𝑖 − 𝐸𝑖
2
𝐸𝑖
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
The observed frequencies are the frequencies obtained from the observation,
which are sample frequencies.
The expected frequencies are the calculated frequencies.
Chi Square formula:
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Steps in solving problems related to Chi-Square test
STEP 1
• Calculate the expected frequencies
STEP 2
• Take the difference between the observed and expected
frequencies and obtain the squares of these differences
(O-E)2
STEP 3
• Divide the values obtained in Step 2 by the respective
expected frequency, E and add all the values to get the
value according to the formula given by:
𝝌 𝟐
=
𝑶𝒊 − 𝑬𝒊
𝟐
𝑬𝒊
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Degree of Freedom:
It denotes the extent of independence (freedom) enjoyed by a given set of observed
frequencies. Suppose we are given a set of n observed frequencies which are
subjected to k independent constraints (restrictions) then,
)𝒅. 𝒇 = 𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒇𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒊𝒆𝒔 − (𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒊𝒏𝒅𝒆𝒑𝒆𝒏𝒅𝒆𝒏𝒕 𝒄𝒐𝒏𝒔𝒕𝒓𝒂𝒊𝒏𝒕𝒔 𝒐𝒏 𝒕𝒉𝒆𝒎
In other terms,
)𝒅. 𝒇 = 𝒓 − 𝟏 (𝒄 − 𝟏
Where,
𝑟 = 𝑇ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑜𝑤𝑠
𝑐 = 𝑇ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑙𝑢𝑚𝑛𝑠
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Example:
)𝒅. 𝒇 = 𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒇𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒊𝒆𝒔 − (𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝑖𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡 𝒄𝒐𝒏𝒔𝒕𝒓𝒂𝒊𝒏𝒕𝒔 𝒐𝒏 𝒕𝒉𝒆𝒎
)𝒅. 𝒇 = 𝟔 − (𝟏 = 𝟓
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Example:
In other terms,
𝑑. 𝑓 = 𝑟 − 1 𝑐 − 1 = 2 − 1 2 − 1 = 1
Where,
𝑟 = 𝑇ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑜𝑤𝑠
𝑐 = 𝑇ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑙𝑢𝑚𝑛𝑠
Contingency table
When the table is prepared by enumeration of qualitative data by entering the actual
frequencies and if that table represents occurrence of two sets of events, that table is
called the contingency table. It is also called as an association table.
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Contingency table
• A contingency table is a type of table in a matrix format that displays
the frequency distribution of the variables.
• They provide a basic picture of the interrelation between two variables
and can help find interactions between them.
• The chi-square statistic compares the observed count in each table cell
to the count which would be expected under the assumption of no
association between the row and column classifications.
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Table/Critical values of 2
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Exercise-01:
Two random samples drawn from two normal populations are:
Sample -I 20 16 26 27 22 23 18 24 25
Sample -II 27 33 42 35 32 34 38 28 41 43 30 37
Obtain the estimates of the variance of the population and test 5% level of significance
whether the two populations have the same variance.
Testing the ratio of variance
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Solution:
We want to test the null hypothesis,
𝐻0∶ The two samples are drawn from two populations having the same variance
i.e 𝜎1
2 = 𝜎2
2
VS 𝐻1 ∶ The two samples are drawn from two populations having the different variance
i.e 𝜎1
2
≠ 𝜎2
2
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
𝑻𝒂𝒃𝒍𝒆 𝒇𝒐𝒓 𝒏𝒆𝒄𝒆𝒔𝒔𝒂𝒓𝒚 𝒄𝒂𝒍𝒄𝒖𝒍𝒂𝒕𝒊𝒐𝒏
𝑋𝑖 (𝑋𝑖− 𝑋𝑖) (𝑋𝑖− 𝑋𝑖)2
𝑋𝑗 (𝑋𝑗− 𝑋𝑗) (𝑋𝑗− 𝑋𝑗)2
20 -2 4 33 -2 4
16 -6 36 42 7 49
26 4 16 35 0 0
27 5 25 32 -3 9
22 0 0 34 -1 1
23 1 1 38 3 9
18 -4 16 28 -7 49
24 2 4 41 6 36
19 -3 9 43 8 64
25 3 9 30 -5 25
37 2 4
33 -2 4
𝑥𝑖 = 220 (𝑋𝑖− 𝑋𝑖)2 = 120 𝑥𝑗 = 420 (𝑋𝑗− 𝑋𝑗)2 = 314
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Now, 𝑥𝑖 =
𝑥 𝑖
𝑛1
=
220
10
= 22
𝑥𝑗 =
𝑥𝑗
𝑛2
=
420
12
= 35
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
We know, the test statistic,
The statistic F is defined by the ratio
𝐹 =
𝑆2
2
𝑆1
2 ; 𝑓𝑜𝑙𝑙𝑜𝑤𝑠 𝐹 − 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑤𝑖𝑡ℎ 𝑑. 𝑓 = (𝑣2 − 1)(𝑣1 − 1)
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Level of Significance:
Let us consider the level of significance 𝛼 = 5% = 0.05
Critical Value Or Expected Value:
At 5% level of significance, the critical value of F-distribution is 𝐹 𝑣2−1 , 𝑣1−1 = 𝐹9,11 = 2.896
Comment:
Since 𝐹𝑐𝑎𝑙 < 𝐹𝑡𝑎𝑏 we accept null hypothesis at 5% level of significance and conclude that the
two samples may be regarded as drawn from the populations having same variance.
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Solution:Exercise-02:
Two random samples drawn from normal populations are:
1st Sample 22 24 34 36 45 18
2nd Sample 27 28 33 24 47 17 16 20
Test whether the two populations have the same variance.
BBA Professional - 2016
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Solution:Exercise-03:
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
We want to test the null hypothesis,
𝐻0∶ There is no difference in the variance of yield of wheat
i.e 𝜎1
2
= 𝜎2
2
VS 𝐻1 ∶ There is difference in the variance of yield of wheat
i.e 𝜎1
2 ≠ 𝜎2
2
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
We know, the test statistic,
The statistic F is defined by the ratio
𝐹0 =
𝑆1
2
𝑆2
2 ; 𝑓𝑜𝑙𝑙𝑜𝑤𝑠 𝐹 − 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑤𝑖𝑡ℎ 𝑑. 𝑓 = (𝑣1 − 1)(𝑣2 − 1)
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Level of Significance:
Let us consider the level of significance 𝛼 = 5% = 0.05
Critical Value Or Expected Value:
At 5% level of significance, the critical value of F-distribution is 𝐹 𝑣1−1 , 𝑣2−1 = 𝐹7,5 = 4.88
Comment:
Since 𝐹𝑐𝑎𝑙 < 𝐹𝑡𝑎𝑏 we accept null hypothesis at 5% level of significance and conclude that
the there is no difference in the variances of yield of wheat.
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Example-04: BBA Professional - 2004
By random sampling one thousand families from Dhaka city are selected to test the belief
that high income families usually send their children to private school and the low income
families send their children to government schools. The following results were obtained:
Income School Total
Private Governme
nt
Low 370 430 800
High 130 70 200
Total 500 500 1000
Test whether income and type of school are independent.
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Solution:
We want to test the null hypothesis,
𝐻0 = 𝐼𝑛𝑐𝑜𝑚𝑒 𝑎𝑛𝑑 𝑡𝑦𝑝𝑒𝑠 𝑜𝑓 𝑠𝑐ℎ𝑜𝑜𝑙 𝑎𝑟𝑒 𝑖𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡
𝐻1 = 𝐼𝑛𝑐𝑜𝑚𝑒 𝑎𝑛𝑑 𝑡𝑦𝑝𝑒𝑠 𝑜𝑓 𝑠𝑐ℎ𝑜𝑜𝑙 𝑎𝑟𝑒 𝑛𝑜𝑡 𝑖𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡
Test statistic:
We define our test statistics define as follow,
𝜒2 =
𝑖=1
𝑛
(𝑂𝑖 − 𝐸𝑖)2
𝐸𝑖
Where,
O refers to the observed frequencies
E refers to the expected frequencies.
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
The expected frequencies are computed as follows:
𝑬 =
𝑹𝒐𝒘 𝑻𝒐𝒕𝒂𝒍 × 𝑪𝒐𝒍𝒖𝒎𝒏 𝑻𝒐𝒕𝒂𝒍
𝑻𝒐𝒕𝒂𝒍 𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝑶𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏
𝐸11 =
𝑅1 × 𝐶1
𝑁
=
800 × 500
1000
= 400
𝐸12 =
𝑅1 × 𝐶2
𝑁
=
800 × 500
1000
= 400
𝐸21 =
𝑅2 × 𝐶1
𝑁
=
200 × 500
1000
= 100
𝐸22 =
𝑅2 × 𝐶2
𝑁
=
200 × 500
1000
= 100
Income School Total
Private
(𝑪 𝟏)
Government
(𝑪 𝟐)
Low (𝑹 𝟏) 370 430 800
High (𝑹 𝟐) 130 70 200
Total 500 500 1000
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Table for calculation of 𝝌 𝟐
Observed Frequencies
(O)
Expected Frequencies
(E)
(𝑶 − 𝑬) 𝟐
(𝑶 − 𝑬) 𝟐
𝑬
370 400 900 2.25
430 400 900 2.25
130 100 900 9.00
70 100 900 9.00
Total (𝑂 − 𝐸)2
𝐸
= 22.50
Now, 𝝌 𝟐 = 𝒊=𝟏
𝒏 (𝑶 𝒊−𝑬 𝒊) 𝟐
𝑬 𝒊
= 𝟐𝟐. 𝟓𝟎
Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
Significance Level:
Let us consider, significance level, 𝛼 = 5% = 0.05
Critical Value:
At 5% level of significance, the critical value of 𝜒2 = 3.84 , where df = (2-1)(2-1)=1
Comment:
Since the calculated value of 𝜒2 = 22.5 is greater than the tabulated value of 𝜒2 = 3.84 i.e
𝜒2
𝑐𝑎𝑙 > 𝜒2
𝑡𝑎𝑏 and we may reject the null hypothesis at 5% level of significance.
Therefore, we conclude that there is association between income and type of school.
Chi-square distribution
Chi-square distribution
Chi-square distribution

Chi-square distribution

  • 1.
    CHI-SQUARE DISTRIBUTION Bipul KumarSarker Lecturer BBA Professional Habibullah Bahar University College Chapter-07, Part-02
  • 2.
    Introduction: • The Chi-squaretest is one of the most commonly used non-parametric test, in which the sampling distribution of the test statistic is a chi-square distribution, when the null hypothesis is true. • It was introduced by Karl Pearson as a test of association. The Greek Letter χ2 is used to denote this test. • It can be applied when there are few or no assumptions about the population parameter. • It can be applied on categorical data or qualitative data using a contingency table. • Used to evaluate unpaired/unrelated samples and proportions. Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
  • 3.
    Definition: The Chi- square(𝜒2 ) test is one of the simplest and most widely used non parametric tests in statistical work. The 𝜒2 test was first used by Karl Pearson in the year 1900. The quantity 𝜒2 describes the magnitude of the discrepancy between theory and observation. It is defined as: 𝝌 𝟐 = 𝑂𝑖 − 𝐸𝑖 2 𝐸𝑖 Where, O refers to the observed frequencies And E refers to the expected frequencies. Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
  • 4.
    Note:  If 𝜒2is zero, it means that the observed and expected frequencies coincide with each other.  The greater the discrepancy between the observed and expected frequencies the greater is the value of 𝜒2. Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
  • 5.
    Chi-square Distribution: The squareof a standard normal variate is a Chi-square variate with 1 degree of freedom i.e. If X is normally distributed with mean 𝜇 and standard deviation 𝜎, then ( 𝑋−𝜇 𝜎 )2 is a Chi-square variate (𝜒2) with 1 d.f. The distribution of Chi-square depends on the degrees of freedom. There is a different distribution for each number of degrees of freedom. Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
  • 6.
    Bipul Kumar Sarker,Lecturer (BBA Professional), HBUC
  • 7.
    Properties of Chi-squaredistribution: 1. The Mean of 𝜒2 distribution is equal to the number of degrees of freedom (n) 2. The variance is equal to two times the number of degrees of freedom. i.e The variance of 𝜒2 distribution is equal to 2n 3. The median of 𝜒2 distribution divides, the area of the curve into two equal parts, each part being 0.5 4. The mode of 𝜒2 distribution is equal to (n-2) Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
  • 8.
    Properties of Chi-squaredistribution: 5. Since Chi-square values always positive, the Chi-square curve is always positively skewed. 6. Since Chi-square values increase with the increase in the degrees of freedom, there is a new Chi-square distribution with every increase in the number of degrees of freedom. 7. The lowest value of Chi-square is zero and the highest value is infinity ie 𝜒2 0 Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
  • 9.
    #5. The 2distribution is not symmetrical and all the values are positive. The distribution is described by degrees of freedom. For each degrees of freedom we have asymmetric curves. Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
  • 10.
    8. As thedegrees of freedom increase, the chi-square curve approaches a normal distribution. Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
  • 11.
    Properties of Chi-squaredistribution: 8. When Two Chi- squares 𝜒2 1and 𝜒2 2 are independent 𝜒2 distribution with 𝑛1and 𝑛2 degrees of freedom and their sum 𝜒2 1 + 𝜒2 2 will follow 𝜒2 distribution with (𝑛1 + 𝑛2) degrees of freedom. 9. When n (d.f) > 30, the distributionn of 2𝜒2 approximately follows normal distribution. The mean of the distribution 2𝜒2 is 2n − 1 and the standard deviation is equal to 1. Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
  • 12.
    Applications of achi-square test: i. Goodness of fit of distributions ii. Test of independence of attributes iii.Test of homogeneity This test can be used in, Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
  • 13.
    Bipul Kumar Sarker,Lecturer (BBA Professional), HBUC
  • 14.
    Conditions for applying𝜒2 test: 1. The data must be in the form of frequencies 2. All the items in the sample must independent 3. N, the total frequency should be reasonably large, say greater than 50. 4. No theoretical cell-frequency should be less than 5. If it is less than 5, the frequencies should be pooled together in order to make it 5 or more than 5. 5. 𝜒2 test is wholly dependent on degrees of freedom. Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
  • 15.
    Chi Square formula: Thechi-squared test is used to determine whether there is a significant difference between the expected frequencies and the observed frequencies in one or more categories. The value of 𝜒2 is calculated as: 𝜒2 = 𝑂𝑖 − 𝐸𝑖 2 𝐸𝑖 Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
  • 16.
    The observed frequenciesare the frequencies obtained from the observation, which are sample frequencies. The expected frequencies are the calculated frequencies. Chi Square formula: Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
  • 17.
    Steps in solvingproblems related to Chi-Square test STEP 1 • Calculate the expected frequencies STEP 2 • Take the difference between the observed and expected frequencies and obtain the squares of these differences (O-E)2 STEP 3 • Divide the values obtained in Step 2 by the respective expected frequency, E and add all the values to get the value according to the formula given by: 𝝌 𝟐 = 𝑶𝒊 − 𝑬𝒊 𝟐 𝑬𝒊 Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
  • 18.
    Degree of Freedom: Itdenotes the extent of independence (freedom) enjoyed by a given set of observed frequencies. Suppose we are given a set of n observed frequencies which are subjected to k independent constraints (restrictions) then, )𝒅. 𝒇 = 𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒇𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒊𝒆𝒔 − (𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒊𝒏𝒅𝒆𝒑𝒆𝒏𝒅𝒆𝒏𝒕 𝒄𝒐𝒏𝒔𝒕𝒓𝒂𝒊𝒏𝒕𝒔 𝒐𝒏 𝒕𝒉𝒆𝒎 In other terms, )𝒅. 𝒇 = 𝒓 − 𝟏 (𝒄 − 𝟏 Where, 𝑟 = 𝑇ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑜𝑤𝑠 𝑐 = 𝑇ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑙𝑢𝑚𝑛𝑠 Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
  • 19.
    Example: )𝒅. 𝒇 =𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒇𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒊𝒆𝒔 − (𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝑖𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡 𝒄𝒐𝒏𝒔𝒕𝒓𝒂𝒊𝒏𝒕𝒔 𝒐𝒏 𝒕𝒉𝒆𝒎 )𝒅. 𝒇 = 𝟔 − (𝟏 = 𝟓 Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
  • 20.
    Example: In other terms, 𝑑.𝑓 = 𝑟 − 1 𝑐 − 1 = 2 − 1 2 − 1 = 1 Where, 𝑟 = 𝑇ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑜𝑤𝑠 𝑐 = 𝑇ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑙𝑢𝑚𝑛𝑠
  • 21.
    Contingency table When thetable is prepared by enumeration of qualitative data by entering the actual frequencies and if that table represents occurrence of two sets of events, that table is called the contingency table. It is also called as an association table. Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
  • 22.
    Contingency table • Acontingency table is a type of table in a matrix format that displays the frequency distribution of the variables. • They provide a basic picture of the interrelation between two variables and can help find interactions between them. • The chi-square statistic compares the observed count in each table cell to the count which would be expected under the assumption of no association between the row and column classifications. Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
  • 23.
    Table/Critical values of2 Bipul Kumar Sarker, Lecturer (BBA Professional), HBUC
  • 24.
    Bipul Kumar Sarker,Lecturer (BBA Professional), HBUC
  • 25.
    Bipul Kumar Sarker,Lecturer (BBA Professional), HBUC Exercise-01: Two random samples drawn from two normal populations are: Sample -I 20 16 26 27 22 23 18 24 25 Sample -II 27 33 42 35 32 34 38 28 41 43 30 37 Obtain the estimates of the variance of the population and test 5% level of significance whether the two populations have the same variance. Testing the ratio of variance
  • 26.
    Bipul Kumar Sarker,Lecturer (BBA Professional), HBUC Solution: We want to test the null hypothesis, 𝐻0∶ The two samples are drawn from two populations having the same variance i.e 𝜎1 2 = 𝜎2 2 VS 𝐻1 ∶ The two samples are drawn from two populations having the different variance i.e 𝜎1 2 ≠ 𝜎2 2
  • 27.
    Bipul Kumar Sarker,Lecturer (BBA Professional), HBUC 𝑻𝒂𝒃𝒍𝒆 𝒇𝒐𝒓 𝒏𝒆𝒄𝒆𝒔𝒔𝒂𝒓𝒚 𝒄𝒂𝒍𝒄𝒖𝒍𝒂𝒕𝒊𝒐𝒏 𝑋𝑖 (𝑋𝑖− 𝑋𝑖) (𝑋𝑖− 𝑋𝑖)2 𝑋𝑗 (𝑋𝑗− 𝑋𝑗) (𝑋𝑗− 𝑋𝑗)2 20 -2 4 33 -2 4 16 -6 36 42 7 49 26 4 16 35 0 0 27 5 25 32 -3 9 22 0 0 34 -1 1 23 1 1 38 3 9 18 -4 16 28 -7 49 24 2 4 41 6 36 19 -3 9 43 8 64 25 3 9 30 -5 25 37 2 4 33 -2 4 𝑥𝑖 = 220 (𝑋𝑖− 𝑋𝑖)2 = 120 𝑥𝑗 = 420 (𝑋𝑗− 𝑋𝑗)2 = 314
  • 28.
    Bipul Kumar Sarker,Lecturer (BBA Professional), HBUC Now, 𝑥𝑖 = 𝑥 𝑖 𝑛1 = 220 10 = 22 𝑥𝑗 = 𝑥𝑗 𝑛2 = 420 12 = 35
  • 29.
    Bipul Kumar Sarker,Lecturer (BBA Professional), HBUC We know, the test statistic, The statistic F is defined by the ratio 𝐹 = 𝑆2 2 𝑆1 2 ; 𝑓𝑜𝑙𝑙𝑜𝑤𝑠 𝐹 − 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑤𝑖𝑡ℎ 𝑑. 𝑓 = (𝑣2 − 1)(𝑣1 − 1)
  • 30.
    Bipul Kumar Sarker,Lecturer (BBA Professional), HBUC Level of Significance: Let us consider the level of significance 𝛼 = 5% = 0.05 Critical Value Or Expected Value: At 5% level of significance, the critical value of F-distribution is 𝐹 𝑣2−1 , 𝑣1−1 = 𝐹9,11 = 2.896 Comment: Since 𝐹𝑐𝑎𝑙 < 𝐹𝑡𝑎𝑏 we accept null hypothesis at 5% level of significance and conclude that the two samples may be regarded as drawn from the populations having same variance.
  • 31.
    Bipul Kumar Sarker,Lecturer (BBA Professional), HBUC Solution:Exercise-02: Two random samples drawn from normal populations are: 1st Sample 22 24 34 36 45 18 2nd Sample 27 28 33 24 47 17 16 20 Test whether the two populations have the same variance. BBA Professional - 2016
  • 32.
    Bipul Kumar Sarker,Lecturer (BBA Professional), HBUC Solution:Exercise-03:
  • 33.
    Bipul Kumar Sarker,Lecturer (BBA Professional), HBUC We want to test the null hypothesis, 𝐻0∶ There is no difference in the variance of yield of wheat i.e 𝜎1 2 = 𝜎2 2 VS 𝐻1 ∶ There is difference in the variance of yield of wheat i.e 𝜎1 2 ≠ 𝜎2 2
  • 34.
    Bipul Kumar Sarker,Lecturer (BBA Professional), HBUC
  • 35.
    Bipul Kumar Sarker,Lecturer (BBA Professional), HBUC We know, the test statistic, The statistic F is defined by the ratio 𝐹0 = 𝑆1 2 𝑆2 2 ; 𝑓𝑜𝑙𝑙𝑜𝑤𝑠 𝐹 − 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑤𝑖𝑡ℎ 𝑑. 𝑓 = (𝑣1 − 1)(𝑣2 − 1)
  • 36.
    Bipul Kumar Sarker,Lecturer (BBA Professional), HBUC Level of Significance: Let us consider the level of significance 𝛼 = 5% = 0.05 Critical Value Or Expected Value: At 5% level of significance, the critical value of F-distribution is 𝐹 𝑣1−1 , 𝑣2−1 = 𝐹7,5 = 4.88 Comment: Since 𝐹𝑐𝑎𝑙 < 𝐹𝑡𝑎𝑏 we accept null hypothesis at 5% level of significance and conclude that the there is no difference in the variances of yield of wheat.
  • 37.
    Bipul Kumar Sarker,Lecturer (BBA Professional), HBUC Example-04: BBA Professional - 2004 By random sampling one thousand families from Dhaka city are selected to test the belief that high income families usually send their children to private school and the low income families send their children to government schools. The following results were obtained: Income School Total Private Governme nt Low 370 430 800 High 130 70 200 Total 500 500 1000 Test whether income and type of school are independent.
  • 38.
    Bipul Kumar Sarker,Lecturer (BBA Professional), HBUC Solution: We want to test the null hypothesis, 𝐻0 = 𝐼𝑛𝑐𝑜𝑚𝑒 𝑎𝑛𝑑 𝑡𝑦𝑝𝑒𝑠 𝑜𝑓 𝑠𝑐ℎ𝑜𝑜𝑙 𝑎𝑟𝑒 𝑖𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡 𝐻1 = 𝐼𝑛𝑐𝑜𝑚𝑒 𝑎𝑛𝑑 𝑡𝑦𝑝𝑒𝑠 𝑜𝑓 𝑠𝑐ℎ𝑜𝑜𝑙 𝑎𝑟𝑒 𝑛𝑜𝑡 𝑖𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡 Test statistic: We define our test statistics define as follow, 𝜒2 = 𝑖=1 𝑛 (𝑂𝑖 − 𝐸𝑖)2 𝐸𝑖 Where, O refers to the observed frequencies E refers to the expected frequencies.
  • 39.
    Bipul Kumar Sarker,Lecturer (BBA Professional), HBUC The expected frequencies are computed as follows: 𝑬 = 𝑹𝒐𝒘 𝑻𝒐𝒕𝒂𝒍 × 𝑪𝒐𝒍𝒖𝒎𝒏 𝑻𝒐𝒕𝒂𝒍 𝑻𝒐𝒕𝒂𝒍 𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝑶𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏 𝐸11 = 𝑅1 × 𝐶1 𝑁 = 800 × 500 1000 = 400 𝐸12 = 𝑅1 × 𝐶2 𝑁 = 800 × 500 1000 = 400 𝐸21 = 𝑅2 × 𝐶1 𝑁 = 200 × 500 1000 = 100 𝐸22 = 𝑅2 × 𝐶2 𝑁 = 200 × 500 1000 = 100 Income School Total Private (𝑪 𝟏) Government (𝑪 𝟐) Low (𝑹 𝟏) 370 430 800 High (𝑹 𝟐) 130 70 200 Total 500 500 1000
  • 40.
    Bipul Kumar Sarker,Lecturer (BBA Professional), HBUC Table for calculation of 𝝌 𝟐 Observed Frequencies (O) Expected Frequencies (E) (𝑶 − 𝑬) 𝟐 (𝑶 − 𝑬) 𝟐 𝑬 370 400 900 2.25 430 400 900 2.25 130 100 900 9.00 70 100 900 9.00 Total (𝑂 − 𝐸)2 𝐸 = 22.50 Now, 𝝌 𝟐 = 𝒊=𝟏 𝒏 (𝑶 𝒊−𝑬 𝒊) 𝟐 𝑬 𝒊 = 𝟐𝟐. 𝟓𝟎
  • 41.
    Bipul Kumar Sarker,Lecturer (BBA Professional), HBUC Significance Level: Let us consider, significance level, 𝛼 = 5% = 0.05 Critical Value: At 5% level of significance, the critical value of 𝜒2 = 3.84 , where df = (2-1)(2-1)=1 Comment: Since the calculated value of 𝜒2 = 22.5 is greater than the tabulated value of 𝜒2 = 3.84 i.e 𝜒2 𝑐𝑎𝑙 > 𝜒2 𝑡𝑎𝑏 and we may reject the null hypothesis at 5% level of significance. Therefore, we conclude that there is association between income and type of school.