Statistical Hypothesis Testing Guide

Testing of Hypothesis
One of the important applications of statistical inference is ‘test of hypothesis’.
In testing of hypothesis, decision-making about the characteristics of the
population on the basis of sample study involves the risk of taking wrong
decision. For example, we may decide whether a given food-stuff is really
effective in increasing weight or which of the two brands of fertilizers is more
effective. In such case the modern theory of probability plays a vital role in
decision-making. The branch of statistics, which helps in arriving at the
criterion for such decision is called Test of Hypothesis or Hypothesis Testing
or Test of Significance or Statistical Decision Making. A hypothesis is an
assumption or statement, which may or may not be true about a population and
is under the test.
The test of hypothesis is a process of testing a significance regarding the
parameter of the population on the basis of sample drawn from the population.
General Procedure of Testing a Hypothesis:
Formulating the Hypothesis: Hypothesis is a tentative statement of the of the
population parameter on the basis of the sample study. For examples, (a) The
average height of the students in a class is 160 cms. (b) A given drug cures
90% of the patients taking it. (c) A given detergent cleans better than any
washing soap etc. All these hypotheses may be verified on the basis of certain
sample tests. A common way of stating a hypothesis is that there is no
difference between the population mean and sample mean. The term ‘no
difference’ implies that the difference, if any, is merely due to sampling
fluctuations. The statistical hypothesis may be divided into two types- Null
hypothesis and Alternative hypothesis.
Null Hypothesis: The null hypothesis is the hypothesis of no difference, which
is denoted by Ho. In the above examples (a) the null hypothesis may be
expressed symbolically as
Ho: µ = 160 cms.
While formulating a null hypothesis we should take care of the following two
points.
i) If we want to test the significance of the difference between statistic and
the parameter or between two sample statistics then we formulate a null
hypothesis that the difference is not significant. This implies that the
difference is just due to fluctuations of sampling. In symbol,
H0 = µ =
−
x
1

ii) If we want to test any statement about the population we formulate the
null hypothesis that it is true. For example if we want to find whether the
population mean has specified value µ0, then we formulate the null
hypothesis
H0 = µ = µ0
Alternative Hypothesis: Any hypothesis, which is complementary to the null
hypothesis is called an alternative hypothesis and is denoted by H1. For
example, if we want to test the null hypothesis that the average height of the
student is 160 cms, then we set up the null hypothesis as
i) H0 : µ = 160cms = µ0 (say), then the alternative hypothesis could be
H1: µ ≠ µ0 (i.e.µ>µ0 or µ<µ0). It is called two-tailed alternative
hypothesis
ii) H1: µ > µ0. It is called right tailed alternative hypothesis.
iii) H1: µ < µ0. It is called left tailed alternative hypothesis.
Computing the Test Statistic: After formulating a hypothesis, the next step is
to calculate an appropriate test statistic. The choice of a test statistic will be
based on an appropriate probability distribution. For testing whether the null
hypothesis should be accepted or rejected, we use Z-distribution under normal
curve for large sample ( n ≥ 30 ) and t-distribution for small sample ( n < 30 ).
Defining Type I and Type II Error: A decision to accept or reject H0 is made
on the basis of the sample data and there is always a chance of making an
error. There are two types of errors in testing hypothesis viz.
Type I Error: Reject a null hypothesis when it is true.
Type II Error: Accept a null hypothesis when it is false.
Thus there are four situations which may arise in testing hypothesis are given
in the following table.
Decision
Accept Ho Reject Ho
Ho is true
No Error
Correct Decision
Probability = 1 - α
Type I Error
Wrong Decision
Probability = α
Ho is false Type II Error
Wrong Decision
Probability = β
No Error
Correct Decision
Probability = 1 - β
2

In deciding whether to accept or reject Ho, we try to minimize α, the
probability of making type I error. Hence the probability of making correct
decision is 1 - α. In the above table, β denotes the probability of making type II
error. The probability β is much more risky than the probability α, because we
incur greater risk in accepting the false hypothesis than in rejecting a true
hypothesis.
Fixing the level of significance: The level of significance usually denoted by
α, is the maximum probability of making Type I error. The commonly used
levels of significance are 5% (0.05) and 1% (0.01). If we use 5%, it implies
that in 5 cases out of 100 cases we are likely to reject Ho. In other words, this
implies that we are 95% confident that our decision is to accept Ho is correct.
The level of significance should be fixed in advance before applying the test.
Fixing critical region (or rejection region): In hypothesis testing, the level of
significance is set up in order to know the probability of making Type I error.
The region of the standard normal curve corresponding to a pre-determined
level of significance should be known, because when the computed test
statistic lies in this region, it is reasonable to reject the hypothesis as it is
believed to be probably false. The region of the standard normal curve
corresponding to a pre-determined level of significance that is fixed for
knowing the probability of making Type I error is called rejection region or
critical region. The region of standard normal curve, which is not occupied by
the rejection region, is called the acceptance region. When the computed test
statistic lies in the acceptance region, it is reasonable to accept the hypothesis
as it is believed to be probably true. The value of the test statistic, which
separates the rejection region and acceptance region, is known as critical value.
Deciding two tailed or one tailed test: The critical region may be represented
by a portion of area under the normal curve in two ways:
a. Two ‘tails’ or two sides under the curve
b. One ‘tail’ or one side under the curve, which is either the left tail
or right tail.
The test of hypothesis, which is used on critical region represented by
both the tails under the normal curve, is called two-tailed test or two sided test.
If the critical region is represented by only one tail, the test is called one-tailed
test or one sided test.
3

Two-tailed test
(Level of significance α)
Note that , RR= Rejection Region AR= Acceptance Region
Right-tailed test Left-tailed test
(level of significance α) (level of significance α)
Decision Making: If the computed value of the test statistic is less than the
critical value, the computed value lies in the acceptance region and H0 is
accepted. On the other hand, if the computed value lies in the rejection region
and Ho is rejected. In testing a hypothesis we generally use 5% (0.05) level of
significance unless otherwise stated.
Test of Hypothesis for Large Samples:
4
Critical ValueCritical value
RRRR
ARAR

Test of significance of a Single Mean: The steps in testing the significance of a
sample mean of large sample (n ≥30) are as follows:
Step 1. Formulate the null hypothesis (H0) in any one of the following forms
i) H0: µ = µo i.e. the population mean has specified mean µo
ii) H0: x=µ i.e. there is no significant difference between the
sample mean x and the population mean µ
iii) H0:The sample has been drawn from the given large
population with mean µo and the standard deviationσ
Step 2. Compute the test statistic,
Z =
n
x
σ
µ−
_
, Z ~ N (0,1) i.e. z follows normally with mean 0 and
standard deviation 1
If σ is not known then its estimate is given by sample standard deviation
( )
1
2
−
−
=
∑
n
xx
S
Where, S = σ is an unbiased estimate of population standard deviation.
Step 3: Select the level of significanceα. We usually fix α at 5% (α = 0.05)
unless otherwise stated.
Decide whether two tailed test or one tailed test has to be applied.
Step 4: Write down the critical or tabulated value of z at pre-determined level
of significance α
Critical values Level of significance α
Zα 1% 5%
--------------------- ---------------------------
Two tailed test 2.58 1.96
One tailed test: Right +2.326 +1.645
Left - 2.326 - 1.645
Step 5: Make decision by comparing the computed and critical (or tabulated)
value of z. If the computed value of |z | is less than the critical value Zα, it lies
in acceptance region. Hence Ho is accepted at the level of significance α and
we conclude that there is no significance difference between sample mean and
population mean or the sample has been drawn from the given population.
If the computed value of |z| is greater than Zα, Ho lies in the
rejection region and we reject Ho. It may be concluded that there is a
5

significant difference between the sample mean and the population mean or the
sample has not been drawn from parent population.
Example 1: A samples of 50 pieces of certain type of string was tested. The
mean breaking strength turned out to be 14.5 kgs. Test whether the sample is
from a batch of strings having a mean breaking strength of 15.6 kgs and
standard deviation of 2.2 kgs.
Here, Number of sample, n = 50
Sample mean x = 14.5 kgs.
Population mean µ = 15.6 kgs
Standard deviation σ = 2.2 kgs
Null Hypothesis: Ho : µ = 15.6 kgs i.e. the mean breaking strength of the
strings is 15.6 kg.
Alternative Hypothesis: H1 :≠ 15.6 kgs i.e. the mean breaking strength is not
equal to 15.6 kg. (two tailed test)
We have, Z =
n
x
σ
µ−
−
=
50
2.2
6.155.14 −
= 31.0
1.1−
= - 3.55
∴| z | = | - 3.55 | = 3.55
Level of significance α= 5%
Tabulated value of z at 5% level of significance for two tailed test = 1.96
Since Zcal > Ztab , Ho is rejected i.e. H1 is accepted. Hence we conclude that the
sample has not been drawn from the normal population with mean 15.6 kgs
and standard deviation 2.2 kgs.
Example 2: In the past a blending process has produced an average of 5 Kg.
of waste material for every batch with standard deviation 5 Kg. From a sample
of 100 batches an average of 8 Kg. of waste per batch is obtained. At 5% level
of significance is it reasonable to believe that the average has increased?
Solution: We have, n = 100, x = 16 kg, µ = 5 kg, σ = 5 kg
H0: µ = 5 kg, i.e. the average of waste material has not been increased
H1: µ > 5 kg, i.e. the average has increased (Right tailed test)
Z =
n
x
σ
µ−
_
=
100
5
57 −
= 5.0
2
= 4
Level of significance, α = 5% = 0.05
Tabulated value of Z for right tailed test at 5% level of significance is 1.645.
Since the computed value of Z is greater than its tabulated value, H0 is rejected
6

i.e. H1 is accepted. Hence we conclude that the average waste material has
been increased.
Example 3: An insurance agent has claimed that the average age of the policy
holders who have insured through him is less than the average age for all the
agents, which is 30.55 years. A random sample of 100 policy holders who had
insured through him gave the following age distribution
Age: 16 – 20 21 – 25 26 – 30 31 – 35 36 – 40
No. of persons: 12 22 20 30 16
Calculate the arithmetic mean and standard deviation of this distribution
and use these values to test his claim at 5% level of significance.
Solution: Age f x d’= (x-28)/5 fd’ fd’2
----------- ------ ------ --------------- ------- ---------
15.5 – 20.5 12 18 - 2 - 24 48
20.5 – 25.5 22 23 - 1 - 22 22
25.5 – 30.5 20 28 0 0 0
30.5 – 35.5 30 33 1 30 30
35.5 – 40,5 16 38 2 32 64
--------------- ----- -------- -------------- ------- --------
Σfd’= 16 Σfd’2
=164
5
100
16
28
'
×+=×+=
∑ h
N
fd
Ax = 28 + 0.08 = 28.8 years.
352.650256.064.15
100
16
100
164'' 222
=×−=×





−=×








−=
∑∑ h
N
fd
N
fd
s
H0:µ = 30.5 years i.e. the average age of the policy holders for all agents is 30.5
years.
H1: µ 30.5 years (left tail test) i.e. the average age of the policy holders who
have insured through him is less than the average age for all the agents.
Test statistic: Under H0, Z-statistic is given by
Z = 68.210
353.6
7.1
100
353.6
5.308.28
−=×
−
=
−
=
−
=
−
n
s
x
n
x µ
σ
µ
∴ Z = 2.68
The tabulated value of Z at 5% level of significance for one tail test is 1.645
Since the calculated value of Z is greater than the tabulated value of Z at
5% level of significance, H0 is rejected. Hence we conclude that the average
age of the policy holders who have insured through him is less than the
average age of the policy holders of all the agents.
7

Example 4: A company produce automobile tire. The average life of all these
tires is normally distributed with an average of 40,000 miles and a standard
deviation of 3000 miles. The plant has been updated with new machinery and
the new production process is believed to produce better tires. For testing it, a
sample of 64 new tires was taken and a test run showed that the average life of
41000 miles. Can we conclude at 95% confidence level that new product is
significantly better than the previous one?
Solution: Given, µ = 40000 miles, σ =3000miles, n=64, 41000=x miles
1- α =95% α⇒ = 5% = 0.05
H0: µ = 40000 miles i.e. average life of tires is 40000. In other words
there is no significant difference between sample mean and population
mean.
H1: µ > 40000 miles (right tail test) i.e. the average life of new tires is
greater than 40000 miles. In other words, the average life of new
product is significantly better than the previous one.
Test statistic: Under H0, Z-statistic is given by
Z =
41000 40000 1000
8 2.67
3000 3000
64
x
n
µ
σ
− −
= = × =
1-α = 0.95 =⇒ α 0.05
The tabulated value of Z at 5% level of significant for one tail test is 1.645.
Decision: Since the calculated value of Z is greater than the tabulated value of
Z, H0 is rejected. Hence we conclude that the new product is
significantly better than the previous one.
Ex. An insurance agent has claimed that the average age of policy holder who
insure through him is less than the average for all agents, which is 30.5 years.
The mean age obtained from a random sample of 100 policy holders who have
insured through him is 28.8 years with standard deviation of 6.35 years. Test
his claim at 5% level of significance.
Hints: µ = 30.5 years i.e. the average age of the policy holders for all the
agents is 30.5 years.
µ < 30.5 years (left tailed test) i.e. the average age of the policy
holders who have insured through him is less than the average
age for all the agents.
Ex: A potential buyer of light bulbs bought 50 bulbs each of two brands. Upon
testing these bulbs, he found that brand A had a mean life of 1282 hours with
s.d. of 80 hours, where as brand B had a mean life of 1208 hours with s.d. of
8

94 hours. Can the buyer be quite certain that the two brands do differ in
quality?
Example: A pharmaceutical company wants to estimate the mean life of a
particular drug under typical weather conditions. Following results were
obtained from a simple random sample of 64 bottles of the drug: Sample mean:
20 months, Population s.d.: 3 months, Sample size: 64.
Find an interval estimate with a confidence level of (i) 95% and (ii) 99%.
Solution: In usual notations, it is given; n=64, x =20 and σ =3
(i) 95% confidence limits for population mean µ are given by:
n
x
σ
96.1± = 64
3
96.120 ± = 735.020 ± = (19.265, 20.735)
(ii) ……………………………………………………….
Ex. An automobile manufacturer claims that a particular model gets 28 miles
to the gallon. The Environmental Protection Agency, using a sample of 49
automobiles of this model, finds the sample mean to be 26.8 miles per gallon.
From previous studies, the population standard deviation is known to be 5
miles per gallon. Could we reasonably expect (within 2 standard errors) that
we select such a sample if indeed the population mean is actually 28 miles per
gallon?
,5=σ n= 49, 8.26=x , 28=µ
429.128
49
5
22822 ±=±=±=±
nx
σ
µσµ = (26.571, 29.429)
Because, x =26.8>26.57, it is not unreasonable to see such sample results if µ
really is 28 mpg.
Test of Significance of difference between two means:
Let the two independent random samples of size n1 and n2 be drawn from
two different normal populations with mean 1µ and 2µ and variance 2
1σ and
2
2σ respectively. And let 1x and 2x be the sample means drawn from two
populations, then if we want to test whether there is a significant difference
between the two populations; we set up the null hypothesis as
H0: 21 µµ = i.e. two population means are equal or two samples have been
drawn two parent populations or there is no significant difference between the
sample means against
H1: 21 µµ ≠ i.e. two population means are not equal or two samples have
not been drawn from different populations or there is significant difference
between two sample means.
9

Under H0, the Z-statistic is given by
Z =
2 2
2 2
1 2
1 2
1 2
( )
x x x x
SE x x
n n
σ σ
− −
=
−
+
Where 2
2
2
1 σσ and are population variances. If they are unknown, then their
estimates are provided by the corresponding sample variances s1
2
and s2
2
respectively. Then the Z-statistic is given by
Z =
2
2
2
1
2
1
21
n
s
n
s
xx
+
−
When, 22
2
2
1 σσσ == (say), or if two samples are drawn from the same
population, then
Z =






+
−
21
2
21
11
nn
xx
σ
If the common variance 2
σ is unknown, it is estimated by using the combined
sample variances i.e.
21
2
2
2
112
nn
snsn n
+
+
=σ
Example 1: Electric bulbs manufactured by X and Y companies gave the
following results.
Company X Company Y
No. of bulbs used 100 100
Mean life in hours 1300 1248
Standard deviation 80 93
Test whether there is any significant difference between the mean life of
two companies.
Solution: With the usual notations, we are given
Company X Company Y
n1 = 100 n2 = 100
1x =1300 hrs. 2x = 1248 hrs.
s1 = 80 hrs s2 = 93 hrs.
H0: 21 µµ = i.e. there is no significant difference in the mean life of bulbs
of two companies. In other words, the population means are same.
H1: 21 µµ ≠ i.e. there is a significant difference in the mean life of bulbs of
two companies. In other words the two population means are not same.
10

Z =
2
2
2
1
2
1
2
nn
xx
σσ
+
−
=
2
2
2
1
2
1
21
n
s
n
s
xx
+
−
= ( ) ( )
100
93
100
80
12481300
22
+
−
= 4.19
Tabulated value of Z at 5% level of significance is 1.96.
Z at 5% level of significance, H0 is rejected i.e. there is a significant difference
in the mean life of bulbs of two companies. In other words the two population
means are not same.
Example 2: An experiment has been conducted to compare the productivity of
two machines. Machine 1 was observed for 40 hours and machine 2 for 50
hours. The average productivity of item per hour and standard deviation for
each machine is recorded below:
Machine 1 Machine 2
Average: 61.4 59.5
Standard Deviation 3.1 2.8
At α = 0.10, do the sample provide sufficient evidence to conclude that
productivity of machine 1 is better than productivity of machine 2?
Solution: For machine 1 For machine 2
1x = 61.4 2x = 59.5
s1=3.1 s2= 2.8
n1 = 40 n2 = 50
H0: 21 µµ = i.e. there is no significant difference in the productivity between
machine 1 and machine 2.
H1: 21 µµ > (right tail test) i.e. the average productivity of machine 1 is greater
than machine 2.
Test statistic: Under H0 Z statistic is:
Z =
2
2
2
1
2
1
21
n
s
n
s
xx
+
−
= ( ) ( )
50
8.2
40
1.3
5.594.61
22
+
−
= 6301.0
9.1
3971.0
9.1
= =3.02
The tabulated value of Z at 10% level of significance for one tail test is 1.28.
Z at 10% level of significance for one tail test, null hypothesis is rejected.
Hence, we conclude that the average productivity of machine 1 is better than
machine 2.
11

Example 3: The means of two samples of 100 and 200 individuals are 67.5”
and 68.0” respectively. Can the samples be regarded as drawn from the same
population with standard deviation of 2.5” (Take α =0.05)
Solution: We are given:
Sample I Sample II
n1=100 n2 = 200
=1x 67.5” 2x =68”
σ = 2.5”
H0: 21 µµ = i.e. there is no significant difference between the two population
means or the samples can be regarded as drawn from the same population.
H1: 21 µµ ≠ i.e. there is a significant difference between the two population
means. In other words, the two samples can not be regarded as drawn from the
same population.
Under H0, the Z-statistic is given by
Z=






+
−
21
2
21
11
nn
xx
σ = ( ) 





+
−
200
1
100
1
5.2
685.67
2 = ( ) 015.025.6
5.0
005.001.025.6
5.0
×
−
=
+
−
= 3065.0
5.0−
=
-1.63
∴ z =1.63
Critical Value: The tabulated value of Z at 5% level of significance for two tail
test is 1.96
Decision: Since the calculated value of Z is less than the tabulated value of Z,
null hypothesis is accepted. Therefore, we conclude that the two samples can
be regarded as drawn from the same population.
Test of Significance of a Sample Proportion: In test of significance of a
sample proportion, population is divided into two mutually disjoint classes
representing the qualitative characteristic (attribute) in such a way that one
possesses a particular attribute (i.e. success) and other does not possesses that
attribute (i.e. failure).
To test whether there is any significant difference between the sample
proportion and the population proportion, we set up the null hypothesis as
H0: P = P0 i.e. the population proportion has a specified value P0 or the sample
has been drawn from the population with P0 or there is no significant
difference between the sample proportion (p) and the population
proportion (P).
12

H1: P ≠ P0 i.e. the population proportion has nor a specified value P0 or the
sample has not been drawn from the population with P0 or there is a
significant difference between the sample proportion and population
proportion.
Or H1: P > P0
H1: P< P0
Test statistic: Under H0 the test statistic is
Z = ).(.
)(
).(. pES
pEp
pES
Difference −
=
n
PP
Pp
n
PQ
Pp
)1( −
−
=
−
=
Where p= observed sample proportion of success = n
x
x = number of success possessing the given a given attribute
n= sample size or number of trials.
P = population proportion of success and Q is the population
proportion of failure such that P + Q = 1
Note that if the sample is drawn from the finite population of size N, then
S.E.(p) = n
PP
N
nN )1(
1
−






−
−
Remarks:
1. If the population proportion of success P is not known, then for large
samples, its estimate is provided by the sample proportion ‘p’ and an
unbiased estimate of SE(p) is given by
Est[SE(p)] =
(N n)pq
N(n 1)
−
−
($p =P for large sample)
2. If the sample proportion is less than the population proportion i.e. if
p<P, we assume left tailed test.
3. If the sample proportion is greater than the population proportion i.e.
P>p, we assume right tailed test.
Example 1: A coin is tossed 900 times and heads appear 490 times. Does this
support the hypothesis that coin is unbiased?
Solution: In the usual notation, n-900
p=The observed proportion of successes in 900 throws of a coin
=
490 49
900 90
= =0.544
H0: P=0.5, i.e. the coin is unbiased.
H1: P≠ 0.5 i.e. the coin is biased (two –tailed)
Level of significance, α = 0.05
13

Test Statistic: Under H0, the test statistic is:
Z=
p P
PQ
n
−
=
p P
PQ
n
−
=
0.544 0.5
0.5 0.5
900
−
=
×
0.044 30
2.64
0.5
×
=
Decision: Since computed value of Z is greater than 1.96, it is significant
at 5% level of significance. Hence, the null hypothesis is rejected and we
conclude that the coin is certainly biased. In other words, the given data are not
consistent with the hypothesis that the coin is unbiased.
Example 1: A manufacturer of ladies dresses is planning a direct mail order by
sending catalogues to potential customers obtained from a mailing list. He has
determined that a response of 20% would be needed to consider the campaign
successful. In a pilot study involving 400 potential customers selected
randomly from the mailing list, the actual number of responding was 74. Can
the campaign judge to be success at 95% confidence?
Solution: Given P = 20% = 0.20, Q = 1 – P = 1 – 0.20 = 0.80,
n = 400
Sample proportion of success, p = 74/400 = 0.185
H0: P = 0.20 i.e. the population proportion is 0.20. In other words, there is no
significant difference between the population proportion and sample
proportion.
H1: P ≠ 0.20 (two tailed test) i.e. the population proportion is not equal to 0.20.
We have, Z=
75.0
02.0
015.0
400
80.020.0
20.0185.0
−=
−
=
×
−
=
−
n
PQ
Pp
=∴z 0.75
The tabulated value of z at 5% level of significance = 1.96.
Decision: Since the calculated value of z is less than the tabulated value of z at
5% level of significance for two tailed test, null hypothesis is accepted.
Therefore, we conclude that there is no significant difference between
sample mean and population mean.
Example 2: A stock broker claims that he can predict with 80% accuracy
whether in NSE stock values will rise or fall during the coming week. As a test
he predicts the outcomes of 40 stocks and is correct in 28 of the predictions.
Does this evidence support the stock broker’s claim at 5% level of
significance?
Solution: H0: P=0.8, i.e. the claim of the stock broker is valid.
H1: P≠ 0.8 i.e. the claim of the stock broker is not valid.
14

In usual notations, p=
28
0.7,
40
= n=40, and α =0.05
Using the Z statistic, Z=
p P
PQ
n
−
=
0.7 0.8
0.8 0.2
40
−
=
× -
0.1
1.59
0.063
= −
Since calculated value of Z (= - 1.59) is greater than its critical value Zα (=
1.96), it is not significant (or falls in the acceptance region), the null hypothesis
is accepted and consequently, we conclude that the claim of the stock broker is
valid.
Example 3: A manufacturer of ladies dresses is planning a direct mail order by
sending catalogues to potential customers obtained from a mailing list. He has
determined that a response of 20% would be needed to consider the campaign
successful. In a pilot study involving 400 potential customers selected
randomly from the mailing list, the actual number of responding was 74. Can
the campaign judge to be success at 95% confidence?
Solution: Given P = 20% = 0.20, Q = 1 – P = 1 – 0.20 = 0.80,
n = 400
H0: P = 0.20 i.e. the population proportion is 0.20. In other words, there is no
significant difference between the population proportion and sample
proportion.
H1: P ≠ 0.20 (two tailed test) i.e. the population proportion is not equal to 0.20.
We have, Z=
75.0
02.0
015.0
400
80.020.0
20.0185.0
−=
−
=
×
−
=
−
n
PQ
Pp
=∴z 0.75
The tabulated value of z at 5% level of significance = 1.96.
Decision: Since the calculated value of z is less than the tabulated value of z at
5% level of significance for two tailed test, null hypothesis is accepted.
Therefore, we conclude that there is no significant difference between sample
mean and population mean.
Example 4: The mayor of the city claims that only 20% of the families are
below poverty level. A concerned citizen’s group believes that the percentage
is much higher. A random sample of 100 families is taken and 30 families are
found to be below poverty level. At 95% confidence level, is there sufficient
evidence to reject the mayor’s claim?
Population proportion of success, P = 20% = 0.20
Population proportion of failure, Q = 1 – P = 1 – 0.20 = 0.80
15

Sample size, n = 100
α =1 – 0.95 = 0.05
H0: P = 0.20 i.e. 20% of the families in the city are below poverty level.
H1: ……………………………………………
Example 5: The manufacturer of patent medicine claimed that it was 90%
effective in relieving an allergy for a period of 8 hours. In a sample of 200
people who had the allergy, the medicine provided relief for 160 people.
Determine whether the manufacturer’s claim is legitimate, if a one percent
level of significance is used.
Test of significance for difference of two proportions:
If we want to test whether the two sample proportions could have drawn from
the same parent population or whether two sample proportions differs
significantly or not, we use z-test for difference of two proportions. Let n1 and
n2 denote the sizes of large samples drawn from two populations possessing an
attribute, in which X1 and X2 denote the observed number of success, then,
p1 = observed sample proportion of success from first population =
1
1
X
n
p2 = observed sample proportion of success from second population =
2
2
X
n
such that E(p1) = P1 and E(p2) = P2
and, Var(p1) =
1 1
1
PQ
n and Var(p2) =
2 2
2
PQ
n
Since for large samples p1 and p2 are normally distributed, their (p1 – p2) is also
normally distributed with mean (P1 – P2) and the standard error of difference
of two sample proportions,
S.E.(p1 – p2) = 1 1 2 2
1 2
PQ P Q
n n
+ . Then the test procedure is performed as follows.
Null hypothesis: H0: P1 = P2 i.e. two population proportions are same or the
two samples have been drawn from the same proportion or there is no
significant difference between the two sample proportions p1 and p2.
Alternative hypothesis: H1: P1 ≠ P2 i.e. two population proportions are not
equal or the samples have not drawn from the proportion or there is significant
difference between the two population proportions (two tailed test).
16

or H1 : P1 > P2 i.e. the population proportion of first group is greater than the
second group (Right tailed test).
or H1: P1 < P2 i.e. the proportion of the first group is less than the second
group (Left tailed test).
Test Statistic: Under H0, the test statistic is
1 2
1 1 2 2
1 2
p p
Z
PQ P Q
n n
−
=
+
When population proportion of success remains unknown then we use
21
21
21
2211ˆ
nn
xx
nn
pnpn
P
+
+
=
+
+
= and PQ ˆ1ˆ −=
If P is unknown, then test statistic is






+
−
=
21
21
11ˆˆ
nn
QP
pp
z
Other steps are same as that of the previous tests.
Example 3: In a sample of 600 men from city A, 450 men are found to be
smokers and in another sample of 900 men from city B, 450 are found to be
smokers. Do the data indicate that the two cities are significantly different with
respect to the proportion smoking habit among men?
Solution: With the usual notations, we have,
City A City B
n1= 600 n2 = 900
X1 = 450 X2 = 450
Observed sample proportion of smokers from city A = p1 =
1
2
450
0.75
600
X
n
= =
Observed sample proportion of smokers from city B = p2 =
2
2
450
0.50
900
X
n
= =
H0: P1 = P2 =P (say) i.e. the proportion of smokers of two cities are same. In
other words, there is no significant difference between the proportions of
smokers in two cities.
H1: P1≠P2 i.e. the proportion of smokers of two cities are not same. In other
words, there is a significant difference between the proportions of smokers in
two cities.
Test statistic: Under H0, the test statistic is
17

1 2
1 1 2 2
1 2
p p
Z
PQ P Q
n n
−
=
+ , where P =common population proportion is unknown and we
use its unbiased estimate provided by both the samples taken together.
We have,
21
2211
nn
pnpn
P
+
+
=

=
21
21
nn
xx
+
+
=
450 450 900 3
0.60
600 900 1500 5
+
= = =
+
∴ PQ

−=1 = 1 – 0.60 = 0.4
Hence, z =






+
−
21
21
11
nn
QP
pp
 =
0.75 0.50
1 1
0.60 0.40
600 900
−
 
× + ÷
 
= ( )
0.75 0.50
0.24 0.0017 0.0011
−
+
=
0.25 0.50
0.000672
−
=
0.25
9.62
0.026
=
Tabulated value of Z at 5% level of significance for two tailed test is 1.96.
Conclusion: Since the calculated value of Z is greater than the tabulated value
of Z, it is significant and is rejected i.e. there is a significant difference
between the proportion of smokers in two cities.
Example 4: Before an increase in excise duty on coffee, 800 out of a sample
of 1000 persons were known to be taking coffee. After the increase in the duty
800 persons are now found to be taking coffee in a sample of 1200. Do you
think that there has been a significant decrease in the consumption of coffee
after the increase in the excise duty?
Before increasing excise duty After increasing excise duty
n1 = 1000 n2 = 1200
X1 = 800 X2 = 800
Sample proportion of taking coffee before increasing excise duty
= p1 =
1
1
800
0.8
1000
X
n
= =
Sample proportion of taking coffee after increasing excise duty
= p2 =
2
2
800
0.67
1200
X
n
= =
Null hypothesis: H0: P1 = P2 = P(say) i.e. there is no significant difference in
the consumption of coffee before and after the increase in excise duty.
Alternative hypothesis: H1 = P1 > P2 (Right tailed test) i.e. there has been a
significant decrease in the consumption of coffee after the increase in excise
duty.
18

Z =






+
−
21
21
11ˆˆ
nn
QP
pp
PP ˆ= and QQ ˆ= for unknown P
Where,
21
2211ˆ
nn
pnpn
P
+
+
= =
1 2
1 2
x x
n n
+
+ =
800 800
1000 1200
+
+
=
1600
0.73
2200
=
and O PQ ˆ1ˆ −= = 1 – 0.73 = 0.23
0.8 0.67
1 1
0.73 0.23
1000 1200
Z
−
∴ =
 
× + ÷
 
=
0.13
6 5
0.1679
6000
+ 
 ÷
 
=
0.13
11
0.1679
6000
 
 ÷
 
=
0.13
0.00031
= 7.22
Tabulated value of Z at 5% level of significance for right tailed test is 1.645
i.e. Z0.05 = 1.645.
Conclusion: Since the calculated value of Z is greater than the tabulated value
of Z, it is significant and H0 is rejected i.e. H1 is accepted. It means that there
has been a significant decrease in the consumption of coffee.
Exercise: A marketing company claims that it received 8% responses from its
mailing for a certain product to be produced in the near future. It seems that the
company is giving an exaggerated picture and the said percentage is less. To
test this claim, a random sample of 500 was surveyed, from which there were
30 responses. Test the company’s claim at the alpha = 5% significance level
against that the company is giving an exaggerated picture.
Student’s ‘t’ test:
If x1, x2, x3, ………….., Xn are the random sample of size n from a normal population
with mean µ and variance σ 2
, then the student’s t statistic is defined as
t =
n
S
x µ−
_
where, x = n
x∑ is the sample mean
S2
= an unbiased estimate of the population variance 2
σ and is computed by
(i) Actual Mean Method: S2
= ( )2
1
1
∑ −
−
xx
n
. This formula is suitable when
the mean value is in whole number
(ii) Direct Method:
( )








−
−
= ∑
∑
n
x
x
n
S
2
22
1
1
(iii) Short cut or assumed mean method:
( )








−
−
∑
∑
n
d
d
n
S
2
22
1
1
, where d= x – A
When the biased estimate of the population variance ( 2
σ ) i.e. s2
is given, then
the value of t is computed by
19

t =
1
_
−
−
n
s
x µ
, where sample variance s2
= ( )∑ −
21
xx
n
Or ns2
= ( )∑ −
2
xx
Also, S2
= ( )
21
1
x x
n
−
−
∑ or (n-1)S2
= ( )
2
x x−∑
∴ (n -1)S2
= ns2
or
2 2
1
s S
n n
=
−
⇒
n
S
n
s
=
−1
The statistic t defined above follows t-distribution with n – 1 degrees of
freedom (i.e. 1nν = − ).
Degree of Freedom: The degree of freedom is the number of values that can be
chosen freely. For example, if we deal with two samples x1 and x2 whose mean
is 18 (suppose) and if we assign a value to any one of them other value can be
determined immediately,
i.e. if x1 = 10 (say), then 1 2
2
x x+
=18 will give x2 = 2x18 – 10 = 16.
In this case we are free to specify only one of the samples. Thus we have only
one degree of freedom (=2 – 1) when the sample size is 2. Similarly, for fixed
value of n observations there are n – 1 degrees of freedom.
Test of Significance of a Single Mean: The steps used in the test of significance
of a single mean for a small sample (n < 30) are same as those for large sample
(n ≥ 30 ) except the use of test statistic. We use t- values instead of z- values as
test statistic.
Step I: Formulating hypothesis:
Ho: 0µ µ=
H1: µ≠ µ0 ( i.e. µ ≥ µ0 or µ< µ0 ) (two-tailed alternative hypothesis)
H1: µ > µ0 (right tailed alternative hypothesis).
H1: µ < µ0 (left tailed alternative hypothesis).
Step II : Compute the test statistic given by
x
t
S
n
µ−
=
, where, x = sample mean; and S = ( )
2
1
x x
n
−
−
∑
Step III: Fix the level of significance α (usually α = 0.05) and decide whether
to apply one tailed of two tailed test on the basis of H1.
Step IV: Make decision on the basis of the decision rule given as follows:
i) If tcal < ttab, accept Ho
ii) If tcal > ttab, reject Ho
Confidence Limits for Population Mean µ:
20

If αt is the critical value for t for n -1 degrees of freedom atα level of
significance, then (1- α ) % confidence limit of µ are given by
x ± αt
n
S
i.e. x - αt
n
S
< µ < x + αt
n
S
Thus, for example 95% confidence limits for µ are x ± t0.05 n
S
and
99% confidence limits for µ are x ± t0.01 n
S
i.e. x - t0.01 n
S
< µ < x +
t0.01 n
S
Example 1: A sample of 10 apples from a truck load shipment yields the
following weight in ounces: 16, 11, 14, 20, 15, 13 14, 17, 10, 18, the owner of
the orchid from where the apples were grown claims the mean weight of the
apples in the truck is 14 ounces. Do you think that these data support the
owner’s claim?
Given, µ = 14 ounces, sample size, n = 10
Calculation of sample mean and standard deviation
Weight in ounce
x d= x – 14 d2
----------------- ------------ -----
16 2 4
11 - 3 9
14 0 0
20 6 36
15 1 1
13 - 1 1
14 0 0
17 3 9
10 - 4 16
18 4 16
-------------- --------- ------
8 92
d
x A
n
= +
∑ = 14 +
8
10
= 14 + 0.8 = 14.8 ounces
2 2
d d
s
n n
 
= − ÷ ÷
 
∑ ∑ =
2
92 8
10 10
 
− ÷
 
= 9.2 0.6− = 8.6 = 2.93 ounces
H0:µ =14 ounces i.e. the mean weight of apples in the truck is 14 ounces.
H1:µ ≠ 14 ounces i.e. the mean weight of apples in the truck is not 14 ounces.
21

Test statistic:
1
x
t
s
n
µ−
=
−
=
14.8 14
2.93
10 1
−
−
=
0.8
3
2.93
× = 0.83
d.f. = n – 1 = 10 – 1 = 9
Critical value: The tabulated value of t at 5% level of significance for 9 d.f.
for two tailed test is 2.262.
Decision: Since the calculated value of t is less than the tabulated value of t at
5% level of significance, null hypothesis is accepted. Hence we conclude that
the mean weight of the apples in the truck is 14 ounces i.e. the data support the
owner’s claim.
Example 2: A random sample of size 20 from a normal population gives a
sample mean of 42 and sample standard deviation of 6. Test the hypothesis that
the population mean is 44.
Solution: n=20, x = 42, s = 6, µ = 44
H0: µ = 44 i.e. the populations mean is equal to 44.
H1: µ ≠ 44 (two tailed test) i.e. population mean is not equal to 44
1
x x
t
S s
n n
µ µ− −
= =
−
=
42 44
6
20 1
−
−
= -1
1=∴t
Degree of freedom = n – 1 = 20 – 1 = 19
Level of significance = 5%
Tabulated value of t at 5% level of significance for 19 d.f for two tailed test is
2.093.
Since the calculated value of t is less than the tabulated value of t at 5%
level of significance, null hypothesis is accepted. Hence, we conclude that the
population mean is 44. In other words, there is no significance difference
between the sample mean and population mean.
Example 3: A sample of size 10 drawn from a normal population has a mean
31 and a variance 2.25. Is it reasonable to assume that the mean of the
population is 30? Use 1% level of significance.
Solution: In usual notations, it is given: n=10, x =31, s2
=2.25
Hypothesis: Ho: µ =30 i.e. the mean of the population is 30
against H1 :
µ ≠ 30 i.e. the mean of the population is not 30.
Under Ho, the test statistic is:
1
x x
t
S s
n n
µ µ− −
= =
−
=
9
25.2
3031−
= 5.1
3
=2
d.f.= 10-1=9
Tabulated value of t at 5% level of significance for 9 d.f.= 3.25
22

Decision: Since the computed value of t is less than the critical (tabulated)
value of t, it is not significant i.e. null hypothesis is accepted at 1% level of
significance. Hence in the light of the data given, it may be concluded that the
population mean can be taken as 30.
t-test for difference between two independent means (the pooled t-test):
Suppose if we want to test whether two independent samples have been drawn
from two normal populations having the same means under the assumption
that the sampled population variances being equal, this test is used. This
method does not require that we know the standard deviations of the
populations. Because we assume that the two populations have equal
standards, the best estimate we can make of that value is to combine or pool all
the sample information we have about the population standard deviation.
We set up the null and alternative hypothesis as follows:
Ho: µx = µy (i.e. the samples have been drawn from the normal population
with the same mean. In other words, the means
_
x and
_
y do not differ
significantly against the alternative hypothesis
H1: µx ≠ µy i.e. the samples have not been drawn from the normal
population with the same mean. In other words, there is a significant difference
between the two means.
Under the assumption that σ
2
x
= σ
2
y = σ
2
i.e. population variances are equal
but unknown, the test statistic is given by
t =






+
−
−−
21
2 11
nn
S
yx
, where x =
1n
x∑
, y =
2n
y∑
( ) ( )[ ]∑ ∑ −+−
−+
=
22
21
2
2
1
yyxx
nn
S - when actual mean method is used.
( ) ( )








−+−
−+
= ∑ ∑
∑∑
2
2
2
1
2
2
21
2
2
1
n
y
y
n
x
x
nn
S - when direct method is used.
( ) ( )








−+−
−+
= ∑ ∑
∑∑
2
2
2
1
2
2
21
2
2
1
n
d
d
n
d
d
nn
S
y
y
x
x - when short cut method is used.
Where, Axdx −= and Bxd y −= 2
2
S is an unbiased estimate of the common population variance 2
σ and follows
t-distribution with n1 + n2 –2 d.f.
When the biased estimates of the sample variances (or standard deviations) are
given, then the unbiased estimate of the sample variance is computed as:
23

221
2
22
2
112
−+
+
=
nn
snsn
S , Similarly, 2
)1()1(
21
2
22
2
112
−+
−+−
=
nn
SnSn
S
Example 1: The incomes of a random sample of engineers in industry I are:
Rs. 630, 650, 680, 690, 720 and 710 per month. The incomes of similar
sample from industry II are Rs. 610, 620, 650, 660, 690, 690, 700, 710, 720
and 730 per month. Discuss the validity of the suggestion that industry I pays
its engineers most better than industry II
H0: 21 µµ = i.e. there is no significant difference between the incomes of
engineers of two industries.
H1: 1µ > 2µ i.e. industry I pay its engineers better than industry II (right
tail)
Test Statistic: t =






+
−
−−
21
2 11
nn
S
yx
, where, x =
x
n
∑ , y =
y
n
∑ ,
( ) ( )[ ]∑ ∑ −+−
−+
=
22
21
2
2
1
yyxx
nn
S
x
x
n
=
∑ =
4080
6
=680
y
y
n
=
∑ =
6780
10
= 678
( ) ( )[ ]∑ ∑ −+−
−+
=
22
21
2
2
1
yyxx
nn
S = [ ]
1
6000 15360
6 10 2
+
+ −
=
1
21360
14
× = 1525.71
Industry I Industry II
x ( )x x− ( )
2
x x− y ( )y y− ( )
2
y y−
630
650
680
690
720
710
-50
-30
0
10
40
30
2500
900
0
100
1600
900
610
620
650
660
690
690
700
710
720
730
-68
-58
-28
-18
12
12
22
32
42
52
4624
3364
784
324
144
144
484
1024
1764
2704
4080 6000 6780 15360
24

Hence, t =






+
−
−−
21
2 11
nn
S
yx
=
680 678
1 1
1525.71
6 10
−
 
+ ÷
 
=
2
5 3
1525.71
30
+ 
 ÷
 
=
2
8
1525.71
30
×
Tabulated value of t at 5% level of significance for 14 (6+10-2) d.f. is 1.76.
Decision: Since the calculated value of t is less than its tabulated value, null
hypothesis is accepted i.e. there no significant difference between the incomes
of engineers in industry I and industry II. Hence the suggestion is not valid.
Ex. The nicotine content in milligrams of two samples of tobacco were found
to be as follows
Sample A: 24 27 26 21 25
Sample B: 27 30 28 31 22 36
Can it be said that two samples came from normal populations having the
same mean.
Example 2: Two salesmen A and B are working in a certain district. From a
sample survey conducted by the head office, the following results were
obtained. State whether there is any significant difference in the average sales
between the two salesmen:
Salesman A Salesman B
Number of sales: 20 18
Average sales (in Rs.): 170 205
Standard deviation (in Rs.): 20 25
For A, 1n = 20 For B, 1n = 18 ,
1701 =x 2052 =x
1s = 20 2s = 25
Null hypothesis H0: 21 µµ = , i.e. there is no significant difference between the
average sales of two salesmen.
Alternative hypothesis H1: 21 µµ ≠ i.e. there is a significant difference between
the average sales of two salesmen.
t =






+
−
−−
21
2
21
11
nn
S
xx
, where 221
2
22
2
112
−+
+
=
nn
snsn
S =
( ) ( )
21820
25182020
22
−+
×+×
=
25

36
6251840020 ×+×
= = 36
112508000 +
= = 36
19250
= = 534.75
∴ t = 





+
−
18
1
20
1
72.534
205170
= ( )06.005.072.534
35
+
−
= 11.072.534
35
×
−
= 8192.58
35−
= 67.7
35−
= - 4.56
t∴ = 4.56
Degree of freedom = n1 + n2 – 2 = 20 + 18 – 2 = 36
Since d.f. =36 > 30, it follows normal distribution. Hence the tabulated value
of t at 5% level of significance for two tailed test is 1.96.
Decision: Since the calculated value of t is greater than the tabulated value of t,
it is significant, hence the null hypothesis is rejected i.e. alternative hypothesis
is accepted, which means that there is a significant difference between the
average sales of two salesmen.
Hypothesis Testing for Difference of two Means with Dependent Samples
(Paired t-test)
In test of significance for difference of two means, the comparison between
two population means was made under the assumption of independent samples
drawn from normal populations. Two samples are said to be independent of
each other if the elements in one are not related to those in the other in
meaningful manner. Two samples are dependent if they are paired in the sense
that each observation in one is associated with some particular observation in
the other. For example, if we are testing sales of goods before and after the
advertisement campaign, or testing the productivity level of workers before
and after the training program and so on, then these values are related to each
other. In each situation, we are concerned with the difference between the pair
of related observations instead of the value of the individual observations. If
two samples are dependent, they must contain the same number of elementary
units.
Let us now consider a particular situation where
(i) The sample sizes are equal, i.e. n1 = n2 = n (say), and
(ii) The sample observations (x1, x2,………..xn) and (y1, y2………….yn) are
not completely independent, but they are dependent in pairs i.e. the pair
26

of observations (x1, y1), (x2, y2)……………(xn, yn) correspond to the 1st
,
2nd
,……….nth unit respectively.
The steps in testing paired t-test for difference of means are as follows:
Step 1: Set up the null hypothesis and alternative hypothesis as follows:
Null hypothesis: H0: yx µµ = or d=0 i.e. there is no significant difference in the
observations before and after treatment.
Alternative hypothesis: H1: yx µµ ≠ i.e. there is a significant difference in the
observations before and after treatment. (Other procedures are same)
Under H0 the test statistic is
n
S
d
n
S
d
t
2
==
follows t-distribution with (n-1) d.f.
Where, d= x - y, n
d
d
∑= and ( )∑ −
−
=
22
1
1
dd
n
S or S2
=
( )








−
−
∑
∑
n
d
d
n
2
2
1
1
Example: A company has reorganized its sale department. The following data
show the weekly sales figures (in Rs. lakhs) before and after the
reorganization. Can we say that sales have significantly improved due to the
reorganization?
Week : 1 2 3 4 5 6 7 8 9 10
Sales before: 12 15 13 11 17 15 10 11 18 19
Sales after : 16 17 14 13 15 14 12 11 17 22
Solution: In this problem, the sales before reorganization (x) and after
reorganization (y) are not independent, but are paired together. Hence we
shall apply paired t-test.
H0: yx µµ = , i.e. there is no significant change in sale after the reorganization.
H1: yx µµ ≠ , i.e. there is a significant change in sale after the reorganization.
(two-tailed test).
Under H0:, the test statistic:
n
S
d
t
2
==
Computation of mean and standard deviation:
Week Sales (before
reorganization)
(x)
Sales (after
reorganization)
(y)
d=
y - x
d2
1
2
12
13
16
17
4
2
16
4
27

3
4
5
6
7
8
9
10
13
11
17
15
10
11
18
19
14
13
15
14
12
11
17
22
1
2
-2
-1
2
0
-1
3
1
4
4
1
4
0
1
9
Total 10=∑d 44
2
=∑d
n
d
d
∑=
= 10
10
= 1
S2
=
( )








−
−
∑
∑
n
d
d
n
2
2
1
1
=
( )






−
10
10
44
9
1
2
= 34
9
1
× = 3.778
∴
10
778.3
1
=t
= 615.0
1
= 1.6269
The critical (tabulated) value of t for 9 d.f. at 5% level of significance for two
tailed test 2.262. Since calculated value of t is less than critical value of t, it is
not significant at 5% level of significance. Hence, the data do not provide any
evidence against the null hypothesis which may be accepted. It may therefore
be concluded that reorganization has no effect on sales.
Exercises:
1. A machine put out 20 imperfect items in a sample of 500. After the
machine is overhauled, it put out 5 imperfect items in a batch of 150. Has
the machine being improved after overhauling? (z= 0.39)
2. A company is considering two different advertisements for promotion of
a new product. Management believes that advertisement A is used in one
area and advertisement B in other area. In a random sample of 60
customers who saw advertisement A, 18 had tried the product. In a
random sample of 100 customers who saw advertisement B, 22 had tried
the product. Does this indicate that advertisement A is more effective
than advertisement B, if a 5% level of significance is used?
(z = 1.1315)
3. The production manager at Bullevue Steel, a manufacturer of wheelchair,
wants to compare the number of defective wheelchairs produced on the
28

day shift with the number on the afternoon shift. A sample of the the
production from 6 day shifts and 8 afternoon shifts revealed the
following number of defects.
Day: 5 8 7 6 9 7
Afternoon: 8 10 7 11 9 12 14 9
At the 0.05 significance level, is there a difference in the mean number
of defects per shift?
(a) State the null hypothesis and the alternative hypothesis.
(b) What is the decision rule?
(c) What is the value of the test statistic?
(d) What is your decision regarding the null hypothesis?
(e) What is the p-value?
(f) Interpret the result
(g) What are the assumptions necessary for this test?
4. For a certain sample of 10 pigs fed on diet A, the increase in weight (in
lbs) in a certain period were 10, 17, 13, 12, 9, 8, 14, 15, 6, 16. For another
sample of 12 pigs fed on diet B, the increase in weight in the same period
were 14, 18, 8, 21, 23, 10, 17, 12, 22, 15, 7, 13. Test whether diet A and B
differ significantly as regards their effect on increase in weight. (t=1.52)
5. A certain diet newly introduced to each of the 12 pigs resulted in the
following increasing in body weight:
6, 3, 8 -2, 3, 0, -1, 1, 6, 0, 5 and 4
Can you conclude that the diet is effective in increasing the weight of pigs?
6. Memory capacity of 10 students was tested before and after training. State
whether the training was effective or not from the following scores.
Roll no. : 1 2 3 4 5 6 7 8 9 10
Before training: 12 14 11 8 7 10 3 0 5 6
After training: 15 16 10 7 5 12 10 2 3 8
7. The sales data of an item in six shops before and after a special promotional
campaign are:
Shops: A B C D E F
Before: 53 28 31 48 50 42
After: 58 29 30 55 56 45
Can the campaign be judge to be a success?
2
χ -test: The test of significance such as z-test, t-test etc. are based on the
assumption that the samples are drawn from the normal population i.e. we
29

make assumption about the population parameter. Such tests are called
parametric tests. However, in many situations it is not possible to make
dependable assumption about the parent population from which the samples
are drawn. In such situations some tests, called non-parametric test is used,
which do not require any assumptions about the parameters. The X2
(chi-
square) is one of the most important non-parametric test.
Definition: A measure of discrepancy between observed (or experimental
values) and expected (theoretical values) frequencies is known as X2
statistic,
which is defined as
2
χ =
)(∑
−
E
EO
2
=
(
1
2
11 )
E
EO −
+
(
2
2
22 )
E
EO −
+………+
(
n
nn
E
EO 2
)−
follows 2
χ distribution with d.f. n-1, where O1, O2, ………On are the observed
frequencies and E1, E2, ………En are the corresponding expected or theoretical
frequencies.
If N is the total frequency, ∑ ∑ == NEO
When 2
χ = 0, i.e. observed and expected frequencies agree exactly. It is
clear that greater the value of X2
, greater is the discrepancy between the
observed and expected frequencies.
Conditions for using 2
χ - test: Followings are the conditions for using the 2
χ -
test.
(a) The frequencies used in 2
χ test must be absolute.
(b) The total number of observations must be as large as 50.
(c) Each of the observations must be independent of each other.
(d) The expected frequency of any item or cell must not be less
than 5. If it is less than 5, the frequencies of adjacent items or cells should
be pooled together in order to make it 5 or more than 5.
(e) 2
χ –test can not be used for estimating the value of the population
parameter.
Application of 2
χ –test: 2
χ –test is used broadly
(i) As a test of goodness of fit
(ii) As a test of independence
2
χ test as a Test of Goodness of Fit: If a set of observed frequencies under
some experiment is given and if we want to test whether the experimental
result support a particular hypothesis, Karl Pearson developed a test of
significance called 2
χ -test of goodness of fit. This is used to test whether there
is a difference between observed (experimental) and expected (theoretical)
values.
30

Under the null hypothesis Ho that there is no significance difference
between observed and expected values (or there is a good compatibility
between theory and experiment), Karl Pearson defined chi square as
2
χ = ( )
2
O E
E
−
∑ =
(
1
2
11 )
E
EO −
+
(
2
2
22 )
E
EO −
+………+
(
n
nn
E
EO 2
)−
follows 2
χ distribution with d.f. n-1, where O1, O2, ………On are the observed
frequencies and E1, E2, ………En are the corresponding expected or theoretical
frequencies
Example 1. The automobile accidents per week in a certain community were
as follows:
12 8 20 2 14 10 15 6 9 4
Are these frequencies in agreement with the belief that accident conditions
were the same during the 10 week period?
Null Hypothesis: Given frequencies (or number of accidents per week in a
certain community) are consistent with the belief that the accident conditions
were same during the 10 week period.
Observed no. Expected no.
of accidents(O) of accidents(E) O – E (O – E)2
E
EO 2
)( −
------------------ ------------------ ------- ---------- ----------
12 10 2 4 0.4
8 10 -2 4 0.4
20 10 10 100 10
2 10 -8 64 6.4
14 10 4 16 1.6
10 10 0 0 0
15 10 5 25 2.5
6 10 -4 16 1.6
9 10 -1 1 0.1
4 10 -6 36 3.6
.------------ ---------- ------- ---------- ---------
100 100 26.6
We have, 2
χ = ∑ −
E
EO 2
)( = 26.6
d.f. = 10 – 1= 9
2
χ tabulated for 9 d.f. at 5% level of significance = 16.9
31

Since the calculated value of 2
χ is greater than that of tabulated value,
null hypothesis is rejected i. e. the accident conditions were not same during
the period of 10 weeks.
Example 2: Ex. In a set of random numbers the digits 0, 1, ……….9 were
found to have the following frequencies:
Digit: 0 1 2 3 4 5 6 7 8 9 Total
f : 43 32 38 27 38 52 36 31 39 24 360
Test whether they are significantly different from those expected on the
hypothesis of uniform distribution.
Example 3: A sample analysis of examination results of 200 students was
made. It was found that 46 students had failed, 68 secured a third division, 62
secured a second division and rest was placed in the first division. Are these
figures commensurate with general examination result, which is in the ratio 2 :
3 : 3 : 2 for various categories respectively? (The table value of 2
χ for 3 d.f. at
5% level of significance is 7.8)
Solution: Null Hypothesis H0: The observed figures do not differ significantly
from the hypothetical frequencies, which are in the ratio of 2 : 3 : 3 : 2. In other
words, there is a good correspondence between the given data and the general
examination result.
Under the null hypothesis, the expected frequencies for various
categories are calculated as follows:
We have, the ratio of failed : third division : second division : first division
= 2 : 3 : 3 : 2 = 10
Hence, the expected frequencies of failed students =
2
200
10
× = 40
the expected frequencies of failed students =
3
200
10
× = 60
3
200
10
× = 60
2
200
10
× = 40
Computation of the value of 2
χ
Category Observed
frequency
O
Expected
frequencies
E
O - E (O - E)2 2
(O E)
E
−
Failed
IIIrd
division
IInd
division
46
68
62
40
60
60
6
8
2
36
64
4
0.90
1.07
0.07
32

Ist
division 24 40 - 16 256 6.40
Total 200 200 0 8.44
2
χ = ( )2
O E
E
−
∑ = 8.44
Here, d.f. = 4 - 1 = 3
Tabulated value of 2
χ at 5% level of significance for 3 d.f. = 7.81
Since calculated value of 2
χ is greater than the tabulated value, it is significant.
Thus, null hypothesis is rejected at 5% level of significance. Hence we
conclude that the data do not commensurate with the general examination
result.
2
χ test as a test of Independence of Attributes: In 2
χ test of goodness of fit
we used one way classification table of observed frequencies in a single row or
column. When the observed frequencies occupy r rows and c column, a two
way classification table is formed and such table is known as contingency
table.
For testing the independence of two attributes, say A and B with r×c
contingency table, we set up the null hypothesis as:
H0: two attributes A and B are independent i.e. there is no association between
the two attributes A and B against the alternative hypothesis
H1: the attributes A and B are not independent (i.e. there is an association
between the two attributes A and B).
Then, we compute the expected cell frequencies by using the relation
E = nsobservatioofnoTotal
totalColumntotalRow
.
×
= N
CTRT ×
Under Ho, we compute the test statistic
2
χ = ∑
−
E
EO 2
)(
Follows 2
χ distribution with (r-1)(c-1) degrees of freedom, where r×c = total
number of cells in the contingency table
Now comparing this calculated value with tabulated value for (r-1)×(c-
1) degrees of freedom at certain level of significance, we reject or accept the
null hypothesis of independence of attributes at that level of significance.
Example 4: Two factories using materials purchased from the same suppliers
are closely controlled to an agreed specification produce output for a given
period classified into three quality grades as follows.
Quality Grades (outputs in tons)
Factory A B C Total
33

X 42 13 33 88
Y 20 8 25 53
62 21 58 141
Test whether the output produced by the two factories are significant
difference at 5% level.
Ho: The outputs produced by the two factories are not significantly different.
H1: The outputs produced by the two factories are significantly different.
Computation of expected frequencies:
Expected frequency of a11 = 141
6288× = 38.69
2188× = 13.11
5888× = 36.20
6253× = 23.30
2153× = 7.89
5853× = 21.80
Computation of 2
χ :
O E O-E (O – E) E
EO 2
)( −
---------------------------------------------------------------------------
42 38.69 3.31 10.923 0.283
13 13.11 -0.11 0.011 0.001
33 36.20 -3.2 10.227 0.283
20 23.30 -3.3 10.923 0.467
8 7.89 0.11 0.011 0.001
25 21.80 3.2 10.234 0.470
-------------------------------------------------------------------------
1.505
∴ 2
χ = ∑
−
E
EO 2
)(
= 1.505
d.f.= (r – 1)(c – 1) = (2 -1 )(3 - 1) = 2
χ at5% level of significance for 2 d.f. is 5.991.
Decision: Since the computed value of 2
χ is less than its tabulated value at 5%
level of significance for 2 d.f., Ho is accepted i.e. we conclude that the outputs
produced by two factories are almost the same quality.
22× Contingency Table:
34

We can write 22× Contingency table as
a b a + b
c d c + d
a
+c
b +
d
N=a+b+c+d
Under the null hypothesis of independence of attributes, the value of 2
χ for
22× contingency table can easily be obtained by using the formula
2
χ = ))()()((
)( 2
dcbadbca
bcadN
++++
−
Example 5:. Do the following data provide evidence of the effectiveness of
inoculation?
Attacked Not attacked Total
Inoculated 20 300 320
Not inoculated 80 600 680
Total 100 900 1000
( 2
χ for 3 d.f. at 5% = 7.81, for 1 d.f. at 5% = 3.841)
Ho: Inoculation is not effective
H1: Inoculation is effective
2
χ = ))()()((
)( 2
dcbadbca
bcadN
++++
−
= 900100680320
)3008060020(1000 2
×××
×−×
= 900100680320
)2400012000(1000 2
×××
−
=
01958400000
1440000001000×
= 19584
144000
= 7.35
d.f.= (r-1)(c-1) = (2-1)(2-1) = 1
χ at 5% level of significance for 1 d.f. is 3,84
Decision: Since the computed value of 2
χ is greater than its tabulated value,
null hypothesis is rejected i.e. the inoculation is effective.
Example 6:.The following table gives the classification of 100 workers
according to sex and nature of work. Test whether the nature of work is
independent of the sex of the worker.
Skilled Unskilled
Male: 40 20
Females: 10 30 ( 2
χ
=16.67, Reject null hyp.)
Example 7:.In a survey of 200 boys of which 75 were intelligent, 40 had
skilled fathers, while 85 of the unintelligent boys had unskilled fathers. Do
35

these figures support the hypothesis that skilled fathers hava intelligent boys?
Use 2
χ test.
Solution: H0: The two attributes viz. 'skill of fathers' and 'intelligence of boys'
are independent. In other words, skilled fathers do not have intelligent boys.
The observed frequencies are tabulated in the following table:
Intelligent
boys
Unintelligent
boys
Total
Skilled fathers 40 125 - 85 = 40 40 + 40 = 80
Unskilled fathers 120 - 85 = 35 85 200 - 80 = 120
Total 75 200 - 75 = 125 200
Computation of Expected frequencies:
Exp. freq. of intelligent boys who had skilled fathers (a11) =
80 75
200
×
= 30
Exp. freq. of intelligent boys who had skilled fathers (a11) =
80 125
200
×
= 50
Exp. freq. of intelligent boys who had unskilled fathers (a21) =
120 75
200
×
= 45
Exp. freq. of unintelligent boys who had unskilled fathers (a22) =
120 125
200
×
= 75
Computation of the value of 2
χ
Observed
frequency
O
Expected
frequencies
E
O - E (O - E)2 2
(O E)
E
−
40
40
35
85
30
50
45
75
10
-10
-10
10
100
100
100
100
3.333
2.00
2.222
1.333
8.888
∴ 2
χ = ∑
−
E
EO 2
)(
= 8.888
d.f.= (r – 1)(c – 1) = (2 -1 )(2 - 1) = 1
χ at 5% level of significance for 1 d.f. = 3.841
Since the calculated value of 2
χ is greater than the tabulated value of 2
χ at 5%
level of significance, it is significant and the null hypothesis is rejected. Hence
we conclude that the skill of the fathers has a significant effect on the
intelligence of the boys.
36

Yate’s Correction for Continuity for 2×2 contingency table: If any cell
frequency for 2×2 contingency table is less than 5, then we use the technique
of pooling, which consists in adding the frequencies which are less than 5 with
the preceding or succeeding frequency to get sum 5 or more.
In 2×2 contingency table, the d.f. is (r – 1).(c – 1) = (2 – 1).(2 – 1) =1
But 1 d.f. is lost in pooling. So the number of d.f. for 2×2 contingency table
after pooling = 1 – 1 = 0, which is meaningless, since 2
χ must have at least 1
d.f. In this case we apply a correction given by F-Yates, which is usually
known as ‘Yates correction for continuity’. As 2
χ is a continuous and it fails to
maintain its character of continuity if any expected frequency or observed
frequency is less than 5 and hence the name correction for continuity.
The working rule for the application of the correction is to add 0.5 to the cell
frequency which is less than 5 and adjust the remaining frequencies
accordingly by fixing the row and column totals. Then apply 2
χ -test without
pooling.
Let us write 22× Contingency table as
a b a + b
c d c + d
a +c b + d N=a+b+c+d
Then the 2
χ after Yates correction is
( ) ( ) ( ) ( )
2
2 2
N
N ad bc
a b c d a c b d
χ
 
− −  =
+ + + +
Example 7: In an experiment on the immunization of goats from Anthrax,
the following results were obtained. Derive your inference on the efficiency
of the vaccine.
Died of
Anthrax
Survived Total
Inoculated with
vaccine
10 26 36
Not inoculated 16 4 20
Total 26 30 56
Solution: Since cell frequency is 4, which is less than 5, we should apply
Yates correction for calculating 2
χ . For this, add 0.5 to cell frequency which
37

is less than 5 and adjust the remaining frequencies by fixing row total and
column total. Thus, adjusted 22× contingency table is presented in the
following table
a = 10.5 b=25.5 a + b = 36
c = 15.5 d=4.5 c + d
a +c = 26 b + d = 30 N=a+b+c+d = 56
H0 : The vaccine is not effective in preventing from Anthrax
H1 : The vaccine is effective in preventing from Anthrax.
O
E=
RT CT
N
×
O – E (O – E )2
(O – E)2
/E
10.5
22.5
15.5
4.5
16.71
19.29
9.29
10.71
- 6.21
6.21
6.21
- 6.21
38.56
38.56
38.56
38.56
2.31
2.0
4.15
3.6
∑
−
E
EO 2
)(
= 12.06
Test Statistic: Under H0, 2
χ = ∑
−
E
EO 2
)(
= 12.06
d.f. = (r-1)(c-1) = (2-1)(2-1) = 1
χ at 5% level of significance for 1 d.f. is 3.84.
Decision: Since the calculated value of 2
χ is greater than the tabulated value
of 2
χ , null hypothesis is rejected i.e. alternative hypothesis is accepted.
Hence, we conclude that the inoculation is effective.
Alternative Method:
Died of
Anthrax
Survived Total
Inoculated with
vaccine
a=10 b=26 a+b = 36
Not inoculated c = 16 d =4 c+d =20
Total a+c = 26 b+d =30 N=56
Applying Yates correction,
( ) ( ) ( ) ( )
2
2 2
N
N ad bc
a b c d a c b d
χ
 
− −  =
+ + + +
= ………= 12.06
38

Statistical Hypothesis Testing Guide

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Statistical Hypothesis Testing Guide

Similar to Statistical Hypothesis Testing Guide (20)

More from Sanjay Basukala

More from Sanjay Basukala (7)

Recently uploaded

Recently uploaded (20)

Statistical Hypothesis Testing Guide