Tests of significance
Z & Chi square Test
Dr. Shaikh B.M.
Dept. of Community Medicine
Basic terms (recap)……….
Level of significance and confidence,
P value, power of the test, degree of freedom
Tests of significance and type
Selection of the test of significance
Steps in hypothesis testing
Z test (SEDP)
Chi Square test
Limitations of the tests of significance
• Research done on samples and not on populations.
• Variability of observations occur among different samples.
• This complicates whether the observed difference is due to
biological or sampling variation from true variation.
• To conclude actual difference, we use tests of significance
Test of significance
• The test which is done for testing the research hypothesis
against the null hypothesis.
Why it is done?
To assist administrations and clinicians in making decision.
The difference is real ? or
Has it happened by chance ?
Null Hypothesis (HO)
• 1st step in testing any hypothesis.
• Set up such that it conveys a meaning that there exists no
difference between the different samples.
• Eg: Null Hypothesis – The mean pulse rate among the two
groups are same (or) there is no significant difference between
their pulse rates.
• By using various tests of significance we either:
–Reject the Null Hypothesis
–Accept the Null Hypothesis
• Rejecting null hypothesis → difference is significant.
• Accepting null hypothesis → difference is not significant.
Level of significance and confidence
• Significance means the percentage risk to reject a null
hypothesis when it is true and it is denoted by 𝛼.Generally
• (1 − 𝛼) is the confidencelevelin which the null hypothesis will
exist when it is true.
Level of Significance – “P” Value
• p-value is a function of the observed sample results
(a statistic) that is used for testing a statistical hypothesis.
• It is the probability of null hypothesis being true. It can accept
or reject the null hypothesis based on P value.
• Practically, P < 0.05 (5%) is considered significant.
• P = 0.05 implies,
– We may go wrong 5 out of 100 times by rejecting null
– Or, We can attribute significance with 95% confidence.
5% Significance level & 95% confidence level
Acceptance and Rejection
(𝛼 = 0.025 𝑜𝑟 2.5%)
(𝛼 = 0.025 𝑜𝑟 2.5%)
𝑇𝑜𝑡𝑎𝑙 𝐴𝑐𝑐𝑒𝑝𝑡𝑎𝑛𝑐𝑒 𝑟𝑒𝑔𝑖𝑜𝑛
(1 − 𝛼) = 95%
Power of the test &
Degree of freedom
Power of the test :
Type II error is β and 1−β is called power of the test.
Probability of rejecting False H0, i. e. taking correct decision
Degree of freedom:
• Number of independent observations used in statistics (d. f.)
Various tests of significance
1. Parametric – Data is or assumed to be normal distributed.
2. Non parametric – Data is not normal distributed.
Parametric tests :
For Qualitative data:-
1. Z test
2. 2. Chi-square test or X2
For Quantitative data:-
1. Unpaired ‘t’ test
2. Paired ‘t’ test
Selection of the test
1. Type of data
2. Size of sample
3. Number of samples
Selection of the test
n > 30
1, 2 samples
n </> 30
Small / large
More than 2
n </> 30
Small / large
test / Paired t-
More than 2
Steps in Testing a Hypothesis
• General procedure in testing a hypothesis
1. Set up a null hypothesis (HO).
2. Define alternative hypothesis (HA).
3. Calculate the test statistic (Z, X2, t etc.).
4. Find out the corresponding probability level (P Value) for
the calculated test statistic from relevant tables.
5. Accept or reject the Null hypothesis depending on P
P> 0.05 HO accepted
P < 0.05 HO rejected
[Standard error of difference between proportions (SEp1-p2)]
• For comparing qualitative data between 2 groups.
• Used for large samples only. (> 30 in each group).
• Standard error of difference between proportions (SEp1-p2) is
calculated by using the formula
• And then the test statistic ‘Z Score’ value is calculated by
using the formula:
• If Z > 1.96 P < 0.05 SIGNIFICANT - H0 rejected
• If Z < 1.96 P > 0.05 NOT SIGNIFICANT- H0 accepted
• Finding p value from Z table
• Go to the row that represents the ones digit and the first digit
after the decimal point (the tenths digit) of your z-value.
• Go to the column that represents the second digit after the
decimal point (the hundredths digit) of your z-value.
• Intersect the row and column from Steps 1 and 2.
1.0000 - 0.9772
• Consider a hypothetical study where cure rate of Typhoid fever
after treatment with Ciprofloxacin and Ceftriaxone were
recorded to be 90% and 80% among 100 patients treated with
each of the drug.
• How can we determine whether cure rate of Ciprofloxacin is
better than Ceftriaxone?
Step – 1: Set up a null hypothesis – H0:
– “There is no significant difference between cure rates of
Ciprofloxacin and Ceftriaxone.”
Step – 2: Define alternative hypothesis – Ha:
– “Ciprofloxacin is 1.125 times better in curing typhoid fever
Step – 3: Calculate the test statistic – ‘Z Score’
Step – 3: Calculating Z score
– Here, P1 = 90, P2 = 80
– SEP1-P2 will be given by the formula:
So, Z = 10/5 = 2
Step – 4: Find out the corresponding P Value
– Since Z = 2 i.e., > 1.96, hence, P < 0.05
Step – 5: Accept or reject the Null hypothesis
– Since P < 0.05, So we reject the null hypothesis(H0)
There is no significant difference between cure rates of
Ciprofloxacin and Ceftriaxone
– And we accept the alternate hypothesis (Ha)
that, Ciprofloxacin is 1.125 times better in curing typhoid
fever than Ceftriaxone
Chi square (X2) test
• This test is also for testing qualitative data.
• Its advantage over Z test is:
– Can be applied for smaller samples as well as for large
• Prerequisites for Chi square (X2) test to be applied:
– The sample must be a random sample
– None of the observed values must be zero.
– Adequate cell size
Steps in Calculating (X2) value
1. Make a contingency table mentioning the frequencies in
2. Determine the expected value (E) in each cell.
3. Calculate the difference between observed and expected
values in each cell (O-E).
4. Calculate X2 value for each cell
5. Sum up X2 value of each cell to get X2 value of the table.
6. Find out p value from x2 table
7. If p > 0.05 , difference is not significant, null hypothesis
If p < 0.05 , difference is significant , null hypothesis rejected.
• Consider a study done in a hospital where cases of breast
cancer were compared against controls from normal
population against with a family history of Ca Breast.
• 100 in each group were studied for presence of family history.
• 25 of cases and 15 among controls had a positive family
• Comment on the significance of family history in breast
• From the numbers, it suggests that family history is 1.66
(25/15) times more common in Ca breast.
• So is it a risk factor in population?
• We need to test for the significance of this difference.
• We shall apply X2 test.
Step – 1: Set up a null hypothesis
– H0: “There is no significant difference between incidence of
family history among cases and controls.”
Step – 2: Define alternative hypothesis
– Ha: “Family history is 1.66 times more common in Ca
Step – 3: Calculating X2
1. Make a contingency table mentioning the frequencies in all
Step – 3: Cont..
2. Determine the expected value (E) for each cell.
O – observed values , E – expected value
Step – 4: Determine degree of freedom.
• DoF is determined by the formula:
DoF = (r-1) x (c-1)
where r and c are the number of rows and columns
• Here, r = c = 2.
• Hence, DoF = (2-1) x (2-1) = 1
Step – 5: Find out the corresponding P Value
– P values can be derived by using the X2 distribution
Step – 6: Accept or reject the Null hypothesis
• In given scenario,
X2 = 3.125
• This is less than 3.84 (for P = 0.05 at dof =1)
• Hence Null hypothesis is Accepted, i.e.,
“There is no significant difference between incidence of
family history among cases and controls”
Limitations of tests of significance
Testing of hypothesis is not decision making itself; but it helps
for decision making
Test does not explain the reasons as why the difference
exists, tests do not tell about the reason causing the difference.
Tests are based on the probabilities and as such can not be
expressed with full certainty.
Statistical inferences based on the significance tests can
not be said to be entirely correct evidences concerning
the truth of the hypothesis.
Formal statement or an assumption made about relationships between variables
2 tail test 1 tail test
Z 1.96 – 0.05 , z 1.64 – 0.05
Single mean 𝑋 − µ
Z = ---------- 3. single proportion z = 𝑝 −P 𝑃(1−𝑃) 𝑛
Two mean Z = (𝑋1−𝑋2) 𝑆1 𝑛 2 2 − 𝑠2 𝑛 2 2
P value for z score 2 1.0000 - 0.9772 =0.0228