Historical aspect.
Basis of statistical inference.
Hypothesis and it’s testing.
Characteristics of Hypothesis.
Null Hypothesis.
Alternate Hypothesis.
Interpreting the result of Hypothesis.
Type I error and Type II error.
One-tailed test , Two-tailed test.
Effect of sample size.
Test of significance.
Parametric Vs Non Parametric test.
Parametric test.
Non-Parametric test.
References. 2
The term statistical
significance was
coined by Ronald
Fisher(1890-1962).
Student t-test :
William Sealy
Gosset.
3
Statistical inference is the branch of
statistics which is concerned with using
probability concept to deal with
uncertainly in decision making.
It refers to the process of selecting and
using a sample to draw inference about
population from which sample is drawn.
4
Statistical Inference
Estimation of
population value
Testing of
hypothesis
Point
estimation
Range
estimation
Mean,
proportion
estimation
Confidence
interval
estimation
5
• During investigation there is assumption and
presumption which subsequently in study must be
proved or disproved.
• Hypothesis is a supposition made from observation. On
the basis of Hypothesis we collect data.
• Hypothesis is a tentative justification, the validity of
which remains to be tested.
Two Hypothesis are made to draw inference from
Sample value-
A. Null Hypothesis or hypothesis of no difference.
B. Alternative Hypothesis of significant difference.
6
The Null Hypothesis is symbolized as Ho and
Alternative Hypothesis is symbolized as H1 or HA.
In Hypothesis testing we proceed on the basis of
Null Hypothesis. We always keep Alternative
Hypothesis in mind.
The Null Hypothesis and the Alternative
Hypothesis are chosen before the sample is drawn.
7
1. Hypothesis should be clear and precise.
2. Hypothesis should be capable of being tested.
3. It should state relationship between variables.
4. It must be specific.
5. It should be stated as simple as possible.
6. It should be amenable to testing within a
reasonable time.
7. It should be consistent with known facts.
8
A Null Hypothesis or Hypothesis of no difference
(Ho) between statistic of a sample and parameter of
population or between statistic of two samples
nullifies the claim that the experimental result is
different from or better than the one observed
already. In other words, Null Hypothesis states that
the observed difference is entirely due to sampling
error, that is - it has occurred purely by chance.
9
There is no difference between the operational
procedures of open prostatectomy and TURP.
There is no difference between open operation and
transsphenoidal approach.
There is no difference in the incidence of measles
between vaccinated and non-vaccinated children.
Drugs chloramphenicol is as good as drug cotrimoxazole
in treating enteric fever.
10
Alternative Hypothesis of significant difference
states that the sample result is different that is,
greater or smaller than the hypothetical value of
population.
A test of significance such as Z-test, t-test, chi-square
test, is performed to accept the Null
Hypothesis or to reject it and accept the Alternative
Hypothesis.
11
The Hypothesis Ho is true - our test
accepts it because the result falls within
the zone of acceptance at 5% level of
significance.
The Hypothesis Ho is false - test rejects it
because the estimate falls in the area of
rejection.
12
Zone of acceptance- If the results of a sample falls
in the plain area i.e. within the mean+/-1.96
standard error the Null Hypothesis is accepted- the
area is called zone of acceptance.
Zone of rejection-If the result of a sample falls
outside the plain area, i.e. beyond mean +/-1.96
standard error, it is significantly different from
population value. So Null Hypothesis is rejected and
alternative hypothesis is accepted. This area is
called zone of rejection.
13
When a Null Hypothesis is tested, there may be four
possible outcomes:
i. The Null Hypothesis is true but our test rejects it.
ii. The Null Hypothesis is false but our test accepts it.
iii. The Null Hypothesis is true and our test accepts it.
iv. The Null Hypothesis is false but our test rejects it.
Type 1 Error – rejecting Null Hypothesis when Null
Hypothesis is true. It is called ‘α error’.
Type 2 Error – accepting Null Hypothesis when Null
Hypothesis is false. It is called ‘β-error’.
15
Decision
Accept Ho Reject Ho
Correct Type 1 error
decision
Ho true
Correct
decision
Ho false Type 2 error
16
Decision
Same effect More effective
New regime is
Correct Error
not better
decision
New regime is Error Correct decision
better
17
The Null Hypothesis is
True False
1-α (confidence β (type 2 error)
level)
Accept if
p>=0.05(non-significant)
conclusion-negative
1-β (power of the
test)
Reject if α (type 1 error)
p<0.05(significant
) conclusion-positive
18
The probability of committing Type 1 Error is called the
P-value. Thus p-value is the chance that the presence
of difference is concluded when actually there is none.
When the p value is between 0.05 and 0.01 the result is
usually called significant.
When p value is less than 0.01, result is often called
highly significant.
When p value is less than 0.001 and 0.005, result is
taken as very highly significant.
19
The statistical power of a test is the
probability that a study or a trial will be
able to detect a specified difference . This
is calculated as 1- probability of type II
error, i. e. probability of correctly
concluding that a difference exists when it
is indeed present. Thus, power = 1-β.
20
Confidence Interval : The interval within which
a parameter value is expected to lie with a
certain confidence level as could be revealed
by repeated samples is called confidence
interval.
Confidence Level : The degree of assurance for
an interval to contain the value of a parameter
(1-α).
21
ONE-TAILED TEST
If HA states is < some value, critical region
occupies left tail.
If HA states is > some value, critical region
occupies right tail.
22
RIGHT-TAILED TEST
H0: μ = 100
H1: μ > 100
Points Right
Values that
differ “significantly”
Fail to reject H0 Reject H0
alpha
100 from 100
Zcrit
23
LEFT-TAILED TESTS
H0: μ = 100
H1: μ < 100
100
Points Left
Values that
differ “significantly”
from 100
Fail to reject H0 Reject H0
alpha
Zcrit
24
TWO-TAILED HYPOTHESIS TESTING
• HA is that μ is either greater or less than μH0
HA: μ ≠ μH0
• is divided equally between the two tails of
the critical region.
25
TWO-TAILED HYPOTHESIS
TESTING
H0: μ = 100
H1: μ 100
Fail to reject H0 Reject H0 Reject H0
100
Means less than or greater than
Values that differ significantly from 100
alpha
Zcrit Zcrit
26
With large n (say, n > 30), assumption of normal population
distribution not important.
For a given observed sample mean and standard deviation, the
larger the sample size n, the larger the test statistic (because
denominator is smaller) and the smaller the P-value.
We’re more likely to reject a false H0 when we have a larger
sample size (the test then has more “power”)
With large n, “statistical significance” not the same as “practical
significance.”
27
Test of significance is a formal procedure for
comparing observed data with a claim (also called a
hypothesis) whose truth we want to assess.
Test of significance is used to test a claim about an
unknown population parameter.
A significance test uses data to evaluate a hypothesis
by comparing sample point estimates of parameters
to values predicted by the hypothesis.
We answer a question such as, “If the hypothesis
were true, would it be unlikely to get data such as
we obtained?”
28
Based on specific
distribution such
as Gaussian
Not based on any
particular parameter such
as mean
Do not require that the
means follow a particular
distribution such as
Gaussian.
Used when the underlying
distribution is far from
Gaussian (applicable to
almost all levels of
distribution) and when the
sample size is small
29
Parametric Tests
Student’s t- test(one
sample, two sample,
and paired)
Z test
ANOVA F-test
Pearson’s correlation(r)
Non-Parametric
Tests
Sign test(for paired data)
Wilcoxon Signed-Rank
test for matched pair
Wilcoxon Rank Sum test
(for unpaired data)
Chi-square test
Spearman’s Rank
Correlation(p)
ANOCOVA
Kruskal-Wallis test
30
Purpose of
application
Parametric test Non-Parametric test
Comparison of two
independent groups.
‘t’-test for independent
samples
Wilcoxon rank sum test
Test the difference
between paired
observation
‘t’-test for paired
observation
Wilcoxon signed-rank
test
Comparison of several
groups
ANOVA Kruskal-Wallis test
Quantify linear
relationship between
two variables
Pearson’s Correlation Spearman’s Rank
Correlation
Test the association
between two qualitative
variables
_ C h i - s q u a r e t e s t
31
Students t- tests - A statistical criterion to test the
hypothesis that mean is superficial value, or that
specified difference, or no difference exists
between two means. It requires Gaussian
distribution of the values, but is used when SD is not
known.
Proportion test - A statistical test of hypothesis
based on Gaussian distribution, generally used to
compare two means or two proportions in large
samples, particularly when the SD is known.
ANOVA F-test - used when the number of groups
compared are three or more and when the objective
is to compare the means of a quantitative variable.
32
One sample– only one group is studied and an
externally determined claim is examined.
Two sample– there are two groups to
compare.
Paired– used when two sets of measurements
are available, but they are paired .
33
Find the difference between the actually observed
mean and the claimed mean.
Estimate the standard error (SE) of mean by S/n,
where s is the standard deviation and n is the
number of subjects in the actually studied sample.
The SE measures the inter-sample variability
Check the difference obtained in step 1 is
sufficiently large relative to the SE. for this ,
calculate students t. this is called the test
criterion. Rejection or non-rejection of the null
depends on the value of this t .
Reject the null hypothesis if the t-value so
calculated is more than the critical value
corresponding to the pre-fixed alpha level of
significance and appropriate df.
34
There are 10 patients of arthritis. Suppose the reduction in pain
after using newspirin is as follows on a 10-point visual analog
scale:
0 3 6 1 1 4 0 2 1 5
Mean reduction, x = 2.3 points and SD, s=2.11.
By using the formula :
t = 2.3-3.0/(2.11/10^½)
= -1.049
n = 10, df = 10-1 = 9
For one-tailed α = 0.05, and df = 9 , the critical value of t is 1.833
Since the calculated value 1.049 of t is less than the critical value
1.833, the Null Hypothesis that the mean reduction in pain is 3
point can not be rejected.
35
A study on 24-hour creatinine excretion in male
and female healthy adults to examine if a
difference exists. For our illustration ,we give
value obtained for 15 subjects in group in table:
Me
16
19
17
15
20
24
18
17
22
24
18
16
21
17
n
.6
.8
.1
.6
.3
.7
.5
.6
.0
.9
.4
.9
.1
.0
23
.3
W
o
m
en
23
.2
22
.0
21
.9
14
.2
23
.2
24
.8
25
.5
28
.1
21
.8
20
.9
18
.0
19
.5
20
.6
16
.7
17
.3
37
df = n1+n2-2 = 15+15-2 = 28
in men, y1 = 19.59 and s1 = 3.03
in women, y2 = 21.18 and s2 = 3.65
sp = [(15-1)x(3.03)^²+(15-1)x(3.65) ^²/15+15-2]^½
=3.35
Thus,
t = 19.59-21.18/3.35(1/15+1/15) ^½
= -1.59/1.2232 = -1.30
The critical value of t is 2.048, the calculated value is
less than the critical value. Thus the Null Hypothesis
of equality can not be rejected.
38
Obtain the difference for each pair and test
the null hypothesis that the mean of these
differences is zero(this null hypothesis is
same as saying that the means before and
after are equal).
For paired samples : t = d/(Sd/(n)^1/2)
d : is the sample mean of the
differences
Sd : is standard deviation.
39
Consider serum albumin level of 8 randomly
chosen patients of dengue haemorrhagic fever
before and after treatment. The value has been
tabulated :
Before
treatm
ent
5.1 3.8 4.0 4.7 4.5 4.8 4.1 3.6
After
treat
ment
4.8 3.7 3.8 4.7 4.6 5.0 4.0 3.4
Differ
ence(
d)
0.3 0.1 0.2 0 -0.1 -0.2 0.1 0.2
40
Mean difference, d = 0.6/8 = 0.075g/dl, and SD of
difference, sd = 0.17.
t = 0.075/0.17/(8)^½ = 1.25
df = 8-1 =7
The critical value of t is 2.365, since the calculated
value is less, the null hypothesis of difference can not
be rejected.
41
Used for large Quantitative data (i.e. n>30) .
Application: To find out Standard Error of
difference between two sample means
i.e. S. E. (X1 - X2)
e.g. To find our significant difference
between two different variables/groups i.e.
Efficacy of two drugs, difference between two
groups etc.
42
State the Null Hypothesis i.e. H0 and its
Alternative Hypothesis i.e. H1
Find out the values of test statistic i.e. value
of 'Z' as follows:
_ _ _ _
Z = X1 – X2 / SE (X1 – X2)
where,
SE (X1 – X2)=√ (SD1)²/n1 + (SD2)² /n2
43
Situations where it is
used are
1.in two sample situation
2. in paired set-up
3.in repeated measures,
when the same subject is
measured at different
time points such as after
5 minutes, 15 minutes,
30 minutes, 60 minutes
etc,.
4.removing the effect of
a covariate
5. regression.
44
Correlation is the relationship between two or more
paired factors or two or more sets. The degree of
relationship is usually measured and represented by a
correlation coefficient.
A correlation coefficient is numerical measure of the
linear relationship between two factors or sets of
scores. Coefficient can be identified by either the
letter r or the Greek letter rho. Or other symbols,
depending on the manner the coefficient has been
computed. Obtained correlation
45
The sign of the obtained correlation coefficient
can range from coefficient indicates the directions
of the relationship and the numerical value of its
strength.
Correlation Coefficient Degree of Relationship
.00 - .20 Negligible
.21 - .40 Low
.41 - .60 Moderate
.61 - .80 Substantial
.81 – 1.00 High to Very High
46
Types of Correlation :
Types
Type 1 Type 2 Type 3
Type1
Positive Negative No Perfect
47
Type 2
Linear Non – linear
Type 3
Simple Multiple Partial
48
r = [NSXY- (SX) (SY)] /
[(NSX² - (SX²) - (NSY² - (SY) ²)]^½
Where:
N = Number of paired observation
SXY = sum of the cross products of C and Y
SX = sum of the scores under Variable X
SY = sum of the scores under variable Y
(SX)² = Sum of x scores acquired
(SY) = sum of y equated
SX² = sum of squared X scores
SY² = Sum of squared Y scores
49
Alternative to the test of significance of
difference between two proportions
O : Observed frequencies.
E : Expected frequencies.
50
Do you know that prevalence of cataract is more in
males or in females? Consider a study on prevalence of
cataract in males and females of age 50 years and
above. The results are as follows. Number of males
examined (n1) = 60: found with cataract 37. Number of
females examined (n2) = 40 : found with cataract 30.
This is stated in a table format
Gender Yes No Total
Male 37 23 60
Female 30 10 40
Total 67 33 100
51
Expected frequency = Corresponding row total X Corresponding
column total / Grand total
=60x33/100 = 19.8
Applying the formula
=(37-40.2)^2/40.2+(23-19.8) ^2/19.8+(30-26.8) ^2/26.8+(10-
13.2) ^2/13.2
= 0.2547+0.5172+0.3821+0.7758
= 1.93
The critical value of chi-square is 3.84 at 5% level of significance.
Since the calculated value is less than the critical value, the Null
Hypothesis can not be rejected.
52
For paired data
It is a non parametric test based on
signs(positive and negative) of the
differences in the levels seen before and
after therapy .
53
For matched pairs.
It is better test than the sign test– assigns rank to the
differences of n pairs after ignoring the + or – signs.
The lowest difference gets rank 1 and the highest
gets rank n.
Sum of the only those ranks that are associated with
positive difference obtained(Wilcoxon signed rank
criteria).
It is similar to Mann-Whitney test.
54
For unpaired two sample situation .
If there are n1 subjects in the first sample
and n2 in the second sample, these(n1+n2)
values are jointly ranked from 1 to (n1+n2)
{the sum of these ranks is obtained for those
subjects only who are in smaller group}.
55
Spearman’s correlation is designed to measure the
relationship between variables measured on an ordinal
scale of measurement.
Similar to Pearson’s Correlation, however it uses ranks
as opposed to actual values.
56
1. Convert the observed values to ranks (accounting for
ties)
2. Find the difference between the ranks, square them and
sum the squared differences.
3. Set up Hypothesis, carry out test and conclude based on
findings.
4. If the Null is rejected then calculate the Spearman
correlation coefficient to measure the strength of the
relationship between the variables.
57
Where,
6
n
1
i
i d
1 2
2
N N
(
1)
di is the difference between the paired ranks
n is the number of pairs.
The Spearman rank correlation coefficient may lie
between -1 to +1. Values close to +/-1 indicate a high
correlation ; values close to zero indicate lack of
relationship.
58
A Indrayan and L Satyanarayana-biostatistics,
20006 ed, Printice -Hall of India.
MSN Rao, NS Murthy-applied statistics in
health sciences, 2nd ed, 2010, jaypee.
B Antonisamy, Solomon Christopher, P
Prasanna Samuel – Boistatistics Principles and
Practice.
www. Wikipedia. org
59