The document discusses the history and procedures for conducting the Mann-Whitney U test. It originated from the work of Frank Wilcoxon, Henry Mann, and Donald Whitney in the late 19th/early 20th century. The test is a nonparametric alternative to the independent t-test that can be used to compare two independent groups when the dependent variable is either ordinal or continuous. It works by ranking the data from both groups together and comparing the sums of the ranks for each group to determine if they are significantly different, indicating differences in central tendency. Examples are provided demonstrating how to calculate the test statistic U and conduct statistical inferences.
2. History of Mann Whitney U test
β’ 1892-1965 -Frank Wilcoxon (American
chemist and statistician)- for equal
sample size he proposed- Wilcoxon rank
sum test
β’ 1947 -extended to arbitrary sample size
β’ 1905-2000 -Henry Berthold Mann
(Austrian born American mathematician).
β’ 1915-2001 -Donald Ransom Whitney
(American statistician)
β’ 1892-1965 -Frank Wilcoxon (American
chemist and statistician)- for equal
sample size he proposed- Wilcoxon rank
sum test
β’ 1947 -extended to arbitrary sample size
β’ 1905-2000 -Henry Berthold Mann
(Austrian born American mathematician).
β’ 1915-2001 -Donald Ransom Whitney
(American statistician)
3. Assumption
β’ Two samples have been independently and randomly
drawn from their respective population.
β’ The scale of measurement is at least Ordinal.
β’ The variable of interest is continuous.
β’ The samples are identical and are proper
representative of population.
4. Procedure for small samples (less than 8)
Ex-1 -The haemoglobin levels (in grams per decilitre) for two groups is given
below, determine the statistical significance. The table value of Wilcoxon rank sum
test at 5% level of significance is 17.
Hb levels in 3rd
trimester
Hb Levels
After 3 months
of delivery
13.5 11
11.5 13
12.5 12
10.5 10
14.5 14
1. Formulate the Hypothesis
Null hypothesis- two samples that are
independent of each other, come from identical
continuous distribution with equal medians.
Alternate hypothesis- Null hypothesis is not
true.
5. Procedure for small samples (less than 8)
2. Arrange all the observations belonging to the two samples under investigations
are in ascending or descending order of magnitude in series. And rank them, If two
observations have the same rank, assign each of tied observations the arithmetic
mean of the ranks that they jointly occupy.
Rank Observation Rank Observation
1 10 6 12.5*
2 10.5* 7 13
3 11 8 13.5*
4 11.5* 9 14
5 12 10 14.5*
Hb levels in 3rd
trimester
Hb Levels
After 3 months
of delivery
13.5 11
11.5 13
12.5 12
10.5 10
14.5 14
6. Procedure for small samples (less than 8)
4. Summate the rank order individually for the two sample.
5. Determine the smaller of the two summated ranks in the samples.
Rank Observation Rank Observation
1 10 6 12.5*
2 10.5* 7 13
3 11 8 13.5*
4 11.5* 9 14
5 12 10 14.5*
Rank for sample 1 Rank for sample 2
2 1
4 3
6 5
8 7
10 9
Total- 30 Total- 25
7. Procedure for small samples (less than 8)
6. Compare the smaller of the two summated ranks with the table value of the
Wilcoxon Rank Sum statistics at the pre-determined level of significance. (17)
7. If the smaller of the two summated ranks is more than the table value of the
Wilcoxon rank sum statistics at the pre-determined level of significance, then the
null hypothesis is accepted.
Conclusion- In our example our smaller summated rank is more than the table
value so we will accept the Null Hypothesis and reject the Alternative Hypothesis.
8. Example 2
β’ A researcher designed an experiment to assess the effects of prolonged inhalation
of cadmium oxide. Fifteen laboratory animals served as experimental subjects,
while 10 similar animals served as controls. The variable of interest was
hemoglobin level following the experiment. We wish to know if we can conclude
that prolonged inhalation of cadmium oxide reduces hemoglobin level.
10. 1. Data.
2. Assumptions. We assume that the assumptions of the Mannβ Whitney test are met.
3. Hypotheses. The null and alternative hypotheses are as follows:
4. Test statistic. To compute the test statistic, we combine the two samples and rank all
observations from smallest to largest while keeping track of the sample to which each
observation belongs. Tied observations are assigned a rank equal to the mean of the rank
positions for which they are tied.
Haemoglobin
determinations (in grams)
for 25 laboratory animals
Exposed
animals
(X)
Unexpose
d animals
(Y)
14.4 17.4
14.2 16.2
13.8 17.1
16.5 17.5
14.1 15.0
16.6 16.0
15.9 16.9
15.6 15.0
14.1 16.3
15.3 16.8
15.7
16.7
13.7
15.3
14.0
11. U = S -
π(π+1)
2
n = No. Of sample X observations
S = sum of ranks assign to sample
observations from population X values
5. Distribution of test statistics-
Critical values from the
distribution of the test statistic
are given in Appendix Table L
for various levels of Ξ±
12. 6. For this example decision rule is- Reject the null hypothesis if the computed value
of U is smaller than 45, the critical value of the test statistic for n = 15, m = 10 and Ξ± =
0.05 found in Table L.
7. Calculation of test statistics.
U = 145 -
15(15+1)
2
=25
8. Statistical decision. When we enter Table L with n = 15, m = 10, and Ξ± = 0.05, we
find the critical value wΞ± of to be 45. Since 25 < 45, we reject Null Hypothesis
9. Conclusion. This leads to the conclusion that prolonged inhalation of cadmium
oxide does reduce the haemoglobin level.
10. P value: since 22< 25 <30, we have for this test. 0.005> p > 0.001
13. Procedure for large samples
Ex β The body weights (in kg) of persons in sample-1 are 63, 48, 88, 70,
83, 84, 58, 83, 49, 56, 67 and 79. the body weights (in kg) of persons in
sample-2 are 80, 51, 77, 82, 63, 82, 54, 50, 71, 62, 42 and 54. test at 5%
level of significance, the hypothesis that the two samples come from the
same population with equal means by using U test.
15. Procedure for large samples
1. Data.
2. Assumptions. We assume that the assumptions of the Mannβ Whitney test are met.
3. Hypotheses. The null and alternative hypotheses are as follows:
Β΅1 = Β΅2
Β΅1 β Β΅2
Suppose we let Ξ±= 0.05
4. Test statistic. To compute the test statistic, we combine the two samples and rank
all observations from smallest to largest while keeping track of the sample to which
each observation belongs. Tied observations are assigned a rank equal to the mean of
the rank positions for which they are tied.
16. Calculating U statistics
U = n1n2 +
π1 (π1+1)
2
- π΄ π 1 U = 12*12+
12 (12+1)
2
- 167.5 = 54.5
Calculate Β΅U =
π1 π π2
2
= 72
Calculate the standard deviation of sampling distribution
ππ =
π1βπ2 (π1+π2+1)
12
=
144 β25
12
= 17.32
Determining the limits of acceptance region
Upper limits = Β΅U + 1.96 (ππ) = 105.94
Lower limit = Β΅U - 1.96 (ππ) = 38.06
Inference:- since the U
statistics (54.5) lies within
the acceptance region the
null hypothesis is accepted
at 5 % level of significance
Two samples come from
the same population with
equal means
17. How to make a table for results
Comparison of median SGOT across MI order
MI Status Median (IQR) Mean Rank
First 50.4 (31.1-80.6) 90.68
Recurrent 111.02
Statistical results: Mann Whitney U value-______________. Z value:- ___________, p value:-_____________