SlideShare a Scribd company logo
i
NONPARAMETRIC RANK-BASED BIOSTATISTICAL
METHODS
by
H.E. MISIRI
(LECTURER)
14.DECEMBER 2000
Department Of Community Health
College Of Medicine
ii
NONPARAMETRIC STATISTICAL TESTS
Introduction
When we use t-tests, the assumption we make about the data is that it is normally
distributed. We assume that the data is a random sample or random samples form a
normal population. The normal distribution has a probability density function which
is as below:
( ) ∞<<∞−













 −
−= x
x
xf ,
2
1
exp.
1
.
2
1
2
σ
µ
σπ
.
Therefore , if we assume that a data set is normally distributed we imply that it has
this density function. Tests based on the assumption that the population has a certain
functional form are called parametric tests. Examples are t-tests and z-tests.
It is still possible to test hypotheses without assuming that the data has a certain
functional form. Tests conducted without the assumption of a certain functional form
are called nonparametric tests. The use of these is rather pragmatic because the
assumption that data has a certain distribution may not be valid sometimes. Besides
this, the sample size may not be big enought to justify the use the normal
approximation( Central Limit Theorem). Some nonparametric tests have higher
power to detect alternatives under some circumstances.
RANKING DATA
The table below (Table 1.1) shows measurements of blood pressure taken from 10
subjects. 4 received treatment and 6 were controls. Suppose we pool and arrange the
data in ascending order. Furthermore, if we assign numbers to these ranked values
and if the numbers assigned denote some ordering, we say that we are ranking the
data. The numbers which serve as the indices of location of the data points are called
ranks. Table 1.1 shows the data in the first two columns. Column 4 contains the
group labels and column 5 contains the ranks.
Table 1.1 : Measurements of Blood Pressure
Pressure Ranked Values of
Group (mmHg) Pressure Group Rank
1 94 78 2 1
1 108 80 2 2
1 110 85 2 3
1 90 88 2 4
2 80 90 1 5
2 85 94 1 6
2 94 94 2 7
2 78 105 2 8
2 105 108 1 9
2 88 110 1 10
iii
TIES IN A DATA SET
Suppose that any two values in a data set are equal. We call this a tie. When ties
occur, some remedial measures have to be taken. When several numbers are equal,
their rank is the average of their ranks. Consider the following example.
Measurements on some attributes were taken and are shown below:
94,108,110,90,80,94,85,90,90,90,108,94,78,105,88
Several values occur more than once. For instance 90, 94 and 108. The rank of each
90 is (5 + 6 + 7 + 8 )/4 = 6.5. The table below (Table 1.3) shows the ranked data.
Table 1.2: Ranked Data
Value 78 80 85 88 90 90 90 90 94 94 94 105 108 108 110
Rank 1 2 3 4 6.5 6.5 6.5 6.5 10 10 10 12 13.5 13.5 15
When ties occur in a data set, the test statistics calculated from that data have to be
adjusted to take care of this. This adjustment is called correction for ties. This is
sometimes hard to do manually. However, some statistical packages have in-built
mechanisms of doing this.
Some nonparametric tests are based on ranks. Statistical tests based on ranks are
called rank tests. Some parametric tests have nonparametric rank test analogues.
Examples are in the following table (Table 1.4).
Table 1.3 Nonparametric Analogues of Some Parametric Methods
Parametric Method Nonparametric Analogue
T-Test For Paired Samples Wilcoxon Signed Rank test, Sign Test
One Way Anova Kruskal Wallis Test
Two Way Anova Friedman Test
T-test for Independent Samples Rank Sum Test(Wilcoxon Mann-Whitney U
Test)
T-test for One Sample Sign Test
1. ONE SAMPLE TEST: THE SIGN TEST
These are tests based on data collected from one population. The parametric test one
can use is the t-test for small samples. If the sample size is large, one can use the z-
test. The sign test is based on the binomial distribution. Suppose we have a random
sample of observations from a certain population. Suppose also that our null
hypothesis is that the mean is equal to some constant, say µ0. Using statistical
notation, this can be written as Ho: µ = µ0. A strong assumption we will have to make
is that the distribution of the source population is symmetric so that the population
mean is equal to the median. Then, without having to assume that the data is
normally distributed, we can stage the same null hypothesis using the median as
follows :
iv
H0: M = M0, where M0 is the hypothesized median.
Let Di = Xi – M0, where Xi is observation i. Then each difference will be either
positive or negative. Let Yi be defined as follows:
Yi = 1, if Di > 0
= 0, if Di < 0
Let S be the sum of the Yi's. The probability of having a positive difference is equal
to the probability of having a negative difference. This is 0.5. If you take success as
a positive difference then S has a binomial distribution with parameters n and π=0.5.
Suppose that there are n0 ties. Then, there will be n0 difference equal to zero. Then S
has a binomial distribution with n-n0 and π = 0.5 as parameters. S is in essence, the
number of positive differences.
Example:
The following are measurements of weight, in kilograms, for 6 individuals:
62, 63, 64.5, 65, 72 and 60. We strongly believe that the mean is greater than 64 kg.
Find the deviations ( Di's ) and S, the sum of the Yi's.
Solution:
Table 1.4: Deviations From The Mean
Obs No(i) 1 2 3 4 5 6
Weight(kg) 62 63 64.5 65 72 60
Difference -2 -1 0.5 1 8 -4
Yi 0 0 1 1 1 1
Sign - - + + + +
Thus, S= 1+1+1+1 = 4. There are 4 positive differences.
S has a Binomial distribution with n = 6 and π = 0.5 as parameters. This is so because
the number of positive differences here is 6. π = 0.5 because the probability of having
a positive difference is just 0.5. We can thus use the binomial distribution to test the
following hypotheses:
1. H0 : M = M0 vs Ha : M ≠≠≠≠ M0
2. H0: M = M0 vs Ha : M > M0
3. H0: M = M0 vs Ha: M < M0
Implication of The Null Hypothesis
Under the null, the proportion of positive differences (π) is equal to the proportion of
negative differences. We thus assume that the data has symmetric distribution (not
necessarily bell-shaped). If a distribution is symmetric, µ =M.
Note that the null hypothesis has an equivalent expression and this is H0: π = 0.5.
Thus the above hypotheses can be rewritten as follows:
v
1. H0 : π = π0 vs Ha : π≠≠≠≠ π0
2. H0: π = π0 vs Ha : π> π0
3. H0: π = π0 vs Ha: π < π0
USE OF THE BINOMIAL DISTRIBUTION
To test the above hypotheses, we will use a cumulative binomial distribution. Recall
that the pmf of a binomial distribution is ( ) ( ) nr
r
n
rR
rnr
,...,1,0,1Pr =−





==
−
ππ .
The cumulative binomial distribution function is thus
( ) ( ) nr
r
n
rR
rnr
r
r
,,1,0,1Pr
0
0
0 Κ=−





=≤
−
=
∑ ππ
Suppose that we have a one-tail test (upper tail) and that our level of significance is
0.05. Then we can find which value of R gives us a cumulative probability of 0.95
approximately. The table below (Table 1.6) shows the cumulative probabilities of a
Bin(n = 6,π = 0.5) distribution.
Table 1.5:Bin(10, 0.5) Distribution
n R=r P Pr(R≤≤≤≤ r)
10 0 0.5 0.0010
10 1 0.5 0.0107
10 2 0.5 0.0547
10 3 0.5 0.1719
10 4 0.5 0.3770
10 5 0.5 0.6230
10 6 0.5 0.8281
10 7 0.5 0.9453
10 8 0.5 0.9893
10 9 0.5 0.9990
10 10 0.5 1.0000
From Table 1.6 above, R = 7 gives a probability of 0.95 approximately. This is our
cut-off (critical) point. We will therefore reject H0 if an observed value of R is greater
than 7. The critical region for our test is composed of 8,9 and 10.
Let us go back to our previous example about some measurements on weight in Table
1.5. There we have n = 6 and π = 0.5 (as usual). Our null hypothesis is H0 : π = 0.5
and our alternative is Ha: π > 0.5.
vi
The table below (Table 1.7) shows a Bin(6, 0.5) distribution.
Table 1.6: A Bin(6,0.5) Distribution
n R=r p Pr(R≤≤≤≤ r)
6 0 0.5 0.0156
6 1 0.5 0.1094
6 2 0.5 0.3438
6 3 0.5 0.6563
6 4 0.5 0.8906
6 5 0.5 0.9844
6 6 0.5 1.0000
We will test at the 2.5% level of significance. Our critical (cut-off) point is the value
of R which gives us a cumulative probability of, approximately, 0.975. R = 5 will do
because it gives us a cumulative probability of 0.9844 which is very close to 0.975.
We will therefore, reject H0 if R > 5 ( i.e. if R = 6). Here R = S, the sum of the Yi's.
From Table 1.5 above, S = 4 which implies that R = 4 for this data set. We accept H0
and conclude that the mean is equal to 64 kg. There are special tables for the
Cumulative Binomial Distribution. You can also use a computer to get the same
results. Some of the statistical packages which can be used are StatXact and SSPS.
Below is output from both StatXact and SPSS.
StatXact Output Panel
From the panel above, we see that the one-sided p-value is 0.0000. We therefore
reject Ho. The mean weight for this group is greater than 64 kg.
ESTIMATION OF BINOMIAL PARAMETER (PI)
Number of Trials =6
Number of Successes =4
Point Estimation of PI = 0.6667
97.50% Confidence Interval for PI = ( 0.1839 , 0.9699)
Exact P-values for testing PI = 0.0250
One-sided : Pr { T .GE. 4 } = 0.0000
Two-sided : 2 * One-sided = 0.0000
vii
SPSS Output Panel
Unfortunately, the SPSS panel is of no use because the p-value supplied is two-sided!
- - - - - Binomial Test
D
Cases
Test Prop. = .5000
4 LE .00 Obs. Prop. = .6667
2 GT .00
- Exact Binomial
6 Total 2-Tailed P = .6875
viii
2. TWO SAMPLE TESTS
2.1 INDEPENDENT SAMPLES TESTS
2.1.1 WILCOXON RANK SUM TEST/WILCOXON-MANN-WHITNEY
TEST
Suppose we are 'given' a group of N subjects to be used in a study. Let us take an
example of a clinical trial (an experimental study). If we want to compare a new drug
to a placebo1
, we will divide the subjects into two groups. These groups are called
arms of the trial. Arms of a trial may have different number of subjects. Let n denote
the number of subjects in the treatment group and m, the number of subjects in the
control group. Apparently, n + m = N.
The subjects are randomly allocated to the treatment and control groups. In some
trials, patients are not allowed to know which drug they are receiving. Only clinicians
in the study know. This administration of a drug without telling the patient which
drug he is getting is called blinding. The patient is 'blinded'. Since the clinician
knows the drug and the patient does not know what he is getting. Such blinding is
called single blinding. Single blinding eliminates the psychological effects that might
result if a patient knows which drug he is receiving. For example, a patient can fake
improvement! Sometimes both the clinician and the patient do not know which drug
is being administered. This is called double-blinding. This eliminates bias since the
clinician might curry favour if his relative or friend is in the study by giving him/her
the new drug being tested. Blood is thicker than water! Both the patient and the
clinician are blinded.
In this example, chance enters the study by the choice of subjects into the arms. We
can not claim that the group of N subjects is a sample from some population. This
scenario is very common to physicians. Here is a hospital ward. There are only 6
patients. Four are on Treatment A and two are on Treatment B. Measurements of
their blood pressure are taken. The main objective of the trial is to see if the two
treatments have the same effect on the blood pressure of patients. Here, the sample
size is very small. We can not even speak of asymptotics (Normal Approximation)!
The data may be skewed which may make it implausible to use the Student's t-test. A
very practical approach to this problem is by using exact nonparametric methods.
Since the selection of subjects into the two groups is done using random sampling, the
two samples can be said to be independent. There is a rank test which can be used to
compared the two groups. One such method is called the Wilcoxon Rank Sum Test.
Suppose that the responses for the two groups are pooled and ranked. Let Si denote
the rank of the ith
subject from the treatment group and Rj denote the rank of the jth
subject from the control group. Obviously, i = 1,…,n and j = 1,…,m. Our statistic is
the sum of ranks from the treatment groups. This is usually denoted by either Ws or
Tx. Similarly, the sum of the ranks for the control group is Wr. To carry out this test,
one must be very careful not to loose track of the group labels for the observations.
1
A placebo is a harmless pill containing no active ingredients
ix
The test statistic, Ws has a distribution and hypotheses can be tested using this
distribution. There are tables for this. This test is available in many statistical
packages e.g. StatXact.
DETERMINING THE ALTERNATIVE HYPOTHESIS.
Suppose that we are comparing two groups basing on measurements of weight in
kilograms. Group I was on Diet A and Group II was on Diet B. We want to know if
Diet A is more effective in increasing body weight than Diet B. We will therefore be
testing the following hypotheses:
H0: µ1 = µ2 vs Ha: µ1 > µ2
Higher values of weight in the first group will be in favour of Ha. Since we rank the
pooled weights, if Group I has higher weights on average than Group II then the rank
sum statistic, Ws will be very big. Diet A will be said to be more effective than Diet
B in increasing body weight if Ws ≥ c. The value of c is chosen so that Pr(Ws ≥ c) =
α.
Alternatively, one can test the same hypotheses using the significance probability.
Using some level of significance, α say, our significance probability is equal to Pr(Ws
≥ ws), where ws is the value of Ws calculated from the sample data. If we are testing
at α = 0.05, say, we can determine from special tables2
the value of Ws which gives us
this significance probability. This p-value is then compared to the level of
significance. Another name for 'significance probability' is p-value. The rest is
straightforward.
Example 1:
A mental hospital wishes to test the effectiveness of a new drug that is believed to
have a beneficial effect on some mental or emotional disorder. There are 5 patients in
the hospital suffering from this disorder. Three are selected at random to receive the
new drug and the other 2 serve as controls. The ranks of the treatment subjects are
2,3 and 5 and those of the control subjects are 1 and 4. Does the drug have a
beneficial effect? Test at α =0.05.
Solution:
i 1 2 3
Treatment ranks, Si 2 3 5
j 1 2
Control ranks, Rj 1 4
To carry out the hypothesis testing, the exact distribution of Ws must be known. This
can be worked out when N is small. For N = 5, the distribution of Ws is as below:
2
See the appendix
x
Table 2.1: The Distribution of Ws Using Permutation
ws P(Ws = ws) Pr(Ws ≤≤≤≤ ws)
6 0.100 0.100
7 0.100 0.200
8 0.200 0.400
9 0.200 0.600
10 0.200 0.800
11 0.100 0.900
12 0.100 1.000
The possible treatment ranks are permuted and the above distribution is arrived at.
Our observed value of Ws is ws = 2 + 3 + 5 = 10. From the above table (Table 2.1),
Pr(Ws ≥ 10) = 1 – Pr(Ws <10) = 1 - Pr(Ws ≤≤≤≤ 9) = 1 – 0.6 = 0.4. The p-value is 0.4. At
the 5% level of significance, we accept H0. The drug has no effect.
Most statistical packages have an equivalent form of the above test called the
Wilcoxon-Mann-Whitney Test or just Mann-Whitney Test.
USING THE NORMAL APPROXIMATION
When both m and n are less than 10, critical values and significance probabilities of
the Wilcoxon Rank Sum statistic can be obtained from the table in the appendix3
.
However, when n is greater than 10, we can use the normal approximation. To do
this, we need the mean and variance of Ws.
MEAN AND VARIANCE OF Ws
The mean of Ws, E(Ws) = )1(
2
1
+Nn ,
where N is the total number of subjects in the study and n is the number of subjects in
the treatment group.
The variance of Ws, Var(Ws) = ( )1
12
1
+Nmn ,
where m , n, and N are as before.
Similarly, E(Wr) = )1(
2
1
+Nm ,
where N is the total number of subjects in the study and m is the number of subjects in
the treatment group.
and
3
See the Appendix
xi
Var(Wr) = ( )1
12
1
+Nmn , where m , n, and N are as before.
Thus using the normal approximation, Z =






+






+−
)1(
12
1
)1(
2
1
Nmn
NnWs
has an approximate
standard normal distribution by the same Central Limit Theorem..
Similarly, Z =






+






+−
)1(
12
1
)1(
2
1
Nmn
NmWr
has an approximate standard normal distribution
by the same theorem. Therefore the p-value for a one-sided test,( Pr(Ws ≤≤≤≤ ws) or
Pr(Ws ≥ ws) ) can be calculated using this approximation.
Example2:
Suppose that m=10, n = 10 and Ws = 79. Calculate Pr(Ws ≤≤≤≤ 79) using the normal
approximation.
Solution:
E(Ws)= 105 and Var(Ws)=175
Thus Pr(Ws ≤≤≤≤ 79) =
( )





 −
≤=




 −
≤
23.13
26
Pr
175
10579
Pr ZZ
= ( ) .025.0965.1Pr =−≤Z
CORRECTION FOR TIES
Suppose that there are ties in a data set such that value number 1, in ascending order,
is occurring with frequency f1, value number two with frequency f2 and so on. Then
the mean of Ws* , E(Ws*) does not change. In other words, E(Ws) = E(Ws*).
However, the variance changes and it is equal to :
Var(Ws)* =
( )
( )
)1(1212
1 1
3
−
−
−
+
∑=
NN
ffmn
Nmn
e
i
ii
, where e is the frequency of the largest
observation.
xii
Example 3: Psychological counselling
In a test of the effect of psychological counselling, 80 boys are divided at random into
a control group of 40 to whom only the normal counselling facilities are available,
and a treatment group of 40 who receive special counselling. At the end of the study,
a careful assessment is made of each boy who is then classified as having made a
good, fairly good, fairly poor, or poor adjustment, with the following results.
Table 2.2 : Psychological Counselling Data
Poor Fairly Poor Fairly Good Good Total
Treatment 5 7 16 12 40
Control 7 9 15 9 40
Does counselling have a positive effect on adjustment? Test at α = 0.05.
Solution:
We will assign ranks to the five different categories of adjustment thus:
Poor = 1, Fairly Poor = 2, Fairly Good = 3, Good = 4. We can see from the above
table that we have ties in each category of adjustment. We have 5+7 = 12
observations tied at the rank 1, 16 at the rank 2, 31 at the rank 3, 21 and at the rank 4.
Below is a table of the ranks and the number of observations tied at each rank.
Table 2.3 Ranks And Number Of Observations Tied At Rank of Magnitude I
Rank 1 2 3 4
Number of observations
tied at rank i, fi
12 16 31 21
Average rank, Si* 6.5 20.5 44 70
These ranks are obtained as follows:
At the first position we have 12 observations tied. The mean of these ranks is
(1+ 2+…+12)/12 = 6.5. At the second position we have 16 numbers tied. The mean
rank is therefore (13+14+15+…+28)/16 = 20.5 and so on
Since our Wilcoxon Rank Sum statistic considers the treatment rank, Ws* = (6.5x12).
+(20.5 x 16) + (44 x 12) + (70 x 12) = 1,720. Since N = 80, m = n = 40 and f1 = 12,
f2 = 16, f3 = 31, f4 = 21 (which is also e=21), E(Ws* = 1,620) and Var(Ws*) = 99.27.
Our hypotheses are H0: Counselling has no positive effect
Ha: Counselling has a positive effect
Thus we will reject H0 if Pr(Ws* ≥ 1,720) < α = 0.05..
Now Pr(Ws* ≥ 1,720) = 1-Pr(Z ≤ 1.01)
= 0.16.
We accept H0 and conclude that counselling has no positive effect on adjustment.
xiii
2.12 MANN- WHITNEY TEST
Recall that Ws is the sum of treatment ranks. Let WXY = ( )1
2
1
+− nWs , where n is
defined as before. This statistic is the Mann-Whitney Statistic.
MEAN AND VARIANCE OF THE MANN WHITNEY STATISTIC
The mean of WXY, E(WXY) = mn
2
1
and its variance, Var (WXY) = ( )1
12
1
+Nmn .
CORRECTION FOR TIES
In the presence of ties, we use the same correction used for Ws*
. In such cases, the
mean of WXY
*
is equal to the mean of WXY when there are no ties. However, the
variance of WXY
*
is equal to the variance of Ws
*
.
USING THE NORMAL APPROXIMATION
Using the mean and variance of WXY
*
, p-value computations can be made using the
normal approximation.
Example 4:
1. Diastolic pressure was measured on 3 subjects in a treatment group and 4 subjects
in a control group. The data is displayed below in Table 1.1 below
Table 1.1: Blood Pressure of Subjects In A Clinical Trial
Treated
Group
Control Group
95 80
101 78
110 85
90
Does the drug have on effect on blood pressure? Test at α=0.05.
Solution:
(i) H0: The drug has no effect on blood pressure
H1: The drug has an effect on blood pressure
Note:
Since we do not say explicitly what effect the drug might have on pressure, this is a
two-tailed test.
xiv
(ii) Test statistic, Wilcoxon Rank Sum Test Statistic, Ws = ∑=
n
i
iS
1
where Si is the rank of subject I in the treatment group.
(iii) Pool the data and rank it. You can use Microsoft ExcelTM
to do this. When that
is done you get the results as presented in the following table, Table 2.4..
Table 2.4 Data On Blood Pressure Pooled And Ranked
Ranks for
Blood Pressure Group Rank Treatment Group
78 C 1
80 C 2
85 C 3
90 C 4
95 T 5 5
101 T 6 6
110 T 7 7
Sum 18
(iv) From the above table, the observed value of Ws is 18. (Ws = 18).
The Wilcoxon –Mann Whitney Statistic, WXY = ( )1
2
1
+− nWs
= 16)13(
2
1
18 =+−
(v) The significance probability (p-value ) for this is
Pr(|WXY| ≥ 16) = 2x Pr(WXY ≥ 16)
= 2x(1- Pr(WXY ≤ 16))
= 2x(1 – 1)
= 0.
Hence we reject Ho (since p-value < 0.05). The drug has an effect on the blood
pressure of subjects.
Using StatXact , one gets the following output,
xv
StatXact Output Panel
Explanation of The Above Output Panel
The mean is 12 which agrees with theory since
E(Ws) = ( ) ( ) 12173
2
1
1
2
1
=+=+ xxNn
The standard deviation is 2.828. We know that Variance of Ws,
Var(Ws) = ( ) 8834
12
1
1
12
1
==+ xxxNmn
Hence, the standard deviation of Ws, is = 828.28 =
The standardized value is 2.121. We know that using the normal approximation,
( )
( )





+
−
=
1
12
1
)(
Nmn
WsEWs
Z has an approximate normal distribution with parameters µ =0
and σ²= 1. ( i.e. Z ~ appr N(0, 1)). Hence, Z =
( ) 122.2
828.2
6
828.2
1218
==
−
StatXact gives approximate as well as exact p-values. When the sample size is large,
it is advisable to use the approximate p-value because then the normal approximation
holds ( Recall the Central Limit Theorem). In such cases of a very big sample size,
StatXact can not really use the exact method of computation to find the p-value. It is
not just feasible! Since our test is two-tailed, we will use the two-sided p-values.
From the approximate p-values in the above output panel, we reject Ho. This agrees
with what we found manually using Statistical Tables. It is appropriate in this case of
a small sample size to use exact methods.
WILCOXON-MANN-WHITNEY TEST
[ Sum of scores from population < 1 > ]
Summary of Exact distribution of WILCOXON-MANN-WHITNEY statistic:
Min Max Mean Std-dev Observed Standardized
6.000 18.00 12.00 2.828 18.00 2.121
Mann-Whitney Statistic = 12.00
Asymptotic Inference:
One-sided p-value: Pr { Test Statistic .GE. Observed } = 0.0169
Two-sided p-value: 2 * One-sided = 0.0339
Exact Inference:
One-sided p-value: Pr { Test Statistic .GE. Observed } = 0.0286
Pr { Test Statistic .EQ. Observed } = 0.0286
Two-sided p-value: Pr { | Test Statistic - Mean |
.GE. | Observed - Mean | = 0.0571
Two-sided p-value: 2*One-Sided = 0.0571
xvi
Using exact p-values from the same output panel, we accept Ho; the treatment has no
effect.
Example 4: (Example 2 continued)
We found the mean and standard deviation of WXY
*
to be 1,620 and 99.29
respectively. The mean and standard deviation of WXY* are 180 and 99.29
respectively. Hence, Pr(WXY
*
1720) = Pr(Z = 1.007) = 0.16 as before. The
conclusion does not change. We still accept H0.
PAIRED SAMPLES T TEST.
1. THE SIGN TEST
In section, we encountered the sign test for the first time as a nonparametric test of
location for one sample. Suppose that we have two paired samples. Let Xi denote the
response (outcome) to some treatment A for subject i in the first group and Yi the
response of subject i in the second group to treatment B.
Using the paired samples t-test, we can test whether treatment A is better than
treatment B. We use the paired differences between the paired measurements Xi and
Yi. Let Di = Yi-Xi. Regardless of the size of the difference, the differences will be
either positive or negative. Our interest is in the positive differences. If Treatment A
is indeed better than treatment B, then, on average, we expect Di to be positive. As
before, define Yi as below:
Yi = 1, if Di > 0
= 0, if Di < 0
The sum of the Yi's, S say, will be the total number of positive differences which is
our statistic. S has a binomial distribution with the sample size and p = 0.5 as
parameters. When there are ties, the difference, Di will be equal to zero. One simple
solution to this is to reduce n by the number of zeros. For instance, if there are 6 zeror
and the sample size is 20, the value of n to use in testing will be 14.
DISTRIBUTION OF S
Since S has a binomial distribution with N and π = 0.5,
The mean of S, E(S) =
2
n
n =π and its variance , Var(S) =
44
1
2
1
.)1(
n
xxnn ==−ππ .
If n is large, one can use the normal approximation.
Example:
The following data from Makutch and Parks (1988) document the response of
serum antigen level to AZT in 20 AIDS patients. Two sets of antigen levels are
provided for each patient; pre-treatment and post-treatment. The differences are also
displayed, along with two sets of signed ranks.
xvii
Table:2.5 AZT and Serum Antigen Trial
Sign
Patient Serum Antigen Level(pg/ml) Test
Id Pre-AZT Post-AZT Difference Scores
1 149 0 -149 0
2 0 51 51 1
3 0 0 0 -
4 259 385 126 1
5 106 0 -106 0
6 255 235 -20 0
7 0 0 0 -
8 52 0 -52 0
9 340 48 -392 0
10 65 65 0 -
11 180 77 -103 0
12 0 0 0 -
13 84 0 -84 0
14 89 0 -89 0
15 212 53 -159 0
16 554 150 -404 0
17 500 0 -500 0
18 424 165 -259 0
19 112 98 -14 0
20 2600 0 -2600 0
Sum 2
Does AZT have an effect on serum antigen levels? Test at α = 0.05.
Solution:
There are 20 differences, 4 of which are zeros. Therefore, for our sign test n = 16.
Two differences are positive. Therefore S=2. Calculation of a p-value will depend on
our alternative hypothesis. If our alternative is that AZT increases serum antigen
level then our p-value will be calculated as follows:
∑∑ =
=
≤−===≥=
0
1
0
16
2
)Pr(1)Pr()2Pr(
s
s
s
sSsSSp
= ( ) ( ( ) ( ) ( ) ( ) )151160
5.05.0
1
16
5.5.0
0
16
1)1Pr()0Pr(1 





+





−==+=− SS
= 1-(0.00001526 + 0.0002441)
= 0.000259
Hence we reject H0. AZT increases serum antigen levels in AIDS patients.
Using StatXact, we get the following output in the panel below.
xviii
Explanation Of Output:
Manually, we found S equal to 2. From the above output,under the column 'Observed'
we have 2 which is the observed value of S. Our alternative is non-directional since
we do not say what effect AZT has. Therefore, we will use the exact two-sided p-
value which is equal to 0.0042. We reject H0. AZT has an effect on Serum Antigen
Levels in AIDS patients.
The Sign Test for paired samples is good and simple to use. However, there is loss of
information since the actual magnitude of the differences is not taken into
consideration. As such, it as has a power which is less than that of a test which takes
into consideration the actual size of the differences. On such a method is the
Wilcoxon Signed Rank Test.
2. THE WILCOXON SIGNED RANK TEST
This method is for paired samples and is similar to the Sign Test for paired
differences. The only minor difference is that the magnitude of the differences are
considered in testing. The method is as follows:
1. Find the difference, Di between paired values.
1. Ignoring the sign of the difference ( i.e. whether positive or negative) arrange the
differences in ascending order.
2. For each difference,attach the sign of the difference to the rank of that difference.
3. Add all ranks with a positive sign. Let the sum of these positive ranks be TWS.
This is your statistic.
SIGN TEST
Summary of Exact distribution of SIGN statistic:
Min Max Mean Std-dev Observed Standardized
0.0000 16.00 8.000 2.000 2.000 -3.000
Asymptotic Inference:
One-sided p-value: Pr { Test Statistic .LE. Observed } = 0.0013
Two-sided p-value: 2 * One-sided = 0.0027
Exact Inference:
One-sided p-value: Pr { Test Statistic .LE. Observed } = 0.0021
Pr { Test Statistic .EQ. Observed } = 0.0018
Two-sided p-value: 2*One-Sided = 0.0042
xix
THE DISTRIBUTION OF TWS.
The mean of Tws, E(Tws) =
( )
4
1+NN
and its variance , Var(Tws) =
4
N
. One can use
these two for the normal approximation.
Example:
We will revisit the AZT example. The data is reproduced below with another column
added.
Patient Signed
Id Pre-AZT Post AZT Difference Ranks
1 149 0 -149 -10
2 0 51 51 3
3 0 0 0 .
4 259 385 126 9
5 106 0 -106 -8
6 255 235 -20 -2
7 0 0 0 .
8 52 0 -52 -4
9 340 48 -392 -13
10 65 65 0 .
11 180 77 -103 -7
12 0 0 0 .
13 84 0 -84 -5
14 89 0 -89 -6
15 212 53 -159 -11
16 554 150 -404 -14
17 500 0 -500 -15
18 424 165 -259 -12
19 112 98 -14 -1
20 2600 0 -2600 -16
Sum 12
Column 5 contains the ranks with signs attached to them and the total of positive
ranks is 9 + 3 =12.
Using StatXact:
On the next is an output panel from StatXact.
xx
Since our test is two-tailed we will use the two-sided p-value. This p-value is equal to
0.0021 which is less than 0.05, our level of significance. We still reject H0. AZT has
an effect on Serum Antigen levels in AIDS patients. Note that the observed value is
given in the output panel under the column observed as 12, just what we have
calculated at the bottom of column 5 of the table above.
GENERAL NOTE:
WHEN TO USE TWO SIDED ALTERNATIVES.
1. When trying to decide which of two treatments is better and both are new, there is
no reason to stage an alternative hypothesis which says that one is better than the
other.
2. If the problem is not deciding which of two treatments or procedures is better but
that one wants to know whether they differ at all then the alternative must be two-
sided. For example, in a study comparing a surgical approach to a medical
problem with a more conservative treatment, it may be desirable to examine first
whether it is necessary to distinguish between two surgical techniques used by
different surgeons.
WILCOXON SIGNED RANK TEST
Summary of Exact distribution of WILCOXON SIGNED RANK statistic:
Min Max Mean Std-dev Observed Standardized
0.0000 136.0 68.00 19.34 12.00 -2.896
Asymptotic Inference:
One-sided p-value: Pr { Test Statistic .LE. Observed } = 0.0019
Two-sided p-value: 2 * One-sided = 0.0038
Exact Inference:
One-sided p-value: Pr { Test Statistic .LE. Observed } = 0.0011
Pr { Test Statistic .EQ. Observed } = 0.0002
Two-sided p-value: Pr { | Test Statistic - Mean |
.GE. | Observed - Mean | = 0.0021
Two-sided p-value: 2*One-Sided = 0.0021
xxi
EXERCISES
1. From a group of nine rats available for a study of learning, five were selected at
random and were trained to imitate the leader rat in a maze. They were then
placed together with four untrained rats in a situation where imitation of the
leaders enabled them to avoid receiving an electric shock. The results (the
number of trials required to obtain ten correct responses in ten consecutive trials)
were as follows:
Trained rats 74 64 75 45 82
Control 110 70 53 51
Find the significance probability of these results when the Wilcoxon Rank Sum Test
is used. What do you conclude? (Test at α = 0.05).
2. The effectiveness of vitamin C in orange juice and in synthetic ascorbic acid was
compared in 20 guinea pigs (divided at random into two groups of 10) in terms of
the length of the odontoblasts after 6 weeks, with the following results:
Orange
juice
8.2 9.4 9.6 9.7 10.0 14.5 15.2 16.1 17.6 21.5
Ascorbic
acid
4.2 5.2 5.8 6.4 7.0 7.3 10.1 11.2 11.3 11.5
Test the hypothesis of no difference against the hypothesis that the orange juice tends
to give larger values. Use α = 0.05.
(a) Use the Rank sum test
(b) Sign Test
(c) Wilcoxocon Signed Rank Test.
3. Suppose that a new postsurgical treatment is being compared with a standard
treatment by observing the recovry times of 9 treatment subjects and 9 controls.
The data is in the table below:
Standard
Treatment
20 21 24 30 32 36 40 48 54
New
Treatment
19 22 25 26 28 29 34 37 38
(a) Find the p-value if the Wilcoxon Rank Sum Test is used.
(b) Using the Wilcoxon Signed Rank Test to test the null hypothesis that the new
treatment is no better than the standard one. Use α = 0.05.
4. Suppose that 20 treatment patients are being compared with 20 controls, and that
the progress of each patient is classified as very poor, poor, indifferent, good, or
very good. The data are given in the following table:
xxii
Very Poor Poor Indifferent Good Very Good
Control 2 2 11 4 1
Treatment 0 1 9 7 3
Is the treatment effective? Test at α = 0.05.
5. In an investigation of a new drug for postoperative pain relief, it was desired to
determine (among other things) whether the relief brought by 3 mg of the drug is
significntly higher than that resulting from a dose of 1 mg. In one phase of the
study, 15 freshly operated-upon patients were assigned at random, 7 to the lower
dose (T1) and the remaining 8 to the higher dose (T3). The responses (number of
hours of pain relief) were T1: 2,0,3,3,0,0,8 and T3: 6,4,4,0,1,8,2,8. How
significant are these results? Use any appropriate nonparametric test. (α = 0.05 ).
xxiii
REFERENCES
Lehmann, E.L (1975). Nonparametrics: Statistical Methods Based on Ranks.San
Fransisco:Holden-Day Inc.
Gajjar,Y.,Mehta, C., Patel, N., Senchaudhuri, P., StatXact: User's Manual.
xxiv
PREFACE
It is not very easy to produce a text which is strictly non-technical. Of course, when
designing course material the audience has to be at heart. How user-friendly a text is
is very subjective and it depends on ones background. I have tried to make the stuff
attractive, user-friendly and straight-forward without losing too much rigour. I have
also tried my best to make it simple. I could not have made it any simpler. After all,
Albert Enstein said that we can make things simple but not simpler!
Nonparametric statistics is seen as difficult to some because they are not exposed to it
at an early stage. However, it is very useful in biomedical research and is like any
other statistical method. Its importance in inference can not be over-emphasized.
The methods explained in this brochure can be tried out using some statistical
packages. One such a statistical package is StatXact. This is chiefly for
nonparametric statistics. Other statistical packages also have routines for carrying
out non parametric inferential procedures. An example in mind is SPSS.
A reader who is looking for more rigorous stuff should read the books mentioned in
the bilbiography.
Any comments, corrections and necessary additions should be addressed to the
author. Constructive criticisms are welcome.
I hope you will enjoy this brochure.
Blantyre,
14 December, 2000
H. E. Misiri
HEMisiri@yahoo.co.uk
xxv
TABLE OF CONTENTS
Page
Preface ii
Introduction 1
Ranking Data 1
Ties In A Data Set 2
Nonparametric Analogues of Some Parametric Methods 2
Sign Test for One Sample 2
Using The Binomial Distribution 4
Two sample Tests 7
Wilcoxon Rank Sum test 7
Determining The Alternative 8
Using The Normal Approximation to The Distribution of Ws 9
Correcting for Ties When using Ws 10
Wilcoxon –Mann-Whitney Test 12
Nonparametric Rank - Based Paired Samples Tests 15
The Wilcoxon Signed Rank Test 17
The Distribution of The Wilcoxon Signed Rank Test 18
When To use Two-Sided Alternatives 19
Exercises 20
Bilbiography 22
Appendix 23
xxvi

More Related Content

What's hot

correlation and regression
correlation and regressioncorrelation and regression
correlation and regression
Keyur Tejani
 
Linear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec domsLinear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec doms
Babasab Patil
 
Two-Way ANOVA
Two-Way ANOVATwo-Way ANOVA
Advance statistics 2
Advance statistics 2Advance statistics 2
Advance statistics 2
Tim Arroyo
 
Chapter 6 simple regression and correlation
Chapter 6 simple regression and correlationChapter 6 simple regression and correlation
Chapter 6 simple regression and correlation
Rione Drevale
 
Chapter8
Chapter8Chapter8
Correlation & Linear Regression
Correlation & Linear RegressionCorrelation & Linear Regression
Correlation & Linear Regression
Azmi Mohd Tamil
 
Chapter 16: Correlation (enhanced by VisualBee)
Chapter 16: Correlation  
(enhanced by VisualBee)Chapter 16: Correlation  
(enhanced by VisualBee)
Chapter 16: Correlation (enhanced by VisualBee)
nunngera
 
Two Variances or Standard Deviations
Two Variances or Standard DeviationsTwo Variances or Standard Deviations
Two Variances or Standard Deviations
Long Beach City College
 
Chapter12
Chapter12Chapter12
Chapter12
rwmiller
 
Measures of Correlation in Education
Measures of Correlation in EducationMeasures of Correlation in Education
Measures of Correlation in Education
SunitaBokde
 
Contingency Tables
Contingency TablesContingency Tables
Contingency Tables
Long Beach City College
 
Chapter12
Chapter12Chapter12
Chapter12
Richard Ferreria
 
Non parametrics tests
Non parametrics testsNon parametrics tests
Non parametrics tests
rodrick koome
 
Correlation
CorrelationCorrelation
Correlation
James Neill
 
Linear regression
Linear regressionLinear regression
Linear regression
Karishma Chaudhary
 
Solving stepwise regression problems
Solving stepwise regression problemsSolving stepwise regression problems
Solving stepwise regression problems
Soma Sinha Roy
 
Assessing Normality
Assessing NormalityAssessing Normality
Assessing Normality
Long Beach City College
 
Pearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear RegressionPearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear Regression
Azmi Mohd Tamil
 
Logistic Regression in Sports Research
Logistic Regression in Sports ResearchLogistic Regression in Sports Research
Logistic Regression in Sports Research
J P Verma
 

What's hot (20)

correlation and regression
correlation and regressioncorrelation and regression
correlation and regression
 
Linear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec domsLinear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec doms
 
Two-Way ANOVA
Two-Way ANOVATwo-Way ANOVA
Two-Way ANOVA
 
Advance statistics 2
Advance statistics 2Advance statistics 2
Advance statistics 2
 
Chapter 6 simple regression and correlation
Chapter 6 simple regression and correlationChapter 6 simple regression and correlation
Chapter 6 simple regression and correlation
 
Chapter8
Chapter8Chapter8
Chapter8
 
Correlation & Linear Regression
Correlation & Linear RegressionCorrelation & Linear Regression
Correlation & Linear Regression
 
Chapter 16: Correlation (enhanced by VisualBee)
Chapter 16: Correlation  
(enhanced by VisualBee)Chapter 16: Correlation  
(enhanced by VisualBee)
Chapter 16: Correlation (enhanced by VisualBee)
 
Two Variances or Standard Deviations
Two Variances or Standard DeviationsTwo Variances or Standard Deviations
Two Variances or Standard Deviations
 
Chapter12
Chapter12Chapter12
Chapter12
 
Measures of Correlation in Education
Measures of Correlation in EducationMeasures of Correlation in Education
Measures of Correlation in Education
 
Contingency Tables
Contingency TablesContingency Tables
Contingency Tables
 
Chapter12
Chapter12Chapter12
Chapter12
 
Non parametrics tests
Non parametrics testsNon parametrics tests
Non parametrics tests
 
Correlation
CorrelationCorrelation
Correlation
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Solving stepwise regression problems
Solving stepwise regression problemsSolving stepwise regression problems
Solving stepwise regression problems
 
Assessing Normality
Assessing NormalityAssessing Normality
Assessing Normality
 
Pearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear RegressionPearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear Regression
 
Logistic Regression in Sports Research
Logistic Regression in Sports ResearchLogistic Regression in Sports Research
Logistic Regression in Sports Research
 

Similar to Hmisiri nonparametrics book

Nonparametric and Distribution- Free Statistics _contd
Nonparametric and Distribution- Free Statistics _contdNonparametric and Distribution- Free Statistics _contd
Nonparametric and Distribution- Free Statistics _contd
Southern Range, Berhampur, Odisha
 
Nonparametric and Distribution- Free Statistics
Nonparametric and Distribution- Free Statistics Nonparametric and Distribution- Free Statistics
Nonparametric and Distribution- Free Statistics
Southern Range, Berhampur, Odisha
 
hypothesisTestPPT.pptx
hypothesisTestPPT.pptxhypothesisTestPPT.pptx
hypothesisTestPPT.pptx
dangwalakash07
 
Workshop 4
Workshop 4Workshop 4
Workshop 4
eeetq
 
non para.doc
non para.docnon para.doc
non para.doc
Annamalai University
 
Testing of hypothesis
Testing of hypothesisTesting of hypothesis
Testing of hypothesis
Sanjay Basukala
 
Factorial Experiments
Factorial ExperimentsFactorial Experiments
Factorial Experiments
HelpWithAssignment.com
 
introduction CDA.pptx
introduction CDA.pptxintroduction CDA.pptx
introduction CDA.pptx
Krishna Krish Krish
 
Marketing Research Hypothesis Testing.pptx
Marketing Research Hypothesis Testing.pptxMarketing Research Hypothesis Testing.pptx
Marketing Research Hypothesis Testing.pptx
xababid981
 
Lab manual_statistik
Lab manual_statistikLab manual_statistik
Lab manual_statistik
Nur Afny Andryani
 
Parametric & non parametric
Parametric & non parametricParametric & non parametric
Parametric & non parametric
ANCYBS
 
TEST OF SIGNIFICANCE.pptx
TEST OF SIGNIFICANCE.pptxTEST OF SIGNIFICANCE.pptx
TEST OF SIGNIFICANCE.pptx
muthukrishnaveni anand
 
Data classification sammer
Data classification sammer Data classification sammer
Data classification sammer
Sammer Qader
 
C2 st lecture 10 basic statistics and the z test handout
C2 st lecture 10   basic statistics and the z test handoutC2 st lecture 10   basic statistics and the z test handout
C2 st lecture 10 basic statistics and the z test handout
fatima d
 
Statistics Project
Statistics ProjectStatistics Project
Statistics Project
NicholasDavis85
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Derek Kane
 
Week 5 Lecture 14 The Chi Square TestQuite often, patterns of .docx
Week 5 Lecture 14 The Chi Square TestQuite often, patterns of .docxWeek 5 Lecture 14 The Chi Square TestQuite often, patterns of .docx
Week 5 Lecture 14 The Chi Square TestQuite often, patterns of .docx
cockekeshia
 
Chi square and t tests, Neelam zafar & group
Chi square and t tests, Neelam zafar & groupChi square and t tests, Neelam zafar & group
Chi square and t tests, Neelam zafar & group
Neelam Zafar
 
Chapter 11 Chi-Square Tests and ANOVA 359 Chapter .docx
Chapter 11 Chi-Square Tests and ANOVA  359 Chapter .docxChapter 11 Chi-Square Tests and ANOVA  359 Chapter .docx
Chapter 11 Chi-Square Tests and ANOVA 359 Chapter .docx
bartholomeocoombs
 
Data science classica_hypos
Data science classica_hyposData science classica_hypos
Data science classica_hypos
Neeraj Sinha
 

Similar to Hmisiri nonparametrics book (20)

Nonparametric and Distribution- Free Statistics _contd
Nonparametric and Distribution- Free Statistics _contdNonparametric and Distribution- Free Statistics _contd
Nonparametric and Distribution- Free Statistics _contd
 
Nonparametric and Distribution- Free Statistics
Nonparametric and Distribution- Free Statistics Nonparametric and Distribution- Free Statistics
Nonparametric and Distribution- Free Statistics
 
hypothesisTestPPT.pptx
hypothesisTestPPT.pptxhypothesisTestPPT.pptx
hypothesisTestPPT.pptx
 
Workshop 4
Workshop 4Workshop 4
Workshop 4
 
non para.doc
non para.docnon para.doc
non para.doc
 
Testing of hypothesis
Testing of hypothesisTesting of hypothesis
Testing of hypothesis
 
Factorial Experiments
Factorial ExperimentsFactorial Experiments
Factorial Experiments
 
introduction CDA.pptx
introduction CDA.pptxintroduction CDA.pptx
introduction CDA.pptx
 
Marketing Research Hypothesis Testing.pptx
Marketing Research Hypothesis Testing.pptxMarketing Research Hypothesis Testing.pptx
Marketing Research Hypothesis Testing.pptx
 
Lab manual_statistik
Lab manual_statistikLab manual_statistik
Lab manual_statistik
 
Parametric & non parametric
Parametric & non parametricParametric & non parametric
Parametric & non parametric
 
TEST OF SIGNIFICANCE.pptx
TEST OF SIGNIFICANCE.pptxTEST OF SIGNIFICANCE.pptx
TEST OF SIGNIFICANCE.pptx
 
Data classification sammer
Data classification sammer Data classification sammer
Data classification sammer
 
C2 st lecture 10 basic statistics and the z test handout
C2 st lecture 10   basic statistics and the z test handoutC2 st lecture 10   basic statistics and the z test handout
C2 st lecture 10 basic statistics and the z test handout
 
Statistics Project
Statistics ProjectStatistics Project
Statistics Project
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
 
Week 5 Lecture 14 The Chi Square TestQuite often, patterns of .docx
Week 5 Lecture 14 The Chi Square TestQuite often, patterns of .docxWeek 5 Lecture 14 The Chi Square TestQuite often, patterns of .docx
Week 5 Lecture 14 The Chi Square TestQuite often, patterns of .docx
 
Chi square and t tests, Neelam zafar & group
Chi square and t tests, Neelam zafar & groupChi square and t tests, Neelam zafar & group
Chi square and t tests, Neelam zafar & group
 
Chapter 11 Chi-Square Tests and ANOVA 359 Chapter .docx
Chapter 11 Chi-Square Tests and ANOVA  359 Chapter .docxChapter 11 Chi-Square Tests and ANOVA  359 Chapter .docx
Chapter 11 Chi-Square Tests and ANOVA 359 Chapter .docx
 
Data science classica_hypos
Data science classica_hyposData science classica_hypos
Data science classica_hypos
 

More from College of Medicine(University of Malawi)

Cancer incidence in Malawi:Time trends 1996-2015 in Book of Abstracts of the ...
Cancer incidence in Malawi:Time trends 1996-2015 in Book of Abstracts of the ...Cancer incidence in Malawi:Time trends 1996-2015 in Book of Abstracts of the ...
Cancer incidence in Malawi:Time trends 1996-2015 in Book of Abstracts of the ...
College of Medicine(University of Malawi)
 
HumphreyMisiri_Estimating HIV incidence from grouped cross-sectional data in ...
HumphreyMisiri_Estimating HIV incidence from grouped cross-sectional data in ...HumphreyMisiri_Estimating HIV incidence from grouped cross-sectional data in ...
HumphreyMisiri_Estimating HIV incidence from grouped cross-sectional data in ...
College of Medicine(University of Malawi)
 
HMisiri_Estimating HIV incidence from grouped cross-sectional data in setting...
HMisiri_Estimating HIV incidence from grouped cross-sectional data in setting...HMisiri_Estimating HIV incidence from grouped cross-sectional data in setting...
HMisiri_Estimating HIV incidence from grouped cross-sectional data in setting...
College of Medicine(University of Malawi)
 
Attitudes towards Premarital Testing on Human Immunodeficiency Virus Infectio...
Attitudes towards Premarital Testing on Human Immunodeficiency Virus Infectio...Attitudes towards Premarital Testing on Human Immunodeficiency Virus Infectio...
Attitudes towards Premarital Testing on Human Immunodeficiency Virus Infectio...
College of Medicine(University of Malawi)
 
Attitudes towards Premarital Testing on Human Immunodeficiency Virus Infectio...
Attitudes towards Premarital Testing on Human Immunodeficiency Virus Infectio...Attitudes towards Premarital Testing on Human Immunodeficiency Virus Infectio...
Attitudes towards Premarital Testing on Human Immunodeficiency Virus Infectio...
College of Medicine(University of Malawi)
 
An assessment of food supplementationto chronically sick patients receiving ...
An assessment of food supplementationto chronically sick patients  receiving ...An assessment of food supplementationto chronically sick patients  receiving ...
An assessment of food supplementationto chronically sick patients receiving ...
College of Medicine(University of Malawi)
 
Age at cancer diagnosis in Malawi
Age at cancer diagnosis in MalawiAge at cancer diagnosis in Malawi
Age at cancer diagnosis in Malawi
College of Medicine(University of Malawi)
 
Socio-demographic characteristics associated with HIV and syphilis seroreacti...
Socio-demographic characteristics associated with HIV and syphilis seroreacti...Socio-demographic characteristics associated with HIV and syphilis seroreacti...
Socio-demographic characteristics associated with HIV and syphilis seroreacti...
College of Medicine(University of Malawi)
 
Access to continued professional education among health workers in Blantyre, ...
Access to continued professional education among health workers in Blantyre, ...Access to continued professional education among health workers in Blantyre, ...
Access to continued professional education among health workers in Blantyre, ...
College of Medicine(University of Malawi)
 
Likely stakeholders in the prevention of mother to child transmission of HIV/...
Likely stakeholders in the prevention of mother to child transmission of HIV/...Likely stakeholders in the prevention of mother to child transmission of HIV/...
Likely stakeholders in the prevention of mother to child transmission of HIV/...
College of Medicine(University of Malawi)
 
Physical trauma experience among school children in periurban Blantyre, Malaw
Physical trauma experience among school children in periurban  Blantyre, MalawPhysical trauma experience among school children in periurban  Blantyre, Malaw
Physical trauma experience among school children in periurban Blantyre, Malaw
College of Medicine(University of Malawi)
 
Perceived effects of rotating shift work on nurses’ sleep quality and duration
Perceived effects of rotating shift work on nurses’  sleep quality and durationPerceived effects of rotating shift work on nurses’  sleep quality and duration
Perceived effects of rotating shift work on nurses’ sleep quality and duration
College of Medicine(University of Malawi)
 

More from College of Medicine(University of Malawi) (14)

Cancer incidence in Malawi:Time trends 1996-2015 in Book of Abstracts of the ...
Cancer incidence in Malawi:Time trends 1996-2015 in Book of Abstracts of the ...Cancer incidence in Malawi:Time trends 1996-2015 in Book of Abstracts of the ...
Cancer incidence in Malawi:Time trends 1996-2015 in Book of Abstracts of the ...
 
HumphreyMisiri_Estimating HIV incidence from grouped cross-sectional data in ...
HumphreyMisiri_Estimating HIV incidence from grouped cross-sectional data in ...HumphreyMisiri_Estimating HIV incidence from grouped cross-sectional data in ...
HumphreyMisiri_Estimating HIV incidence from grouped cross-sectional data in ...
 
HMisiri_Estimating HIV incidence from grouped cross-sectional data in setting...
HMisiri_Estimating HIV incidence from grouped cross-sectional data in setting...HMisiri_Estimating HIV incidence from grouped cross-sectional data in setting...
HMisiri_Estimating HIV incidence from grouped cross-sectional data in setting...
 
Attitudes towards Premarital Testing on Human Immunodeficiency Virus Infectio...
Attitudes towards Premarital Testing on Human Immunodeficiency Virus Infectio...Attitudes towards Premarital Testing on Human Immunodeficiency Virus Infectio...
Attitudes towards Premarital Testing on Human Immunodeficiency Virus Infectio...
 
Attitudes towards Premarital Testing on Human Immunodeficiency Virus Infectio...
Attitudes towards Premarital Testing on Human Immunodeficiency Virus Infectio...Attitudes towards Premarital Testing on Human Immunodeficiency Virus Infectio...
Attitudes towards Premarital Testing on Human Immunodeficiency Virus Infectio...
 
An assessment of food supplementationto chronically sick patients receiving ...
An assessment of food supplementationto chronically sick patients  receiving ...An assessment of food supplementationto chronically sick patients  receiving ...
An assessment of food supplementationto chronically sick patients receiving ...
 
Age at cancer diagnosis in Malawi
Age at cancer diagnosis in MalawiAge at cancer diagnosis in Malawi
Age at cancer diagnosis in Malawi
 
Socio-demographic characteristics associated with HIV and syphilis seroreacti...
Socio-demographic characteristics associated with HIV and syphilis seroreacti...Socio-demographic characteristics associated with HIV and syphilis seroreacti...
Socio-demographic characteristics associated with HIV and syphilis seroreacti...
 
Poverty, access and immunisation in Malawi- A descriptive study
Poverty, access and immunisation in Malawi- A descriptive studyPoverty, access and immunisation in Malawi- A descriptive study
Poverty, access and immunisation in Malawi- A descriptive study
 
Underreporting gradivity in a rural Malawian population
Underreporting gradivity in a rural Malawian populationUnderreporting gradivity in a rural Malawian population
Underreporting gradivity in a rural Malawian population
 
Access to continued professional education among health workers in Blantyre, ...
Access to continued professional education among health workers in Blantyre, ...Access to continued professional education among health workers in Blantyre, ...
Access to continued professional education among health workers in Blantyre, ...
 
Likely stakeholders in the prevention of mother to child transmission of HIV/...
Likely stakeholders in the prevention of mother to child transmission of HIV/...Likely stakeholders in the prevention of mother to child transmission of HIV/...
Likely stakeholders in the prevention of mother to child transmission of HIV/...
 
Physical trauma experience among school children in periurban Blantyre, Malaw
Physical trauma experience among school children in periurban  Blantyre, MalawPhysical trauma experience among school children in periurban  Blantyre, Malaw
Physical trauma experience among school children in periurban Blantyre, Malaw
 
Perceived effects of rotating shift work on nurses’ sleep quality and duration
Perceived effects of rotating shift work on nurses’  sleep quality and durationPerceived effects of rotating shift work on nurses’  sleep quality and duration
Perceived effects of rotating shift work on nurses’ sleep quality and duration
 

Recently uploaded

World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
ak6969907
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
Nicholas Montgomery
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Dr. Vinod Kumar Kanvaria
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
NgcHiNguyn25
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
tarandeep35
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
WaniBasim
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
Priyankaranawat4
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
Nicholas Montgomery
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
TechSoup
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
RitikBhardwaj56
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
Nguyen Thanh Tu Collection
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
eBook.com.bd (প্রয়োজনীয় বাংলা বই)
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
adhitya5119
 
How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17
Celine George
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
Celine George
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
David Douglas School District
 

Recently uploaded (20)

World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
 
How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
 

Hmisiri nonparametrics book

  • 1. i NONPARAMETRIC RANK-BASED BIOSTATISTICAL METHODS by H.E. MISIRI (LECTURER) 14.DECEMBER 2000 Department Of Community Health College Of Medicine
  • 2. ii NONPARAMETRIC STATISTICAL TESTS Introduction When we use t-tests, the assumption we make about the data is that it is normally distributed. We assume that the data is a random sample or random samples form a normal population. The normal distribution has a probability density function which is as below: ( ) ∞<<∞−               − −= x x xf , 2 1 exp. 1 . 2 1 2 σ µ σπ . Therefore , if we assume that a data set is normally distributed we imply that it has this density function. Tests based on the assumption that the population has a certain functional form are called parametric tests. Examples are t-tests and z-tests. It is still possible to test hypotheses without assuming that the data has a certain functional form. Tests conducted without the assumption of a certain functional form are called nonparametric tests. The use of these is rather pragmatic because the assumption that data has a certain distribution may not be valid sometimes. Besides this, the sample size may not be big enought to justify the use the normal approximation( Central Limit Theorem). Some nonparametric tests have higher power to detect alternatives under some circumstances. RANKING DATA The table below (Table 1.1) shows measurements of blood pressure taken from 10 subjects. 4 received treatment and 6 were controls. Suppose we pool and arrange the data in ascending order. Furthermore, if we assign numbers to these ranked values and if the numbers assigned denote some ordering, we say that we are ranking the data. The numbers which serve as the indices of location of the data points are called ranks. Table 1.1 shows the data in the first two columns. Column 4 contains the group labels and column 5 contains the ranks. Table 1.1 : Measurements of Blood Pressure Pressure Ranked Values of Group (mmHg) Pressure Group Rank 1 94 78 2 1 1 108 80 2 2 1 110 85 2 3 1 90 88 2 4 2 80 90 1 5 2 85 94 1 6 2 94 94 2 7 2 78 105 2 8 2 105 108 1 9 2 88 110 1 10
  • 3. iii TIES IN A DATA SET Suppose that any two values in a data set are equal. We call this a tie. When ties occur, some remedial measures have to be taken. When several numbers are equal, their rank is the average of their ranks. Consider the following example. Measurements on some attributes were taken and are shown below: 94,108,110,90,80,94,85,90,90,90,108,94,78,105,88 Several values occur more than once. For instance 90, 94 and 108. The rank of each 90 is (5 + 6 + 7 + 8 )/4 = 6.5. The table below (Table 1.3) shows the ranked data. Table 1.2: Ranked Data Value 78 80 85 88 90 90 90 90 94 94 94 105 108 108 110 Rank 1 2 3 4 6.5 6.5 6.5 6.5 10 10 10 12 13.5 13.5 15 When ties occur in a data set, the test statistics calculated from that data have to be adjusted to take care of this. This adjustment is called correction for ties. This is sometimes hard to do manually. However, some statistical packages have in-built mechanisms of doing this. Some nonparametric tests are based on ranks. Statistical tests based on ranks are called rank tests. Some parametric tests have nonparametric rank test analogues. Examples are in the following table (Table 1.4). Table 1.3 Nonparametric Analogues of Some Parametric Methods Parametric Method Nonparametric Analogue T-Test For Paired Samples Wilcoxon Signed Rank test, Sign Test One Way Anova Kruskal Wallis Test Two Way Anova Friedman Test T-test for Independent Samples Rank Sum Test(Wilcoxon Mann-Whitney U Test) T-test for One Sample Sign Test 1. ONE SAMPLE TEST: THE SIGN TEST These are tests based on data collected from one population. The parametric test one can use is the t-test for small samples. If the sample size is large, one can use the z- test. The sign test is based on the binomial distribution. Suppose we have a random sample of observations from a certain population. Suppose also that our null hypothesis is that the mean is equal to some constant, say µ0. Using statistical notation, this can be written as Ho: µ = µ0. A strong assumption we will have to make is that the distribution of the source population is symmetric so that the population mean is equal to the median. Then, without having to assume that the data is normally distributed, we can stage the same null hypothesis using the median as follows :
  • 4. iv H0: M = M0, where M0 is the hypothesized median. Let Di = Xi – M0, where Xi is observation i. Then each difference will be either positive or negative. Let Yi be defined as follows: Yi = 1, if Di > 0 = 0, if Di < 0 Let S be the sum of the Yi's. The probability of having a positive difference is equal to the probability of having a negative difference. This is 0.5. If you take success as a positive difference then S has a binomial distribution with parameters n and π=0.5. Suppose that there are n0 ties. Then, there will be n0 difference equal to zero. Then S has a binomial distribution with n-n0 and π = 0.5 as parameters. S is in essence, the number of positive differences. Example: The following are measurements of weight, in kilograms, for 6 individuals: 62, 63, 64.5, 65, 72 and 60. We strongly believe that the mean is greater than 64 kg. Find the deviations ( Di's ) and S, the sum of the Yi's. Solution: Table 1.4: Deviations From The Mean Obs No(i) 1 2 3 4 5 6 Weight(kg) 62 63 64.5 65 72 60 Difference -2 -1 0.5 1 8 -4 Yi 0 0 1 1 1 1 Sign - - + + + + Thus, S= 1+1+1+1 = 4. There are 4 positive differences. S has a Binomial distribution with n = 6 and π = 0.5 as parameters. This is so because the number of positive differences here is 6. π = 0.5 because the probability of having a positive difference is just 0.5. We can thus use the binomial distribution to test the following hypotheses: 1. H0 : M = M0 vs Ha : M ≠≠≠≠ M0 2. H0: M = M0 vs Ha : M > M0 3. H0: M = M0 vs Ha: M < M0 Implication of The Null Hypothesis Under the null, the proportion of positive differences (π) is equal to the proportion of negative differences. We thus assume that the data has symmetric distribution (not necessarily bell-shaped). If a distribution is symmetric, µ =M. Note that the null hypothesis has an equivalent expression and this is H0: π = 0.5. Thus the above hypotheses can be rewritten as follows:
  • 5. v 1. H0 : π = π0 vs Ha : π≠≠≠≠ π0 2. H0: π = π0 vs Ha : π> π0 3. H0: π = π0 vs Ha: π < π0 USE OF THE BINOMIAL DISTRIBUTION To test the above hypotheses, we will use a cumulative binomial distribution. Recall that the pmf of a binomial distribution is ( ) ( ) nr r n rR rnr ,...,1,0,1Pr =−      == − ππ . The cumulative binomial distribution function is thus ( ) ( ) nr r n rR rnr r r ,,1,0,1Pr 0 0 0 Κ=−      =≤ − = ∑ ππ Suppose that we have a one-tail test (upper tail) and that our level of significance is 0.05. Then we can find which value of R gives us a cumulative probability of 0.95 approximately. The table below (Table 1.6) shows the cumulative probabilities of a Bin(n = 6,π = 0.5) distribution. Table 1.5:Bin(10, 0.5) Distribution n R=r P Pr(R≤≤≤≤ r) 10 0 0.5 0.0010 10 1 0.5 0.0107 10 2 0.5 0.0547 10 3 0.5 0.1719 10 4 0.5 0.3770 10 5 0.5 0.6230 10 6 0.5 0.8281 10 7 0.5 0.9453 10 8 0.5 0.9893 10 9 0.5 0.9990 10 10 0.5 1.0000 From Table 1.6 above, R = 7 gives a probability of 0.95 approximately. This is our cut-off (critical) point. We will therefore reject H0 if an observed value of R is greater than 7. The critical region for our test is composed of 8,9 and 10. Let us go back to our previous example about some measurements on weight in Table 1.5. There we have n = 6 and π = 0.5 (as usual). Our null hypothesis is H0 : π = 0.5 and our alternative is Ha: π > 0.5.
  • 6. vi The table below (Table 1.7) shows a Bin(6, 0.5) distribution. Table 1.6: A Bin(6,0.5) Distribution n R=r p Pr(R≤≤≤≤ r) 6 0 0.5 0.0156 6 1 0.5 0.1094 6 2 0.5 0.3438 6 3 0.5 0.6563 6 4 0.5 0.8906 6 5 0.5 0.9844 6 6 0.5 1.0000 We will test at the 2.5% level of significance. Our critical (cut-off) point is the value of R which gives us a cumulative probability of, approximately, 0.975. R = 5 will do because it gives us a cumulative probability of 0.9844 which is very close to 0.975. We will therefore, reject H0 if R > 5 ( i.e. if R = 6). Here R = S, the sum of the Yi's. From Table 1.5 above, S = 4 which implies that R = 4 for this data set. We accept H0 and conclude that the mean is equal to 64 kg. There are special tables for the Cumulative Binomial Distribution. You can also use a computer to get the same results. Some of the statistical packages which can be used are StatXact and SSPS. Below is output from both StatXact and SPSS. StatXact Output Panel From the panel above, we see that the one-sided p-value is 0.0000. We therefore reject Ho. The mean weight for this group is greater than 64 kg. ESTIMATION OF BINOMIAL PARAMETER (PI) Number of Trials =6 Number of Successes =4 Point Estimation of PI = 0.6667 97.50% Confidence Interval for PI = ( 0.1839 , 0.9699) Exact P-values for testing PI = 0.0250 One-sided : Pr { T .GE. 4 } = 0.0000 Two-sided : 2 * One-sided = 0.0000
  • 7. vii SPSS Output Panel Unfortunately, the SPSS panel is of no use because the p-value supplied is two-sided! - - - - - Binomial Test D Cases Test Prop. = .5000 4 LE .00 Obs. Prop. = .6667 2 GT .00 - Exact Binomial 6 Total 2-Tailed P = .6875
  • 8. viii 2. TWO SAMPLE TESTS 2.1 INDEPENDENT SAMPLES TESTS 2.1.1 WILCOXON RANK SUM TEST/WILCOXON-MANN-WHITNEY TEST Suppose we are 'given' a group of N subjects to be used in a study. Let us take an example of a clinical trial (an experimental study). If we want to compare a new drug to a placebo1 , we will divide the subjects into two groups. These groups are called arms of the trial. Arms of a trial may have different number of subjects. Let n denote the number of subjects in the treatment group and m, the number of subjects in the control group. Apparently, n + m = N. The subjects are randomly allocated to the treatment and control groups. In some trials, patients are not allowed to know which drug they are receiving. Only clinicians in the study know. This administration of a drug without telling the patient which drug he is getting is called blinding. The patient is 'blinded'. Since the clinician knows the drug and the patient does not know what he is getting. Such blinding is called single blinding. Single blinding eliminates the psychological effects that might result if a patient knows which drug he is receiving. For example, a patient can fake improvement! Sometimes both the clinician and the patient do not know which drug is being administered. This is called double-blinding. This eliminates bias since the clinician might curry favour if his relative or friend is in the study by giving him/her the new drug being tested. Blood is thicker than water! Both the patient and the clinician are blinded. In this example, chance enters the study by the choice of subjects into the arms. We can not claim that the group of N subjects is a sample from some population. This scenario is very common to physicians. Here is a hospital ward. There are only 6 patients. Four are on Treatment A and two are on Treatment B. Measurements of their blood pressure are taken. The main objective of the trial is to see if the two treatments have the same effect on the blood pressure of patients. Here, the sample size is very small. We can not even speak of asymptotics (Normal Approximation)! The data may be skewed which may make it implausible to use the Student's t-test. A very practical approach to this problem is by using exact nonparametric methods. Since the selection of subjects into the two groups is done using random sampling, the two samples can be said to be independent. There is a rank test which can be used to compared the two groups. One such method is called the Wilcoxon Rank Sum Test. Suppose that the responses for the two groups are pooled and ranked. Let Si denote the rank of the ith subject from the treatment group and Rj denote the rank of the jth subject from the control group. Obviously, i = 1,…,n and j = 1,…,m. Our statistic is the sum of ranks from the treatment groups. This is usually denoted by either Ws or Tx. Similarly, the sum of the ranks for the control group is Wr. To carry out this test, one must be very careful not to loose track of the group labels for the observations. 1 A placebo is a harmless pill containing no active ingredients
  • 9. ix The test statistic, Ws has a distribution and hypotheses can be tested using this distribution. There are tables for this. This test is available in many statistical packages e.g. StatXact. DETERMINING THE ALTERNATIVE HYPOTHESIS. Suppose that we are comparing two groups basing on measurements of weight in kilograms. Group I was on Diet A and Group II was on Diet B. We want to know if Diet A is more effective in increasing body weight than Diet B. We will therefore be testing the following hypotheses: H0: µ1 = µ2 vs Ha: µ1 > µ2 Higher values of weight in the first group will be in favour of Ha. Since we rank the pooled weights, if Group I has higher weights on average than Group II then the rank sum statistic, Ws will be very big. Diet A will be said to be more effective than Diet B in increasing body weight if Ws ≥ c. The value of c is chosen so that Pr(Ws ≥ c) = α. Alternatively, one can test the same hypotheses using the significance probability. Using some level of significance, α say, our significance probability is equal to Pr(Ws ≥ ws), where ws is the value of Ws calculated from the sample data. If we are testing at α = 0.05, say, we can determine from special tables2 the value of Ws which gives us this significance probability. This p-value is then compared to the level of significance. Another name for 'significance probability' is p-value. The rest is straightforward. Example 1: A mental hospital wishes to test the effectiveness of a new drug that is believed to have a beneficial effect on some mental or emotional disorder. There are 5 patients in the hospital suffering from this disorder. Three are selected at random to receive the new drug and the other 2 serve as controls. The ranks of the treatment subjects are 2,3 and 5 and those of the control subjects are 1 and 4. Does the drug have a beneficial effect? Test at α =0.05. Solution: i 1 2 3 Treatment ranks, Si 2 3 5 j 1 2 Control ranks, Rj 1 4 To carry out the hypothesis testing, the exact distribution of Ws must be known. This can be worked out when N is small. For N = 5, the distribution of Ws is as below: 2 See the appendix
  • 10. x Table 2.1: The Distribution of Ws Using Permutation ws P(Ws = ws) Pr(Ws ≤≤≤≤ ws) 6 0.100 0.100 7 0.100 0.200 8 0.200 0.400 9 0.200 0.600 10 0.200 0.800 11 0.100 0.900 12 0.100 1.000 The possible treatment ranks are permuted and the above distribution is arrived at. Our observed value of Ws is ws = 2 + 3 + 5 = 10. From the above table (Table 2.1), Pr(Ws ≥ 10) = 1 – Pr(Ws <10) = 1 - Pr(Ws ≤≤≤≤ 9) = 1 – 0.6 = 0.4. The p-value is 0.4. At the 5% level of significance, we accept H0. The drug has no effect. Most statistical packages have an equivalent form of the above test called the Wilcoxon-Mann-Whitney Test or just Mann-Whitney Test. USING THE NORMAL APPROXIMATION When both m and n are less than 10, critical values and significance probabilities of the Wilcoxon Rank Sum statistic can be obtained from the table in the appendix3 . However, when n is greater than 10, we can use the normal approximation. To do this, we need the mean and variance of Ws. MEAN AND VARIANCE OF Ws The mean of Ws, E(Ws) = )1( 2 1 +Nn , where N is the total number of subjects in the study and n is the number of subjects in the treatment group. The variance of Ws, Var(Ws) = ( )1 12 1 +Nmn , where m , n, and N are as before. Similarly, E(Wr) = )1( 2 1 +Nm , where N is the total number of subjects in the study and m is the number of subjects in the treatment group. and 3 See the Appendix
  • 11. xi Var(Wr) = ( )1 12 1 +Nmn , where m , n, and N are as before. Thus using the normal approximation, Z =       +       +− )1( 12 1 )1( 2 1 Nmn NnWs has an approximate standard normal distribution by the same Central Limit Theorem.. Similarly, Z =       +       +− )1( 12 1 )1( 2 1 Nmn NmWr has an approximate standard normal distribution by the same theorem. Therefore the p-value for a one-sided test,( Pr(Ws ≤≤≤≤ ws) or Pr(Ws ≥ ws) ) can be calculated using this approximation. Example2: Suppose that m=10, n = 10 and Ws = 79. Calculate Pr(Ws ≤≤≤≤ 79) using the normal approximation. Solution: E(Ws)= 105 and Var(Ws)=175 Thus Pr(Ws ≤≤≤≤ 79) = ( )       − ≤=      − ≤ 23.13 26 Pr 175 10579 Pr ZZ = ( ) .025.0965.1Pr =−≤Z CORRECTION FOR TIES Suppose that there are ties in a data set such that value number 1, in ascending order, is occurring with frequency f1, value number two with frequency f2 and so on. Then the mean of Ws* , E(Ws*) does not change. In other words, E(Ws) = E(Ws*). However, the variance changes and it is equal to : Var(Ws)* = ( ) ( ) )1(1212 1 1 3 − − − + ∑= NN ffmn Nmn e i ii , where e is the frequency of the largest observation.
  • 12. xii Example 3: Psychological counselling In a test of the effect of psychological counselling, 80 boys are divided at random into a control group of 40 to whom only the normal counselling facilities are available, and a treatment group of 40 who receive special counselling. At the end of the study, a careful assessment is made of each boy who is then classified as having made a good, fairly good, fairly poor, or poor adjustment, with the following results. Table 2.2 : Psychological Counselling Data Poor Fairly Poor Fairly Good Good Total Treatment 5 7 16 12 40 Control 7 9 15 9 40 Does counselling have a positive effect on adjustment? Test at α = 0.05. Solution: We will assign ranks to the five different categories of adjustment thus: Poor = 1, Fairly Poor = 2, Fairly Good = 3, Good = 4. We can see from the above table that we have ties in each category of adjustment. We have 5+7 = 12 observations tied at the rank 1, 16 at the rank 2, 31 at the rank 3, 21 and at the rank 4. Below is a table of the ranks and the number of observations tied at each rank. Table 2.3 Ranks And Number Of Observations Tied At Rank of Magnitude I Rank 1 2 3 4 Number of observations tied at rank i, fi 12 16 31 21 Average rank, Si* 6.5 20.5 44 70 These ranks are obtained as follows: At the first position we have 12 observations tied. The mean of these ranks is (1+ 2+…+12)/12 = 6.5. At the second position we have 16 numbers tied. The mean rank is therefore (13+14+15+…+28)/16 = 20.5 and so on Since our Wilcoxon Rank Sum statistic considers the treatment rank, Ws* = (6.5x12). +(20.5 x 16) + (44 x 12) + (70 x 12) = 1,720. Since N = 80, m = n = 40 and f1 = 12, f2 = 16, f3 = 31, f4 = 21 (which is also e=21), E(Ws* = 1,620) and Var(Ws*) = 99.27. Our hypotheses are H0: Counselling has no positive effect Ha: Counselling has a positive effect Thus we will reject H0 if Pr(Ws* ≥ 1,720) < α = 0.05.. Now Pr(Ws* ≥ 1,720) = 1-Pr(Z ≤ 1.01) = 0.16. We accept H0 and conclude that counselling has no positive effect on adjustment.
  • 13. xiii 2.12 MANN- WHITNEY TEST Recall that Ws is the sum of treatment ranks. Let WXY = ( )1 2 1 +− nWs , where n is defined as before. This statistic is the Mann-Whitney Statistic. MEAN AND VARIANCE OF THE MANN WHITNEY STATISTIC The mean of WXY, E(WXY) = mn 2 1 and its variance, Var (WXY) = ( )1 12 1 +Nmn . CORRECTION FOR TIES In the presence of ties, we use the same correction used for Ws* . In such cases, the mean of WXY * is equal to the mean of WXY when there are no ties. However, the variance of WXY * is equal to the variance of Ws * . USING THE NORMAL APPROXIMATION Using the mean and variance of WXY * , p-value computations can be made using the normal approximation. Example 4: 1. Diastolic pressure was measured on 3 subjects in a treatment group and 4 subjects in a control group. The data is displayed below in Table 1.1 below Table 1.1: Blood Pressure of Subjects In A Clinical Trial Treated Group Control Group 95 80 101 78 110 85 90 Does the drug have on effect on blood pressure? Test at α=0.05. Solution: (i) H0: The drug has no effect on blood pressure H1: The drug has an effect on blood pressure Note: Since we do not say explicitly what effect the drug might have on pressure, this is a two-tailed test.
  • 14. xiv (ii) Test statistic, Wilcoxon Rank Sum Test Statistic, Ws = ∑= n i iS 1 where Si is the rank of subject I in the treatment group. (iii) Pool the data and rank it. You can use Microsoft ExcelTM to do this. When that is done you get the results as presented in the following table, Table 2.4.. Table 2.4 Data On Blood Pressure Pooled And Ranked Ranks for Blood Pressure Group Rank Treatment Group 78 C 1 80 C 2 85 C 3 90 C 4 95 T 5 5 101 T 6 6 110 T 7 7 Sum 18 (iv) From the above table, the observed value of Ws is 18. (Ws = 18). The Wilcoxon –Mann Whitney Statistic, WXY = ( )1 2 1 +− nWs = 16)13( 2 1 18 =+− (v) The significance probability (p-value ) for this is Pr(|WXY| ≥ 16) = 2x Pr(WXY ≥ 16) = 2x(1- Pr(WXY ≤ 16)) = 2x(1 – 1) = 0. Hence we reject Ho (since p-value < 0.05). The drug has an effect on the blood pressure of subjects. Using StatXact , one gets the following output,
  • 15. xv StatXact Output Panel Explanation of The Above Output Panel The mean is 12 which agrees with theory since E(Ws) = ( ) ( ) 12173 2 1 1 2 1 =+=+ xxNn The standard deviation is 2.828. We know that Variance of Ws, Var(Ws) = ( ) 8834 12 1 1 12 1 ==+ xxxNmn Hence, the standard deviation of Ws, is = 828.28 = The standardized value is 2.121. We know that using the normal approximation, ( ) ( )      + − = 1 12 1 )( Nmn WsEWs Z has an approximate normal distribution with parameters µ =0 and σ²= 1. ( i.e. Z ~ appr N(0, 1)). Hence, Z = ( ) 122.2 828.2 6 828.2 1218 == − StatXact gives approximate as well as exact p-values. When the sample size is large, it is advisable to use the approximate p-value because then the normal approximation holds ( Recall the Central Limit Theorem). In such cases of a very big sample size, StatXact can not really use the exact method of computation to find the p-value. It is not just feasible! Since our test is two-tailed, we will use the two-sided p-values. From the approximate p-values in the above output panel, we reject Ho. This agrees with what we found manually using Statistical Tables. It is appropriate in this case of a small sample size to use exact methods. WILCOXON-MANN-WHITNEY TEST [ Sum of scores from population < 1 > ] Summary of Exact distribution of WILCOXON-MANN-WHITNEY statistic: Min Max Mean Std-dev Observed Standardized 6.000 18.00 12.00 2.828 18.00 2.121 Mann-Whitney Statistic = 12.00 Asymptotic Inference: One-sided p-value: Pr { Test Statistic .GE. Observed } = 0.0169 Two-sided p-value: 2 * One-sided = 0.0339 Exact Inference: One-sided p-value: Pr { Test Statistic .GE. Observed } = 0.0286 Pr { Test Statistic .EQ. Observed } = 0.0286 Two-sided p-value: Pr { | Test Statistic - Mean | .GE. | Observed - Mean | = 0.0571 Two-sided p-value: 2*One-Sided = 0.0571
  • 16. xvi Using exact p-values from the same output panel, we accept Ho; the treatment has no effect. Example 4: (Example 2 continued) We found the mean and standard deviation of WXY * to be 1,620 and 99.29 respectively. The mean and standard deviation of WXY* are 180 and 99.29 respectively. Hence, Pr(WXY * 1720) = Pr(Z = 1.007) = 0.16 as before. The conclusion does not change. We still accept H0. PAIRED SAMPLES T TEST. 1. THE SIGN TEST In section, we encountered the sign test for the first time as a nonparametric test of location for one sample. Suppose that we have two paired samples. Let Xi denote the response (outcome) to some treatment A for subject i in the first group and Yi the response of subject i in the second group to treatment B. Using the paired samples t-test, we can test whether treatment A is better than treatment B. We use the paired differences between the paired measurements Xi and Yi. Let Di = Yi-Xi. Regardless of the size of the difference, the differences will be either positive or negative. Our interest is in the positive differences. If Treatment A is indeed better than treatment B, then, on average, we expect Di to be positive. As before, define Yi as below: Yi = 1, if Di > 0 = 0, if Di < 0 The sum of the Yi's, S say, will be the total number of positive differences which is our statistic. S has a binomial distribution with the sample size and p = 0.5 as parameters. When there are ties, the difference, Di will be equal to zero. One simple solution to this is to reduce n by the number of zeros. For instance, if there are 6 zeror and the sample size is 20, the value of n to use in testing will be 14. DISTRIBUTION OF S Since S has a binomial distribution with N and π = 0.5, The mean of S, E(S) = 2 n n =π and its variance , Var(S) = 44 1 2 1 .)1( n xxnn ==−ππ . If n is large, one can use the normal approximation. Example: The following data from Makutch and Parks (1988) document the response of serum antigen level to AZT in 20 AIDS patients. Two sets of antigen levels are provided for each patient; pre-treatment and post-treatment. The differences are also displayed, along with two sets of signed ranks.
  • 17. xvii Table:2.5 AZT and Serum Antigen Trial Sign Patient Serum Antigen Level(pg/ml) Test Id Pre-AZT Post-AZT Difference Scores 1 149 0 -149 0 2 0 51 51 1 3 0 0 0 - 4 259 385 126 1 5 106 0 -106 0 6 255 235 -20 0 7 0 0 0 - 8 52 0 -52 0 9 340 48 -392 0 10 65 65 0 - 11 180 77 -103 0 12 0 0 0 - 13 84 0 -84 0 14 89 0 -89 0 15 212 53 -159 0 16 554 150 -404 0 17 500 0 -500 0 18 424 165 -259 0 19 112 98 -14 0 20 2600 0 -2600 0 Sum 2 Does AZT have an effect on serum antigen levels? Test at α = 0.05. Solution: There are 20 differences, 4 of which are zeros. Therefore, for our sign test n = 16. Two differences are positive. Therefore S=2. Calculation of a p-value will depend on our alternative hypothesis. If our alternative is that AZT increases serum antigen level then our p-value will be calculated as follows: ∑∑ = = ≤−===≥= 0 1 0 16 2 )Pr(1)Pr()2Pr( s s s sSsSSp = ( ) ( ( ) ( ) ( ) ( ) )151160 5.05.0 1 16 5.5.0 0 16 1)1Pr()0Pr(1       +      −==+=− SS = 1-(0.00001526 + 0.0002441) = 0.000259 Hence we reject H0. AZT increases serum antigen levels in AIDS patients. Using StatXact, we get the following output in the panel below.
  • 18. xviii Explanation Of Output: Manually, we found S equal to 2. From the above output,under the column 'Observed' we have 2 which is the observed value of S. Our alternative is non-directional since we do not say what effect AZT has. Therefore, we will use the exact two-sided p- value which is equal to 0.0042. We reject H0. AZT has an effect on Serum Antigen Levels in AIDS patients. The Sign Test for paired samples is good and simple to use. However, there is loss of information since the actual magnitude of the differences is not taken into consideration. As such, it as has a power which is less than that of a test which takes into consideration the actual size of the differences. On such a method is the Wilcoxon Signed Rank Test. 2. THE WILCOXON SIGNED RANK TEST This method is for paired samples and is similar to the Sign Test for paired differences. The only minor difference is that the magnitude of the differences are considered in testing. The method is as follows: 1. Find the difference, Di between paired values. 1. Ignoring the sign of the difference ( i.e. whether positive or negative) arrange the differences in ascending order. 2. For each difference,attach the sign of the difference to the rank of that difference. 3. Add all ranks with a positive sign. Let the sum of these positive ranks be TWS. This is your statistic. SIGN TEST Summary of Exact distribution of SIGN statistic: Min Max Mean Std-dev Observed Standardized 0.0000 16.00 8.000 2.000 2.000 -3.000 Asymptotic Inference: One-sided p-value: Pr { Test Statistic .LE. Observed } = 0.0013 Two-sided p-value: 2 * One-sided = 0.0027 Exact Inference: One-sided p-value: Pr { Test Statistic .LE. Observed } = 0.0021 Pr { Test Statistic .EQ. Observed } = 0.0018 Two-sided p-value: 2*One-Sided = 0.0042
  • 19. xix THE DISTRIBUTION OF TWS. The mean of Tws, E(Tws) = ( ) 4 1+NN and its variance , Var(Tws) = 4 N . One can use these two for the normal approximation. Example: We will revisit the AZT example. The data is reproduced below with another column added. Patient Signed Id Pre-AZT Post AZT Difference Ranks 1 149 0 -149 -10 2 0 51 51 3 3 0 0 0 . 4 259 385 126 9 5 106 0 -106 -8 6 255 235 -20 -2 7 0 0 0 . 8 52 0 -52 -4 9 340 48 -392 -13 10 65 65 0 . 11 180 77 -103 -7 12 0 0 0 . 13 84 0 -84 -5 14 89 0 -89 -6 15 212 53 -159 -11 16 554 150 -404 -14 17 500 0 -500 -15 18 424 165 -259 -12 19 112 98 -14 -1 20 2600 0 -2600 -16 Sum 12 Column 5 contains the ranks with signs attached to them and the total of positive ranks is 9 + 3 =12. Using StatXact: On the next is an output panel from StatXact.
  • 20. xx Since our test is two-tailed we will use the two-sided p-value. This p-value is equal to 0.0021 which is less than 0.05, our level of significance. We still reject H0. AZT has an effect on Serum Antigen levels in AIDS patients. Note that the observed value is given in the output panel under the column observed as 12, just what we have calculated at the bottom of column 5 of the table above. GENERAL NOTE: WHEN TO USE TWO SIDED ALTERNATIVES. 1. When trying to decide which of two treatments is better and both are new, there is no reason to stage an alternative hypothesis which says that one is better than the other. 2. If the problem is not deciding which of two treatments or procedures is better but that one wants to know whether they differ at all then the alternative must be two- sided. For example, in a study comparing a surgical approach to a medical problem with a more conservative treatment, it may be desirable to examine first whether it is necessary to distinguish between two surgical techniques used by different surgeons. WILCOXON SIGNED RANK TEST Summary of Exact distribution of WILCOXON SIGNED RANK statistic: Min Max Mean Std-dev Observed Standardized 0.0000 136.0 68.00 19.34 12.00 -2.896 Asymptotic Inference: One-sided p-value: Pr { Test Statistic .LE. Observed } = 0.0019 Two-sided p-value: 2 * One-sided = 0.0038 Exact Inference: One-sided p-value: Pr { Test Statistic .LE. Observed } = 0.0011 Pr { Test Statistic .EQ. Observed } = 0.0002 Two-sided p-value: Pr { | Test Statistic - Mean | .GE. | Observed - Mean | = 0.0021 Two-sided p-value: 2*One-Sided = 0.0021
  • 21. xxi EXERCISES 1. From a group of nine rats available for a study of learning, five were selected at random and were trained to imitate the leader rat in a maze. They were then placed together with four untrained rats in a situation where imitation of the leaders enabled them to avoid receiving an electric shock. The results (the number of trials required to obtain ten correct responses in ten consecutive trials) were as follows: Trained rats 74 64 75 45 82 Control 110 70 53 51 Find the significance probability of these results when the Wilcoxon Rank Sum Test is used. What do you conclude? (Test at α = 0.05). 2. The effectiveness of vitamin C in orange juice and in synthetic ascorbic acid was compared in 20 guinea pigs (divided at random into two groups of 10) in terms of the length of the odontoblasts after 6 weeks, with the following results: Orange juice 8.2 9.4 9.6 9.7 10.0 14.5 15.2 16.1 17.6 21.5 Ascorbic acid 4.2 5.2 5.8 6.4 7.0 7.3 10.1 11.2 11.3 11.5 Test the hypothesis of no difference against the hypothesis that the orange juice tends to give larger values. Use α = 0.05. (a) Use the Rank sum test (b) Sign Test (c) Wilcoxocon Signed Rank Test. 3. Suppose that a new postsurgical treatment is being compared with a standard treatment by observing the recovry times of 9 treatment subjects and 9 controls. The data is in the table below: Standard Treatment 20 21 24 30 32 36 40 48 54 New Treatment 19 22 25 26 28 29 34 37 38 (a) Find the p-value if the Wilcoxon Rank Sum Test is used. (b) Using the Wilcoxon Signed Rank Test to test the null hypothesis that the new treatment is no better than the standard one. Use α = 0.05. 4. Suppose that 20 treatment patients are being compared with 20 controls, and that the progress of each patient is classified as very poor, poor, indifferent, good, or very good. The data are given in the following table:
  • 22. xxii Very Poor Poor Indifferent Good Very Good Control 2 2 11 4 1 Treatment 0 1 9 7 3 Is the treatment effective? Test at α = 0.05. 5. In an investigation of a new drug for postoperative pain relief, it was desired to determine (among other things) whether the relief brought by 3 mg of the drug is significntly higher than that resulting from a dose of 1 mg. In one phase of the study, 15 freshly operated-upon patients were assigned at random, 7 to the lower dose (T1) and the remaining 8 to the higher dose (T3). The responses (number of hours of pain relief) were T1: 2,0,3,3,0,0,8 and T3: 6,4,4,0,1,8,2,8. How significant are these results? Use any appropriate nonparametric test. (α = 0.05 ).
  • 23. xxiii REFERENCES Lehmann, E.L (1975). Nonparametrics: Statistical Methods Based on Ranks.San Fransisco:Holden-Day Inc. Gajjar,Y.,Mehta, C., Patel, N., Senchaudhuri, P., StatXact: User's Manual.
  • 24. xxiv PREFACE It is not very easy to produce a text which is strictly non-technical. Of course, when designing course material the audience has to be at heart. How user-friendly a text is is very subjective and it depends on ones background. I have tried to make the stuff attractive, user-friendly and straight-forward without losing too much rigour. I have also tried my best to make it simple. I could not have made it any simpler. After all, Albert Enstein said that we can make things simple but not simpler! Nonparametric statistics is seen as difficult to some because they are not exposed to it at an early stage. However, it is very useful in biomedical research and is like any other statistical method. Its importance in inference can not be over-emphasized. The methods explained in this brochure can be tried out using some statistical packages. One such a statistical package is StatXact. This is chiefly for nonparametric statistics. Other statistical packages also have routines for carrying out non parametric inferential procedures. An example in mind is SPSS. A reader who is looking for more rigorous stuff should read the books mentioned in the bilbiography. Any comments, corrections and necessary additions should be addressed to the author. Constructive criticisms are welcome. I hope you will enjoy this brochure. Blantyre, 14 December, 2000 H. E. Misiri HEMisiri@yahoo.co.uk
  • 25. xxv TABLE OF CONTENTS Page Preface ii Introduction 1 Ranking Data 1 Ties In A Data Set 2 Nonparametric Analogues of Some Parametric Methods 2 Sign Test for One Sample 2 Using The Binomial Distribution 4 Two sample Tests 7 Wilcoxon Rank Sum test 7 Determining The Alternative 8 Using The Normal Approximation to The Distribution of Ws 9 Correcting for Ties When using Ws 10 Wilcoxon –Mann-Whitney Test 12 Nonparametric Rank - Based Paired Samples Tests 15 The Wilcoxon Signed Rank Test 17 The Distribution of The Wilcoxon Signed Rank Test 18 When To use Two-Sided Alternatives 19 Exercises 20 Bilbiography 22 Appendix 23
  • 26. xxvi