SlideShare a Scribd company logo
Non Parametric Tests
Mean / Median
• The mean is a good measure of center when the data is bell-shaped,
but it is sensitive to outliers and extreme values.
• When the data is skewed, however, a better measure of center would
be the median.
• The median, is a resistant measure.
• In other words, we may want to consider a test for the median and
not the mean.
• In a skewed distribution, the population median, typically denoted
as η, is a better typical value than the population mean, μ.
Sign test
• It is a non-parametric or “distribution free” test, which means the test
doesn’t assume the data comes from a particular distribution.
• The sign test compares the sizes of two groups.
• The sign test is an alternative to a one sample t test or a paired t test.
• It can also be used for ordered data.
• The null hypothesis for the sign test is that the difference
between medians is zero. red (ranked) categorical data.
• This test is used when we are interested in testing the population
median and not the mean.
One sample median test
• The one sample median test checks whether or not there is
a significant difference between our hypothesized median and the
real median of a sample.
• We learned how to use a t-test for the difference between means of
dependent samples. That test required both populations to be
normally distributed.
• If the condition of normality cannot be satisfied, we can use the
paired-sample sign test to test the difference between two
population medians, the following conditions must be met.
• 1. A sample must be randomly selected from each population.
• 2. The samples must be dependent (paired).
• We find the difference between corresponding data entries by
subtracting the entry representing the second variable from the entry
representing the first variable, and record the sign of the difference.
• Then compare the number of + and – signs. (the 0s are ignored.)
Steps:-
• State the hypothesis
• Specify alpha
• Specify sample size
• Find critical value – from t-table or z-table
• Find test statistic
• Make decision
• Interpret
Test statistic
• When n<=25 , test statistic is smaller no of positive or negative sign.
• When n>25 , test statistic is calculated from formula :-
• z=((x+0.5)+0.5n)/sqrt(n)/2
• Where x=smaller no of sign and n=total no of positive and negative
sign.
Example :- Sand C represent two tasks, S the spelling of 25 words presented separately, and C the
spelling of 25 words of equal difficulty presented as an integral part of a sentence (i.e., in context). A
teacher wants to know which condition is favorable to higher scores. Test the hypothesis that C is better
than S.
• Of the 10 differences, 7 are plus (C higher than S), 2 are minus (S
higher than C) and one is zero. Excluding the 0 as being neither +nor
- , we have 9 differences of which 7 are plus.
• Let alpha = 0.05 and N = 9 . It’s a left tailed test. Critical value- 1.860
(from t-table)
• Test statistics = 2
• Since test statistic is greater than the critical value , we fail to reject the
null hypothesis.
Ex
• A college statistics professor claims that the median test score for his
students is 58. The scores of 18 randomly selected tests are listed
below. At alpha=0.01, can you reject the professors claim?
• 58 62 55 55 53 52 52 59 55 55 60 56 57 61 58 63 63 55
Paired/Matched sample Sign test
• Assumptions for the test (your data should meet these requirements
before running the test) are:
• The data should be from two samples.
• The two dependent samples should be paired or matched. For example,
depression scores from before a medical procedure and after.
• Example:-
• This set of data represents test scores at the end of Spring and the
beginning of the Fall semesters.
• The hypothesis is that the summer break means a significant drop in test
scores.
• H0: No difference in median of the signed differences.
• H1: Median of the signed differences is less than zero.
• H0: No difference in median of the signed differences.
• H1: Median of the signed differences is less than zero.
• Count the number of positives and negatives.
• 4 positives.
• 12 negatives.
• Add up the number of items in your sample and subtract any you had
a difference of zero for (in column 3). The sample size in this question
was 17, with one zero, so n = 16.
• Let alpha = 0.05 and N = 16 . Critical value- 2.120 (from t-table)
• Test statistics = 4
• Since test statistic is greater than the critical value , we fail to reject the
null hypothesis.
Example:
A new chemotherapy treatment is proposed for patients with breast cancer. Investigators are
concerned with patient's ability to tolerate the treatment and assess their quality of life both before
and after receiving the new chemotherapy treatment. Quality of life (QOL) is measured on an
ordinal scale and for analysis purposes, numbers are assigned to each response category as follows:
1=Poor, 2= Fair, 3=Good, 4= Very Good, 5 = Excellent. The data are shown below.
Patient QOL Before
Chemother
apy
Treatment
QOL After
Chemother
apy
Treatment
Difference Sign
1 3 2 1 +
2 2 3 -1 -
3 3 4 -1 -
4 2 4 -2 -
5 1 1 0 NA
6 3 4 -1 -
7 2 4 -2 -
8 3 3 0 NA
9 2 1 1 +
10 1 3 -2 -
11 3 4 -1 -
12 2 3 -1 -
H0- no difference in median of both the data values
Ha – there is a difference in the median of both the data
values
No of +ves- 2
No. of –ves = 8
N=10
Alpha= 0.05
Test statistics= 2
Critical value- 1.812
Conclusion:- test statistic > critical value
We accept the hypothesis that there is no difference in the
median of both the data values.
There was no significant change in the quality of life after
and before the chemotherapy treatment.
Mood’s Median Test
• Mood’s median test is used to compare the medians for two samples to
find out if they are different.
• For example, you might want to compare the median number
of positive calls to a hotline vs. the median number of negative comment
calls to find out if you’re getting significantly more negative comments than
positive comments (or vice versa).
• This test is the nonparametric alternative to a one way ANOVA;
Nonparametric means that you don’t have to know what distribution your
sample came from (i.e. a normal distribution) before running the test. That
said, your samples should have been drawn from distributions with the
same shape.
• Use this test instead of the sign test when you have two independent
samples. The test is a particular case of the chi-square test of dependence.
• The null hypothesis for this test is that the medians are the same for both
groups.
• The alternate hypothesis for the test is that the medians are different for
both groups.
• Step 1: Make a 2 x k contingency table, where k is the number of
samples.
• Step 2: Find M, the overall median for all the data in your samples. To
do this, list all of your data (from all samples) in a single set. Sort in
ascending order and then find the middle number.
• Step 3: List each individual sample’s data in ascending order. Count
how many data points are greater than M (from Step 2) and then
count how many data points are smaller than or equal to M. List
these in the first row of the contingency table.
• Step 4: Perform a chi-square test on the completed contingency table.
• Step 5: Compare the chi-square statistic to the table value
with: degrees of freedom = (number of rows – 1) * (number of
columns – 1).
Example
• Non parametric test - Mood's Median test for the following sets of
data :- (11,15,9,4,34,17,18,14,12,13,26,31)
(34,31,35,29,28,12,18,30,14,22,10,29 )
• Significance Level α=0.05 and One-tailed test
• Sol:- Step-1:Calculate total Median of combination of 2 samples
Sorting of combined samples
4,9,10,11,12,12,13,14,14,15,17,18,18,22,26,28,29,29,30,31,31,34,34,
35
n=24
Median =(12thterm+13thterm)/2=(18+18)/2=18
• Step-2:Create a 2×2 contingency table whose first row consists of the number of elements in each
sample that are greater than Median and second row consists of the number of elements in each
sample that are less than or equal to Median
Sample A Sample B Total
> Median 3 8 11
<= Median 9 4 13
Total 12 12 24
Step-3:Perform a chi-square test of independence.
State the hypothesis
H0: two categories variables are independent.
H1: two categories variables are not independent.
Observed Frequencies
B1 B2
Total
A1 3 8 11
A2 9 4 13
Total 12 12 24
• Expected Frequencies
• Compute Chi-square
• χ2=∑(Oij-Eij)2/Eij
=(3-5.5)2/5.5+(8-5.5)2/5.5+(9-6.5)2/6.5+(4-6.5)2/6.5
=6.25/5.5+6.25/5.5+6.25/6.5+6.25/6.5
=1.1364+1.1364+0.9615+0.9615
=4.1958
• Compute the degrees of freedom (df).
df=(2-1)⋅(2-1)=1
• for 1 df, p(χ2≥4.1958)=0.0405. Test statistic- 4.1958. Critical value- 6.314
Since the test statistic < critical value , we reject the null hypothesis H0.
B1
B2 Total
A1 5.5 5.5 11
A2 6.5 6.5 13
Total 12 12 24
Example
• A major wheat supplier from Texas analyzing the yields of various
crop methods. He randomly assigned two different wheat crop
methods to a very high number of different acres of farm land and
recorded the production rate (yield per acre) for each plot. We need
to find out difference between the two wheat crop methods.
Kruskal Wallis Test
• The Kruskal Wallis test is the non parametric alternative to the One Way ANOVA.
• The test determines whether the medians of two or more groups are different. Like most
statistical tests, you calculate a test statistic and compare it to a distribution cut-off point.
The test statistic used in this test is called the H statistic.
• The hypotheses for the test are:
• H0: population medians are equal.
• H1: population medians are not equal.
• The Kruskal Wallis test will tell you if there is a significant difference between groups.
However, it won’t tell you which groups are different.
• You want to find out how test anxiety affects actual test scores. The independent
variable “test anxiety” has three levels: no anxiety, low-medium anxiety and high anxiety.
The dependent variable is the exam score, rated from 0 to 100%.
• You want to find out how socioeconomic status affects attitude towards sales tax
increases. Your independent variable is “socioeconomic status” with three levels:
working class, middle class and wealthy. The dependent variable is measured on a 5-
point scale from strongly agree to strongly disagree.
• The H test is used when the assumptions for ANOVA aren’t met (like
the assumption of normality). It is sometimes called the one-way
ANOVA on ranks, as the ranks of the data values are used in the test
rather than the actual data points.
• Assumptions:-
• One independent variable with two or more levels (independent
groups). The test is more commonly used when you have three or
more levels.
• Ordinal scale, Ratio Scale or Interval scale dependent variables.
• Your observations should be independent. In other words, there
should be no relationship between the members in each group or
between groups.
• All groups should have the same shape distributions.
• It is used for comparing two or more independent samples of equal
or different sample sizes.
• The Kruskal-Wallis H Test is a nonparametric procedure that can be
used to compare more than two populations in a completely
randomized design.
• All n = n1+n2+…+nk measurements are jointly
• ranked (i.e.treat as one large sample).
• We use the sums of the ranks of the k samples to compare the
distributions.
• Rank the total measurements in all k samples from 1 to n. Tied
observations are assigned average of the ranks they would have gotten if
not tied.
• Calculate
T = rank sum for the ith sample
And the test statistic
i = 1, 2,…,k
 3(n 1)
n(n 1) ni
12
2
T
 i
H 
H0: the k distributions are identical versus
Ha: at least one distribution is different Test
statistic: Kruskal-Wallis H
When H0 is true, the test statistic H has an
approximate chi-square distribution with df
= k-1.
Use a right-tailed rejection region or p-
value based on the Chi-square distribution.
Example
• A shoe company wants to know if three groups of workers have different salaries:
Women: 23K, 41K, 54K, 66K, 78K.
Men: 45K, 55K, 60K, 70K, 72K
Minorities: 18K, 30K, 34K, 40K, 44K.
• Sol:- Null Hypothesis H0 : All groups are equal
Alternative Hypothesis H1 : At least one group is not equal
• Step 1: Sort the data for all groups/samples into ascending order in one combined set.
20K
23K
30K
34K
40K
41K
44K
45K
54K
55K
60K
66K
70K
72K
90K
• Step 2: Assign ranks to the sorted data points. Give tied values the average
rank.
20K 1
23K 2
30K 3
34K 4
40K 5
41K 6
44K 7
45K 8
54K 9
55K 10
60K 11
66K 12
70K 13
72K 14
90K 15
• Step 3: Add up the different ranks for each group/sample.
Women: 23K, 41K, 54K, 66K, 90K = 2 + 6 + 9 + 12 + 15 = 44.
Men: 45K, 55K, 60K, 70K, 72K = 8 + 10 + 11 + 13 + 14 = 56.
Minorities: 20K, 30K, 34K, 40K, 44K = 1 + 3 + 4 + 5 + 7 = 20.
• Step 4: Calculate the H statistic: Where:
• n = sum of sample sizes for all samples,
• c = number of samples,
• Tj = sum of ranks in the jth sample,
• nj = size of the jth sample.
H = 6.72
Step 5: Find the critical chi-square value, with c-1 degrees of freedom. For 3 – 1
degrees of freedom and an alpha level of .05, the critical chi square value is
5.9915.
Step 6: Compare the H value from Step 4 to the critical chi-square value from
Step 5.
If the critical chi-square value is less than the H statistic, reject the null
hypothesis that the medians are equal.
If the chi-square value is not less than the H statistic, there is not enough
evidence to suggest that the medians are unequal.
In this case, 5.9915 is less than 6.72, so we can reject the null hypothesis.
• Perform Kruskal wallis test for the following data:-
8,5,7,11,9,6 – 25.5
10,12,11,9,13,12 - 64
11,14,10,16,17,12 – 87.5
18,20,16,15,14,22 - 123
• Significance Level α=0.05 and One-tailed test.
• 12/24*25[(25.52 + 642 + 87.52 + 1232 )/6] -3(24+1)
• H= 16.825
• Critical value = 7.815
Mann Whitney U Test
• The Mann-Whitney U test is the nonparametric equivalent of the two
sample t-test.
• The Mann Whitney U test, sometimes called the Mann Whitney Wilcoxon
Test or the Wilcoxon Rank Sum Test
• While the t-test makes an assumption about the distribution of
a population , the Mann Whitney U Test makes no such assumption.
• The test compares two populations.
• The null hypothesis is that the two samples come from the same
population (i.e. that they both have the same median).
• This test is often performed as a two-sided test and, thus, the research
hypothesis indicates that the populations are not equal as opposed to
specifying directionality.
• A one-sided research hypothesis is used if interest lies in detecting a
positive or negative shift in one population as compared to the other.
• Assumptions for the Mann Whitney U Test
• The dependent variable should be measured on an ordinal scale or a
continuous scale.
• The independent variable should be two independent, categorical
groups.
• Observations should be independent. In other words, there should be
no relationship between the two groups or within each group.
• Observations are not normally distributed. However, they should
follow the same shape (i.e. both are bell-shaped and skewed left).
• The result of performing a Mann Whitney U Test is a U Statistic.
• For small samples, use the direct method (see below) to find the U
statistic;
• For larger samples, a formula is necessary.
Formula
Either of these two formulas are valid for the Mann
Whitney U Test.
R is the sum of ranks in the sample, and n is the number
of items in the sample.
Consider a Phase II clinical trial designed to investigate the effectiveness of a new drug to reduce symptoms
of asthma in children. A total of n=10 participants are randomized to receive either the new drug or a
placebo. Participants are asked to record the number of episodes of shortness of breath over a 1 week
period following receipt of the assigned treatment. The data are shown below.
Placebo 7 5 6 4 12
New
Drug
3 6 4 2 1
Is there a difference in the number of episodes of shortness of breath over a 1 week period in
participants receiving the new drug as compared to those receiving the placebo?
SOL:- In this example, the outcome is a count and in this sample the data do not follow a normal
distribution. In addition, the sample size is small (n1=n2=5), so a nonparametric test is appropriate. The
hypothesis is given below, and we run the test at the 5% level of significance (i.e., α=0.05).
H0: The two populations are equal versus
H1: The two populations are not equal.
The first step is to assign ranks and to do so we order the data from smallest to largest. This is done on the
combined or total sample (i.e., pooling the data from the two treatment groups (n=10)), and assigning ranks
from 1 to 10, as follows.
Total Sample
(Ordered
Smallest to
Largest)
Ranks
Placebo New Drug Placebo New Drug Placebo New Drug
7 3 1 1
5 6 2 2
6 4 3 3
4 2 4 4 4.5 4.5
12 1 5 6
6 6 7.5 7.5
7 9
12 10
• We produce a test statistic based on the ranks.
• First, we sum the ranks in each group. In the placebo group, the sum
of the ranks is 37; in the new drug group, the sum of the ranks is 18.
Recall that the sum of the ranks will always equal n(n+1)/2. As a check
on our assignment of ranks, we have n(n+1)/2 = 10(11)/2=55 which is
equal to 37+18 = 55.
• For the test, we call the placebo group 1 and the new drug group 2
• We let R1 denote the sum of the ranks in group 1 (i.e., R1=37), and
R2denote the sum of the ranks in group 2 (i.e., R2=18).
• The test statistic for the Mann Whitney U Test is denoted U and is
the smaller of U1 and U2.
• In every test, we must determine whether the observed U supports
the null or research hypothesis.
• We determine a critical value of U such that if the observed value of U
is less than or equal to the critical value, we reject H0 in favor of
H1 and if the observed value of U exceeds the critical value we do not
reject H0.
• To determine the appropriate critical value we need sample sizes (for
Example: n1=n2=5) and our two-sided level of significance (α=0.05)
• The critical value is 2, and the decision rule is to reject H0 if U < 2. We
do not reject H0 because 3 > 2. We do not have statistically significant
evidence at α =0.05, to show that the two populations of numbers of
episodes of shortness of breath are not equal.
• To be significant, our obtained U has to be equal to or LESS than this
critical value.
• A new approach to prenatal care is proposed for pregnant women living in a rural
community. The new program involves in-home visits during the course of
pregnancy in addition to the usual or regularly scheduled visits. A pilot
randomized trial with 15 pregnant women is designed to evaluate whether
women who participate in the program deliver healthier babies than women
receiving usual care. The outcome is the APGAR score measured 5 minutes after
birth. Recall that APGAR scores range from 0 to 10 with scores of 7 or higher
considered normal (healthy), 4-6 low and 0-3 critically low. The data are shown
below.
Usual
Care
8 7 6 2 5 8 7 3
New
Program
9 9 7 8 10 9 6
Is there statistical evidence of a difference in APGAR scores in women receiving
the new and enhanced versus usual prenatal care?

More Related Content

What's hot

Parametric tests
Parametric testsParametric tests
Parametric tests
heena45
 
Analysis of variance ppt @ bec doms
Analysis of variance ppt @ bec domsAnalysis of variance ppt @ bec doms
Analysis of variance ppt @ bec doms
Babasab Patil
 
Analysis of variance
Analysis of varianceAnalysis of variance
Analysis of variance
Dr NEETHU ASOKAN
 
Parametric Statistical tests
Parametric Statistical testsParametric Statistical tests
Parametric Statistical tests
Sundar B N
 
Kruskal wallis test
Kruskal wallis testKruskal wallis test
Kruskal wallis test
YASMEEN CHAUDHARI
 
One Way Anova
One Way AnovaOne Way Anova
One Way Anovashoffma5
 
F test and ANOVA
F test and ANOVAF test and ANOVA
F test and ANOVA
Parag Shah
 
The Sign Test
The Sign TestThe Sign Test
The Sign Test
Sharlaine Ruth
 
1 ANOVA.ppt
1 ANOVA.ppt1 ANOVA.ppt
1 ANOVA.ppt
Alemayehu70
 
NON-PARAMETRIC TESTS by Prajakta Sawant
NON-PARAMETRIC TESTS by Prajakta SawantNON-PARAMETRIC TESTS by Prajakta Sawant
NON-PARAMETRIC TESTS by Prajakta Sawant
PRAJAKTASAWANT33
 
Anova (f test) and mean differentiation
Anova (f test) and mean differentiationAnova (f test) and mean differentiation
Anova (f test) and mean differentiation
Subramani Parasuraman
 
Statistical tests
Statistical tests Statistical tests
Statistical tests
Thangamani Ramalingam
 
Mann Whitney U Test | Statistics
Mann Whitney U Test | StatisticsMann Whitney U Test | Statistics
Mann Whitney U Test | Statistics
Transweb Global Inc
 
T test statistics
T test statisticsT test statistics
T test statistics
Mohammad Ihmeidan
 
Non parametric study; Statistical approach for med student
Non parametric study; Statistical approach for med student Non parametric study; Statistical approach for med student
Non parametric study; Statistical approach for med student
Dr. Rupendra Bharti
 
Wilcoxon signed rank test
Wilcoxon signed rank testWilcoxon signed rank test
Wilcoxon signed rank test
Biswash Sapkota
 
Application of ANOVA
Application of ANOVAApplication of ANOVA
Application of ANOVA
Rohit Patidar
 
Test of hypothesis test of significance
Test of hypothesis test of significanceTest of hypothesis test of significance
Test of hypothesis test of significance
Dr. Jayesh Vyas
 
Mann Whitney U test
Mann Whitney U testMann Whitney U test
Mann Whitney U test
Dr. Ankit Gaur
 

What's hot (20)

Non-Parametric Tests
Non-Parametric TestsNon-Parametric Tests
Non-Parametric Tests
 
Parametric tests
Parametric testsParametric tests
Parametric tests
 
Analysis of variance ppt @ bec doms
Analysis of variance ppt @ bec domsAnalysis of variance ppt @ bec doms
Analysis of variance ppt @ bec doms
 
Analysis of variance
Analysis of varianceAnalysis of variance
Analysis of variance
 
Parametric Statistical tests
Parametric Statistical testsParametric Statistical tests
Parametric Statistical tests
 
Kruskal wallis test
Kruskal wallis testKruskal wallis test
Kruskal wallis test
 
One Way Anova
One Way AnovaOne Way Anova
One Way Anova
 
F test and ANOVA
F test and ANOVAF test and ANOVA
F test and ANOVA
 
The Sign Test
The Sign TestThe Sign Test
The Sign Test
 
1 ANOVA.ppt
1 ANOVA.ppt1 ANOVA.ppt
1 ANOVA.ppt
 
NON-PARAMETRIC TESTS by Prajakta Sawant
NON-PARAMETRIC TESTS by Prajakta SawantNON-PARAMETRIC TESTS by Prajakta Sawant
NON-PARAMETRIC TESTS by Prajakta Sawant
 
Anova (f test) and mean differentiation
Anova (f test) and mean differentiationAnova (f test) and mean differentiation
Anova (f test) and mean differentiation
 
Statistical tests
Statistical tests Statistical tests
Statistical tests
 
Mann Whitney U Test | Statistics
Mann Whitney U Test | StatisticsMann Whitney U Test | Statistics
Mann Whitney U Test | Statistics
 
T test statistics
T test statisticsT test statistics
T test statistics
 
Non parametric study; Statistical approach for med student
Non parametric study; Statistical approach for med student Non parametric study; Statistical approach for med student
Non parametric study; Statistical approach for med student
 
Wilcoxon signed rank test
Wilcoxon signed rank testWilcoxon signed rank test
Wilcoxon signed rank test
 
Application of ANOVA
Application of ANOVAApplication of ANOVA
Application of ANOVA
 
Test of hypothesis test of significance
Test of hypothesis test of significanceTest of hypothesis test of significance
Test of hypothesis test of significance
 
Mann Whitney U test
Mann Whitney U testMann Whitney U test
Mann Whitney U test
 

Similar to Non parametric-tests

non parametric test.pptx
non parametric test.pptxnon parametric test.pptx
non parametric test.pptx
SoujanyaLk1
 
T test^jsample size^j ethics
T test^jsample size^j ethicsT test^jsample size^j ethics
T test^jsample size^j ethics
Abhishek Thakur
 
Inferential Statistics.pptx
Inferential Statistics.pptxInferential Statistics.pptx
Inferential Statistics.pptx
jonatanjohn1
 
UNIT 5.pptx
UNIT 5.pptxUNIT 5.pptx
UNIT 5.pptx
ShifnaRahman
 
univariate and bivariate analysis in spss
univariate and bivariate analysis in spss univariate and bivariate analysis in spss
univariate and bivariate analysis in spss
Subodh Khanal
 
tests of significance
tests of significancetests of significance
tests of significance
benita regi
 
Test of significance
Test of significanceTest of significance
Test of significance
Dr. Imran Zaheer
 
Hypothesis Testing.pptx
Hypothesis Testing.pptxHypothesis Testing.pptx
Hypothesis Testing.pptx
RishabhJain661896
 
Student t test
Student t testStudent t test
Student t test
Dr Shovan Padhy, MD
 
Hm306 week 4
Hm306 week 4Hm306 week 4
Hm306 week 4
BHUOnlineDepartment
 
Hm306 week 4
Hm306 week 4Hm306 week 4
Hm306 week 4
BealCollegeOnline
 
linearity concept of significance, standard deviation, chi square test, stude...
linearity concept of significance, standard deviation, chi square test, stude...linearity concept of significance, standard deviation, chi square test, stude...
linearity concept of significance, standard deviation, chi square test, stude...
KavyasriPuttamreddy
 
Testing of hypothesis.pptx
Testing of hypothesis.pptxTesting of hypothesis.pptx
Testing of hypothesis.pptx
SyedaKumail
 
Test of significance in Statistics
Test of significance in StatisticsTest of significance in Statistics
Test of significance in Statistics
Vikash Keshri
 
Parametric tests seminar
Parametric tests seminarParametric tests seminar
Parametric tests seminardrdeepika87
 
Marketing Research Hypothesis Testing.pptx
Marketing Research Hypothesis Testing.pptxMarketing Research Hypothesis Testing.pptx
Marketing Research Hypothesis Testing.pptx
xababid981
 
Hypothsis testing
Hypothsis testingHypothsis testing
Hypothsis testing
University of Balochistan
 
Non Parametric Test by Vikramjit Singh
Non Parametric Test by  Vikramjit SinghNon Parametric Test by  Vikramjit Singh
Non Parametric Test by Vikramjit Singh
Vikramjit Singh
 
Inferential statistics quantitative data - single sample and 2 groups
Inferential statistics   quantitative data - single sample and 2 groupsInferential statistics   quantitative data - single sample and 2 groups
Inferential statistics quantitative data - single sample and 2 groups
Dhritiman Chakrabarti
 
Intro to tests of significance qualitative
Intro to tests of significance qualitativeIntro to tests of significance qualitative
Intro to tests of significance qualitativePandurangi Raghavendra
 

Similar to Non parametric-tests (20)

non parametric test.pptx
non parametric test.pptxnon parametric test.pptx
non parametric test.pptx
 
T test^jsample size^j ethics
T test^jsample size^j ethicsT test^jsample size^j ethics
T test^jsample size^j ethics
 
Inferential Statistics.pptx
Inferential Statistics.pptxInferential Statistics.pptx
Inferential Statistics.pptx
 
UNIT 5.pptx
UNIT 5.pptxUNIT 5.pptx
UNIT 5.pptx
 
univariate and bivariate analysis in spss
univariate and bivariate analysis in spss univariate and bivariate analysis in spss
univariate and bivariate analysis in spss
 
tests of significance
tests of significancetests of significance
tests of significance
 
Test of significance
Test of significanceTest of significance
Test of significance
 
Hypothesis Testing.pptx
Hypothesis Testing.pptxHypothesis Testing.pptx
Hypothesis Testing.pptx
 
Student t test
Student t testStudent t test
Student t test
 
Hm306 week 4
Hm306 week 4Hm306 week 4
Hm306 week 4
 
Hm306 week 4
Hm306 week 4Hm306 week 4
Hm306 week 4
 
linearity concept of significance, standard deviation, chi square test, stude...
linearity concept of significance, standard deviation, chi square test, stude...linearity concept of significance, standard deviation, chi square test, stude...
linearity concept of significance, standard deviation, chi square test, stude...
 
Testing of hypothesis.pptx
Testing of hypothesis.pptxTesting of hypothesis.pptx
Testing of hypothesis.pptx
 
Test of significance in Statistics
Test of significance in StatisticsTest of significance in Statistics
Test of significance in Statistics
 
Parametric tests seminar
Parametric tests seminarParametric tests seminar
Parametric tests seminar
 
Marketing Research Hypothesis Testing.pptx
Marketing Research Hypothesis Testing.pptxMarketing Research Hypothesis Testing.pptx
Marketing Research Hypothesis Testing.pptx
 
Hypothsis testing
Hypothsis testingHypothsis testing
Hypothsis testing
 
Non Parametric Test by Vikramjit Singh
Non Parametric Test by  Vikramjit SinghNon Parametric Test by  Vikramjit Singh
Non Parametric Test by Vikramjit Singh
 
Inferential statistics quantitative data - single sample and 2 groups
Inferential statistics   quantitative data - single sample and 2 groupsInferential statistics   quantitative data - single sample and 2 groups
Inferential statistics quantitative data - single sample and 2 groups
 
Intro to tests of significance qualitative
Intro to tests of significance qualitativeIntro to tests of significance qualitative
Intro to tests of significance qualitative
 

More from Asmita Bhagdikar

JUNCTION DIODE APPLICATIONS
JUNCTION DIODE APPLICATIONSJUNCTION DIODE APPLICATIONS
JUNCTION DIODE APPLICATIONS
Asmita Bhagdikar
 
Basic electronics
Basic electronicsBasic electronics
Basic electronics
Asmita Bhagdikar
 
Mod-1-CH01-Semiconductor-Diodes.pptx
Mod-1-CH01-Semiconductor-Diodes.pptxMod-1-CH01-Semiconductor-Diodes.pptx
Mod-1-CH01-Semiconductor-Diodes.pptx
Asmita Bhagdikar
 
8085-Programming-II-mod-1.pptx
8085-Programming-II-mod-1.pptx8085-Programming-II-mod-1.pptx
8085-Programming-II-mod-1.pptx
Asmita Bhagdikar
 
Vectors mod-1-part-2
Vectors mod-1-part-2Vectors mod-1-part-2
Vectors mod-1-part-2
Asmita Bhagdikar
 
Vectors mod-1-part-1
Vectors mod-1-part-1Vectors mod-1-part-1
Vectors mod-1-part-1
Asmita Bhagdikar
 

More from Asmita Bhagdikar (6)

JUNCTION DIODE APPLICATIONS
JUNCTION DIODE APPLICATIONSJUNCTION DIODE APPLICATIONS
JUNCTION DIODE APPLICATIONS
 
Basic electronics
Basic electronicsBasic electronics
Basic electronics
 
Mod-1-CH01-Semiconductor-Diodes.pptx
Mod-1-CH01-Semiconductor-Diodes.pptxMod-1-CH01-Semiconductor-Diodes.pptx
Mod-1-CH01-Semiconductor-Diodes.pptx
 
8085-Programming-II-mod-1.pptx
8085-Programming-II-mod-1.pptx8085-Programming-II-mod-1.pptx
8085-Programming-II-mod-1.pptx
 
Vectors mod-1-part-2
Vectors mod-1-part-2Vectors mod-1-part-2
Vectors mod-1-part-2
 
Vectors mod-1-part-1
Vectors mod-1-part-1Vectors mod-1-part-1
Vectors mod-1-part-1
 

Recently uploaded

Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
James Polillo
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 

Recently uploaded (20)

Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 

Non parametric-tests

  • 2. Mean / Median • The mean is a good measure of center when the data is bell-shaped, but it is sensitive to outliers and extreme values. • When the data is skewed, however, a better measure of center would be the median. • The median, is a resistant measure. • In other words, we may want to consider a test for the median and not the mean. • In a skewed distribution, the population median, typically denoted as η, is a better typical value than the population mean, μ.
  • 3. Sign test • It is a non-parametric or “distribution free” test, which means the test doesn’t assume the data comes from a particular distribution. • The sign test compares the sizes of two groups. • The sign test is an alternative to a one sample t test or a paired t test. • It can also be used for ordered data. • The null hypothesis for the sign test is that the difference between medians is zero. red (ranked) categorical data. • This test is used when we are interested in testing the population median and not the mean.
  • 4. One sample median test • The one sample median test checks whether or not there is a significant difference between our hypothesized median and the real median of a sample. • We learned how to use a t-test for the difference between means of dependent samples. That test required both populations to be normally distributed. • If the condition of normality cannot be satisfied, we can use the paired-sample sign test to test the difference between two population medians, the following conditions must be met. • 1. A sample must be randomly selected from each population. • 2. The samples must be dependent (paired).
  • 5. • We find the difference between corresponding data entries by subtracting the entry representing the second variable from the entry representing the first variable, and record the sign of the difference. • Then compare the number of + and – signs. (the 0s are ignored.)
  • 6. Steps:- • State the hypothesis • Specify alpha • Specify sample size • Find critical value – from t-table or z-table • Find test statistic • Make decision • Interpret
  • 7. Test statistic • When n<=25 , test statistic is smaller no of positive or negative sign. • When n>25 , test statistic is calculated from formula :- • z=((x+0.5)+0.5n)/sqrt(n)/2 • Where x=smaller no of sign and n=total no of positive and negative sign.
  • 8. Example :- Sand C represent two tasks, S the spelling of 25 words presented separately, and C the spelling of 25 words of equal difficulty presented as an integral part of a sentence (i.e., in context). A teacher wants to know which condition is favorable to higher scores. Test the hypothesis that C is better than S.
  • 9. • Of the 10 differences, 7 are plus (C higher than S), 2 are minus (S higher than C) and one is zero. Excluding the 0 as being neither +nor - , we have 9 differences of which 7 are plus. • Let alpha = 0.05 and N = 9 . It’s a left tailed test. Critical value- 1.860 (from t-table) • Test statistics = 2 • Since test statistic is greater than the critical value , we fail to reject the null hypothesis.
  • 10. Ex • A college statistics professor claims that the median test score for his students is 58. The scores of 18 randomly selected tests are listed below. At alpha=0.01, can you reject the professors claim? • 58 62 55 55 53 52 52 59 55 55 60 56 57 61 58 63 63 55
  • 11. Paired/Matched sample Sign test • Assumptions for the test (your data should meet these requirements before running the test) are: • The data should be from two samples. • The two dependent samples should be paired or matched. For example, depression scores from before a medical procedure and after. • Example:- • This set of data represents test scores at the end of Spring and the beginning of the Fall semesters. • The hypothesis is that the summer break means a significant drop in test scores.
  • 12. • H0: No difference in median of the signed differences. • H1: Median of the signed differences is less than zero.
  • 13. • H0: No difference in median of the signed differences. • H1: Median of the signed differences is less than zero. • Count the number of positives and negatives. • 4 positives. • 12 negatives. • Add up the number of items in your sample and subtract any you had a difference of zero for (in column 3). The sample size in this question was 17, with one zero, so n = 16. • Let alpha = 0.05 and N = 16 . Critical value- 2.120 (from t-table) • Test statistics = 4 • Since test statistic is greater than the critical value , we fail to reject the null hypothesis.
  • 14. Example: A new chemotherapy treatment is proposed for patients with breast cancer. Investigators are concerned with patient's ability to tolerate the treatment and assess their quality of life both before and after receiving the new chemotherapy treatment. Quality of life (QOL) is measured on an ordinal scale and for analysis purposes, numbers are assigned to each response category as follows: 1=Poor, 2= Fair, 3=Good, 4= Very Good, 5 = Excellent. The data are shown below. Patient QOL Before Chemother apy Treatment QOL After Chemother apy Treatment Difference Sign 1 3 2 1 + 2 2 3 -1 - 3 3 4 -1 - 4 2 4 -2 - 5 1 1 0 NA 6 3 4 -1 - 7 2 4 -2 - 8 3 3 0 NA 9 2 1 1 + 10 1 3 -2 - 11 3 4 -1 - 12 2 3 -1 - H0- no difference in median of both the data values Ha – there is a difference in the median of both the data values No of +ves- 2 No. of –ves = 8 N=10 Alpha= 0.05 Test statistics= 2 Critical value- 1.812 Conclusion:- test statistic > critical value We accept the hypothesis that there is no difference in the median of both the data values. There was no significant change in the quality of life after and before the chemotherapy treatment.
  • 15. Mood’s Median Test • Mood’s median test is used to compare the medians for two samples to find out if they are different. • For example, you might want to compare the median number of positive calls to a hotline vs. the median number of negative comment calls to find out if you’re getting significantly more negative comments than positive comments (or vice versa). • This test is the nonparametric alternative to a one way ANOVA; Nonparametric means that you don’t have to know what distribution your sample came from (i.e. a normal distribution) before running the test. That said, your samples should have been drawn from distributions with the same shape. • Use this test instead of the sign test when you have two independent samples. The test is a particular case of the chi-square test of dependence. • The null hypothesis for this test is that the medians are the same for both groups. • The alternate hypothesis for the test is that the medians are different for both groups.
  • 16. • Step 1: Make a 2 x k contingency table, where k is the number of samples. • Step 2: Find M, the overall median for all the data in your samples. To do this, list all of your data (from all samples) in a single set. Sort in ascending order and then find the middle number. • Step 3: List each individual sample’s data in ascending order. Count how many data points are greater than M (from Step 2) and then count how many data points are smaller than or equal to M. List these in the first row of the contingency table. • Step 4: Perform a chi-square test on the completed contingency table. • Step 5: Compare the chi-square statistic to the table value with: degrees of freedom = (number of rows – 1) * (number of columns – 1).
  • 17. Example • Non parametric test - Mood's Median test for the following sets of data :- (11,15,9,4,34,17,18,14,12,13,26,31) (34,31,35,29,28,12,18,30,14,22,10,29 ) • Significance Level α=0.05 and One-tailed test • Sol:- Step-1:Calculate total Median of combination of 2 samples Sorting of combined samples 4,9,10,11,12,12,13,14,14,15,17,18,18,22,26,28,29,29,30,31,31,34,34, 35 n=24 Median =(12thterm+13thterm)/2=(18+18)/2=18
  • 18. • Step-2:Create a 2×2 contingency table whose first row consists of the number of elements in each sample that are greater than Median and second row consists of the number of elements in each sample that are less than or equal to Median Sample A Sample B Total > Median 3 8 11 <= Median 9 4 13 Total 12 12 24 Step-3:Perform a chi-square test of independence. State the hypothesis H0: two categories variables are independent. H1: two categories variables are not independent. Observed Frequencies B1 B2 Total A1 3 8 11 A2 9 4 13 Total 12 12 24
  • 19. • Expected Frequencies • Compute Chi-square • χ2=∑(Oij-Eij)2/Eij =(3-5.5)2/5.5+(8-5.5)2/5.5+(9-6.5)2/6.5+(4-6.5)2/6.5 =6.25/5.5+6.25/5.5+6.25/6.5+6.25/6.5 =1.1364+1.1364+0.9615+0.9615 =4.1958 • Compute the degrees of freedom (df). df=(2-1)⋅(2-1)=1 • for 1 df, p(χ2≥4.1958)=0.0405. Test statistic- 4.1958. Critical value- 6.314 Since the test statistic < critical value , we reject the null hypothesis H0. B1 B2 Total A1 5.5 5.5 11 A2 6.5 6.5 13 Total 12 12 24
  • 20. Example • A major wheat supplier from Texas analyzing the yields of various crop methods. He randomly assigned two different wheat crop methods to a very high number of different acres of farm land and recorded the production rate (yield per acre) for each plot. We need to find out difference between the two wheat crop methods.
  • 21. Kruskal Wallis Test • The Kruskal Wallis test is the non parametric alternative to the One Way ANOVA. • The test determines whether the medians of two or more groups are different. Like most statistical tests, you calculate a test statistic and compare it to a distribution cut-off point. The test statistic used in this test is called the H statistic. • The hypotheses for the test are: • H0: population medians are equal. • H1: population medians are not equal. • The Kruskal Wallis test will tell you if there is a significant difference between groups. However, it won’t tell you which groups are different. • You want to find out how test anxiety affects actual test scores. The independent variable “test anxiety” has three levels: no anxiety, low-medium anxiety and high anxiety. The dependent variable is the exam score, rated from 0 to 100%. • You want to find out how socioeconomic status affects attitude towards sales tax increases. Your independent variable is “socioeconomic status” with three levels: working class, middle class and wealthy. The dependent variable is measured on a 5- point scale from strongly agree to strongly disagree.
  • 22. • The H test is used when the assumptions for ANOVA aren’t met (like the assumption of normality). It is sometimes called the one-way ANOVA on ranks, as the ranks of the data values are used in the test rather than the actual data points. • Assumptions:- • One independent variable with two or more levels (independent groups). The test is more commonly used when you have three or more levels. • Ordinal scale, Ratio Scale or Interval scale dependent variables. • Your observations should be independent. In other words, there should be no relationship between the members in each group or between groups. • All groups should have the same shape distributions. • It is used for comparing two or more independent samples of equal or different sample sizes.
  • 23. • The Kruskal-Wallis H Test is a nonparametric procedure that can be used to compare more than two populations in a completely randomized design. • All n = n1+n2+…+nk measurements are jointly • ranked (i.e.treat as one large sample). • We use the sums of the ranks of the k samples to compare the distributions.
  • 24. • Rank the total measurements in all k samples from 1 to n. Tied observations are assigned average of the ranks they would have gotten if not tied. • Calculate T = rank sum for the ith sample And the test statistic i = 1, 2,…,k  3(n 1) n(n 1) ni 12 2 T  i H 
  • 25. H0: the k distributions are identical versus Ha: at least one distribution is different Test statistic: Kruskal-Wallis H When H0 is true, the test statistic H has an approximate chi-square distribution with df = k-1. Use a right-tailed rejection region or p- value based on the Chi-square distribution.
  • 26. Example • A shoe company wants to know if three groups of workers have different salaries: Women: 23K, 41K, 54K, 66K, 78K. Men: 45K, 55K, 60K, 70K, 72K Minorities: 18K, 30K, 34K, 40K, 44K. • Sol:- Null Hypothesis H0 : All groups are equal Alternative Hypothesis H1 : At least one group is not equal • Step 1: Sort the data for all groups/samples into ascending order in one combined set. 20K 23K 30K 34K 40K 41K 44K 45K 54K 55K 60K 66K 70K 72K 90K
  • 27. • Step 2: Assign ranks to the sorted data points. Give tied values the average rank. 20K 1 23K 2 30K 3 34K 4 40K 5 41K 6 44K 7 45K 8 54K 9 55K 10 60K 11 66K 12 70K 13 72K 14 90K 15
  • 28. • Step 3: Add up the different ranks for each group/sample. Women: 23K, 41K, 54K, 66K, 90K = 2 + 6 + 9 + 12 + 15 = 44. Men: 45K, 55K, 60K, 70K, 72K = 8 + 10 + 11 + 13 + 14 = 56. Minorities: 20K, 30K, 34K, 40K, 44K = 1 + 3 + 4 + 5 + 7 = 20. • Step 4: Calculate the H statistic: Where: • n = sum of sample sizes for all samples, • c = number of samples, • Tj = sum of ranks in the jth sample, • nj = size of the jth sample.
  • 29. H = 6.72 Step 5: Find the critical chi-square value, with c-1 degrees of freedom. For 3 – 1 degrees of freedom and an alpha level of .05, the critical chi square value is 5.9915. Step 6: Compare the H value from Step 4 to the critical chi-square value from Step 5. If the critical chi-square value is less than the H statistic, reject the null hypothesis that the medians are equal. If the chi-square value is not less than the H statistic, there is not enough evidence to suggest that the medians are unequal. In this case, 5.9915 is less than 6.72, so we can reject the null hypothesis.
  • 30. • Perform Kruskal wallis test for the following data:- 8,5,7,11,9,6 – 25.5 10,12,11,9,13,12 - 64 11,14,10,16,17,12 – 87.5 18,20,16,15,14,22 - 123 • Significance Level α=0.05 and One-tailed test. • 12/24*25[(25.52 + 642 + 87.52 + 1232 )/6] -3(24+1) • H= 16.825 • Critical value = 7.815
  • 31. Mann Whitney U Test • The Mann-Whitney U test is the nonparametric equivalent of the two sample t-test. • The Mann Whitney U test, sometimes called the Mann Whitney Wilcoxon Test or the Wilcoxon Rank Sum Test • While the t-test makes an assumption about the distribution of a population , the Mann Whitney U Test makes no such assumption. • The test compares two populations. • The null hypothesis is that the two samples come from the same population (i.e. that they both have the same median). • This test is often performed as a two-sided test and, thus, the research hypothesis indicates that the populations are not equal as opposed to specifying directionality. • A one-sided research hypothesis is used if interest lies in detecting a positive or negative shift in one population as compared to the other.
  • 32. • Assumptions for the Mann Whitney U Test • The dependent variable should be measured on an ordinal scale or a continuous scale. • The independent variable should be two independent, categorical groups. • Observations should be independent. In other words, there should be no relationship between the two groups or within each group. • Observations are not normally distributed. However, they should follow the same shape (i.e. both are bell-shaped and skewed left). • The result of performing a Mann Whitney U Test is a U Statistic. • For small samples, use the direct method (see below) to find the U statistic; • For larger samples, a formula is necessary.
  • 33. Formula Either of these two formulas are valid for the Mann Whitney U Test. R is the sum of ranks in the sample, and n is the number of items in the sample.
  • 34. Consider a Phase II clinical trial designed to investigate the effectiveness of a new drug to reduce symptoms of asthma in children. A total of n=10 participants are randomized to receive either the new drug or a placebo. Participants are asked to record the number of episodes of shortness of breath over a 1 week period following receipt of the assigned treatment. The data are shown below. Placebo 7 5 6 4 12 New Drug 3 6 4 2 1 Is there a difference in the number of episodes of shortness of breath over a 1 week period in participants receiving the new drug as compared to those receiving the placebo? SOL:- In this example, the outcome is a count and in this sample the data do not follow a normal distribution. In addition, the sample size is small (n1=n2=5), so a nonparametric test is appropriate. The hypothesis is given below, and we run the test at the 5% level of significance (i.e., α=0.05). H0: The two populations are equal versus H1: The two populations are not equal. The first step is to assign ranks and to do so we order the data from smallest to largest. This is done on the combined or total sample (i.e., pooling the data from the two treatment groups (n=10)), and assigning ranks from 1 to 10, as follows.
  • 35. Total Sample (Ordered Smallest to Largest) Ranks Placebo New Drug Placebo New Drug Placebo New Drug 7 3 1 1 5 6 2 2 6 4 3 3 4 2 4 4 4.5 4.5 12 1 5 6 6 6 7.5 7.5 7 9 12 10
  • 36. • We produce a test statistic based on the ranks. • First, we sum the ranks in each group. In the placebo group, the sum of the ranks is 37; in the new drug group, the sum of the ranks is 18. Recall that the sum of the ranks will always equal n(n+1)/2. As a check on our assignment of ranks, we have n(n+1)/2 = 10(11)/2=55 which is equal to 37+18 = 55. • For the test, we call the placebo group 1 and the new drug group 2 • We let R1 denote the sum of the ranks in group 1 (i.e., R1=37), and R2denote the sum of the ranks in group 2 (i.e., R2=18). • The test statistic for the Mann Whitney U Test is denoted U and is the smaller of U1 and U2.
  • 37. • In every test, we must determine whether the observed U supports the null or research hypothesis. • We determine a critical value of U such that if the observed value of U is less than or equal to the critical value, we reject H0 in favor of H1 and if the observed value of U exceeds the critical value we do not reject H0. • To determine the appropriate critical value we need sample sizes (for Example: n1=n2=5) and our two-sided level of significance (α=0.05) • The critical value is 2, and the decision rule is to reject H0 if U < 2. We do not reject H0 because 3 > 2. We do not have statistically significant evidence at α =0.05, to show that the two populations of numbers of episodes of shortness of breath are not equal. • To be significant, our obtained U has to be equal to or LESS than this critical value.
  • 38. • A new approach to prenatal care is proposed for pregnant women living in a rural community. The new program involves in-home visits during the course of pregnancy in addition to the usual or regularly scheduled visits. A pilot randomized trial with 15 pregnant women is designed to evaluate whether women who participate in the program deliver healthier babies than women receiving usual care. The outcome is the APGAR score measured 5 minutes after birth. Recall that APGAR scores range from 0 to 10 with scores of 7 or higher considered normal (healthy), 4-6 low and 0-3 critically low. The data are shown below. Usual Care 8 7 6 2 5 8 7 3 New Program 9 9 7 8 10 9 6 Is there statistical evidence of a difference in APGAR scores in women receiving the new and enhanced versus usual prenatal care?