Kruskal wallis test


Published on

Published in: Education, Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Kruskal wallis test

  1. 1. KRUSKAL-WALLIS TESTIn statistics, the Kruskal–Wallis one-way analysis of variance by ranks (named after William Kruskaland W. Allen Wallis) is a non-parametric method for testing whether samples originate from thesame distribution. It is used for comparing more than two samples that are independent, or notrelated. The parametric equivalence of the Kruskal-Wallis test is the one-way analysis of variance(ANOVA). The factual null hypothesis is that the populations from which the samples originatehave the same median. When the Kruskal-Wallis test leads to significant results, then at least oneof the samples is different from the other samples. The test does not identify where thedifferences occur or how many differences actually occur. It is an extension of the Mann–WhitneyU test to 3 or more groups. The Mann-Whitney would help analyze the specific sample pairs forsignificant differences.Since it is a non-parametric method, the Kruskal–Wallis test does not assume a normaldistribution, unlike the analogous one-way analysis of variance. However, the test does assume anidentically shaped and scaled distribution for each group, except for any difference in medians.Kruskal–Wallis is also used when the examined groups are of unequal size (different number ofparticipants).[1]Method: 1. Rank all data from all groups together; i.e., rank the data from 1 to N ignoring group membership. Assign any tied values the average of the ranks they would have received had they not been tied. 2. The test statistic is given by: where: o is the number of observations in group o is the rank (among all observations) of observation from group o is the total number of observations across all groups o o o is the average of all the .
  2. 2. 3. Notice that the denominator of the expression for is exactly and . Thus Notice that the last formula only contains the squares of the average ranks.4. A correction for ties can be made by dividing by , where G is the number of groupings of different tied ranks, and ti is the number of tied values within group i that are tied at a particular value. This correction usually makes little difference in the value of K unless there are a large number of ties.5. Finally, the p-value is approximated by . If some values are small (i.e., less than 5) the probability distribution of K can be quite different from this chi-squared distribution. If a table of the chi-squared probability distribution is available, the critical value of chi-squared, , can be found by entering the table at g − 1 degrees of freedom and looking under the desired significance or alpha level. The null hypothesis of equal population medians would then be rejected if . Appropriate multiple comparisons would then be performed on the group medians.6. If the statistic is not significant, then there is no evidence of differences between the samples. However, if the test is significant then a difference exists between at least two of the samples. Therefore, a researcher might use sample contrasts between individual sample pairs, or post hoc tests, to determine which of the sample pairs are significantly different. When performing multiple sample contrasts, the Type I error rate tends to become inflated.
  3. 3. Este material es de otra web.The Kruskal-Wallis test evaluates whether the population medians on a dependent variable arethe same across all levels of a factor. To conduct the Kruskal-Wallis test, using the K independentsamples procedure, cases must have scores on an independent or grouping variable and on adependent variable. The independent or grouping variable divides individuals into two or moregroups, and the dependent variable assesses individuals on at least an ordinal scale. If theindependent variable has only two levels, no additional significance tests need to be conductedbeyond the Kruskal-Wallis test. However, if a factor has more than two levels and the overall testis significant, follow-up tests are usually conducted. These follow-up tests most frequently involvecomparisons between pairs of group medians. For the Kruskal-Wallis, we could use the Mann-Whitney U test to examine unique pairs.ASSUMPTIONS UNDERLYING A MANN-WHITNEY U TESTBecause the analysis for the Kruskal-Wallis test is conducted on ranked scores, the populationdistributions for the test variable (the scores that the ranks are based on) do not have to be of anyparticular form (e.g., normal). However, these distributions should be continuous and haveidentical form.Assumption 1: The continuous distributions for the test variable are exactly the same (except theirmedians) for the different populations.Assumption 2: The cases represent random samples from the populations, and the scores on thetest variable are independent of each other.Assumption 3: The chi-square statistic for the Kruskal-Wallis test is only approximate and becomesmore accurate with larger sample sizes.The p value for the chi-square approximation test is fairly accurate if the number of cases isgreater than or equal to 30.EFFECT SIZE STATISTICS FOR THE MANN-WHITNEY U TESTSPSS does not report an effect size index for the Kruskal-Wallis test. However, simple indicescan be computed to communicate the size of the effect. For the Kruskal-Wallis test, the medianand the mean rank for each of the groups can be reported. Another possibility for the Kruskal-Wallis test is to compute an index that is usually associated with a one-way ANOVA, such as etasquare (h2), except h2 in this case would be computed on the ranked data. To do so, transform thescores to ranks, conduct an ANOVA, and compute an eta square on the ranked scores. Eta squarecan also be computed directly from the reported chi-square value for the Kruskal-Wallis test withthe use of the following equation:
  4. 4. n2=X2/N-1Where N is the total number of casesTHE RESEARCH QUESTIONSThe research questions used in this example can be asked to reflect differences in mediansbetween groups or a relationship between two variables.1. Differences between the medians: Do the medians for change in the number of days of coldsymptoms differ among those who take a placebo, those who take low doses of vitamin C, andthose who take high doses of vitamin C?2. Relationship between two variables: Is there a relationship between the amount of vitamin Ctaken and the change in the number of days that individuals show cold symptoms?