Nonparametric statistics ppt @ bec doms

974 views
885 views

Published on

Nonparametric statistics ppt @ bec doms

Published in: Business, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
974
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
42
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Nonparametric statistics ppt @ bec doms

  1. 1. Nonparametric Statistics
  2. 2. Chapter Goals <ul><li>After completing this chapter, you should be able to: </li></ul><ul><li>Recognize when and how to use the Wilcoxon signed rank test for a population median </li></ul><ul><li>Recognize the situations for which the Wilcoxon signed rank test applies and be able to use it for decision-making </li></ul><ul><li>Know when and how to perform a Mann-Whitney U-test </li></ul><ul><li>Perform nonparametric analysis of variance using the Kruskal-Wallis one-way ANOVA </li></ul>
  3. 3. Nonparametric Statistics <ul><li>Nonparametric Statistics </li></ul><ul><ul><li>Fewer restrictive assumptions about data levels and underlying probability distributions </li></ul></ul><ul><ul><ul><li>Population distributions may be skewed </li></ul></ul></ul><ul><ul><ul><li>The level of data measurement may only be ordinal or nominal </li></ul></ul></ul>
  4. 4. Wilcoxon Signed Rank Test <ul><li>Used to test a hypothesis about one population median </li></ul><ul><ul><li>the median is the midpoint of the distribution: 50% below, 50% above </li></ul></ul><ul><li>A hypothesized median is rejected if sample results vary too much from expectations </li></ul><ul><ul><li>no highly restrictive assumptions about the shape of the population distribution are needed </li></ul></ul>
  5. 5. The W Test Statistic <ul><li>Performing the Wilcoxon Signed Rank Test </li></ul><ul><li>Calculate the test statistic W using these steps: </li></ul><ul><li>Step 1: collect sample data </li></ul><ul><li>Step 2: compute d i = difference between each value and the hypothesized median </li></ul><ul><li>Step 3: convert d i values to absolute differences </li></ul>
  6. 6. The W Test Statistic <ul><li>Performing the Wilcoxon Signed Rank Test </li></ul><ul><li>Step 4: determine the ranks for each d i value </li></ul><ul><ul><li>eliminate zero d i values </li></ul></ul><ul><ul><li>Lowest d i value = 1 </li></ul></ul><ul><ul><li>For ties, assign each the average rank of the tied observations </li></ul></ul>(continued)
  7. 7. The W Test Statistic <ul><li>Performing the Wilcoxon Signed Rank Test </li></ul><ul><li>Step 5: Create R+ and R- columns </li></ul><ul><ul><li>for data values greater than the hypothesized median, put the rank in an R+ column </li></ul></ul><ul><ul><li>for data values less than the hypothesized median, put the rank in an R- column </li></ul></ul>(continued)
  8. 8. The W Test Statistic <ul><li>Performing the Wilcoxon Signed Rank Test </li></ul><ul><li>Step 6: the test statistic W is the sum of the ranks in the R+ column </li></ul><ul><li>Test the hypothesis by comparing the calculated W to the critical value from the table in appendix P </li></ul><ul><ul><li>Note that n = the number of non-zero d i values </li></ul></ul>(continued)
  9. 9. Example <ul><li>The median class size is claimed to be 40 </li></ul><ul><li>Sample data for 8 classes is randomly obtained </li></ul><ul><li>Compare each value to the hypothesized median to find difference </li></ul>Class size = x i Difference d i = x i – 40 | d i | 23 45 34 78 34 66 61 95 -17 5 -6 38 -6 26 21 55 17 5 6 38 6 26 21 55
  10. 10. Example <ul><li>Rank the absolute differences: </li></ul>tied (continued) | d i | Rank 5 6 6 17 21 26 38 55 1 2.5 2.5 4 5 6 7 8
  11. 11. Example <ul><li>Put ranks in R+ and R- columns </li></ul><ul><li>and find sums: </li></ul>(continued) These three are below the claimed median, the others are above Class size = x i Difference d i = x i – 40 | d i | Rank R+ R- 23 45 34 78 34 66 61 95 -17 5 -6 38 -6 26 21 55 17 5 6 38 6 26 21 55 4 1 2.5 7 2.5 6 5 8 1 7 6 5 8 4 2.5 2.5  = 27  = 9
  12. 12. Completing the Test <ul><li>H 0 : Median = 40 </li></ul><ul><li>H A : Median ≠ 40 </li></ul>Test at the  = .05 level: This is a two-tailed test and n = 8, so find W L and W U in appendix P: W L = 3 and W U = 33 The calculated test statistic is W =  R+ = 27
  13. 13. Completing the Test <ul><li>H 0 : Median = 40 </li></ul><ul><li>H A : Median ≠ 40 </li></ul>W L = 3 and W U = 33 W L < W < W U so do not reject H 0 (there is not sufficient evidence to conclude that the median class size is different than 40) (continued) W L = 3 do not reject H 0 reject H 0 W =  R+ = 27 W U = 33 reject H 0
  14. 14. If the Sample Size is Large <ul><li>The W test statistic approaches a normal distribution as n increases </li></ul><ul><li>For n > 20, W can be approximated by </li></ul>where W = sum of the R+ ranks d = number of non-zero d i values
  15. 15. Nonparametric Tests for Two Population Centers <ul><li>Nonparametric </li></ul><ul><li>Tests for Two </li></ul><ul><li>Population Centers </li></ul>Wilcoxon Matched-Pairs Signed Rank Test Mann-Whitney U-test Large Samples Small Samples Large Samples Small Samples
  16. 16. Mann-Whitney U-Test Used to compare two samples from two populations Assumptions: The two samples are independent and random The value measured is a continuous variable The measurement scale used is at least ordinal If they differ, the distributions of the two populations will differ only with respect to the central location
  17. 17. <ul><li>Consider two samples </li></ul><ul><ul><li>combine into a singe list, but keep track of which sample each value came from </li></ul></ul><ul><ul><li>rank the values in the combined list from low to high </li></ul></ul><ul><ul><ul><li>For ties, assign each the average rank of the tied values </li></ul></ul></ul><ul><ul><li>separate back into two samples, each value keeping its assigned ranking </li></ul></ul><ul><ul><li>sum the rankings for each sample </li></ul></ul>Mann-Whitney U-Test (continued)
  18. 18. <ul><li>If the sum of rankings from one sample differs enough from the sum of rankings from the other sample, we conclude there is a difference in the population medians </li></ul>Mann-Whitney U-Test (continued)
  19. 19. Mann-Whitney U-Test (continued) Mann-Whitney U-Statistics where: n 1 and n 2 are the two sample sizes  R 1 and  R 2 = sum of ranks for samples 1 and 2
  20. 20. Mann-Whitney U-Test (continued) Claim: Median class size for Math is larger than the median class size for English A random sample of 9 Math and 9 English classes is selected (samples do not have to be of equal size) Rank the combined values and then split them back into the separate samples
  21. 21. <ul><li>Suppose the results are: </li></ul>(continued) Mann-Whitney U-Test Class size (Math, M) Class size (English, E) 23 45 34 78 34 66 62 95 81 30 47 18 34 44 61 54 28 40
  22. 22. Mann-Whitney U-Test Ranking for combined samples tied (continued) Size Rank 18 1 23 2 28 3 30 4 34 6 34 6 34 6 40 8 44 9 Size Rank 45 10 47 11 54 12 61 13 62 14 66 15 78 16 81 17 95 18
  23. 23. <ul><li>Split back into the original samples: </li></ul>Mann-Whitney U-Test (continued) Class size (Math, M) Rank Class size (English, E) Rank 23 45 34 78 34 66 62 95 81 2 10 6 16 6 15 14 18 17 30 47 18 34 44 61 54 28 40 4 11 1 6 9 13 12 3 8  = 104  = 67
  24. 24. Mann-Whitney U-Test H 0 : Median M ≤ Median E H A : Median M > Median E Claim: Median class size for Math is larger than the median class size for English Note: U 1 + U 2 = n 1 n 2 (continued) Math: English:
  25. 25. <ul><li>The Mann-Whitney U tables in Appendices L and M give the lower tail of the U-distribution </li></ul><ul><li>For one-tailed tests like this one, check the alternative hypothesis to see if U 1 or U 2 should be used as the test statistic </li></ul><ul><li>Since the alternative hypothesis indicates that population 1 (Math) has a higher median, use U 1 as the test statistic </li></ul>Mann-Whitney U-Test (continued)
  26. 26. <ul><li>Use U 1 as the test statistic: U = 22 </li></ul><ul><li>Compare U = 22 to the critical value U  from the appropriate table </li></ul><ul><ul><li>For sample sizes less than 9, use Appendix L </li></ul></ul><ul><ul><li>For samples sizes from 9 to 20, use Appendix M </li></ul></ul><ul><li>If U < U  , reject H 0 </li></ul>Mann-Whitney U-Test (continued)
  27. 27. <ul><li>Use U 1 as the test statistic: U = 19 </li></ul><ul><li>U  from Appendix M for  = .05, n 1 = 9 and n 2 = 9 is U  = 7 </li></ul>Mann-Whitney U-Test Since U  U  , do not reject H 0 (continued) U  = 7 U = 19 do not reject H 0 reject H 0
  28. 28. Mann-Whitney U-Test for Large Samples <ul><li>The table in Appendix M includes U  values only for sample sizes between 9 and 20 </li></ul><ul><li>The U statistic approaches a normal distribution as sample sizes increase </li></ul><ul><li>If samples are larger than 20, a normal approximation can be used </li></ul>
  29. 29. Mann-Whitney U-Test for Large Samples <ul><li>The mean and standard deviation for Mann-Whitney U Test Statistic: </li></ul>(continued) Where n 1 and n 2 are sample sizes from populations 1 and 2
  30. 30. Mann-Whitney U-Test for Large Samples <ul><li>Normal approximation for Mann-Whitney U Test Statistic: </li></ul>(continued)
  31. 31. Large Sample Example <ul><li>We wish to test </li></ul><ul><li>Suppose two samples are obtained: </li></ul><ul><li>n 1 = 40 , n 2 = 50 </li></ul><ul><li>When rankings are completed, the sum of ranks for sample 1 is  R 1 = 1475 </li></ul><ul><li>When rankings are completed, the sum of ranks for sample 2 is  R 2 = 2620 </li></ul>H 0 : Median 1  Median 2 H A : Median 1 < Median 2
  32. 32. <ul><li>U statistic is found to be U = 655 </li></ul>Large Sample Example Since the alternative hypothesis indicates that population 2 has a higher median, use U 2 as the test statistic Compute the U statistics: (continued)
  33. 33. Large Sample Example Since z = -2.80 < -1.645, we reject H 0 Reject H 0  = .05 Do not reject H 0 0 (continued)
  34. 34. Wilcoxon Matched-Pairs Signed Rank Test <ul><li>The Mann-Whitney U-Test is used when samples from two populations are independent </li></ul><ul><li>If samples are paired, they are not independent </li></ul><ul><li>Use Wilcoxon Matched-Pairs Signed Rank Test with paired samples </li></ul>
  35. 35. The Wilcoxon T Test Statistic <ul><li>Performing the Small-Sample Wilcoxon Matched Pairs Test (for n < 25) </li></ul><ul><li>Calculate the test statistic T using these steps: </li></ul><ul><li>Step 1: collect sample data </li></ul><ul><li>Step 2: compute d i = difference between the sample 1 value and its paired sample 2 value </li></ul><ul><li>Step 3: rank the differences, and give each rank the same sign as the sign of the difference value </li></ul>
  36. 36. The Wilcoxon T Test Statistic <ul><li>Performing the Small-Sample Wilcoxon Matched Pairs Test (for n < 25) </li></ul><ul><li>Step 4: The test statistic is the sum of the absolute values of the ranks for the group with the smaller expected sum </li></ul><ul><ul><li>Look at the alternative hypothesis to determine the group with the smaller expected sum </li></ul></ul><ul><ul><li>For two tailed tests, just choose the smaller sum </li></ul></ul>(continued)
  37. 37. Small Sample Example <ul><li>Paired samples, n = 9: </li></ul>Claim: Median value is smaller after than before Value (before) Value (after) 38 45 34 58 30 46 42 55 41 30 47 18 34 34 31 24 38 40
  38. 38. Small Sample Example <ul><li>Paired samples, n = 9: </li></ul>(continued) Value (before) Value (after) Difference d Rank of d Ranks with smaller expected sum 36 45 34 58 30 46 42 55 41 30 47 18 54 38 31 24 62 40 6 -2 16 4 -8 15 18 -7 1 4 -2 8 3 -6 7 9 -5 1 2 6 5  = T = 13
  39. 39. <ul><li>The calculated T value is T = 13 </li></ul><ul><li>Complete the test by comparing the calculated T value to the critical T-value from Appendix N </li></ul><ul><li>For n = 9 and  = .025 for a one-tailed test, </li></ul><ul><li>T  = 6 </li></ul>Small Sample Example Since T  T  , do not reject H 0 T  = 6 T = 13 do not reject H 0 reject H 0 (continued)
  40. 40. Wilcoxon Matched Pairs Test for Large Samples <ul><li>The table in Appendix N includes T  values only for sample sizes from 6 to 25 </li></ul><ul><li>The T statistic approaches a normal distribution as sample size increases </li></ul><ul><li>If the number of paired values is larger than 25, a normal approximation can be used </li></ul>
  41. 41. <ul><li>The mean and standard deviation for Wilcoxon T : </li></ul>Wilcoxon Matched Pairs Test for Large Samples (continued) where n is the number of paired values
  42. 42. Mann-Whitney U-Test for Large Samples <ul><li>Normal approximation for the Wilcoxon T Test Statistic: </li></ul>(continued)
  43. 43. <ul><li>Tests the equality of more than 2 population medians </li></ul><ul><li>Assumptions: </li></ul><ul><ul><li>variables have a continuous distribution. </li></ul></ul><ul><ul><li>the data are at least ordinal. </li></ul></ul><ul><ul><li>samples are independent. </li></ul></ul><ul><ul><li>samples come from populations whose only possible difference is that at least one may have a different central location than the others. </li></ul></ul>Kruskal-Wallis One-Way ANOVA
  44. 44. Kruskal-Wallis Test Procedure <ul><li>Obtain relative rankings for each value </li></ul><ul><ul><li>In event of tie, each of the tied values gets the average rank </li></ul></ul><ul><li>Sum the rankings for data from each of the k groups </li></ul><ul><ul><li>Compute the H test statistic </li></ul></ul>
  45. 45. Kruskal-Wallis Test Procedure <ul><li>The Kruskal-Wallis H test statistic: </li></ul><ul><li> (with k – 1 degrees of freedom) </li></ul>where: N = Sum of sample sizes in all samples k = Number of samples R i = Sum of ranks in the i th sample n i = Size of the i th sample (continued)
  46. 46. <ul><li>Complete the test by comparing the calculated H value to a critical  2 value from the chi-square distribution with k – 1 degrees of freedom </li></ul><ul><li>(The chi-square distribution is Appendix G) </li></ul><ul><li>Decision rule </li></ul><ul><ul><li>Reject H 0 if test statistic H >  2  </li></ul></ul><ul><ul><li>Otherwise do not reject H 0 </li></ul></ul>Kruskal-Wallis Test Procedure (continued)
  47. 47. <ul><li>Do different departments have different class sizes? </li></ul>Kruskal-Wallis Example Class size (Math, M) Class size (English, E) Class size (History, H) 23 45 54 78 66 55 60 72 45 70 30 40 18 34 44
  48. 48. <ul><li>Do different departments have different class sizes? </li></ul>Kruskal-Wallis Example Class size (Math, M) Ranking Class size (English, E) Ranking Class size (History, H) Ranking 23 41 54 78 66 2 6 9 15 12 55 60 72 45 70 10 11 14 8 13 30 40 18 34 44 3 5 1 4 7  = 44  = 56  = 20
  49. 49. <ul><li>The H statistic is </li></ul>Kruskal-Wallis Example (continued)
  50. 50. <ul><li>Since H = 6.72 < </li></ul><ul><li>do not reject H 0 </li></ul>Kruskal-Wallis Example (continued) Compare H = 6.72 to the critical value from the chi-square distribution for 5 – 1 = 4 degrees of freedom and  = .05: There is not sufficient evidence to reject that the population medians are all equal
  51. 51. Kruskal-Wallis Correction <ul><li>If tied rankings occur, give each observation the mean rank for which it is tied </li></ul><ul><li>The H statistic is influenced by ties, and should be corrected </li></ul><ul><li>Correction for tied rankings: </li></ul>where: g = Number of different groups of ties t i = Number of tied observations in the i th tied group of scores N = Total number of observations
  52. 52. H Statistic Corrected for Tied Rankings <ul><li>Corrected H statistic: </li></ul>

×