Chap11 chie square & non parametrics

1,305 views

Published on

Modul Statistik Bisnis II

0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,305
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
127
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Chap11 chie square & non parametrics

  1. 1. Statistics for Managers Using Microsoft® Excel 4th Edition Chapter 11 Chi-Square Tests and Nonparametric Tests Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-1
  2. 2. Chapter Goals After completing this chapter, you should be able to: Perform a χ2 test for the difference between two proportions  Use a χ2 test for differences in more than two proportions  Perform a χ2 test of independence  Apply and interpret the Wilcoxon rank sum test for the difference between two medians  Perform nonparametric analysis of variance using the Kruskal-Wallis rank test for one-way ANOVA Statistics for Managers Using Microsoft Excel, 4e © 2004 Chap 11-2 Prentice-Hall, Inc. 
  3. 3. Contingency Tables Contingency Tables  Useful in situations involving multiple population proportions  Used to classify sample observations according to two or more characteristics  Also called a cross-classification table. Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-3
  4. 4. Contingency Table Example Left-Handed vs. Gender Dominant Hand: Left vs. Right Gender: Male vs. Female  2 categories for each variable, so called a 2 x 2 table  Suppose we examine a sample of size 300 Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-4
  5. 5. Contingency Table Example (continued) Sample results organized in a contingency table: Hand Preference sample size = n = 300: 180 Males, 24 were left handed Left Right Female 12 108 120 Male 24 156 180 36 120 Females, 12 were left handed Gender 264 300 Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-5
  6. 6. χ2 Test for the Difference Between Two Proportions H0: p1 = p2 (Proportion of females who are left handed is equal to the proportion of males who are left handed) H1: p1 ≠ p2 (The two proportions are not the same – Hand preference is not independent of gender)  If H0 is true, then the proportion of left-handed females should be the same as the proportion of left-handed males  The two proportions above should be the same as the proportion of left-handed people overall Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-6
  7. 7. The Chi-Square Test Statistic The Chi-square test statistic is: ( fo − fe )2 χ2 = ∑ fe all cells  where: fo = observed frequency in a particular cell fe = expected frequency in a particular cell if H0 is true χ2 for the 2 x 2 case has 1 degree of freedom Statistics for Managers Using the contingency table has expected (Assumed: each cell in Microsoft Excel, 4e © of at least 5) frequency 2004 Chap 11-7 Prentice-Hall, Inc.
  8. 8. Decision Rule The χ2 test statistic approximately follows a chisquared distribution with one degree of freedom Decision Rule: If χ2 > χ2U, reject H0, otherwise, do not reject H0 α 0 Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Do not reject H0 Reject H0 χ2 χ2U Chap 11-8
  9. 9. Computing the Average Proportion X1 + X 2 X The average p= = proportion is: n1 + n2 n 120 Females, 12 were left handed 180 Males, 24 were left handed Here: 12 + 24 36 p= = = 0.12 120 + 180 300 i.e., the proportion Statistics for Managers Using of left handers overall is 12% Microsoft Excel, 4e © 2004 Chap 11-9 Prentice-Hall, Inc.
  10. 10. Finding Expected Frequencies  To obtain the expected frequency for left handed females, multiply the average proportion left handed (p) by the total number of females  To obtain the expected frequency for left handed males, multiply the average proportion left handed (p) by the total number of males If the two proportions are equal, then P(Left Handed | Female) = P(Left Handed | Male) = .12 i.e., we would expect (.12)(120) = 14.4 females to be left handed Statistics for Managers (.12)(180) = 21.6 males to be left handed Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-10
  11. 11. Observed vs. Expected Frequencies Hand Preference Gender Left Right Female Observed = 12 Expected = 14.4 Observed = 108 Expected = 105.6 120 Male Observed = 24 Expected = 21.6 Observed = 156 Expected = 158.4 180 264 300 36 Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-11
  12. 12. The Chi-Square Test Statistic Hand Preference Gender Left Right Female Observed = 12 Expected = 14.4 Observed = 108 Expected = 105.6 120 Male Observed = 24 Expected = 21.6 Observed = 156 Expected = 158.4 180 36 264 300 The test statistic is: ( fo − fe )2 χ = ∑ fe all cells 2 Statistics12 − 14.4)2 (108 − 105.6)2 (24 − 21.6)2 (156 − 158.4)2 for Managers Using ( = + + + = 0.6848 Microsoft Excel, 4e © 2004 14.4 105.6 21.6 158.4 Chap 11-12 Prentice-Hall, Inc.
  13. 13. Decision Rule 2 The test statistic is χ 2 = 0.6848 , χ U with 1 d.f. = 3.841 Decision Rule: If χ2 > 3.841, reject H0, otherwise, do not reject H0 α 0 Do not reject H0 Reject H0 χ2U=3.841 Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. χ2 Here, χ2 = 0.6848 < χ2U = 3.841, so we do not reject H0 and conclude that there is not sufficient evidence that the two proportions are different at α = .05 Chap 11-13
  14. 14. χ2 Test for the Differences in More Than Two Proportions  Extend the χ2 test to the case with more than two independent populations: H0: p1 = p2 = … = pc H1: Not all of the pj are equal (j = 1, 2, …, c) Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-14
  15. 15. The Chi-Square Test Statistic The Chi-square test statistic is: ( fo − fe )2 χ2 = ∑ fe all cells  where: fo = observed frequency in a particular cell of the 2 x c table fe = expected frequency in a particular cell if H0 is true χ2 for the 2 x c case has (2-1)(c-1) = c - 1 degrees of freedom Statistics for Managers Usingcontingency table has expected (Assumed: each cell in the Microsoft Excel, 4e at least 1) frequency of © 2004 Chap 11-15 Prentice-Hall, Inc.
  16. 16. Computing the Overall Proportion The overall proportion is:  X1 + X 2 +  + Xc X p= = n1 + n2 +  + nc n Expected cell frequencies for the c categories are calculated as in the 2 x 2 case, and the decision rule is the same: Decision Rule: If χ2 > χ2U, reject H0, otherwise, do not Statistics for H reject Managers Using 0 Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Where χ2U is from the chi-squared distribution with c – 1 degrees of freedom Chap 11-16
  17. 17. χ2 Test of Independence  Similar to the χ2 test for equality of more than two proportions, but extends the concept to contingency tables with r rows and c columns H0: The two categorical variables are independent (i.e., there is no relationship between them) H1: The two categorical variables are dependent (i.e., there is a relationship between them) Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-17
  18. 18. χ2 Test of Independence (continued) The Chi-square test statistic is: ( fo − fe )2 χ2 = ∑ fe all cells  where: fo = observed frequency in a particular cell of the r x c table fe = expected frequency in a particular cell if H0 is true χ2 for the r x c case has (r-1)(c-1) degrees of freedom Statistics for Managers in the contingency table has expected (Assumed: each cell Using frequency of © 2004 Microsoft Excel, 4eat least 1) Chap 11-18 Prentice-Hall, Inc.
  19. 19. Expected Cell Frequencies  Expected cell frequencies: row total × column total fe = n Where: row total = sum of all frequencies in the row column total = sum of all frequencies in the column n = overall sample size Statistics for Managers Using Microsoft Excel, 4e © 2004 Chap 11-19 Prentice-Hall, Inc.
  20. 20. Decision Rule  The decision rule is If χ2 > χ2U, reject H0, otherwise, do not reject H0 Where χ2U is from the chi-squared distribution with (r – 1)(c – 1) degrees of freedom Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-20
  21. 21. Example  The meal plan selected by 200 students is shown below: Number of meals per week 20/week 10/week Class Standing Fresh. 24 32 none Total 14 70 Soph. 22 26 12 60 Junior 10 14 6 30 Senior 14 16 Statistics for Managers Using 88 Total 70 Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. 10 40 42 200 Chap 11-21
  22. 22. Example (continued)  The hypothesis to be tested is: H0: Meal plan and class standing are independent (i.e., there is no relationship between them) H1: Meal plan and class standing are dependent (i.e., there is a relationship between them) Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-22
  23. 23. Example: Expected Cell Frequencies (continued) Observed: Class Standing Number of meals per week 20/wk 10/wk none Expected cell frequencies if H0 is true: Total Fresh. 24 32 14 70 Soph. 22 26 12 60 Junior 10 14 6 30 Senior 14 16 10 40 Class Standing Total 70 88 42 200 Fresh. 24.5 30.8 14.7 70 Soph. 21.0 26.4 12.6 60 Junior 10.5 13.2 6.3 30 Senior 14.0 17.6 8.4 40 70 88 42 200 Example for one cell: fe = row total × column total n Statistics for Managers Using 30 × 70 = = 10 5 Microsoft Excel, .4e © 2004 200 Prentice-Hall, Inc. Total Number of meals per week 20/wk 10/wk none Total Chap 11-23
  24. 24. Example: The Test Statistic (continued)  The test statistic value is: ( fo − fe )2 χ2 = ∑ fe all cells (24 − 24.5)2 (32 − 30.8)2 (10 − 8.4)2 = + ++ = 0.709 24.5 30.8 8. 4 χ2U = 12.592 for α = .05 from the chi-squared distribution with (4 – 1)(3 – 1) = 6 degrees of freedom Statistics for Managers Using Microsoft Excel, 4e © 2004 Chap 11-24 Prentice-Hall, Inc.
  25. 25. Example: Decision and Interpretation (continued) 2 The test statistic is χ 2 = 0.709 , χ U with 6 d.f. = 12.592 Decision Rule: If χ2 > 12.592, reject H0, otherwise, do not reject H0 α 0 Do not reject H0 Reject H0 χ2U=12.592 Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. χ2 Here, χ2 = 0.709 < χ2U = 12.592, so do not reject H0 Conclusion: there is not sufficient evidence that meal plan and class standing are related at α = .05 Chap 11-25
  26. 26. Wilcoxon Rank-Sum Test for Differences in 2 Medians  Test two independent population medians  Populations need not be normally distributed  Distribution free procedure  Used when only rank data are available  Must use normal approximation if either of the sample sizes is larger than 10 Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-26
  27. 27. Wilcoxon Rank-Sum Test: Small Samples  Can use when both n1 , n2 ≤ 10  Assign ranks to the combined n1 + n2 sample observations     If unequal sample sizes, let n1 refer to smaller-sized sample Smallest value rank = 1, largest value rank = n1 + n2 Assign average rank for ties Sum the ranks for each sample: T1 and T2 Statistics for Managers Using  Obtain test statistic, T (from smaller sample) 1 Microsoft Excel, 4e © 2004 Chap 11-27 Prentice-Hall, Inc.
  28. 28. Checking the Rankings   The sum of the rankings must satisfy the formula below Can use this to verify the sums T1 and T2 n(n + 1) T1 + T2 = 2 where n = n1 + n2 Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-28
  29. 29. Wilcoxon Rank-Sum Test: Hypothesis and Decision Rule M1 = median of population 1; M2 = median of population 2 Test statistic = T1 (Sum of ranks from smaller sample) Two-Tail Test H0: M1 = M2 H1: M1 ≠ M2 Reject Do Not Reject T1L Reject T1U Left-Tail Test H0: M1 ≥ M2 H1: M1 < M2 Reject Do Not Reject T1L Reject H0 T1 T1L Statisticsif for<Managers Using H0 if T1 < T1L Reject Microsoftif Excel, 4e © 2004 or T1 > T1U Prentice-Hall, Inc. Right-Tail Test H0: M1 ≤ M2 H1: M1 > M2 Do Not Reject Reject T1U Reject H0 if T1 > T1U Chap 11-29
  30. 30. Wilcoxon Rank-Sum Test: Small Sample Example Sample data is collected on the capacity rates (% of capacity) for two factories. Are the median operating rates for two factories the same?  For factory A, the rates are 71, 82, 77, 94, 88  For factory B, the rates are 85, 82, 92, 97 Test for equality of the sample medians Statistics for Managers Using at the 0.05 significance level Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-30
  31. 31. Wilcoxon Rank-Sum Test: Small Sample Example (continued) Ranked Capacity values: Capacity Factory A Rank Factory B Factory A 1 77 Tie in 3rd and 4th places 71 2 82 Factory B 3.5 82 3.5 85 5 88 6 92 94 97 Statistics for Managers Using Microsoft Excel, 4e © 2004 Rank Sums: Prentice-Hall, Inc. 7 8 9 20.5 24.5 Chap 11-31
  32. 32. Wilcoxon Rank-Sum Test: Small Sample Example (continued) Factory B has the smaller sample size, so the test statistic is the sum of the Factory B ranks: T1 = 24.5 The sample sizes are: n1 = 4 (factory B) n2 = 5 (factory A) Statistics for Managers Using Microsoft Excel, 4e © 2004 of significance is α = .05 The level Chap 11-32 Prentice-Hall, Inc.
  33. 33. Wilcoxon Rank-Sum Test: Small Sample Example (continued)  Lower and Upper Critical Values for T1 from Appendix table E.8: α n2 n1 OneTailed TwoTailed 4 5 .05 .10 12, 28 19, 36 .025 .05 11, 29 17, 38 .01 .02 10, 30 16, 39 .005 .01 --, -- 15, 40 4 5 6 Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. T1L = 11 and T1U = 29 Chap 11-33
  34. 34. Wilcoxon Rank-Sum Test: Small Sample Solution (continued)  α = .05 n1 = 4 , n2 = 5 Test Statistic (Sum of ranks from smaller sample): Two-Tail Test H0: M1 = M2 H1: M1 ≠ M2  T1 = 24.5 Reject Do Not Reject T1L=11 Reject T1U=29 Decision: Do not reject at α = 0.05 Conclusion: There is not enough evidence to Reject H0 T1 T1L=11 Statisticsif for<Managers Using prove that the medians are not Microsoftif Excel, =29 © 2004 equal. or T1 > T1U 4e Chap 11-34 Prentice-Hall, Inc.
  35. 35. Wilcoxon Rank-Sum Test (Large Sample)  For large samples, the test statistic T1 is approximately normal with mean μT and standard deviation σ T : 1 1 n1(n + 1) μT1 = 2  n1 n2 (n + 1) σ T1 = 12 Must use the normal approximation if either n1 or n2 > 10  Assign n to be the Statistics for Managers Usingsmaller of the two sample sizes 1 Microsoft Excel, 4e the2004  Can use © normal approximation for small samples Chap 11-35 Prentice-Hall, Inc.
  36. 36. Wilcoxon Rank-Sum Test (Large Sample) (continued)  The Z test statistic is Z=  T1 − μT1 σ T1 Where Z approximately follows a standardized normal distribution Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-36
  37. 37. Wilcoxon Rank-Sum Test: Normal Approximation Example Use the setting of the prior example: The sample sizes were: n1 = 4 (factory B) n2 = 5 (factory A) The level of significance was α = .05 The test statistic was T1 = 24.5 Statistics for Managers Using Microsoft Excel, 4e © 2004 Chap 11-37 Prentice-Hall, Inc.
  38. 38. Wilcoxon Rank-Sum Test: Normal Approximation Example (continued) μT1 = σ T1 =  n1(n + 1) 4(9 + 1) = = 20 2 2 n1 n2 (n + 1) = 12 4 (5) (9 + 1) = 2.739 12 The test statistic is T1 − μT1 24.5 − 20 Z= = = 1.64 σ T1 2.739 Z = 1.64 is not greater than the critical Z value of 1.96 (for for Managers Using Statisticsα = .05) so we do not reject H0 – there is not sufficient evidence that Microsoft Excel, 4e © 2004 the medians are not equal  Prentice-Hall, Inc. Chap 11-38
  39. 39. Kruskal-Wallis Rank Test    Tests the equality of more than 2 population medians Use when the normality assumption for oneway ANOVA is violated Assumptions: The samples are random and independent  variables have a continuous distribution  the data can be ranked  populations have the same variability  populations have the same shape Statistics for Managers Using Microsoft Excel, 4e © 2004 Chap 11-39 Prentice-Hall, Inc. 
  40. 40. Kruskal-Wallis Test Procedure  Obtain relative rankings for each value   In event of tie, each of the tied values gets the average rank Sum the rankings for data from each of the c groups  Compute the H test statistic Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-40
  41. 41. Kruskal - Wallis Test Procedure (continued)  The Kruskal - Wallis H test statistic: (with c – 1 degrees of freedom) c T2   12 j H= ∑ n  − 3(n + 1)  n(n + 1) j=1 j    where: n = sum of sample sizes in all samples c = Number of samples Tj = Sum of ranks in the jth sample nj = Size of the Managers Using jth sample Statistics for Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-41
  42. 42. Kruskal-Wallis Test Procedure (continued)  Complete the test by comparing the calculated H value to a critical χ2 value from the chi-square distribution with c – 1 degrees of freedom  Decision rule  α 0 χ2 Do not Reject H reject H Statistics for Managers Using χ2U Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Reject H0 if test statistic H > χ2U  Otherwise do not reject H0 0 0 Chap 11-42
  43. 43. Kruskal-Wallis Example  Do different departments have different class sizes? Class size (Math, M) Class size (English, E) Class size (Biology, B) 23 45 54 78 66 55 60 72 45 70 30 40 18 34 44 Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-43
  44. 44. Kruskal-Wallis Example  Do different departments have different class sizes? Class size Class size Class size Ranking Ranking (Math, M) (English, E) (Biology, B) 23 41 54 78 66 2 6 9 15 12 55 60 72 45 70 Σ = 44 Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. 10 11 14 8 13 Σ = 56 30 40 18 34 44 Ranking 3 5 1 4 7 Σ = 20 Chap 11-44
  45. 45. Kruskal-Wallis Example (continued) H0 : MedianM = MedianE = MedianH HA : Not all population Medians are equal  The H statistic is c R2   12 j H=  ∑ n  − 3(n + 1)  n(n + 1) j=1 j      44 2 56 2 20 2  12  =  5 + 5 + 5  − 3(15 + 1) = 6.72  15(15 + 1)    Statistics for Managers Using Microsoft Excel, 4e © 2004 Chap 11-45 Prentice-Hall, Inc.
  46. 46. Kruskal-Wallis Example (continued)  Compare H = 6.72 to the critical value from the chi-square distribution for 5 – 1 = 4 degrees of freedom and α = .05: χ = 9.4877 2 U 2 χU = 9.4877 Since H = 6.72 < do not reject H0 There is not sufficient evidence to reject Statistics for Managers Using that the 2004 Microsoft Excel, 4e ©population medians are all equal Chap 11-46 Prentice-Hall, Inc.
  47. 47. Chapter Summary     Developed and applied the χ2 test for the difference between two proportions Developed and applied the χ2 test for differences in more than two proportions Examined the χ2 test for independence Used the Wilcoxon rank sum test for two population medians   Small Samples Large sample Z approximation Applied the Kruskal - Wallis H-test for multiple Statistics for Managers Using population medians  Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-47

×