Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Chapter 07 Chi Square

1,616 views

Published on

  • Be the first to comment

Chapter 07 Chi Square

  1. 1. <ul><li>Chapter 6 </li></ul><ul><li>Chi-Square Test for Categorical Variable </li></ul>
  2. 2. 6.1 Basic logic of  2 test <ul><li>Given a set of observed frequency distribution </li></ul><ul><li>A 1 , A 2 , A 3 … </li></ul><ul><li>to test whether the data follow certain theory. </li></ul><ul><li>If the theory is true, then we will have a set </li></ul><ul><li>of theoretical frequency distribution: </li></ul><ul><li>T 1 , T 2 , T 3 … </li></ul><ul><li>Comparing A 1 , A 2 , A 3 … and T 1 , T 2 , T 3 … </li></ul><ul><li>If they are quite different, then the theory might not be true; </li></ul><ul><li>Otherwise, the theory is acceptable. </li></ul>
  3. 3. 6.1.1 Chi-square distribution <ul><li>~  2 distribution </li></ul><ul><li>—— Agreement between observed and expected frequencies </li></ul>DF=k-1-# parameters estimating f i For a contingency table, DF=(# rows-1)(# columns-1 )
  4. 4.  2 distribution
  5. 5. 6.1.2 χ 2 Test for Goodness of Fit (Large Sample) Table1 Frequency distribution and goodness of fit based on 136 measurements to the phantom( 体模 ) intervals A Φ (X 1 ) Φ (X 2 ) P (X) T=n* P (X) (A-T) 2 /T 1.228- 2 0.00069 0.00466 0.00397 0.5405 3.94143 1.234- 2 0.00466 0.02275 0.01809 2.4601 0.08605 1.240- 7 0.02275 0.08076 0.05801 7.8889 0.10016 1.246- 17 0.08076 0.21186 0.13110 17.8294 0.03859 1.252- 25 0.21186 0.42074 0.20888 28.4083 0.40892 1.258- 37 0.42074 0.65542 0.23468 31.9167 0.80961 1.264- 25 0.65542 0.84134 0.18592 25.2855 0.00322 1.270- 16 0.84134 0.94520 0.10386 14.1244 0.24906 1.276- 4 0.94520 0.98610 0.04090 5.5618 0.43858 1.282- 1 0.98610 0.99744 0.01135 1.5434 0.19130 合 计 - - - - - 6.26692
  6. 6. <ul><li>1. Setting up hypotheses </li></ul><ul><li>H 0 : the population follows N (1.26,0.01 2 ) </li></ul><ul><li>H 1 : the population doesn’t follow N (1.26,0.01 2 ) α =0.05 </li></ul><ul><li>Calculation of the statistic : </li></ul><ul><li>3. P -value : ν = k -1-2=10-1-2=7 </li></ul><ul><li>4. Conclusion : With significance level α =0.05, H 0 is not rejected. The measurement follows the normal distribution. </li></ul>
  7. 7. <ul><li>6.2 Comparison between Two Independent </li></ul><ul><li>Sample Proportions </li></ul><ul><li>In chapter 4 the Z test can only be used </li></ul><ul><li>for comparing  with a given  0 (one sample) </li></ul><ul><li>or comparing  1 with  2 (two samples). </li></ul><ul><li>If we need to compare more than two </li></ul><ul><li>samples, Chi-square test is widely used. </li></ul>
  8. 8. Example 6.1 <ul><li>In a clinical survey, 215 patients with pulmonary heart disease ( 肺心病 ) in a hospital were collected , of which 164 patients have taken digitalis ( 洋地黄 ) and 51 patients haven’t taken it. Each of them received an ECG examination. The results are listed in Table 6.2. </li></ul>
  9. 9.
  10. 10.
  11. 11. ν = 1
  12. 12.  2 test and Z test <ul><li>According to (4.25) </li></ul>
  13. 13. Correction for continuity <ul><li>When n ≥40, if there happens 1≤ e ij <5, </li></ul>
  14. 14. Fisher’s exact test <ul><li>When n <40, or e ij <1, with SPSS,  2 test is not proper then. An exact P value will be obtained for us to give conclusion. </li></ul><ul><li>This can be easily fulfilled in SPSS. </li></ul>
  15. 15. Example 6.9
  16. 16. Statistical description
  17. 17. Statistical inference
  18. 18. 6.3 The  2 Tests for Binary Variable under a Paired Design <ul><li>Example 6.2 There are 260 serum ( 血清 ) samples. Each sample is divided into two and tested by two different methods of immunological test of rheumatoid factor( 类风湿因子 ) respectively. The results are listed in Table 6.4. Now the question is that results of two methods are independent or not. </li></ul>
  19. 19. test for independence between two binary variables  2 =173.74 Example 6.2 12/80=15% 172/180=95%
  20. 20. 6.3.2 Comparison between two sample proportions <ul><li>McNemar test </li></ul> 2 =
  21. 21. <ul><li>H 0 :  1 =  2 , H 1 :  1 ≠  2 , α =0.05 </li></ul><ul><li>When H 0 is true, </li></ul><ul><li>For large sample (b+c>40) </li></ul><ul><li>If the  2 >  2 , then reject H 0 </li></ul>0.05
  22. 22. The Probability Expressions H 0 :  c1 =  r1 H 1 :  c1   r1 Since  c1 =  11 +  21,  r1 =  11 +  12 , This test becomes: H 0 :  12 =  21 , H 1 :  12   21 Trt A Trt B Total + - +  11 (a)  12 (b)  r1 -  21 (c)  22 (d)  r2 Total  c1  c2 1.0
  23. 23. Correction to McNemar test ( f 12 + f 21 <40)  2 =  2 = =0.45
  24. 24. 6.4 The  2 Test for R×C Contingency Table
  25. 25. The statistic for hypothesis test  2 = =9.488
  26. 26. 6.4.2 Multiple comparison for R×C Table control group + - I II III IV V … … … … … … … … … … VI … …
  27. 27. 6.4.3 Measurement of association for R×C table
  28. 28. Pearson contingency coefficient
  29. 29. <ul><li>Pre-requisite for  2 test </li></ul><ul><li>By experience, </li></ul><ul><li>The theoretical frequencies should be greater than 5 in more than 4/5 cells; </li></ul><ul><li>The theoretical frequency in any cell should be greater than 1. </li></ul><ul><li>Otherwise, we need to use Fisher exact test. </li></ul>

×