Successfully reported this slideshow.
Upcoming SlideShare
×

# Chapter 07 Chi Square

1,616 views

Published on

• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

### Chapter 07 Chi Square

1. 1. <ul><li>Chapter 6 </li></ul><ul><li>Chi-Square Test for Categorical Variable </li></ul>
2. 2. 6.1 Basic logic of  2 test <ul><li>Given a set of observed frequency distribution </li></ul><ul><li>A 1 , A 2 , A 3 … </li></ul><ul><li>to test whether the data follow certain theory. </li></ul><ul><li>If the theory is true, then we will have a set </li></ul><ul><li>of theoretical frequency distribution: </li></ul><ul><li>T 1 , T 2 , T 3 … </li></ul><ul><li>Comparing A 1 , A 2 , A 3 … and T 1 , T 2 , T 3 … </li></ul><ul><li>If they are quite different, then the theory might not be true; </li></ul><ul><li>Otherwise, the theory is acceptable. </li></ul>
3. 3. 6.1.1 Chi-square distribution <ul><li>~  2 distribution </li></ul><ul><li>—— Agreement between observed and expected frequencies </li></ul>DF=k-1-# parameters estimating f i For a contingency table, DF=(# rows-1)(# columns-1 ）
4. 4.  2 distribution
5. 5. 6.1.2 χ 2 Test for Goodness of Fit (Large Sample) Table1 Frequency distribution and goodness of fit based on 136 measurements to the phantom( 体模 ) intervals A Φ (X 1 ) Φ (X 2 ) P (X) T=n* P (X) (A-T) 2 /T 1.228- 2 0.00069 0.00466 0.00397 0.5405 3.94143 1.234- 2 0.00466 0.02275 0.01809 2.4601 0.08605 1.240- 7 0.02275 0.08076 0.05801 7.8889 0.10016 1.246- 17 0.08076 0.21186 0.13110 17.8294 0.03859 1.252- 25 0.21186 0.42074 0.20888 28.4083 0.40892 1.258- 37 0.42074 0.65542 0.23468 31.9167 0.80961 1.264- 25 0.65542 0.84134 0.18592 25.2855 0.00322 1.270- 16 0.84134 0.94520 0.10386 14.1244 0.24906 1.276- 4 0.94520 0.98610 0.04090 5.5618 0.43858 1.282- 1 0.98610 0.99744 0.01135 1.5434 0.19130 合 计 - - - - - 6.26692
6. 6. <ul><li>1. Setting up hypotheses </li></ul><ul><li>H 0 ： the population follows N (1.26,0.01 2 ) </li></ul><ul><li>H 1 ： the population doesn’t follow N (1.26,0.01 2 ) α =0.05 </li></ul><ul><li>Calculation of the statistic ： </li></ul><ul><li>3. P -value ： ν = k -1-2=10-1-2=7 </li></ul><ul><li>4. Conclusion ： With significance level α =0.05, H 0 is not rejected. The measurement follows the normal distribution. </li></ul>
7. 7. <ul><li>6.2 Comparison between Two Independent </li></ul><ul><li>Sample Proportions </li></ul><ul><li>In chapter 4 the Z test can only be used </li></ul><ul><li>for comparing  with a given  0 (one sample) </li></ul><ul><li>or comparing  1 with  2 (two samples). </li></ul><ul><li>If we need to compare more than two </li></ul><ul><li>samples, Chi-square test is widely used. </li></ul>
8. 8. Example 6.1 <ul><li>In a clinical survey, 215 patients with pulmonary heart disease ( 肺心病 ) in a hospital were collected , of which 164 patients have taken digitalis ( 洋地黄 ) and 51 patients haven’t taken it. Each of them received an ECG examination. The results are listed in Table 6.2. </li></ul>
9. 9.
10. 10.
11. 11. ν ＝ 1
12. 12.  2 test and Z test <ul><li>According to (4.25) </li></ul>
13. 13. Correction for continuity <ul><li>When n ≥40, if there happens 1≤ e ij <5, </li></ul>
14. 14. Fisher’s exact test <ul><li>When n <40, or e ij <1, with SPSS,  2 test is not proper then. An exact P value will be obtained for us to give conclusion. </li></ul><ul><li>This can be easily fulfilled in SPSS. </li></ul>
15. 15. Example 6.9
16. 16. Statistical description
17. 17. Statistical inference
18. 18. 6.3 The  2 Tests for Binary Variable under a Paired Design <ul><li>Example 6.2 There are 260 serum ( 血清 ) samples. Each sample is divided into two and tested by two different methods of immunological test of rheumatoid factor( 类风湿因子 ) respectively. The results are listed in Table 6.4. Now the question is that results of two methods are independent or not. </li></ul>
19. 19. test for independence between two binary variables  2 =173.74 Example 6.2 12/80=15% 172/180=95%
20. 20. 6.3.2 Comparison between two sample proportions <ul><li>McNemar test </li></ul> 2 =
21. 21. <ul><li>H 0 :  1 =  2 , H 1 :  1 ≠  2 , α =0.05 </li></ul><ul><li>When H 0 is true, </li></ul><ul><li>For large sample (b+c>40) </li></ul><ul><li>If the  2 >  2 , then reject H 0 </li></ul>0.05
22. 22. The Probability Expressions H 0 :  c1 =  r1 H 1 :  c1   r1 Since  c1 =  11 +  21,  r1 =  11 +  12 , This test becomes: H 0 :  12 =  21 , H 1 :  12   21 Trt A Trt B Total + - +  11 (a)  12 (b)  r1 -  21 (c)  22 (d)  r2 Total  c1  c2 1.0
23. 23. Correction to McNemar test ( f 12 + f 21 <40)  2 =  2 = =0.45
24. 24. 6.4 The  2 Test for R×C Contingency Table
25. 25. The statistic for hypothesis test  2 = =9.488
26. 26. 6.4.2 Multiple comparison for R×C Table control group + － I II III IV V … … … … … … … … … … VI … …
27. 27. 6.4.3 Measurement of association for R×C table
28. 28. Pearson contingency coefficient
29. 29. <ul><li>Pre-requisite for  2 test </li></ul><ul><li>By experience, </li></ul><ul><li>The theoretical frequencies should be greater than 5 in more than 4/5 cells; </li></ul><ul><li>The theoretical frequency in any cell should be greater than 1. </li></ul><ul><li>Otherwise, we need to use Fisher exact test. </li></ul>