Upcoming SlideShare
×

# Chi Squared Test

8,540

Published on

Notes on the 'Chi Squares Test Material' of the OCR A Level Mathematics Unit S3.

Published in: Education, Technology
1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

Views
Total Views
8,540
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
193
0
Likes
1
Embeds 0
No embeds

No notes for slide

### Chi Squared Test

1. 1. Statistics 3 The Chi Squared ( χ 2 ) Test - Lesson 1 - <ul><li>Key Learning Points/Vocabulary: </li></ul><ul><li>What is the purpose of the chi squared test? </li></ul><ul><li>Conducting a chi squared test on a uniformly distributed expected frequencies. </li></ul><ul><li>Goodness of fit of a Binomial model. </li></ul>
2. 2. Purpose of the Chi Squared Test The purpose of the chi squared test is to see whether observed experimental data is a ‘good fit’ with theoretical expected results. The χ 2 statistic is calculated in the following way:
3. 3. Source: http://en.wikipedia.org/wiki/File:Chi-square_distributionPDF.png
4. 4. Notes on the Chi Squared Distribution 1.) ν (the number of degrees of freedom) is calculated from the number of classes – the number of restrictions. 2.) A restriction is defined as any value that is derived from the observed data set. 3.) The chi squared distribution is continuous and thus offers poor approximation when dealing with small frequencies. When calculating χ 2 we have to combine any classes which contain expected frequencies of less than 5 elements. 4.) As with most statistical distributions, we do not need to concern ourselves with calculating by hand as all critical values are tabulated for easy reference.
5. 5. Lesson 1 - Example Question I The table below shows the results when a die is rolled 120 times. Conduct a chi squared test to see whether the die is fair or not at the 5% significance level. 24 20 18 14 29 15 Freq 6 5 4 3 2 1 Score
6. 6. Lesson 1 - Example Question II The table below shows the results of an experiment in which four coins thrown 160 times and the number of heads recorded. a.) Fit a Bin (4, ½) distribution to the data. b.) Test the goodness of fit of the Bin (4, ½) model using a chi squared test at the 5% significance level. 10 35 54 46 15 Freq 4 3 2 1 0 Score
7. 7. Practice Questions Statistics 3 and 4 by Jane Miller Page 121, Exercise 5A Questions 1 and 3
8. 8. Statistics 3 The Chi Squared ( χ 2 ) Test - Lesson 2 - <ul><li>Key Learning Points/Vocabulary: </li></ul><ul><li>In the National Lottery fair? </li></ul><ul><li>Are the digits of π random? </li></ul>
9. 9. Statistics 3 The Chi Squared ( χ 2 ) Test - Lesson 3 - <ul><li>Key Learning Points/Vocabulary: </li></ul><ul><li>Goodness of fit of a Poisson model. </li></ul><ul><li>Goodness of fit to a ratio model. </li></ul><ul><li>Goodness of fit of a Geometric model. </li></ul>
10. 10. Lesson 3 - Example Question The table below shows the number of calls arriving at a switchboard in time intervals of 5 minutes. Test at the 5% significance level whether the Poisson distribution provides a good model for this data. 0 2 4 23 71 Freq 4 or more 3 2 1 0 No
11. 11. Practice Questions Statistics 3 and 4 by Jane Miller Page 121, Exercise 5A Questions 11 (part a only) (Poisson) 2 (Ratio) 12 (Geometric) 13 (Poisson)
12. 12. Statistics 3 The Chi Squared ( χ 2 ) Test - Lesson 4 - <ul><li>Key Learning Points/Vocabulary: </li></ul><ul><li>Goodness of fit of a Normal model. </li></ul>
13. 13. Lesson 4 - Example Question The height in centimetres gained by a conifer in its first year after planting is denoted by the random variable H. The value of H is measured for a random sample of 86 conifers and the results obtained are summarised in the table below. Assuming that H is modelled by a N(50, 15 2 ) distribution, test at the 5% level, the goodness of fit of the model. 12 18 28 18 10 Obs Freq >65 55-65 45-55 35-45 <35 H
14. 14. Practice Questions Statistics 3 and 4 by Jane Miller Page 128, Exercise 5B Question 1 onwards
15. 15. Statistics 3 The Chi Squared ( χ 2 ) Test - Lesson 5 - <ul><li>Key Learning Points/Vocabulary: </li></ul><ul><li>Goodness of fit test on a contingency table. </li></ul><ul><li>Combing rows/columns when expected frequencies < 5. </li></ul><ul><li>With Yates correction on a 2 x 2 table. </li></ul>
16. 16. Degrees of Freedom The number of degrees of freedom in a h x k contingency table is given by ν = (h – 1) x (k – 1).
17. 17. Lesson 5 - Example Question I Is income level independent of method of transport? 693 129 102 462 Total 266 29 32 205 Large 312 64 49 199 Average 115 36 21 58 Small Income Level Total Self Public Car Method of Transport
18. 18. Lesson 5 - Example Question II A university sociology department believes that students with a good grade in A Level General Studies tend to do well on sociology degree courses. To check this it has collected information on a random sample of 100 who had just graduated and who also had taken general studies at A Level. The students performance in General Studies was divided into two categories, those with grades A or B and ‘others’. Their degrees were recorded as Class I, II, III or fail. The data is given in the table below. Test at the 1% level, the hypothesis that degree performance is independent of A level performance in General Studies. 100 5 30 50 15 Total 60 4 24 28 4 Others 40 1 6 22 11 Grade A or B Total Fail Class III Class II Class I
19. 19. Yates Correction χ 2 is a continuous distribution whilst χ 2 calc is not. In the case of a 2 x 2 contingency table for which ν = 1, the agreement between the two distributions can be improved by applying a continuity correction called Yates Correction. This involves reducing each value of |O – E| by 0.5
20. 20. Lesson 5 - Example Question III A random sample of 930 companies quoted on the stock exchange revealed the information summarised in the table below, which shows the distribution of these companies classified according to two attributes. In this table, D indicates that the company has diversified its product range during the previous financial year, and P indicates that there has been a significant rise in profits during the previous financial year. The null hypothesis is that D and P are independent. Show that this can rejected at all reasonable significance levels. 377 299 Not P 106 148 P Not D D
21. 21. Practice Questions <ul><li>Statistics 3 and 4 by Jane Miller </li></ul><ul><li>Page 121, Exercise 5C </li></ul><ul><li>Basic contingency table: Qu 1, 3, 4 and 5 </li></ul><ul><li>Combining: Qu 2, 6 and 8 </li></ul><ul><li>Yates: Qu 7 and 9 </li></ul>
1. #### A particular slide catching your eye?

Clipping is a handy way to collect important slides you want to go back to later.