Upcoming SlideShare
×

# 25 Testing

358 views
317 views

Published on

Published in: Technology, Education
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
358
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
7
0
Likes
0
Embeds 0
No embeds

No notes for slide

### 25 Testing

1. 1. Stat310 Testing Hadley Wickham Sunday, 19 April 2009
2. 2. 1. Import question 2. Recap 3. More examples/practice 4. Choosing a cut-off 5. P value is a random variable too! 6. Next time Sunday, 19 April 2009
3. 3. Final Which would you prefer? a) a 3 hour ﬁnal b) a 2 hour ﬁnal Sunday, 19 April 2009
4. 4. Recap What is a null hypothesis? What is an alternative hypothesis? What is the opposite of rejecting the null hypothesis? Why? Sunday, 19 April 2009
5. 5. Testing jargon No: Null hypothesis. Nothing is happening. (Thing we want to disprove) Yes: Alternative hypothesis. Something interesting is happening. Sunday, 19 April 2009
6. 6. Absence of evidence is not evidence of absence Sunday, 19 April 2009
7. 7. The lady tasting tea A thought experiment by R. A. Fisher (famous early statistician, 1890-1962) A lady at a tea party claims that she can tell the difference between putting the milk in ﬁrst and second. How can we be sure? Sunday, 19 April 2009
8. 8. Experiment 8 cups. 4 milk ﬁrst, 4 milk second. Presented in random order. What is the null hypothesis? How many possible outcomes are there? Sunday, 19 April 2009
9. 9. Your turn What would the distribution of correct responses be under the null hypothesis? How many would she need to get correct for us to be reasonably certain that she really could tell the difference? Sunday, 19 April 2009
10. 10. Right Wrong # % 4 0 1 1% 3 1 16 23% 2 2 36 51% 1 3 16 23% 0 4 1 1% 70 100% Sunday, 19 April 2009
11. 11. Another example Xi ~ iid Normal(μx, 1) Yi ~ iid Normal(μy, 1) Do they have the same means? Sunday, 19 April 2009
12. 12. 1. Write down null and alternative hypotheses 2. Figure out good test statistic (for this class, usually obvious) 3. Work out distribution under the null Sunday, 19 April 2009
13. 13. Experiment x = 7.0 5.8 2.0 5.0 6.1 5.6 4.3 4.0 4.8 6.5 y = 6.2 4.0 5.8 5.9 5.7 6.0 6.2 5.7 5.4 5.8 (mean of x = 5.67, mean of y = 5.11) Are the means of the underlying distributions the same? (True answer?) Sunday, 19 April 2009
14. 14. 1. Compute test statistic 2. Compute p-value, by evaluating F at the test-statistic 3. (Question: what is the distribution of the p-value if the null hypothesis is true?) Sunday, 19 April 2009
15. 15. P-value P value gives us the probability, under the null hypothesis, that we would have seen a value equal to or more extreme than the value we observed. Strength of evidence for rejecting the null hypothesis. But we need a cut off to make a yes-no decision. How do we choose that cut off? Sunday, 19 April 2009
16. 16. Errors What are the possible errors we can make? False positive. Choose alternative when null is correct. (aka Type 1) False negative. Choose null when alternative is true. (aka Type 2) Sunday, 19 April 2009
17. 17. Terminology Probability of a false positive called α Probability of false negative called 1 - β How are the two related? Usually care more about false positives. Usually pick arbitrary cut-off of what? Sunday, 19 April 2009
18. 18. Testing overview Write down null and alternative hypotheses. Compute test statistic. Convert to p-value. Compare p-value to alpha cut off. Sunday, 19 April 2009
19. 19. Back to example Sunday, 19 April 2009
20. 20. y y y y y y 6.5 y y yy y y y y y y y y y y y y y y y y y y y y y yy yy y yy y yyy yy yyyy y y yyy yy 6.0 y y y yy y y y y y y yy y y y y y yy y y y y y y y yyy yy yy y y y y y y y x x y x y 5.5 x yxx x x x x x x x x x xx x x x xx x x x x xx y x x x x x x xx x x x x x x x xx 5.0 x x x x xx xx x xx x x x x x x xx x x x x xx x xx xx x x x xx x x xx x x x x x x x xx x 4.5 x x xx x x 20 40 60 80 100 Sunday, 19 April 2009
21. 21. 2.0 1.5 Difference 1.0 0.5 0.0 20 40 60 80 100 Sunday, 19 April 2009
22. 22. 2.0 1.5 |Difference| 1.0 0.5 0.0 20 40 60 80 100 Sunday, 19 April 2009
23. 23. 4 3 z−score 2 1 0 20 40 60 80 100 Sunday, 19 April 2009
24. 24. 61 rejected 0.8 0.6 yintercept 0.4 0.2 0.0 20 40 60 80 100 Sunday, 19 April 2009
25. 25. Sunday, 19 April 2009
26. 26. y x y x x x y x 5.5 y x y y x x x y y x yy y x xyx x xx xy xx x x y yy y y x yyy xx xx x y y xyy xy xx x y xy x x yx y y yx x y y x y yy yy yy y x y xy xx xy yx yx y x y y y y yy yx xx xxx 5.0 y yyx x x y yy x y x yy yx y yy x y x x xx x x x y yyy x x x y y xy yx x x xy x y x xxx y x y xx y y x x yy y y yx x x xy x x xx y y y y xx x y y y x 4.5 y yx x y x y 20 40 60 80 100 Sunday, 19 April 2009
27. 27. 1.0 0.5 sameerence 0.0 −0.5 20 40 60 80 100 Can you think of another test- statistic based on this plot? Sunday, 19 April 2009
28. 28. 1.2 1.0 0.8 |sameerence| 0.6 0.4 0.2 0.0 20 40 60 80 100 Sunday, 19 April 2009
29. 29. 2.5 2.0 z−score 1.5 1.0 0.5 0.0 20 40 60 80 100 Sunday, 19 April 2009
30. 30. 1 rejected 0.8 0.6 yintercept 0.4 0.2 0.0 20 40 60 80 100 Sunday, 19 April 2009
31. 31. The rest of testing For a given situation, need to know a good test-statistic and the distribution under the null. Lots of standard cases, which you can now derive, or look up in a book. In a ﬁnal, I will either explicitly ask you to derive it, or I’ll give you the test statistic and null distribution. Sunday, 19 April 2009
32. 32. Next time Graded tests back. Information about the ﬁnal. (Incl. study session) What you I do with statistics (stat405). Other courses / Majoring in statistics. Celebrate being done. Sunday, 19 April 2009