25 Testing

358 views
317 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
358
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

25 Testing

  1. 1. Stat310 Testing Hadley Wickham Sunday, 19 April 2009
  2. 2. 1. Import question 2. Recap 3. More examples/practice 4. Choosing a cut-off 5. P value is a random variable too! 6. Next time Sunday, 19 April 2009
  3. 3. Final Which would you prefer? a) a 3 hour final b) a 2 hour final Sunday, 19 April 2009
  4. 4. Recap What is a null hypothesis? What is an alternative hypothesis? What is the opposite of rejecting the null hypothesis? Why? Sunday, 19 April 2009
  5. 5. Testing jargon No: Null hypothesis. Nothing is happening. (Thing we want to disprove) Yes: Alternative hypothesis. Something interesting is happening. Sunday, 19 April 2009
  6. 6. Absence of evidence is not evidence of absence Sunday, 19 April 2009
  7. 7. The lady tasting tea A thought experiment by R. A. Fisher (famous early statistician, 1890-1962) A lady at a tea party claims that she can tell the difference between putting the milk in first and second. How can we be sure? Sunday, 19 April 2009
  8. 8. Experiment 8 cups. 4 milk first, 4 milk second. Presented in random order. What is the null hypothesis? How many possible outcomes are there? Sunday, 19 April 2009
  9. 9. Your turn What would the distribution of correct responses be under the null hypothesis? How many would she need to get correct for us to be reasonably certain that she really could tell the difference? Sunday, 19 April 2009
  10. 10. Right Wrong # % 4 0 1 1% 3 1 16 23% 2 2 36 51% 1 3 16 23% 0 4 1 1% 70 100% Sunday, 19 April 2009
  11. 11. Another example Xi ~ iid Normal(μx, 1) Yi ~ iid Normal(μy, 1) Do they have the same means? Sunday, 19 April 2009
  12. 12. 1. Write down null and alternative hypotheses 2. Figure out good test statistic (for this class, usually obvious) 3. Work out distribution under the null Sunday, 19 April 2009
  13. 13. Experiment x = 7.0 5.8 2.0 5.0 6.1 5.6 4.3 4.0 4.8 6.5 y = 6.2 4.0 5.8 5.9 5.7 6.0 6.2 5.7 5.4 5.8 (mean of x = 5.67, mean of y = 5.11) Are the means of the underlying distributions the same? (True answer?) Sunday, 19 April 2009
  14. 14. 1. Compute test statistic 2. Compute p-value, by evaluating F at the test-statistic 3. (Question: what is the distribution of the p-value if the null hypothesis is true?) Sunday, 19 April 2009
  15. 15. P-value P value gives us the probability, under the null hypothesis, that we would have seen a value equal to or more extreme than the value we observed. Strength of evidence for rejecting the null hypothesis. But we need a cut off to make a yes-no decision. How do we choose that cut off? Sunday, 19 April 2009
  16. 16. Errors What are the possible errors we can make? False positive. Choose alternative when null is correct. (aka Type 1) False negative. Choose null when alternative is true. (aka Type 2) Sunday, 19 April 2009
  17. 17. Terminology Probability of a false positive called α Probability of false negative called 1 - β How are the two related? Usually care more about false positives. Usually pick arbitrary cut-off of what? Sunday, 19 April 2009
  18. 18. Testing overview Write down null and alternative hypotheses. Compute test statistic. Convert to p-value. Compare p-value to alpha cut off. Sunday, 19 April 2009
  19. 19. Back to example Sunday, 19 April 2009
  20. 20. y y y y y y 6.5 y y yy y y y y y y y y y y y y y y y y y y y y y yy yy y yy y yyy yy yyyy y y yyy yy 6.0 y y y yy y y y y y y yy y y y y y yy y y y y y y y yyy yy yy y y y y y y y x x y x y 5.5 x yxx x x x x x x x x x xx x x x xx x x x x xx y x x x x x x xx x x x x x x x xx 5.0 x x x x xx xx x xx x x x x x x xx x x x x xx x xx xx x x x xx x x xx x x x x x x x xx x 4.5 x x xx x x 20 40 60 80 100 Sunday, 19 April 2009
  21. 21. 2.0 1.5 Difference 1.0 0.5 0.0 20 40 60 80 100 Sunday, 19 April 2009
  22. 22. 2.0 1.5 |Difference| 1.0 0.5 0.0 20 40 60 80 100 Sunday, 19 April 2009
  23. 23. 4 3 z−score 2 1 0 20 40 60 80 100 Sunday, 19 April 2009
  24. 24. 61 rejected 0.8 0.6 yintercept 0.4 0.2 0.0 20 40 60 80 100 Sunday, 19 April 2009
  25. 25. Sunday, 19 April 2009
  26. 26. y x y x x x y x 5.5 y x y y x x x y y x yy y x xyx x xx xy xx x x y yy y y x yyy xx xx x y y xyy xy xx x y xy x x yx y y yx x y y x y yy yy yy y x y xy xx xy yx yx y x y y y y yy yx xx xxx 5.0 y yyx x x y yy x y x yy yx y yy x y x x xx x x x y yyy x x x y y xy yx x x xy x y x xxx y x y xx y y x x yy y y yx x x xy x x xx y y y y xx x y y y x 4.5 y yx x y x y 20 40 60 80 100 Sunday, 19 April 2009
  27. 27. 1.0 0.5 sameerence 0.0 −0.5 20 40 60 80 100 Can you think of another test- statistic based on this plot? Sunday, 19 April 2009
  28. 28. 1.2 1.0 0.8 |sameerence| 0.6 0.4 0.2 0.0 20 40 60 80 100 Sunday, 19 April 2009
  29. 29. 2.5 2.0 z−score 1.5 1.0 0.5 0.0 20 40 60 80 100 Sunday, 19 April 2009
  30. 30. 1 rejected 0.8 0.6 yintercept 0.4 0.2 0.0 20 40 60 80 100 Sunday, 19 April 2009
  31. 31. The rest of testing For a given situation, need to know a good test-statistic and the distribution under the null. Lots of standard cases, which you can now derive, or look up in a book. In a final, I will either explicitly ask you to derive it, or I’ll give you the test statistic and null distribution. Sunday, 19 April 2009
  32. 32. Next time Graded tests back. Information about the final. (Incl. study session) What you I do with statistics (stat405). Other courses / Majoring in statistics. Celebrate being done. Sunday, 19 April 2009

×