Your SlideShare is downloading. ×
19 Inference
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

19 Inference


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Stat310 Introduction to inference Hadley Wickham Saturday, 27 March 2010
  • 2. Continuity correction Applies only to sums, not means! Important only where n smaller. Saturday, 27 March 2010
  • 3. Homework Learn the change of variable steps! If X and Y are independent, and you know f(x) and f(y), what is f(x, y)? Common pattern Saturday, 27 March 2010
  • 4. What proportion of Rice students smoke marijuana? Saturday, 27 March 2010
  • 5. 1. Introduction to inference 2. A survey 3. Estimators and their properties 4. Feedback Saturday, 27 March 2010
  • 6. Data vs. Distributions Random experiments produce data. A repeatable random experiment has some underlying distribution. We want to go from the data to say something about the underlying distribution. So far we have been doing the opposite. Saturday, 27 March 2010
  • 7. Inference Up to now: Given a sequence of random variables with distribution F, what can we say about the mean of a sample? What we really want: Given the mean of a sample, what can we say about the underlying sequence of random variables? Saturday, 27 March 2010
  • 8. Marijuana Up to now: If I told you that smoking pot was Binomially distributed with probability p, you could tell me what I would expect to see if I sampled n people at random. What we want: If I sample n people at random and find out m of them smoke pot, what does that tell me about p? Saturday, 27 March 2010
  • 9. Your turn Let’s I selected 10 people at random from the Rice campus and asked them if they’d smoked pot in the last month. Two said yes. Let Yi be 1 if person i smoked pot and 0 otherwise. What’s a good guess for distribution of Yi? Saturday, 27 March 2010
  • 10. 3.5e−08 3.0e−08 2.5e−08 2.0e−08 prob 1.5e−08 1.0e−08 5.0e−09 0.0e+00 0.0 0.2 0.4 0.6 0.8 1.0 p Saturday, 27 March 2010
  • 11. Plug-in principle A good first guess at the true mean, variance, or any other parameter of a distribution is simply to estimate it from the data. We’ll learn more sophisticated ways after the break. Saturday, 27 March 2010
  • 12. Problems If I picked someone at random out of the Rice campus directory, and asked them if they’d smoked pot in the last month, how likely do you think I am to get an honest answer? How can we get around this? Saturday, 27 March 2010
  • 13. Your turn Compare your survey to someone with the other colour survey. What is the difference? How could you use the answers to estimate how many people smoke pot? Saturday, 27 March 2010
  • 14. Questions 1. Have you smoked pot in the last month? 2. Flip a coin. If it’s heads, write yes. Otherwise write yes if you’ve smoked pot in the last month, and no otherwise. 3. In the last month, how many of the following activities have you done? One list contains smoking pot, the other doesn’t. Saturday, 27 March 2010
  • 15. Results 1. The proportion of yeses in Q1. 2. The proportion of yeses in (Q2 - 0.5) * 2 3. (Q3a/4 - Q3b/5) / n. Saturday, 27 March 2010
  • 16. Your turn Fill in the survey. Make sure no one else can see your answers! Saturday, 27 March 2010
  • 17. Definitions Parameter space: set of all possible parameter values Estimator: process/function which takes data and gives best guess for parameter Point estimate: estimator for a single value Saturday, 27 March 2010
  • 18. Your turn With a partner, brainstorm as many different properties that you could use to decide which estimator is best. Remember that there is some true unknown proportion that we are trying to estimate. Saturday, 27 March 2010
  • 19. Important properties of an estimator Unbiased Small variance Consistent Sufficient Saturday, 27 March 2010
  • 20. ˆ =θ E(θ) Unbiased ˆ1 ) < V ar(θ2 ) V ar(θ ˆ Minimum variance Common problem is to find UMVE (unbiased minimum variance estimator) across all possible estimators Saturday, 27 March 2010
  • 21. Low bias, low variance Low bias, high variance High bias, low variance High bias, high variance Saturday, 27 March 2010
  • 22. Consistent n=10 n=5 n=1 Saturday, 27 March 2010
  • 23. Results Saturday, 27 March 2010
  • 24. Your turn How can we describe this as a random experiment? Do you have any concerns about applying the results of this survey to the entire Rice campus? Saturday, 27 March 2010
  • 25. Experiments Remember, the context is always the random experiment. For this case the randomness comes from picking the person at random. Repeating the experiment does not mean asking each person again (the answer would be the same) but asking a different set of people. Saturday, 27 March 2010
  • 26. Next time (After the test and spring recess) Method of moments Maximum likelihood Saturday, 27 March 2010
  • 27. Feedback Saturday, 27 March 2010