Upcoming SlideShare
×

# 19 Inference

729 views

Published on

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
729
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
10
0
Likes
0
Embeds 0
No embeds

No notes for slide

### 19 Inference

1. 1. Stat310 Introduction to inference Hadley Wickham Saturday, 27 March 2010
2. 2. Continuity correction Applies only to sums, not means! Important only where n smaller. Saturday, 27 March 2010
3. 3. Homework Learn the change of variable steps! If X and Y are independent, and you know f(x) and f(y), what is f(x, y)? Common pattern Saturday, 27 March 2010
4. 4. What proportion of Rice students smoke marijuana? Saturday, 27 March 2010
5. 5. 1. Introduction to inference 2. A survey 3. Estimators and their properties 4. Feedback Saturday, 27 March 2010
6. 6. Data vs. Distributions Random experiments produce data. A repeatable random experiment has some underlying distribution. We want to go from the data to say something about the underlying distribution. So far we have been doing the opposite. Saturday, 27 March 2010
7. 7. Inference Up to now: Given a sequence of random variables with distribution F, what can we say about the mean of a sample? What we really want: Given the mean of a sample, what can we say about the underlying sequence of random variables? Saturday, 27 March 2010
8. 8. Marijuana Up to now: If I told you that smoking pot was Binomially distributed with probability p, you could tell me what I would expect to see if I sampled n people at random. What we want: If I sample n people at random and ﬁnd out m of them smoke pot, what does that tell me about p? Saturday, 27 March 2010
9. 9. Your turn Let’s I selected 10 people at random from the Rice campus and asked them if they’d smoked pot in the last month. Two said yes. Let Yi be 1 if person i smoked pot and 0 otherwise. What’s a good guess for distribution of Yi? Saturday, 27 March 2010
10. 10. 3.5e−08 3.0e−08 2.5e−08 2.0e−08 prob 1.5e−08 1.0e−08 5.0e−09 0.0e+00 0.0 0.2 0.4 0.6 0.8 1.0 p Saturday, 27 March 2010
11. 11. Plug-in principle A good ﬁrst guess at the true mean, variance, or any other parameter of a distribution is simply to estimate it from the data. We’ll learn more sophisticated ways after the break. Saturday, 27 March 2010
12. 12. Problems If I picked someone at random out of the Rice campus directory, and asked them if they’d smoked pot in the last month, how likely do you think I am to get an honest answer? How can we get around this? Saturday, 27 March 2010
13. 13. Your turn Compare your survey to someone with the other colour survey. What is the difference? How could you use the answers to estimate how many people smoke pot? Saturday, 27 March 2010
14. 14. Questions 1. Have you smoked pot in the last month? 2. Flip a coin. If it’s heads, write yes. Otherwise write yes if you’ve smoked pot in the last month, and no otherwise. 3. In the last month, how many of the following activities have you done? One list contains smoking pot, the other doesn’t. Saturday, 27 March 2010
15. 15. Results 1. The proportion of yeses in Q1. 2. The proportion of yeses in (Q2 - 0.5) * 2 3. (Q3a/4 - Q3b/5) / n. Saturday, 27 March 2010
16. 16. Your turn Fill in the survey. Make sure no one else can see your answers! Saturday, 27 March 2010
17. 17. Deﬁnitions Parameter space: set of all possible parameter values Estimator: process/function which takes data and gives best guess for parameter Point estimate: estimator for a single value Saturday, 27 March 2010
18. 18. Your turn With a partner, brainstorm as many different properties that you could use to decide which estimator is best. Remember that there is some true unknown proportion that we are trying to estimate. Saturday, 27 March 2010
19. 19. Important properties of an estimator Unbiased Small variance Consistent Sufﬁcient Saturday, 27 March 2010
20. 20. ˆ =θ E(θ) Unbiased ˆ1 ) < V ar(θ2 ) V ar(θ ˆ Minimum variance Common problem is to ﬁnd UMVE (unbiased minimum variance estimator) across all possible estimators Saturday, 27 March 2010
21. 21. Low bias, low variance Low bias, high variance High bias, low variance High bias, high variance Saturday, 27 March 2010
22. 22. Consistent n=10 n=5 n=1 Saturday, 27 March 2010
23. 23. Results Saturday, 27 March 2010
24. 24. Your turn How can we describe this as a random experiment? Do you have any concerns about applying the results of this survey to the entire Rice campus? Saturday, 27 March 2010
25. 25. Experiments Remember, the context is always the random experiment. For this case the randomness comes from picking the person at random. Repeating the experiment does not mean asking each person again (the answer would be the same) but asking a different set of people. Saturday, 27 March 2010
26. 26. Next time (After the test and spring recess) Method of moments Maximum likelihood Saturday, 27 March 2010
27. 27. Feedback Saturday, 27 March 2010