Successfully reported this slideshow.
Your SlideShare is downloading. ×

# Chapter 9 Probability and Statistics.pdf

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Upcoming SlideShare
journalism.pptx.pdf
Loading in …3
×

1 of 38 Ad

# Chapter 9 Probability and Statistics.pdf

Probability and Statistics

Probability and Statistics

Advertisement
Advertisement

## More Related Content

Advertisement

### Chapter 9 Probability and Statistics.pdf

1. 1. BITS Pilani, Pilani Campus BITS Pilani Pilani Campus Course No: MTH F113 Probability and Statistics
2. 2. BITS Pilani, Pilani Campus BITS Pilani Pilani Campus Chapter 9: Inferences on Proportions Sumanta Pasari
3. 3. BITS Pilani, Pilani Campus What is Sample Proportion? Example 1: • Suppose a student guesses at the answer on every question in a 300-question examination. If he gets 60 questions correct, then his proportion of correct guesses is 60/300=0.20. • If he gets 75 questions correct, then his proportion of correct guesses is 75/300=.25. • Thus, the proportion of correct guesses is simply the number of correct guesses divided by the total number of questions. 3
4. 4. BITS Pilani, Pilani Campus What is Sample Proportion? Examples: • Babies: Is the proportion of Indian babies born male different from 0.50? • Handedness: Are more than 80% of Indians right handed? • Ice cream: Is the percentage of “creamery” customers who prefer chocolate ice cream over vanilla less than 70%? • Students: Is the proportion of Indian students visiting abroad during BTech is less than 0.20? 4
5. 5. BITS Pilani, Pilani Campus We would like to estimate “population proportion”, somehow using random samples. So, we have a population of interest; a particular trait (a distinguishing characteristic or quality) is being studied, and each member of the population can be classified as either having the trait or not (like, Bernoulli trials). Let’s Define…
6. 6. BITS Pilani, Pilani Campus Let us draw a random sample X1, X2, …, Xn of size n from population, where Xi = 1, if the i-th member of the sample has the trait = 0, if i-th sample does not have the trait Then, gives the number of objects in the sample with the trait and the statistic X/n gives the proportion of the sample with the trait. Note that X is a binomial RV with parameters n (known) and p. Let’s Define… 1 n X X i i   
7. 7. BITS Pilani, Pilani Campus The statistic that estimates the parameter , a proportion of a population that has some property, is number in sample with the trait (success) ˆ sample size the sample proportion Properties: (i) As th X p n p     (WH e s Y?) ample size increases ( large), the sampling ˆ distribution of becomes approximately normal 1 ˆ ˆ (ii) The mean of is , and variance of (WHY?) (iii) Can we get estimators of ? is Po n p p p p p p p n  int and interval estimator Sample Proportion
8. 8. BITS Pilani, Pilani Campus     1 Note that where, each is an independent point binomial (Bernoulli RV), that is, 1 and ˆ 0 1 , i i n i i X P X p P X X n p X p X n          Sample Proportion xi 1 0 f(xi) p 1-p E[Xi] = 1(p)+0(1-p)=p Var (Xi) = E[Xi 2]-(E[Xi])2 = p(1-p)
9. 9. BITS Pilani, Pilani Campus We just obtained the point estimator of p. Is it unbiased? Does it have small variance? Now to obtain CI on p, we would like to get a known statistic. Can the central limit theorem be useful? What are the requirements for CLT? Estimators for p 1 ˆ n i X X p X n n    
10. 10. BITS Pilani, Pilani Campus           2 2 2 Note that for large (using CLT), Taking two points symmetrically about the origin, 1 ˆ ˆ , 0,1 1 ˆ 1 1 con we get fidence l Here 1 is known as e p p p p p N p N n p p n p p P n z z p p n z                                         vel. Confidence Interval on p Interval Estimate = Point Estimate + /  Margin of Error `
11. 11. BITS Pilani, Pilani Campus       2 2 2 As is unknown, above confidence bounds are not statistics. So replace by ˆ unbiased estimator , and then the CI on having i 1 1 ˆ ˆ 1 confidence level 1 ˆ ˆ 1 ˆ s p p p p p p p p P p z p p z n n p p z                             2 The endpoints of the confidence interval is cal ˆ ˆ 1 ˆ , . confidence limits led . p p p p z n n            Confidence Interval on p
12. 12. BITS Pilani, Pilani Campus Confidence Interval on p
13. 13. BITS Pilani, Pilani Campus We can be 100(1-% sure that and p differ by at most d , where d is given by Thus, sample size for estimating p, when prior estimate available is 2 2 2 ˆ ˆ (1 ) p p n z d    p̂ Sample Size for Estimating p 2 ˆ ˆ (1 ) p p d z n   
14. 14. BITS Pilani, Pilani Campus   2 2 2 It can be shown Exercise 9, page 327 that the value of Thus, sample size for estimating ,when prior estimate is not available is 1 ˆ ˆ (1 ) . 4 . 4     p p z n d p Sample Size for Estimating p
15. 15. BITS Pilani, Pilani Campus Problem Solving Ex 9.1 (Ex. 2, Section 9.1, Page 326) A study of electromechanical protection devices used in electrical power systems showed that of 193 devices that failed when tested, 75 were due to mechanical part failures. a)Find a point estimate for p, the proportion of failures that are due to mechanical failures. b)Find a 95% confidence interval on p. c)How large a sample is required to estimate p to within 0.03 with 95% confidence.
16. 16. BITS Pilani, Pilani Campus Random variable X= number of failed devices which were due to mechanical failure among 193 failed devices. X has approx. normal dist with mean = 193p, variance=193p(1-p). a) Point estimate for p = = x/n = 75/193 = 0.3886. b) 95% confidence interval on p is ˆobs p      0.025 75 75 75 1 193 0.3198,0.4574 193 193 193 2 1.96 0.389 0.611 (c) (with using prior estimate) ~1015 2 (0.03 ) 2 2 1.96 / 2 (without using prior estimate) ~1068. 2 2 4 (4)(0.03 ) z n z n d              Problem Solving
17. 17. BITS Pilani, Pilani Campus Ex 9.2 (Ex. 33, Page 333) A survey of mining companies is to be conducted to estimate p, the proportion of companies that anticipate hiring either graduating seniors or experienced engineers during the coming year. a) How large a sample is required to estimate p within 0.04 with 94% confidence? b) A sample of size 500 yields 105 companies that plan to hire such engineers. Find the point estimate of p. Also find 94% confidence interval for p. Problem Solving
18. 18. BITS Pilani, Pilani Campus 2 2 0.03 2 2 2 2 1.88 ( ) (estimate unknown) 552.25 ~ 553. (4)(0.04 ) (4)(0.04 ) Suppose we already know 0.2 p 0.35. Then we can improve n by maximizing p(1-p) on this interval to get 1.88 max{ (1 ) :0.2 0 0.04 z a n n p p p          2 2 .35} 1.88 max{(0.2)(0.8),(0.35)(0.65)} 0.04 502.5475 ~ 503.                           Problem Solving
19. 19. BITS Pilani, Pilani Campus ˆ ( ) 105 / 500 0.210. 94% C.I. for p is obs b p   0.03 0.03 ˆ ˆ ˆ ˆ ˆ ˆ (1 ) / (1 ) / . . 0.136 0.244 obs obs obs obs obs obs p z p p n p p z p p n i e p         Problem Solving
20. 20. BITS Pilani, Pilani Campus 20 Problem Solving HW 9.1 A major metropolitan newspaper selected a random sample of 1,600 readers from their list of 100,000 subscribers. They asked whether the paper should increase its coverage of local news. It was found that 640 readers out of the random sample of 1600 readers wanted more local news. (a) What is the point estimate for p, the proportion of readers who would like more coverage of local news (b) Find 95% and 99% confidence intervals on p. (c) How large a sample is required to estimate p to within 0.05 with 99% confidence?
21. 21. BITS Pilani, Pilani Campus 21 Problem Solving HW 9.2 To examine the proportion of left-handed professional baseball players, we choose a random sample of size 59 from this population. It was observed that there are 15 left-handed baseball players in the given random sample. (a) What is the point estimate for p, the proportion of left handed basketball players. (b) Find 95% confidence intervals on p and interpret the result.
22. 22. BITS Pilani, Pilani Campus 22 Problem Solving HW 9.3 The cure rate for a standard treatment of a disease is 45%. Dr. Sen has perfected a primitive treatment which he claims is much better. As evidence, he says that he has used his new treatment on 50 patients with the disease and cured 25 of them. What do you think? Is this new treatment better? Use a 95% confidence interval to answer the question [Ans: (0.36, 0.64)]. Does this example give some notion of hypothesis testing on proportion?
23. 23. BITS Pilani, Pilani Campus 23 Problem Solving HW 9.4 The advocacy group of a company took a random sample of 1000 consumers who recently purchased their MP3 player and found that 400 were satisfied with their purchase. Find a 95% confidence interval for p, the proportion of customers that are satisfied with the MP3 player of that company. HW 9.5 Experimenters injected a growth hormone gene into thousands of carp eggs. Of the 400 carp that grew from these eggs, 20 incorporated the gene into their DNA (Science News, May 20, 1989). Calculate a 95% confidence interval for the proportion of carp that would incorporate the gene into their DNA. [Ans.: (.03,.07)]
24. 24. BITS Pilani, Pilani Campus A hypothesis test about the value of a population proportion p must take one of the following three forms (where p0 is the hypothesized value of the population). Then, One-tailed (lower tail) One-tailed (upper tail) Two-tailed 0 0 : H p p  0 : a H p p  0 : a H p p  0 0 : H p p  0 0 : H p p  0 : a H p p  Hypothesis Testing on Proportions
25. 25. BITS Pilani, Pilani Campus 9.2 Hypothesis testing on proportions Let p0 be the null value of p. Then H0:p=p0 H0:p=p0 H0:p=p0 H1:p>p0 H1:p<p0 H1:p≠p0 Right-tailed test Left-tailed test Two-tailed test
26. 26. BITS Pilani, Pilani Campus     0 0 0 0 0 Teststatisticfor testing ˆ : 1 H p p p n p p Z p     Hypothesis Testing on Proportions  Rejection Rule: P –Value Approach Reject H0 if P –value <  0.05  Rejection Rule: Critical Value Approach H0: p  p0 Reject H0 if z > z Reject H0 if z < -z Reject H0 if z < -z or z > z H0: p  p0 H0: p  p0
27. 27. BITS Pilani, Pilani Campus Note that the test statistic is a logical choice because it compares the unbiased estimator of p with the null value of p0.     0 0 0 0 0 Teststatisticfor testing ˆ : 1 H p p p n p p Z p     Hypothesis Testing on Proportions
28. 28. BITS Pilani, Pilani Campus Steps of Hypothesis Testing 28 Step 1. Develop the null and alternative hypotheses of proportion; Calculate the observed value of test statistic. Step 2. Specify the level of significance . Step 3. Collect the sample data and compute the test statistic. Step 4. Based on , identify critical values (use the value of test statistic to compute the P-value). Step 5. Reject H0 if the calculated test statistic value falls in the rejection region (Reject H0 if P-value < ).
29. 29. BITS Pilani, Pilani Campus Ex 9.3 For a Christmas and New Year’s week, the National Safety Council estimated that 500 people would be killed and 25,000 injured on the nation’s roads. The NSC claimed that 50% of the accidents would be caused by drunk driving. To verify, a sample of 120 accidents showed that 67 were caused by drunk driving. Use these data to test the NSC’s claim at α = 0.05 Sol.: Discussed in class from two approaches, P- value and critical value Problem Solving
30. 30. BITS Pilani, Pilani Campus Ex 9.4 A marketing company claims that it receives 8% responses from its mailing. To test this claim, a random sample of 500 were surveyed with 25 responses. Test at the  =0.05 significance level. Sol.: Discussed in class from two approaches, P- value and critical value Problem Solving 30/55
31. 31. BITS Pilani, Pilani Campus Ex 9.5 (Ex. 10) A poll of investment analysts taken earlier suggests that a majority of these individuals think that the dominant issue affecting the future of the solar energy industry is falling energy prices. A new survey is being taken to see if this is still the case. Let p denote the proportion of investment analysts holding this opinion. (a) Set up the appropriate null and alternative hypothesis. Problem Solving
32. 32. BITS Pilani, Pilani Campus b)When the survey is conducted, 59 of the 100 analysts sampled agreed that the major issue is falling energy prices. Is this sufficient to allow us to reject H0? Explain based on P-value of the test. c) Interpret your results in the context of this problem. Problem Solving
33. 33. BITS Pilani, Pilani Campus Solution p = proportion of persons from the population who feel falling energy prices is a dominant issue for future of solar energy. a) H0: p≤ 0.5 Vs H1: p > 0.5 b) These can be replaced by H0: p = 0.5 Vs H1: p > 0.5. To test these, we have used a sample with n = 100, ˆ / 0.59. obs p x n  
34. 34. BITS Pilani, Pilani Campus 0 0 0 0 ˆ 0.09 1.8. (1 ) / 0.25 /100 ( 1.8) 0.0359 P-value is relatively small. Hence we reject H . obs p p z p p n P value P Z          c) A majority of investment analysis think that falling energy prices is the dominant issue in the future of solar energy. Solution (Ex. 9.3)
35. 35. BITS Pilani, Pilani Campus HW 9.6 (Ex. 15) A battery operated digital pressure monitor is being developed for use in calibrating pneumatic pressure gauges in the field. It is thought that 95 % of the readings it gives lies within .01 lb/in2 of the true reading. In a series of 100 tests, the gauge is subjected to a pressure of 10,000 lb/in2. A test is considered to be a success if the reading lies within 10,000± 0.01 lb/in2. We want to test H0: p=0.95 H1: p≠0.95 at the =0.05 level. Problem Solving
36. 36. BITS Pilani, Pilani Campus a) What are the critical points for the test. b) When the data are gathered, it is found that 98 out of 100 readings were successful. Can H0 be rejected at α=0.05 level? To what type of error are you now subject? Problem Solving
37. 37. BITS Pilani, Pilani Campus Hints: H0: p=0.95 H1: p≠0.95 Z has the standard normal distribution. Since =0.05, z0.025 = 1.96, can we get critical values? . 2179 . 0 ) 95 . 0 ˆ ( 10 100 / ) 05 . 0 )( 95 . 0 ( 95 . 0 ˆ / ) 1 ( ˆ : Statistic Test 0 0 0        p p n p p p p Z
38. 38. BITS Pilani, Pilani Campus HW 9.7 (Ex. 14). Opponents of the construction of a dam on the New river claim that less than half of the residents living along the river are in favour of its construction. A survey is conducted to gain support for this point of view. a) Set up the appropriate null and alternate hypotheses. b) Find the critical point for an α=0.1 level test. c) Of 500 people surveyed, 230 favour the construction. Is it a sufficient evidence to justify the claim of the opponents? d) To what type of error are you now subject? Discuss the practical consequences of making such an error. Problem Solving