Your SlideShare is downloading. ×
  • Like
ch_07.ppt
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply
Published

 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
4,224
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
149
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Chapter 7 Statistical Inference: Confidence Intervals
    • Learn ….
    • How to Estimate a Population
    • Parameter Using Sample Data
  • 2.
    • Section 7.1
    • What Are Point and Interval Estimates of Population Parameters?
  • 3. Point Estimate
    • A point estimate is a single number that is our “best guess” for the parameter
  • 4. Interval Estimate
    • An interval estimate is an interval of numbers within which the parameter value is believed to fall.
  • 5. Point Estimate vs Interval Estimate
  • 6. Point Estimate vs Interval Estimate
    • A point estimate doesn’t tell us how close the estimate is likely to be to the parameter
    • An interval estimate is more useful
      • It incorporates a margin of error which helps us to gauge the accuracy of the point estimate
  • 7. Point Estimation: How Do We Make a Best Guess for a Population Parameter?
    • Use an appropriate sample statistic:
      • For the population mean , use the sample mean
      • For the population proportion , use the sample proportion
  • 8. Point Estimation: How Do We Make a Best Guess for a Population Parameter?
    • Point estimates are the most common form of inference reported by the mass media
  • 9. Properties of Point Estimators
    • Property 1: A good estimator has a sampling distribution that is centered at the parameter
      • An estimator with this property is unbiased
        • The sample mean is an unbiased estimator of the population mean
        • The sample proportion is an unbiased estimator of the population proportion
  • 10. Properties of Point Estimators
    • Property 2: A good estimator has a small standard error compared to other estimators
      • This means it tends to fall closer than other estimates to the parameter
  • 11. Interval Estimation: Constructing an Interval that Contains the Parameter (We Hope!)
    • Inference about a parameter should provide not only a point estimate but should also indicate its likely precision
  • 12. Confidence Interval
    • A confidence interval is an interval containing the most believable values for a parameter
    • The probability that this method produces an interval that contains the parameter is called the confidence level
      • This is a number chosen to be close to 1, most commonly 0.95
  • 13. What is the Logic Behind Constructing a Confidence Interval?
    • To construct a confidence interval for a population proportion, start with the sampling distribution of a sample proportion
  • 14. The Sampling Distribution of the Sample Proportion
    • Gives the possible values for the sample proportion and their probabilities
    • Is approximately a normal distribution for large random samples
    • Has a mean equal to the population proportion
    • Has a standard deviation called the standard error
  • 15. A 95% Confidence Interval for a Population Proportion
    • Fact: Approximately 95% of a normal distribution falls within 1.96 standard deviations of the mean
      • That means: With probability 0.95, the sample proportion falls within about 1.96 standard errors of the population proportion
  • 16. Margin of Error
    • The margin of error measures how accurate the point estimate is likely to be in estimating a parameter
    • The distance of 1.96 standard errors in the margin of error for a 95% confidence interval
  • 17. Confidence Interval
    • A confidence interval is constructed by adding and subtracting a margin of error from a given point estimate
    • When the sampling distribution is approximately normal, a 95% confidence interval has margin of error equal to 1.96 standard errors
  • 18.
    • Section 7.2
    • How Can We Construct a Confidence Interval to Estimate a Population Proportion?
  • 19. Finding the 95% Confidence Interval for a Population Proportion
    • We symbolize a population proportion by p
    • The point estimate of the population proportion is the sample proportion
    • We symbolize the sample proportion by
  • 20. Finding the 95% Confidence Interval for a Population Proportion
    • A 95% confidence interval uses a margin of error = 1.96(standard errors)
    • [point estimate ± margin of error] =
  • 21. Finding the 95% Confidence Interval for a Population Proportion
    • The exact standard error of a sample proportion equals:
    • This formula depends on the unknown population proportion, p
    • In practice, we don’t know p , and we need to estimate the standard error
  • 22. Finding the 95% Confidence Interval for a Population Proportion
    • In practice, we use an estimated standard error :
  • 23. Finding the 95% Confidence Interval for a Population Proportion
    • A 95% confidence interval for a population proportion p is:
  • 24. Example: Would You Pay Higher Prices to Protect the Environment?
    • In 2000, the GSS asked: “Are you willing to pay much higher prices in order to protect the environment?”
      • Of n = 1154 respondents, 518 were willing to do so
  • 25. Example: Would You Pay Higher Prices to Protect the Environment?
    • Find and interpret a 95% confidence interval for the population proportion of adult Americans willing to do so at the time of the survey
  • 26. Example: Would You Pay Higher Prices to Protect the Environment?
  • 27. Sample Size Needed for Large-Sample Confidence Interval for a Proportion
    • For the 95% confidence interval for a proportion p to be valid, you should have at least 15 successes and 15 failures:
  • 28. “95% Confidence”
    • With probability 0.95 , a sample proportion value occurs such that the confidence interval contains the population proportion, p
    • With probability 0.05 , the method produces a confidence interval that misses p
  • 29. How Can We Use Confidence Levels Other than 95%?
    • In practice, the confidence level 0.95 is the most common choice
    • But, some applications require greater confidence
    • To increase the chance of a correct inference, we use a larger confidence level, such as 0.99
  • 30. A 99% Confidence Interval for p
  • 31. Different Confidence Levels
  • 32. Different Confidence Levels
    • In using confidence intervals, we must compromise between the desired margin of error and the desired confidence of a correct inference
      • As the desired confidence level increases, the margin of error gets larger
  • 33. What is the Error Probability for the Confidence Interval Method?
    • The general formula for the confidence interval for a population proportion is:
    • Sample proportion ± (z-score)(std. error)
    • which in symbols is
  • 34. What is the Error Probability for the Confidence Interval Method?
  • 35. Summary: Confidence Interval for a Population Proportion, p
    • A confidence interval for a population proportion p is:
  • 36. Summary: Effects of Confidence Level and Sample Size on Margin of Error
    • The margin of error for a confidence interval:
      • Increases as the confidence level increases
      • Decreases as the sample size increases
  • 37. What Does It Mean to Say that We Have “95% Confidence”?
    • If we used the 95% confidence interval method to estimate many population proportions, then in the long run about 95% of those intervals would give correct results, containing the population proportion
  • 38. A recent survey asked: “During the last year, did anyone take something from you by force?”
    • Of 987 subjects, 17 answered “yes”
    • Find the point estimate of the proportion of the population who were victims
    • .17
    • .017
    • .0017
  • 39.
    • Section 7.3
    • How Can We Construct a Confidence Interval To Estimate a Population Mean?
  • 40. How to Construct a Confidence Interval for a Population Mean
    • Point estimate ± margin of error
    • The sample mean is the point estimate of the population mean
    • The exact standard error of the sample mean is σ /
    • In practice, we estimate σ by the sample standard deviation, s
  • 41. How to Construct a Confidence Interval for a Population Mean
    • For large n…
      • and also
    • For small n from an underlying population that is normal…
    • The confidence interval for the population mean is:
  • 42. How to Construct a Confidence Interval for a Population Mean
    • In practice, we don’t know the population standard deviation
    • Substituting the sample standard deviation s for σ to get se = s/ introduces extra error
    • To account for this increased error, we replace the z-score by a slightly larger score, the t-score
  • 43. How to Construct a Confidence Interval for a Population Mean
    • In practice, we estimate the standard error of the sample mean by se = s/
    • Then, we multiply se by a t-score from the t-distribution to get the margin of error for a confidence interval for the population mean
  • 44. Properties of the t-distribution
    • The t-distribution is bell shaped and symmetric about 0
    • The probabilities depend on the degrees of freedom, df
    • The t-distribution has thicker tails and is more spread out than the standard normal distribution
  • 45. t-Distribution
  • 46. Summary: 95% Confidence Interval for a Population Mean
    • A 95% confidence interval for the population mean µ is:
    • To use this method, you need:
      • Data obtained by randomization
      • An approximately normal population distribution
  • 47. Example: eBay Auctions of Palm Handheld Computers
    • Do you tend to get a higher, or a lower, price if you give bidders the “buy-it-now” option?
  • 48. Example: eBay Auctions of Palm Handheld Computers
    • Consider some data from sales of the Palm M515 PDA (personal digital assistant)
    • During the first week of May 2003, 25 of these handheld computers were auctioned off, 7 of which had the “buy-it-now” option
  • 49. Example: eBay Auctions of Palm Handheld Computers
    • “ Buy-it-now” option:
    • 235 225 225 240 250 250 210
    • Bidding only:
    • 250 249 255 200 199 240 228 255 232 246 210 178 246 240 245 225 246 225
  • 50. Example: eBay Auctions of Palm Handheld Computers
    • Summary of selling prices for the two types of auctions:
    • buy_now N Mean StDev Minimum Q1 Median Q3
    • no 18 231.61 21.94 178.00 221.25 240.00 246.75 yes 7 233.57 14.64 210.00 225.00 235.00 250.00
    • buy_now Maximum
    • no 255.00
    • yes 250.00
  • 51. Example: eBay Auctions of Palm Handheld Computers
  • 52. Example: eBay Auctions of Palm Handheld Computers
    • To construct a confidence interval using the t-distribution, we must assume a random sample from an approximately normal population of selling prices
  • 53. Example: eBay Auctions of Palm Handheld Computers
    • Let µ denote the population mean for the “buy-it-now” option
    • The estimate of µ is the sample mean:
    • x = $233.57
    • The sample standard deviation is:
    • s = $14.64
  • 54. Example: eBay Auctions of Palm Handheld Computers
    • The 95% confidence interval for the “buy-it-now” option is:
    • which is 233.57 ± 13.54 or (220.03, 247.11)
  • 55. Example: eBay Auctions of Palm Handheld Computers
    • The 95% confidence interval for the mean sales price for the bidding only option is:
    • (220.70, 242.52)
  • 56. Example: eBay Auctions of Palm Handheld Computers
    • Notice that the two intervals overlap a great deal:
      • “ Buy-it-now”: (220.03, 247.11)
      • Bidding only: (220.70, 242.52)
    • There is not enough information for us to conclude that one probability distribution clearly has a higher mean than the other
  • 57. How Do We Find a t- C onfidence Interval for Other Confidence Levels?
    • The 95% confidence interval uses t .025 since 95% of the probability falls between - t .025 and t .025
    • For 99% confidence, the error probability is 0.01 with 0.005 in each tail and the appropriate t-score is t .005
  • 58. If the Population is Not Normal, is the Method “Robust”?
    • A basic assumption of the confidence interval using the t- distribution is that the population distribution is normal
    • Many variables have distributions that are far from normal
  • 59. If the Population is Not Normal, is the Method “Robust”?
    • How problematic is it if we use the t- confidence interval even if the population distribution is not normal?
  • 60. If the Population is Not Normal, is the Method “Robust”?
    • For large random samples, it’s not problematic
    • The Central Limit Theorem applies: for large n, the sampling distribution is bell-shaped even when the population is not
  • 61. If the Population is Not Normal, is the Method “Robust”?
    • What about a confidence interval using the t -distribution when n is small?
    • Even if the population distribution is not normal, confidence intervals using t -scores usually work quite well
    • We say the t -distribution is a robust method in terms of the normality assumption
  • 62. Cases Where the t- Confidence Interval Does Not Work
    • With binary data
    • With data that contain extreme outliers
  • 63. The Standard Normal Distribution is the t -Distribution with df = ∞
  • 64. The 2002 GSS asked: “What do you think is the ideal number of children in a family?”
    • The 497 females who responded had a median of 2, mean of 3.02, and standard deviation of 1.81. What is the point estimate of the population mean?
    • 497
    • 2
    • 3.02
    • 1.81
  • 65.
    • Section 7.4
    • How Do We Choose the Sample Size for a Study?
  • 66. How are the Sample Sizes Determined in Polls?
    • It depends on how much precision is needed as measured by the margin of error
    • The smaller the margin of error, the larger the sample size must be
  • 67. Choosing the Sample Size for Estimating a Population Proportion?
    • First, we must decide on the desired margin of error
    • Second, we must choose the confidence level for achieving that margin of error
    • In practice, 95% confidence intervals are most common
  • 68. Example: What Sample Size Do You Need For An Exit Poll?
    • A television network plans to predict the outcome of an election between two candidates – Levin and Sanchez
    • They will do this with an exit poll that randomly samples votes on election day
  • 69. Example: What Sample Size Do You Need For An Exit Poll?
    • The final poll a week before election day estimated Levin to be well ahead, 58% to 42%
      • So the outcome is not expected to be close
    • The researchers decide to use a sample size for which the margin of error is 0.04
  • 70. Example: What Sample Size Do You Need For An Exit Poll?
    • What is the sample size for which a 95% confidence interval for the population proportion has margin of error equal to 0.04?
  • 71. Example: What Sample Size Do You Need For An Exit Poll?
    • The 95% confidence interval for a population proportion p is:
    • If the sample size is such that 1.96( se ) = 0.04, then the margin of error will be 0.04
  • 72. Example: What Sample Size Do You Need For An Exit Poll?
    • Find the value of the sample size n for which 0.04 = 1.96( se ):
  • 73. Example: What Sample Size Do You Need For An Exit Poll?
    • A random sample of size n = 585 should give a margin of error of about 0.04 for a 95% confidence interval for the population proportion
  • 74. How Can We Select a Sample Size Without Guessing a Value for the Sample Proportion
    • In the formula for determining n, setting
    • = 0.50 gives the largest value for n out of all the possible values to substitute for
    • Doing this is the “safe” approach that guarantees we’ll have enough data
  • 75. Sample Size for Estimating a Population Parameter
    • The random sample size n for which a confidence interval for a population proportion p has margin of error m (such as m = 0.04) is
  • 76. Sample Size for Estimating a Population Parameter
    • The z-score is based on the confidence level, such as z = 1.96 for 95% confidence
    • You either guess the value you’d get for the sample proportion based on other information or take the safe approach of setting = 0.50
  • 77. Sample Size for Estimating a Population Mean
    • The random sample size n for which a 95% confidence interval for a population mean has margin of error approximately equal to m is
    • To use this formula, you guess the value you’ll get for the sample standard deviation, s
  • 78. Sample Size for Estimating a Population Mean
    • In practice, since you don’t yet have the data, you don’t know the value of the sample standard deviation, s
    • You must substitute an educated guess for s
      • You can use the sample standard deviation from a similar study
  • 79. Example: Finding n to Estimate Mean Education in South Africa
    • A social scientist plans a study of adult South Africans to investigate educational attainment in the black community
    • How large a sample size is needed so that a 95% confidence interval for the mean number of years of education has margin of error equal to 1 year?
  • 80. Example: Finding n to Estimate Mean Education in South Africa
    • No prior information about the standard deviation of educational attainment is available
    • We might guess that the sample education values fall within a range of about 18 years
  • 81. Example: Finding n to Estimate Mean Education in South Africa
    • If the data distribution is bell-shaped, the range from – 3 to + 3 will contain nearly all the distribution
    • The distance – 3 to + 3 equals 6s
    • Solving 18 = 6s for s yields s = 3
    • So ‘3’ is a crude estimate of s
  • 82. Example: Finding n to Estimate Mean Education in South Africa
    • The desired margin of error is m = 1 year
    • The required sample size is :
  • 83. What Factors Affect the Choice of the Sample Size?
    • The first is the desired precision, as measured by the margin of error, m
    • The second is the confidence level
  • 84. What Other Factors Affect the Choice of the Sample Size?
    • A third factor is the variability in the data
      • If subjects have little variation (that is, s is small), we need fewer data than if they have substantial variation
    • A fourth factor is financial
      • Cost is often a major constraint
  • 85. What if You Have to Use a Small n ?
    • The t- methods for a mean are valid for any n
      • However, you need to be extra cautious to look for extreme outliers or great departures from the normal population assumption
  • 86. What if You Have to Use a Small n ?
    • In the case of the confidence interval for a population proportion, the method works poorly for small samples
  • 87. Constructing a Small-Sample Confidence Interval for a Proportion
    • Suppose a random sample does not have at least 15 successes and 15 failures
    • The confidence interval formula:
      • Is still valid if we use it after adding ‘2’ to the original number of successes and ‘2’ to the original number of failures
    • This results in adding ‘4’ to the sample size n