This document discusses estimating population parameters such as proportions, means, and standard deviations from sample data. It covers how to calculate confidence intervals for a population proportion based on a sample proportion. The key steps are to determine the sample proportion, calculate the margin of error using the sample size and a critical z-value, and use these to estimate the confidence interval. An example is provided to demonstrate calculating the confidence interval for a population proportion based on survey data. The summary accurately conveys the main topic and methods discussed in the document in under 3 sentences.
2. Chapter 7:
Estimating Parameters and Determining Sample Sizes
7.1 Estimating a Population Proportion
7.2 Estimating a Population Mean
7.3 Estimating a Population Standard Deviation or Variance
7.4 Bootstrapping: Using Technology for Estimates
2
Objectives:
• Find the confidence interval for a proportion.
• Determine the minimum sample size for finding a confidence interval for a proportion.
• Find the confidence interval for the mean when is known.
• Determine the minimum sample size for finding a confidence interval for the mean.
• Find the confidence interval for the mean when is unknown.
• Find a confidence interval for a variance and a standard deviation.
3. Recall: The Standard Normal Distribution
Normal Distribution
If a continuous random variable has a distribution with a graph that
is symmetric and bell-shaped, we say that it has a normal
distribution. The shape and position of the normal distribution
curve depend on two parameters, the mean and the standard
deviation.
SND: 1) Bell-shaped 2) µ = 0 3) σ = 1
3
2 2
( ) (2 )
2
x
e
y
2
1
2
:
2
x
e
OR y
x
z
TI Calculator:
Normal Distribution Area
1. 2nd + VARS
2. normalcdf(
3. 4 entries required
4. Left bound, Right
bound, value of the
Mean, Standard
deviation
5. Enter
6. For −∞, 𝒖𝒔𝒆 − 𝟏𝟎𝟎𝟎
7. For ∞, 𝒖𝒔𝒆 𝟏𝟎𝟎𝟎
TI Calculator:
Normal Distribution: find
the Z-score
1. 2nd + VARS
2. invNorm(
3. 3 entries required
4. Left Area, value of the
Mean, Standard
deviation
5. Enter
4. Sampling Distributions and Estimators: An estimator is a statistic used to infer (or
estimate) the value of a population parameter.
Unbiased Estimator: An unbiased estimator is a statistic that targets the value of the
corresponding population parameter in the sense that the sampling distribution of the
statistic has a mean that is equal to the corresponding population parameter such as:
Proportion: 𝒑 Mean: 𝒙 Variance: s²
Biased Estimator: These statistics are biased estimators. That is, they do not target the
value of the corresponding population parameter: Median, Range, Standard deviation s
Assessing Normality:
Determine whether the requirement of a normal distribution is satisfied:
(1) visual inspection of a histogram to see if it is roughly bell-shaped
(2) identifying any outliers; and
(3) constructing a normal quantile plot.
Normal Quantile Plot
A normal quantile plot (or normal probability plot) is a graph of points (x, y) where each x
value is from the original set of sample data, and each y value is the corresponding z score that is
expected from the standard normal distribution.
x, y = Z-score)
4
Normal as Approximation
to Binomial: Use a normal
distribution as an
approximation to the binomial
probability distribution: BD: n
& p⇾ q = 1 − p
if the conditions of np ≥ 5 and
nq ≥ 5 are both satisfied, then
probabilities from a binomial
probability distribution can be
approximated reasonably well
by using a normal distribution
having these parameters: 𝜇 =
𝑛𝑝 & 𝜎 = 𝑛𝑝𝑞
The binomial probability
distribution is discrete (with
whole numbers for the random
variable x), but the normal
approximation is continuous.
To compensate, we use a
“continuity correction” with a
whole number x represented by
the interval from x − 0.5 to x +
0.5.
Recall: Normal Distribution
5. Recall: Central Limit Theorem and the Sampling Distribution 𝒙
A sampling distribution of sample means is a distribution obtained by using the means computed from random samples of a
specific size taken from a population. Sampling error is the difference between the sample measure and the corresponding
population measure due to the fact that the sample is not a perfect representation of the population.
The Central Limit Theorem tells us that for a population with any distribution, the distribution of the sample means
approaches a normal distribution as the sample size increases.
1. The random variable x has a distribution (which may or may not be normal) with mean μ and standard deviation σ.
2. Simple random samples all of size n are selected from the population. (The samples are selected so that all possible samples of
the same size n have the same chance of being selected.)
Conclusion:
1. The distribution of sample 𝑥 will, as the sample size increases, approach a normal distribution.
2. The mean of the sample means is the population mean 𝜇 𝑥 = 𝜇 .
3. The standard deviation of all sample means ( also called the standard error of the mean) is 𝜎 𝑥= 𝜎/ 𝑛
1. For samples of size n > 30, the distribution of the sample means can be approximated reasonably well by a normal distribution.
The approximation becomes closer to a normal distribution as the sample size n becomes larger.
2. If the original population is normally distributed, then for any sample size n, the sample means will be normally distributed (not
just the values of n larger than 30).
Requirements: Population has a normal distribution or n > 30:
5/
x
x
x x
z
n
6. Margin of Error & Confidence Interval for Estimating a Population Proportion p &
Determining Sample Size:
When data from a simple random sample are used to estimate a population proportion
p, the margin of error (maximum error of the estimate ), denoted by E, is the
maximum likely difference (with probability 1 – α, such as 0.95) between the
observed (sample) proportion 𝑝 and the true value of the population proportion p.
6
ˆ
ˆ ˆ
p E
p E p p E
2 2
ˆ ˆ ˆ ˆ
ˆ ˆ
pq pq
p z p p z
n n
7.1 Estimating a Population Proportion Summary
2
ˆ ˆpq
E z
n
2
2
2
ˆ ˆ( )z pq
n
E
2
2
2
( ) 0.25z
n
E
When no estimate of 𝒑 is
known: 𝒑 = 𝒒 =0.5
Point estimate of p: 𝑝 =
𝑈𝐶𝐿+𝐿𝐶𝐿
2
, UCL: Upper Confidence Limit
Margin of error: 𝐸 =
𝑈𝐶𝐿−𝐿𝐶𝐿
2
, LCL: Lower Confidence Limit
7. Key Concept: Use a Sample proportion to make an inference about the value of the corresponding population proportion.
A point estimate is a specific numerical value estimate of a parameter. (A point estimate is a single value (or point) used to approximate
a population parameter. )
Properties of a good estimator:
1. The estimator should be an unbiased estimator. That is, the expected value or the mean of the estimates obtained from samples
of a given size is equal to the parameter being estimated.
2. The estimator should be consistent. For a consistent estimator, as sample size increases, the value of the estimator approaches the
value of the parameter estimated.
3. The estimator should be a relatively efficient estimator; that is, of all the statistics that can be used to estimate a parameter, the
relatively efficient estimator has the smallest variance.
• The sample proportion 𝑝 is the best point estimate of the population proportion p.
• p =
𝑋
𝑁
= population proportion 𝑝 =
𝑥
𝑛
(read p “hat”) = sample proportion, 𝑞 = 1 − 𝑝
x number of sample units that possess the characteristics of interest and n = sample size.
X number of population units that possess the characteristics of interest and N = population size.
• Confidence Interval: We can use a sample proportion to construct a confidence interval estimate of the true value of a population
proportion.
• Sample Size: We should know how to find the sample size necessary to estimate a population proportion.
7.1 Estimating a Population Proportion
7
8. A Gallup poll was taken in which 1487 adults were surveyed and 43% of them
said that they have a Facebook page. Based on that result, find the best point
estimate of the proportion of all adults who have a Facebook page.
Example 1
Solution
Point estimate (PE) of p = 𝑝 = 0.43
Example 2
In a recent survey of 150 households, 54 had central air conditioning. Find 𝑝 and 𝑞 ,
where 𝑝 is the proportion of households that have central air conditioning.
x = 54 and n = 150 54
ˆ 0.36 36%
150
x
p
n
ˆ ˆ1 1 0.36 0.64 64% q p
8
9. Confidence Interval: A confidence interval (or interval estimate) is a range (or
an interval) of values used to estimate the true value of a population parameter. A
confidence interval is sometimes abbreviated as CI. This estimate may or may not
contain the value of the parameter being estimated.
Confidence Level: The confidence level is the probability 1 − α (such as 0.95, or
95%) that the confidence interval actually does contain the population parameter,
assuming that the estimation process is repeated a large number of times. (The
confidence level is also called the degree of confidence, or the confidence
coefficient.) The confidence level of 95% is the value used most often.
7.1 Estimating a Population Proportion
Most Common Confidence Levels Corresponding Values of α
90% (or 0.90) confidence level: α = 0.10
95% (or 0.95) confidence level: α = 0.05
99% (or 0.99) confidence level: α = 0.01
9
10. Interpreting a Confidence Interval
We must be careful to interpret confidence intervals correctly. There is a correct interpretation and
many different and creative incorrect interpretations of the confidence interval 0.405 < p < 0.455.
Correct: “We are 95% confident that the interval from 0.405 to 0.455 actually does contain the true
value of the population proportion p.”
Wrong: “There is a 95% chance that the true value of p will fall between 0.405 and 0.455.”
Wrong: “95% of sample proportions will fall between 0.405 and 0.455.”
The Process Success Rate: A confidence level of 95% tells us that the process we are using
should, in the long run, result in confidence interval limits that contain the true population proportion
95% of the time.
7.1 Estimating a Population Proportion
Confidence Interval from
20 Different Samples
10
Confidence intervals can be used informally to compare
different data sets, but the overlapping of confidence
intervals should not be used for making formal and final
conclusions about equality of proportions.
11. A critical value is the number on the borderline separating sample statistics that are
significantly high or low from those that are not significant. The number zα/2 is a
critical value that is a z score with the property that it separates an area of α/2 in the
right tail of the standard normal distribution.
A standard z score can be used to distinguish between sample statistics that are likely
to occur and those that are unlikely to occur. Such a z score is called a critical value.
Critical values are based on the following observations:
11
7.1 Estimating a Population Proportion, Critical Values
1. Under certain conditions, the sampling distribution of sample
proportions can be approximated by a normal distribution.
2. A z score associated with a sample proportion has a
probability of α/2 of falling in the right tail.
3. The z score separating the right-tail region is commonly
denoted by zα/2 and is referred to as a critical value because it
is on the borderline separating z scores from sample
proportions that are likely to occur from those that are unlikely
to occur.
12. Finding zα/2 for a 95% Confidence Level
Critical Values
This is the most
common critical value,
and it is listed with two
other common values in
the table that follows.
13. Margin of Error & Confidence Interval for Estimating a Population Proportion p:
When data from a simple random sample are used to estimate a population proportion p, the margin of
error (also called the maximum error of the estimate ), denoted by E, is the maximum likely difference
(with probability 1 – α, such as 0.95) between the observed (sample) proportion 𝑝 and the true value of the
population proportion p.
𝑝 = sample proportion
n = number of sample values
E = margin of error
zα/2 = z score separating an area of α/2 in the right tail of the standard normal distribution.
1. The sample is a simple random sample.
2. The conditions for the binomial distribution are satisfied: There is a fixed number of trials, the
trials are independent, there are two categories of outcomes, and the probabilities remain constant
for each trial.
3. There are at least 5 successes and at least 5 failures (np ≥ 5, and nq ≥ 5 ).
13
ˆ
ˆ ˆ
p E
p E p p E
2 2
ˆ ˆ ˆ ˆ
ˆ ˆ
pq pq
p z p p z
n n
7.1 Estimating a Population Proportion
2
ˆ ˆpq
E z
n
14. Recall that a Gallup poll of 1487 adults showed that 43% of the respondents have Facebook pages.
a. Find the margin of error E that corresponds to a 95% confidence level.
b. Find the 95% confidence interval estimate of the population proportion p.
c. Based on the results, can we safely conclude that fewer than 50% of adults have Facebook pages?
Assuming that you are a newspaper reporter, write a brief statement that accurately describes the
results and includes all of the relevant information.
Example 2
Solution: Given: Binomial Distribution(BD) n = 1487, 𝑝 = 0.43
n 𝑝 = (1487)(0.43) = 639 & n 𝑞 = (1487)(0.57) = 848 ⇾ n 𝑝 ≥ 5 & n 𝑞 ≥ 5
14
2
ˆ ˆpq
E z
n
ˆ ˆ ˆ,p E p E p p E
𝑏. 𝑝 ± 𝐸 → 0.4048 < p < 0.4552
c. yes, because the interval of values is an interval that is completely below 0.50.
0.43(0.57)
1.96 0.0252
14 7
.
8
Ea
TI Calculator:
Confidence Interval:
proportion
1. Stat
2. Tests
3. 1-prop ZINT
4. Enter: x, n & CL
43% of adults have Facebook pages. That percentage is based on a Gallup poll of
1487 randomly selected adults in the United States.
In theory, in 95% of such polls, the percentage should differ by no more than 2.52
percentage points in either direction from the percentage that would be found by
interviewing all adults.
15. A poll of 1007 (by Pew Research)
randomly selected adults showed that 85%
of respondents know what Twitter is.
a. Find the margin of error E that
corresponds to a 95% confidence level.
b. Find the 95% confidence interval
estimate of the population proportion p.
c. Based on the results, can we safely
conclude that more than 75% of adults
know what Twitter is?
d. Assuming that you are a newspaper
reporter, write a brief statement that
accurately describes the results and
includes all of the relevant information.
Example 3
Solution: BD: n = 1007, 𝑝 = 0.85
n 𝑝 = (1007)(0.85) = 855.951007
& n 𝑞 = (1007)(0.85) = 151.05 ⇾
n 𝑝 ≥ 5 & n 𝑞 ≥ 5
15
0.85
.
0.15
1.96
1007
Ea
2
ˆ ˆpq
E z
n
ˆ ˆ ˆ,p E p E p p E
0.0220545
𝑝 ± 𝐸 →
0.85 0.0220545 0.85 0b .0220545. p
0.8279 0.8721p
c. Yes, because the limits of 0.828 and 0.872 are
likely to contain the true population proportion, it
appears that the population proportion is a value
greater than 0.75.
d. 85% of U.S. adults know what Twitter is. That percentage is
based on a Pew Research Center poll of 1007 randomly selected
adults.
In theory, in 95% of such polls, the percentage should differ by
no more than 2.21 percentage points in either direction from the
percentage that would be found by interviewing all adults in the
16. Finding the Point Estimate and E from a Confidence Interval
Point estimate of p: 𝑝 =
𝑈𝐶𝐿+𝐿𝐶𝐿
2
, UCL: Upper Confidence Limit
Margin of error: 𝐸 =
𝑈𝐶𝐿−𝐿𝐶𝐿
2
, LCL: Lower Confidence Limit
16
7.1 Estimating a Population Proportion
Assume “Of the 71 subjects, 70% were abstinent from smoking at 8 weeks
(95% confidence interval [CI], 58% to 81%).”
Use the above statement to find the point estimate p and the margin of error E.
Example 4
Solution: 95% CI: 0.58 < p < 0.81
𝑝 =
0.81 + 0.58
2
= 0.695
𝐸 =
0.81 − 0.58
2
= 0.115
17. Determining Sample Size: Finding the Sample Size Required to Estimate a Population
Proportion: Objective:
Suppose we want to collect sample data in order to estimate some population proportion.
The question is how many sample items must be obtained?
17
7.1 Estimating a Population Proportion
2
ˆ ˆpq
E z
n
(solve for n by algebra)
2
2
2
ˆ ˆ( )z pq
n
E
2
2
2
( ) 0.25z
n
E
When no estimate of 𝒑 is not
known: 𝒑 = 𝒒 =0.5
18. Many companies are interested in knowing the percentage of adults who buy clothing online.
How many adults must be surveyed in order to be 95% confident that the sample percentage is in error by no more
than three percentage points?
a. Use a recent result from the Census Bureau: 66% of adults buy clothing online.
b. Assume that we have no prior information suggesting a possible value of the proportion.
Example 5
Solution: a. 𝑝 = 0.66, 𝐶𝐿 = 95%, 𝐸 = 0.03
ˆ ˆ1 1 0.66 0.34q p
18
2
2
2
ˆ ˆ( )z pq
n
E
295% 1.96CL z
2
2
(1.96) (0.66)(0.34)
(0.03)
957.8 958n
b. Assume 𝑝 = 𝑞 = 0.5, 𝐶𝐿 = 95%, 𝐸 = 0.03
2
2
2
( ) 0.25z
n
E
2
2
(1.96) 0.25
(0.03)
1067.11 1068n
2
2
2
ˆ ˆ( )z pq
n
E
19. A researcher wishes to estimate, with 95% confidence, the proportion of people
who own a home computer. A previous study shows that 40% of those
interviewed had a computer at home. The researcher wishes to be accurate
within 2% of the true proportion. Find the minimum sample size necessary.
Example 6
Solution: a. 𝑝 = 0.4, 𝐶𝐿 = 95%, 𝐸 = 0.02
ˆ ˆ1 1 0.4 0.6q p
19
2
2
2
ˆ ˆ( )z pq
n
E
295% 1.96CL z
2
2
(1.96) (0.4)(0.6)
(0.02)
2304.96 2305n
2304.96
2
ˆ ˆpq
E z
n
20. A survey of 1404 respondents found that 323 students paid
for their education by student loans. Find the 90%
confidence of the true proportion of students who paid for
their education by student loans.
Solution: Given: Binomial Distribution(BD)
n = 1404, 𝑥 = 323, 𝐶𝐿 = 90%
20
323
ˆ 0.23
1404
p
2
ˆ ˆpq
E z
n
ˆ ˆ ˆ,p E p E p p E
𝑝 ± 𝐸 →
0.23 0.019 0.23 0.019p
0.211 0.249p
0.23 0.77
1.645
1404
E
1 0.23 0.77q ⇾ n 𝑝 ≥ 5 & n 𝑞 ≥ 5
You can be 90%
confident that the
percentage of
students who pay for
their college
education by student
loans is between 21.1
and 24.9%.
Example 7
TI Calculator:
Confidence Interval:
proportion
1. Stat
2. Tests
3. 1-prop ZINT
4. Enter: x, n & CL
0.019
21. A survey of 1721 people found that 15.9% of individuals
purchase religious books at a Christian bookstore. Find the
95% confidence interval of the true proportion of people who
purchase their religious books at a Christian bookstore.
21
2
ˆ ˆpq
E z
n
ˆ ˆ ˆ,p E p E p p E
2 2
ˆ ˆ ˆ ˆ
ˆ ˆ
pq pq
p z p p z
n n
0.159 0.841 0.159 0.841
0.159 1.96 0.159 1.96
1721 1721
p
0.142 0.176 p
Solution: Given: Binomial Distribution(BD)
n = 1721, 𝑝 = 0.159, 𝐶𝐿 = 95%
Example 8