14. 12 - 14
• Population: The entire group about which information
is desired.
• Sample: A proportion or part of the population -
usually the proportion from which information is
gathered.
Populations and Samples
16. Population Vs. Sample
Population of Interest
Sample
Population Sample
Parameter Statistic
We measure the sample using statistics in order to draw
inferences about the population and its parameters.
22. Example of computer simulation…
• How many heads come up in 100 coin tosses?
• Flip coins virtually
– Flip a coin 100 times; count the number of heads.
– Repeat this over and over again a large number of
times (we’ll try 30,000 repeats!)
– Plot the 30,000 results.
23. Coin tosses…
Conclusions:
We usually get
between 40 and 60
heads when we flip a
coin 100 times.
It’s extremely
unlikely that we will
get 30 heads or 70
heads (didn’t
happen in 30,000
experiments!).
24.
25.
26.
27.
28.
29. 12 - 29
Sampling
• In its broadest sense, sampling is a procedure by which
one or more members of a population are picked from
the population.
• The objective is to make certain observations upon the
members of the sample and then, on the basis of these
results, to draw conclusions about the characteristics of
the entire population.
30.
31. 12 - 31
Looking at the Process
When we randomly select a sample from a
population, we can use the mean for the sample as
an estimate or guess as to the value for the mean of
the population. This should bring up the question as
to how good is this sample mean or sample statistic
as a guess for the value of the population mean or
population parameter.
The essence of this question has to do with how well
this process works—the process of using a sample to
make guesses about the population.
32. 12 - 32
How Good is a Sample Mean
The essential question is “How good is a sample mean
as an estimate of the population mean?”
One way to examine this question is to understand
that we used a process that involved randomly
selecting a sample from the population and then
calculating the mean for the values of the
observations in the sample.
We can repeat this process as many times as we wish
and examine what it produces.
36. 12 - 36
Sample with n = 5
156
201 105
149
121
189
201 121
149 172
220
201
309111
198
46
42 162
217 198
156
133
…
261
100
Sample of 5 weights
n = 5; = 732x
732
= = 146.4
5
x
Population of weights
37. 12 - 37
Ten Different Samples, n = 5
Sample n Mean s2 s
1 5 147.43 88.14 9.39
2 5 153.98 117.91 10.86
3 5 146.50 103.66 10.18
4 5 155.53 91.99 9.59
5 5 147.87 149.65 12.23
6 5 143.60 66.76 8.17
7 5 146.87 64.23 8.01
8 5 149.19 280.88 16.76
9 5 150.05 200.28 14.15
10 5 146.92 173.36 13.17
Average 148.79 133.69 11.25
38. 12 - 38
Sampling Distributions
Individual
Observations
Means for
n = 5
149 153.0
146 146.4
: :
n = 1 n = 5
= 150 Ibs = 150 Ibs
2
= 100 Ibs2 2
2 2
20Ibsx
n
= 10 Ibs 4.47Ibsx
n
39. 12 - 39
Standard Error of the Mean
x
n
The population that includes all possible samples of
size n is a long list of numbers and the variance for
these numbers can, in theory, be calculated.
The square root of this variance is called the standard
error of the mean. It is simply the standard deviation
for this population of means.
2
2
x
n
40. 12 - 40
Sample with n = 20
113
145
148
151
102
111
181
189
154
114
120
191
105
206
171
133
101198
127
136
161
Sample of 20 weights
n = 20; = 3057x
3057
= = 152.85
20
x
41. 12 - 41
Ten Different Samples, n = 20
Sample n Mean s2 s
1 20 150.86 100.96 10.05
2 20 146.88 122.70 11.08
3 20 147.65 119.51 10.93
4 20 149.37 51.07 7.15
5 20 153.30 109.54 10.47
6 20 152.83 111.96 10.58
7 20 148.62 91.94 9.59
8 20 152.16 140.83 11.87
9 20 154.40 179.56 13.40
10 20 151.43 115.85 10.76
Average 150.75 114.39 10.59
42. 12 - 42
Sampling Distributions
Individual
observations
Means for
n = 5
Means for
n = 20
149 153.0 151.6
146
.
.
.
146.4
.
.
.
151.3
.
.
.
µ = 150 lbs µ = 150 lbs µ = 150 lbs
2 = 100lbs
= 10 lbs
2
2 2
20 lbsx
n
2
2 2
5 lbsx
n
4.47 lbsx
n
2.23 lbsx
n
43. A Sampling Distribution
Let’s create a sampling distribution of means…
Take a sample of size 1,500 from the US. Record the mean income. Our
census said the mean is $30K.
$30K
44. A Sampling Distribution
Let’s create a sampling distribution of means…
Take another sample of size 1,500 from the US. Record the mean income.
Our census said the mean is $30K.
$30K
45. A Sampling Distribution
Let’s create a sampling distribution of means…
Take another sample of size 1,500 from the US. Record the mean income.
Our census said the mean is $30K.
$30K
46. A Sampling Distribution
Let’s create a sampling distribution of means…
Take another sample of size 1,500 from the US. Record the mean income.
Our census said the mean is $30K.
$30K
47. A Sampling Distribution
Let’s create a sampling distribution of means…
Take another sample of size 1,500 from the US. Record the mean income.
Our census said the mean is $30K.
$30K
48. A Sampling Distribution
Let’s create a sampling distribution of means…
Take another sample of size 1,500 from the US. Record the mean income.
Our census said the mean is $30K.
$30K
49. A Sampling Distribution
Let’s create a sampling distribution of means…
Let’s repeat sampling of sizes 1,500 from the US. Record the mean incomes.
Our census said the mean is $30K.
$30K
50. A Sampling Distribution
Let’s create a sampling distribution of means…
Let’s repeat sampling of sizes 1,500 from the US. Record the mean incomes.
Our census said the mean is $30K.
$30K
51. A Sampling Distribution
Let’s create a sampling distribution of means…
Let’s repeat sampling of sizes 1,500 from the US. Record the mean incomes.
Our census said the mean is $30K.
$30K
52. A Sampling Distribution
Let’s create a sampling distribution of means…
Let’s repeat sampling of sizes 1,500 from the US. Record the mean incomes.
Our census said the mean is $30K.
$30K
The sample means would stack up
in a normal curve. A normal
sampling distribution.
53. A Sampling Distribution
Say that the standard deviation of this distribution is $10K.
Think back to the empirical rule. What are the odds you would get a sample
mean that is more than $20K off.
$30K
The sample means would stack up
in a normal curve. A normal
sampling distribution.
-3z -2z -1z 0z 1z 2z 3z
54. A Sampling Distribution
Say that the standard deviation of this distribution is $10K.
Think back to the empirical rule. What are the odds you would get a sample
mean that is more than $20K off.
$30K
The sample means would stack up
in a normal curve. A normal
sampling distribution.
-3z -2z -1z 0z 1z 2z 3z
2.5% 2.5%
55. A Sampling Distribution
An Example:
A population’s car values are = $12K with = $4K.
Which sampling distribution is for sample size 625 and which is
for 2500? What are their s.e.’s?
-3 -2 -1 0 1 2 3
95% of M’s
95% of M’s
-3-2-1 0 1 2 3
? $12K ?
? $12K ?
56. A Sampling Distribution
An Example:
A population’s car values are = $12K with = $4K.
Which sampling distribution is for sample size 625 and which is for 2500? What are their s.e.’s?
s.e. = $4K/25 = $160 s.e. = $4K/50 = $80
(625 = 25) (2500 = 50)
-3 -2 -1 0 1 2 3
95% of M’s
95% of M’s
-3-2-1 0 1 2 3
$11,840 $12K $12,320
$11,920$12K $12,160
57. A Sampling Distribution
A population’s car values are = $12K with = $4K.
Which sampling distribution is for sample size 625 and which is for 2500?
Which sample will be more precise? If you get a particularly bad sample, which sample size will
help you be sure that you are closer to the true mean?
-3 -2 -1 0 1 2 3
95% of M’s
95% of M’s
-3-2-1 0 1 2 3
$11,840 $12K $12,320
$11,920$12K $12,160
58.
59.
60.
61.
62.
63.
64. •TheIdeaofaConfidence
Interval
estimate±marginoferror
Definition:
A confidence interval for a parameter has two parts:
• An interval calculated from the data, which has the form:
estimate ± margin of error
• The margin of error tells how close the estimate tends to be to the
unknown parameter in repeated random sampling.
• A confidence level C, the overall success rate of the method for
calculating the confidence interval. That is, in C% of all possible
samples, the method would yield an interval that captures the true
parameter value.
We usually choose a confidence level of 90% or higher because we want to be
quite sure of our conclusions. The most common confidence level is 95%.
The big idea: The sampling distribution ofx tells us how close to the
sample mean x is likely to be. All confidence intervals we construct will
have a form similar to this:
65. • Constructing a Confidence Interval
Why settle for 95% confidence when
estimating a parameter? The price we pay
for greater confidence is a wider interval.
When we calculated a 95% confidence interval
for the mystery mean µ, we started with
estimate ± margin of error
ConfidenceIntervals:TheBasics
This leads to a more general formula for confidence intervals:
statistic ± (critical value) • (standard deviation of statistic)
Our estimate came from the sample statisticx.
Since the sampling distribution ofx is Normal,
about 95% of the values ofx will lie within 2
standard deviations (2x ) of the mystery mean.
That is, our interval could be written as:
240.79 2 5 = x 2x
66. • Calculating a Confidence Interval
ConfidenceIntervals:TheBasics
The confidence interval for estimating a population parameter has the form
statistic ± (critical value) • (standard deviation of statistic)
where the statistic we use is the point estimator for the parameter.
Calculating a Confidence Interval
Properties of Confidence Intervals:
The “margin of error” is the (critical value) • (standard deviation of statistic)
The user chooses the confidence level, and the margin of error follows
from this choice.
The critical value depends on the confidence level and the sampling
distribution of the statistic.
Greater confidence requires a larger critical value
The standard deviation of the statistic depends on the sample size n
The margin of error gets smaller when:
The confidence level decreases
The sample size n increases