Understanding Sampling Distributions and the Central Limit Theorem
1. Chapter 6:
Sampling Distribution
6.1 Population data, sample data, population size, sample size,
probability sample.
6.2 Histogram of all the probability mean, polygon and frequency curve,
statistics mean.
6.3 The distribution of when original population is normal: known
variance and unknown variance.
6.4 The distribution of when original population is not normal, but
with sufficient sample size: central limit theorem.
6.5 Proportion sampling distribution, with big sample size.
6.6 Sampling distribution for the difference between means and
difference between proportions with two independent populations.
1
X
X
2. 2
Outcomes
By the end of the lesson, you should be able to:
• form a sampling distribution for a mean and proportion
based on a small, finite population.
• present and describe the sampling distribution of sample
means and the central limit theorem.
• explain the relationship between the sampling
distributions with the central limit theorem.
• compute, describe and interpret z-scores corresponding
to known values of X
3. 3
• describe central limit theorem and solve problems
• identify sampling distribution and solve problems
(population & sampling proportion)
• Know the sampling distribution for the difference
between means and the difference between proportions
with two independent populations.
• Clarify any issues from Chapters 4-5 and recap
– Binomial distribution
– Poisson distribution
– Normal distribution
Outcomes (cont.)
4. 4
Population Distribution
• The probability distribution of the population data.
Example 1:
• Suppose there are only five students in an advanced statistics class
and the midterm scores of these five students are:
• 70 78 80 80 95
• Let x denote the score of a student.
5. 5
Mean & Standard Deviation
• Mean for population:
6.80
5
9580807870
N
x
Standard deviation for population:
0895.8
5
5
403
32809
2
2
2
N
N
x
x
6. 6
Sampling Distribution
• Sampling Distribution
• The probability distribution of a sample statistic, such as
median, mode, mean and standard deviation.
• Sampling Distribution Of Sample Mean
• A distribution obtained by using the means computed
from random samples of a specific size taken from a
population.
7. 7
Example 2
• Say we draw all possible samples of three number each
and compute the mean. (from data in example 1)
• Total number of samples =
• Suppose we assign the letters A, B, C, D and E to
scores of the five students, so that
• A = 70, B = 78, C=80, D = 80, E = 95
• Then the 10 possible samples of three scores each are
• ABC, ABD, ABE, ACD, ACE, ADE, BCD, BCE, BDE,
CDE
10. 10
Sampling Error
• Sampling Error
• Sampling error is the difference between the value of a
sample statistic and the value of the corresponding
population parameter.
• In the case of mean, the sampling error = x
11. 11
Mean of the Sampling Distribution
• Mean ( ) of the Sampling Distribution of :
• Based on Example 2,
which is the same as the population mean
(Example 1, p2).
• “The mean of the sampling distribution of is always
equal to the mean of the population”
x x
6.80
10
8533.8433.8433.7967.8167.8167.76817676
x
x
x
17. 17
Sampling from a Not Normally
Distributed Population
• Most of the time the population from which the samples
are selected is not normally distributed.
• In such cases, the shape of the sampling distribution
of is inferred from central limit theorem.x
18. 18
Central Limit Theorem
• The distribution of sample means taken from any large
population approaches that of the normal distribution as
the sample size n increases. ( )
– The mean of the sample means will be equal to the population
mean.
– The standard deviation of the distribution of sample means will
be
x
nx
30n
21. 21
Example 3
• In a study of the life expectancy of 500 people in a certain geographic
region, the mean age at death was 72 years and the standard deviation was
5.3 years. If a sample of 50 people from this region is selected, find the
probability that the mean life expectancy will be less than 70 years.
22. 22
Example 4
• Assume that the weights of all packages of a certain brand of cookies are normally
distributed with a mean of 32 ounces and a standard deviation of 0.3 ounce. Find the
probability that the mean weight of a random sample of 20 packages of this brand of
cookies will be between 31.8 and 31.9 ounces.
• Solution:
23. 23
Example 5
A magazine reported that children between the ages of 2 and 5
watch an average of 25 hours of television per week. Assume the
variable is normally distributed and the standard deviation is 3
hours.
If 20 children between the ages of 2 and 5 are randomly selected,
find the probability that the mean of the number of hours they watch
television will be:
a) greater than 26.3 hours.
b) less than 24 hours *
c) between 24 and 26.3 hours.
27. 27
Example 6
The average number of pounds of meat that a person consumes a
year is 218.4 pounds. Assume that the standard deviation is 25
pounds and the distribution is approximately normal.
• a) Find the probability that a person selected at random consumes
less than 224 pounds per year.
• b) If a sample of 40 individuals selected, find the probability that the
mean of the sample will be less than 224 pounds per year.
30. 30
x1
xn x3
x2
…
Statistical population
All possible samples
of size n
All other
samples
x1
xn
x3
x2
…
x1 xnx3x2 …
x1xn
x3 x2
…
The sampling distribution
of sample means
The elements are:
All values of the sample
statistic are used to form
the sampling distribution
x
nx
x1
xn x3
x2
…
x1
xn x3
x2
…
Parameter of interest
µ
X
P(x)
32. 32
Central Limit Theorem
• The distribution of sample means taken from any large
population approaches that of the normal distribution as
the sample size n increases. ( )
– The mean of the sample means will be equal to the population
mean.
– The standard deviation of the distribution of sample means will
be
x
nx
30n
x
Different spread!
33. 33
Exercise
1. Given a large population with mean, µ = 400 and
standard deviation, σ = 60.
a) If the population is normally distributed, what is the shape for
the sampling distribution of sample mean with random sample
size of 16. Find the mean and standard deviation.
b) If we do not know the shape of the population in 1(a), can we
answer 1(a)? Explain.
c) Can we answer 1(a) if we do not know the population
distribution but we have been given a random sample of size
36? Explain.
34. 34
Exercise
2. A random sample with size, n = 30, is obtained from a
normal distribution population with µ = 13 and σ = 7.
a) What are the mean and the standard deviation for the
sampling distribution of sample mean.
b) What is the shape of the sampling distribution? Explain.
c) Calculate
i) 𝑃( 𝑥 < 10) ii) 𝑃( 𝑥 < 16)
35. Exercise
4. Given X ~ N (5.55, 1.32). If a sample size of 50 is
randomly selected, find the sampling distribution for
sample. (Hint: Give the name of distribution, mean and
variance).Then, Calculate:
a) P( 5.25 ≤ x ≤ 5.90)
b) P(5.45 ≤ x ≤ 5.75)
35
_
_
37. 37
Population & Sample Proportion
where N = total number of elements in the population
n = total number of elements in the sample
X = number of elements in the population that possess
a specific characteristic
x = number of elements in the sample that possess
a specific characteristic
38. 38
Example 7
• Suppose a total of 789 654 families live in a city and 563
282 of them own houses. A sample of 240 families is
selected from the city and 158 of them own houses. Find
the proportion of families who own houses in the
population and in the sample.
• Solution:
39. 39
Example 8
• Boe Consultant Associates has five employees. The
table below lists the names of these five employees and
information concerning their knowledge of statistics,
• where: p = population
• proportion of
• employees
who
• know
• statistics
6.0
5
3
N
X
p
40. 40
Example 8 (cont’d)
• Say we draw all possible samples of three employees each and
compute the proportion.
• Total number of samples = 5C3 =
10
)!35(!3
!5
41. 41
Mean & Std Dev of Sample Proportion
pp ˆ
n
pq
p ˆ
05.0
N
n
1
ˆ
N
nN
n
pq
p
05.0
N
n
where p = population proportion
q = 1 – p
n = sample size
42. 42
Example 9
• Based on the example of Boe Consultant Associates,
find the mean and the standard deviation.
43. 43
Shape of Sampling Distribution
• According to the central limit theorem, the sampling
distribution of is approximately normal for a sufficiently
large sample size.
• In the case of proportion, the sample size is considered
to be sufficiently large if np > 5 and nq > 5
pˆ
pˆ
44. 44
Example 10
• A binomial distribution has p = 0.3. How large must
sample size be such that a normal distribution can be
used to approximate sampling distribution of ?pˆ
45. 45
Example 11
• The Dartmouth Distribution Warehouse makes
deliveries of a large number of products to its
customers. It is known that 85% of all the orders it
receives from its customers are delivered on time.
• a) Find the probability that the proportion of orders in a
random sample of 100 are delivered on time:
(i) less than 0.87
(ii) between 0.81 and 0.88
• b) Find the probability that the proportion of orders in a
random sample of 100 are not delivered on time greater
than 0.1.
49. 49
Example 12
• The machine that is used to make these CDs is known to produce
6% defective CDs. The quality control inspector selects a sample of
100 CDs every week and inspects them. If 8% or more of the CDs in
the sample are defective, the process is stopped and the machine is
readjusted. What is the probability that based on a sample of 100
CDs the process will be stopped to readjust the machine?
51. Sampling Distributions
• e.g.an experiment designed to compare the mean of a control group
with the mean of an experimental group.
51
Population
1
(1)
Population
2
(2)
Sample
1
( 1)
Sample
2
( 2)X X
52. Sampling Distribution for the
Difference Between Means
• the mean of is
• the standard error of is
• If the two populations are normally distributed then the
sampling distribution of is exactly normally
distributed, regardless of the sample size.
• If the two populations are not normally distributed then
the sampling distribution of is approximately
normally distributed when n1 and n2 are both 30 or more
(Central Limit Theorem)
52
1 2( )x x 1 2
1 2( )x x
1 2( )x x
1 2( )x x
2 2
1 2
1 2
1 2
s s
s s
n n
corrections
53. Sampling Distribution for the
Difference Between Proportions
• Assume that independent random samples of n1 and n2
observations have been selected from binomial
populations with parameters p1 and p2 respectively.
• The sampling distribution of the difference between
sample proportions has these properties:
• The mean of is p1 - p2
•
• The standard error is
53
1 2
ˆ ˆ( )p p
1 1 2 2
1 2
ˆ ˆ ˆ ˆp q p q
n n
1 2
1 2
1 2
ˆ ˆ( )
x x
p p
n n
-
correction
54. 54
Binomial Distribution
• X ~ B(n, p)
• Characteristics of the Binomial Distribution:
1. The experiment consists of n identical trials.
2. Each trial has only one of the two possible mutually exclusive
outcomes, success or a failure.
3. The trials are independent, thus we must sample with
replacement.
4. The probability of success is denoted by p and that of failure by
q, and
p + q=1. The probabilities p and q remain constant for each
trial.
The outcome to which the question refers
is called a success
55. 55
Binomial Distribution
• For a binomial experiment, the probability of exactly x
successes in n trials is given by the binomial formula:
=
• where
• n = the total number of trials
• p = probability of success
• q = 1-p = probability of failure
• x = number of successes in n trials
• n-x = number of failures in n trials
xx x
56. 56
Poisson Distribution
• X~P()
• The probability of x occurrences in an interval is:
• Where;
• : mean number of occurrences in that interval (time or ?)
• e : use the ex function on your calculator
57. 57
Poisson Distribution
• The Poisson distribution depends only on the average
number of occurrences per unit time or space, (lambda)
• The Poisson probability distribution is useful when n is
large and p is small.
Conditions for Poisson:
•x is a discrete random variable
•The occurrences are random
•The occurrences are independent
58. 58
Using ≤ Table
It gives the sum of probabilities for all values of X≤ x (i.e. to the left of
that x value).
At most: Use the table P(X≤ x) directly
Exactly, use the formula for P(X=x)
These two are very different .
59. 59
Normal Distribution
• The normal distribution is the most important and most
widely used of all probability distributions.
• A large number of phenomena in the real world are
normally distributed either exactly or approximately.
• The normal probability distribution or the normal curve
is a bell-shaped (symmetric) curve.
– Its mean is denoted by and its standard deviation by .
– Also known as bell curve or Gaussian distribution.
61. 61
Summary
• State the central limit theorem.
– Mean
– Std dev.
• Describe and interpret z-scores corresponding to known
values of
• Describe the central limit theorem.
• Solve problems using central limit theorem.
• Solve all problems from Chapters1-6.
Extra:
• http://www.oswego.edu/~srp/stats/z.htm
• http://www.oswego.edu/~srp/stats/piechart.htm
X