2. Chapter Goals
After completing this chapter, you are expected to:
• Describe a random sample and why sampling is
important.
• Explain the difference between probability and non-
probability sampling.
• Define the concept of a sampling distribution.
• Determine the mean and standard deviation for the
sampling distribution of the sample mean &
proportion.
2
3. 7.1 Basic Concepts
• Survey - a method of data collection with no special
control on one or more of the factors.
• Two types of surveys: census (complete enumeration)
and sampling.
• The two situations where census is the only option:
– If information is needed from each and every unit of
the study population ;
– If a study is on rare events
• A population is the set of all items or individuals of
interest possessing some common characteristic.
• Example: All likely voters in the next election.
• A Sample is a subset of the population
• Example: 1000 voters selected at random for interview.
3
4. Basic Concepts . . .
• Parameters: descriptive measures for a population.
• Statistic is a descriptive measure for a sample.
• Sampling: The process by which any portion of a
population as representative of that population is
chosen.
• Sampling Unit: an element or a set of elements
considered for selection at some stage of sampling, e.g.,
animals, persons, households etc.
• Sampling Frame: listing of all the things or sampling
units that makes up a given population.
• Sample Size: The number or amount of elements
included in the sample.
4
5. Basic Concepts . . .
• Sample Design: A set of rules or procedures that
specify how a sample is to be selected.
• Sampling Error: the difference between the true
population value and the statistic.
– Can be minimized by increasing the size of sample.
• Non-sampling Error: errors occurring due to data are
incorrectly collected, recorded or analyzed.
– May happen both in census survey and sample
survey.
– Potential sources:
• Personal bias
• Poor measurement and/ or instrumentation
• imperfection in specifying the population
5
6. 7.2 Reasons for Sampling
• Less time consuming than a census.
• Less costly to administer than a census.
• The only option for infinite population.
• Recommended in destructive type experiments.
• Possible to obtain statistical results of a sufficiently
high precision based on samples.
• Disadvantages:
– If we don’t have a representative sample, the
extracted results may be misleading.
– Minority groups may not be properly
represented.
6
7. 7.3 Types of Sampling Techniques
• Non-probability Samples: Not every element has a
chance to be sampled. Selection process usually
involves subjectivity.
• Convenience
• Haphazard
• Quota
• Judgment
• Probability (random) Samples: Each element to be
sampled has a calculable chance of being selected.
• Simple random
• Systematic
• Stratified
• Cluster 7
8. Non-probability Sampling
• Widely used as a case selection method in qualitative
research.
8
• Convenience
• Haphazard
• Quota
• Judgment
• Elements are sampled because of ease
and availability.
• The sample is selected haphazardly
by picking a sample of 10 rabbits
from a large cage in a laboratory.
• Elements are sampled, but not
randomly, from every layer, or
stratum, of the population.
• Elements are sampled because the
researcher believes the members are
representative of the population.
9. Probability Sampling
• Random sampling can be done with replacement or
without replacement.
9
• Simple random • Every subject (case) has an equal
chance of being selected.
• Best when a sampling frame exists
and population is homogeneous.
• Two methods of selection:
•lottery method,
•table of random numbers.
•Lottery method is commonly used
when size of the population is small.
10. Random Number Table
• This consists of a randomly generated series of digits (0
– 9).
• All the units of a population should be numbered from 1
to N or from 0 to N-1.
1. Determine the number of digits required based on N.
2. Choose a direction (right, left, up or down).
3. Identify starting point randomly.
4. In the chosen direction, read the number of digits
required. Numbers < = population size would be
included in the sample
10
12. Example
• Use a table of random numbers to select a sample of
size 5 from the population having 59 elements.
• Number of digits = 2
• Direction: downward
• Starting point: first cell
• Two digit numbers on the selected direction: {84,
59, 57, 26, 07, 89, 76, 62, 63, 78, 60, 84, 14}
• Selected samples: {59, 57, 26, 07, 14 }
12
13. Random Sampling
13
• Stratified
• Cluster
• Randomly sample elements from
every layer, or stratum, of the
population. Best when elements
within strata are homogeneous.
• Sampling error will arise primarily
from variability within the strata
• Randomly sample clusters from
the available clusters in the
population. Best when elements
within cluster are heterogeneous.
• Sampling error will occur because
of variability between clusters.
14. Random Sampling
14
•Systematic Sampling
1. Number the elements of a population from 1 to N.
2. Divide the dataset into N/n sub-units.
3. Pick a number k between 1 and N/n.
4. The kth , (k+N/n)th, (k+2N/n)th,..., (k+(n- 1)N/n))th
elements will be included in the sample.
• Best when elements are randomly ordered, i.e., no
cyclic variation.
• Example: Select a systematic sample of size 10
from a population of size 100.
15. 7.4 Sampling Distribution of the Sample Mean
15
A sampling distribution is a distribution of all of the
possible values of a statistic for a given size sample
selected from a population.
Sampling
Distributions
Sampling
Distribution of
Sample Mean
Sampling
Distribution of
Sample
Proportion
Sampling
Distribution of
Sample
Variance
16. Constructing a
Sampling Distribution
• Assume there is a population …
• Population size (N)=4
• Random variable, X
is age of individuals.
• Values of X:
18, 20, 22, 24 (years)
A B C
D
16
17. .25
0
18 20 22 24
A B C D
Uniform Distribution
P(x)
x
Summary Measures for the Population Distribution:
Sampling Distribution . . .
21
4
24
22
20
18
N
X
μ i
2.236
N
μ)
(X
σ
2
i
17
18. 16 possible samples
(sampling with
replacement)
Now consider all possible samples of size n = 2.
1st 2nd Observation
Obs 18 20 22 24
18 18 19 20 21
20 19 20 21 22
22 20 21 22 23
24 21 22 23 24
Sampling Distribution . . .
16 Sample
Means
2nd
Observation
1st
Obs 18 20 22 24
18 18,18 18,20 18,22 18,24
20 20,18 20,20 20,22 20,24
22 22,18 22,20 22,22 22,24
24 24,18 24,20 24,22 24,24
18
Total number of samples in selection with replacement = 42 = 16
19. 1st 2nd Observation
Obs 18 20 22 24
18 18 19 20 21
20 19 20 21 22
22 20 21 22 23
24 21 22 23 24
Sampling Distribution of All Sample Means
18 19 20 21 22 23 24
0
.1
.2
.3
P(X)
X
Sample Means
Distribution
16 Sample Means
_
Sampling Distribution . . .
(no longer uniform)
_
19
20. Summary Measures of this Sampling Distribution:
Sampling Distribution . . .
μ
21
16
24
21
19
18
N
X
)
X
E( i
1.58
16
21)
-
(24
21)
-
(19
21)
-
(18
N
μ)
X
(
σ
2
2
2
2
i
X
20
21. Comparing the Population with its
Sampling Distribution
18 19 20 21 22 23 24
0
.1
.2
.3
P(X)
X
18 20 22 24
A B C D
0
.1
.2
.3
Population
N = 4
P(X)
X _
1.58
σ
21
μ X
X
2.236
σ
21
μ
Sample Means Distribution
n = 2
_
21
22. Expected Value of Sample Mean
• Let X1, X2, . . . Xn represent a random sample from a
population.
• The sample mean value of these observations is defined
as
• The expected value of the sample mean ( ) is given by
n
1
i
i
X
n
1
X
X
)
Pr( i
1
i
i X
X
)
X
E(
22
23. Standard Error of the Mean
• Different samples of the same size from the same
population will yield different sample means.
• A measure of the variability in the mean from sample to
sample is given by the Standard Error of the Mean:
• Note that the standard error of the mean decreases as the
sample size increases
n
σ
σX
23
24. Sampling from a Normal Population
• If a population is normal with mean μ and standard
deviation σ, the sampling distribution of is also
normally distributed with
and
X
μ
μX
n
σ
σX
24
25. Z-value for Sampling Distribution
of the Sample Mean
• Z-value for the sampling distribution of :
where: = sample mean
= population mean
= population standard deviation
n = sample size
X
μ
σ
n
σ
μ)
X
(
σ
μ)
X
(
Z
X
X
25
26. Finite Population Correction
• Apply the Finite Population Correction if:
– a population member cannot be included more
than once in a sample (sampling is without
replacement), and
– the sample is large relative to the population
(n is greater than about 5% of N)
• Then or
If the sample size (n) is not small compared to the
population size N, then use
1
N
n
N
n
σ
σX
1
N
n
N
n
σ
)
X
Var(
2
1
N
n
N
n
σ
μ)
X
(
Z
26
28. Sampling Distribution Properties . . .
• For sampling with replacement:
As n increases,
decreases
Larger
sample size
Smaller
sample size
x
σ
μ
28
29. Sampling from a Non-normal Population
• We can apply the Central Limit Theorem:
– Even if the population is not normal,
– sample means from the population will be
approximately normal as long as the sample size
is large enough.
Properties of the sampling distribution:
and
μ
μx
n
σ
σx
29
30. 7.5 Sampling distribution of the sample proportion
size
sample
interest
of
stic
characteri
the
having
sample
the
in
items
of
number
n
X
P
ˆ
30
31. Sampling distribution of the sample proportion
n
Pq
n
q
P ˆ
ˆ
Student 1 2 3 4 5
Response Yes No No Yes Yes
31
32. Sampling distribution of the sample proportion
Possible samples
1,2,34 (Yes,No,No,Yes) 2/4
1,2,3,5(Yes,No,No, Yes) 2/4
1,3,4,5 (Yes,No, Yes, Yes) ¾
1,2,4,5 (Yes, No, Yes, Yes) ¾
2,3,4,5 (No, No, Yes, Yes) 2/4 32
33. Sampling distribution of the sample proportion
0.5 3/5
0.75 2/5
N(0,1)
~
n
Pq
n
Pq
N(P,
~ P
P
Z
and
P
ˆ
)
ˆ
33
34. Sampling distribution of the sample proportion
0.015969
500
.15)
.15(1
n
P)
P(1
σP
ˆ
1.57)
Pr(Z
0.015969
.15
.175
P
P
P
Pr
.175)
P
Pr(
ˆ
ˆ
ˆ
34
35. Sampling distribution of the sample proportion
b) Exercise
c)
Use standard normal table: Pr(Z > 1.57) = 0.0594
0.2342
1.88)
Z
Pr(0.63
0.015969
.15
.18
Z
0.015969
.15
.16
Pr
.18)
P
Pr(.16
ˆ
35