A researcher wants to conduct a survey to determine the prevalence of abusive behavior in children aged 6-12 years old in Manila. Using a sample size formula for one group proportions, the required sample size is calculated as 139 children based on: a past reported prevalence of 10%, a desired confidence level of 95%, and a tolerable error of 5%. Adding a 10% expected non-response rate, the total required sample size is rounded up to 153 children.
A second study wants to compare exam pass rates between two statistics class sections. Using the sample size formula for two groups proportions, the required sample size per group is calculated as 44 students based on: detecting a 15% pass rate difference, a confidence level of 95%,
2. FOR ONE GROUP WHERE OUTCOME IS
EXPRESSED AS A DISCRETE VARIABLE (I.E.,
PROPORTION OR PERCENTAGE)
where
Zά/2 = standard normal deviate
corresponding to the desired level of
confidence
e = effect size or maximum tolerable error, or
margin of error
p = estimate of the population proportion
q = 1 – p
2
2
2/ )(
e
pqz
n
3. A researcher wants to do a survey to
determine the prevalence of abusive
behavior in children 6-12 years of age in a
community in Manila. How many children
should be included in the study if the
prevalence of child abuse in the
Philippines as reported in past studies is
10% (i.e., p=0.10), the desired level of
confidence is 95% (i.e., α=0.05, Zά=1.96),
and the desired precision of the estimate
(tolerable error) is 5% (i.e., e=0.05)?
4. = (1.96²) (0.10) (0.90) = 138.29
(0.05)²
This study will include at least 139 children aged 6-
12 years.
A drop-out or non-participation rate is factored in.
In this study, we expect an attrition rate of 10%.
Therefore, 139 + 13.9 = 152.9. Overall, we need
at least 153 respondents for this study.
2
2
2/ )(
e
pqz
n
5. FOR TWO OR MORE GROUPS AND OUTCOME
IS EXPRESSED AS A DISCRETE VARIABLE
(E.G.,PERCENTAGE OR PROPORTION)
n = [Z/2² (2pq) + Zβ² (p1q1+p2q2)]²
e²
where
n = sample size per group
p1, p2 = estimated population proportions in
groups 1 & 2, respectively
p = p1 + p2
2
q = 1 - p
6. e = magnitude of difference to be
detected, effect size
Zά/2 = standard, normal deviate
corresponding to the desired level of
confidence
Zβ = standard, normal deviate
corresponding to β error rate
7. Suppose a researcher wishes to test a
hypothesis comparing the proportion of
passers in the two sections in a statistics
class. She wants to detect a 15%
difference (error) in the percentage of
successful examinees between the two
sections where her past experience shows
that 95% of students passed in one
section (p1), and only 80% passed in the
other class (p2). What should be her
sample size in each group? Given:
α=0.05, power (ß)= 80%.
8. DETERMINE THE VALUE OF Zß
ß = 80% = 0.80
Z ß
0.84 = 0.7995
z – 0.84 0.0005
z = 0.8000
0.01 0.0028
0.85 = 0.8023
8418.0
84.0
0028.0
)0005.0)(01.0(
0028.0
0005.0
01.0
84.0
z
z
z
9. n = [Zά/2² (2pq) + Zβ² (p1q1+p2q2)]²
e²
n = [(1.96²)(2)(0.875)(0.125)+ 0.8418² ((0.95)(0.05) +(0.80)(0.20))]
0.15²
= a minimum of 43.88 or 44 per group for a total of
88 subjects.
With 10% non-participation rate of 10%, at least 96.8
(or 97) respondents will taken for the two groups.
10. FOR ONE GROUP WHERE OUTCOME IS
EXPRESSED AS A CONTINUOUS VARIABLE
(E.G., MEANS)
where
Zά/2 = standard normal deviate
corresponding to the desired level of
confidence
e = maximum tolerable error; level of
precision, effect size
s² = estimate of variance of observations
2
22
2/
e
sz
n
11. EXAMPLE
A survey will be done to determine the average
number of times a selected group of
adolescents have engaged in binge drinking
in the past year. How many adolescents
should be included in the study if past studies
have shown that binge drinking in
adolescents occurs about 8 times on the
average per year (SD=0.40)? The desired
level of confidence is 95% (i.e., α=0.05), and
the desired precision of the estimate (e) is
5%.
12. = 1.96² (0.40) ² = 245.86
.05²
A minimum number of 246 adolescents is
needed for the study
plus 10% non-participation rate = at least
271 respondents
2
22
2/
e
sz
n
13. FOR TWO OR MORE GROUPS AND OUTCOME
IS EXPRESSED AS CONTINUOUS VARIABLE
(E.G., MEANS)
2 s² (Zά/2+ Zβ)²
n = ------------------------
e²
where
n = sample size per group
s² = estimate of variance of observations
e = magnitude of difference to be detected
Zά/2 = standard. normal deviate corresponding to
the desired level of confidence
Zβ = std. normal deviate corresponding to β error
rate
14. A group of acceptors and non acceptors of
measles immunization were compared in
terms of their beliefs in measles
vaccination. Beliefs in measles
vaccination was measured using a 5-
point semantic differential scale. The pilot
data indicated that a conservative value
for the variance was 1.0. It was also
decided that the smallest difference that
the study should detect was 0.4 where
ά=0.05, two-tailed, power=90%.
15. n = 2 s² (Zά/2+ Zβ)²
e²
n = 2(1)² (1.96 + 1.2817) ² =131.35 = 132
0.4²
Total sample size = 132 + 10%(132) = 146
When there is absolutely no prior knowledge about
the variance of the population, a maximum
variance of 0.50 can be estimated.
It must be noted that the higher the variance, the
larger will be the sample size.
17. An investigator wants to know the GPA of
MMSU students. However, he does not
have enough resources to survey the
entire population of 3,000 students. If he
wants to use a sample of this population,
with a 5% margin of error, what should his
sample size be?
Given:
N = 3,000
e = 5% = 0.05 Required: n = ?
18. Solution:
n = 3000 / ( 1 + [(3000)(0.05)(0.05)] = 352.9 or a
minimum of 353
The Slovin’s formula is a simple way of
estimating sample size but is most
commonly used.
Only applicable for one group surveys
and when the population size is known.
20. Exercises:
1. A student in public administration wants to determine the mean
amount members of city councils in large cities earn per month as
renumeration for being a council member. The error in estimating the
mean is to be less than Php1,000 with a 95% level of confidence. The
student found a report by the Department of Labor that estimated the
standard deviation to be Php5,000. What is the required sample size?
2. The study in problem 1 also estimates the proportion of cities that
have private collectors. The student wants the estimate to be within
0.10 of the population proportion, the desired level of confidence is
90%, and no estimate is available for the population proportion. What is
the required sample size?
3. Will you assist the college registrar in determining how many
transcripts to study? The registrar wants to estimate the arithmetic
mean grade point average (GPA) of all graduating seniors during the
past 10 years. GPA’s range between 2.0 and 4.0. The mean GPA is to
be estimated within plus or minus 0.05 of the population mean. The
standard deviation is estimated to be 0.279. Use the 99% level of
confidence.
21. Session 4.21
TEACHI
NG
If X has a distribution (not
necessarily normal) with
mean and variance 2,
then the distribution of the
sample mean approaches
the normal distribution with
mean and variance 2/n
as the sample size
increases.
CENTRAL LIMIT THEOREM
23. Session 4.23
TEACHI
NG
Large sample size, like n 25
only imply that “normality” of
the sample mean may be
assumed.
However, it does not imply that
this is the “appropriate”
sample size for inference.
REMARK
Editor's Notes
Alpha error rate and margin of error are preset by the researcher.
Comparing one group to the known population; thus, e = 5%, set by the researcher.
Beta error is set by the researcher, just as alpha error.
Beta error rate is the complement of alpha error rate: Beta region = 1 – Alpha region
Comparing the two groups with p1 = 95% and p2 = 80%; thus, e = 15%.