3. Approaches and decisions about
sample size
• Depends on research questions and research
designs
• Some common situations
– Estimating prevalence /mean
– Comparing two groups
• Mean
• Proportion
4. Sample size for prevalence study
• For dichotomous data (only 2 outcomes;
sick/not sick, male/female, dead/alive)
• The study describe results in percentage
• For example, disease prevalence survey
5. Sample size: prevalence survey
2
)]
1
(
[
2
d
p
p
Z
n
• p = estimated prevalence (percentage)
• q = 1-p
• Z = critical value for 95% CI= 1.96
• d = allowable error
• n = sample size
6. Error
• Suppose, the survey wants to estimate the
true prevalence of a disease in population
• The estimate we get from the survey will be
within +/- d% of the true prevalence
7. Example
• A survey is to estimate prevalence of influenza
virus infection in school kids
• Suppose the available evidence suggests that
approximately 20% (P=20) of the children will
have antibodies to the virus
• Assume the investigator wants to estimate the
prevalence within 6% of the true value (6% is
called allowable error; d)
8. Sample size for estimation of mean
• A Survey to find an average of a parameter
(birth weight, antibody titre, blood pressure)
• The study reports average of parameters
• The parameter must be quantitative
9. Sample size for estimation of mean
• SD = Standard Deviation of variable of interest
• d = Allowable error
• Z = value for the desired confidence limit
• n = required sample size
2
2
2
d
SD
Z
n
10. Example
• Suppose an investigator has some evidence
suggests that the standard deviation of rat
weight is about 455 g
• He wishes to provide an estimate within 80 g
of the true average (80 g is the allowable
error; d)
11. Example
• The required sample size is
n = (1.96) 2 x (455)2 / (80)2 = 129.39
• Thus approximately 130 rats would be
needed.
12. Sample Size to Compare Percentages
• A study to compare percentages of outcomes
from different groups (incidence, cure rate,
mortality rate, survival rate)
13. Sample Size to Compare Percentages
• Pc = percentage from control group
• Qc = 1- Pc
• Pe = Percentage from the experimental group
• Qe = 1- Pe
2
2
)
(
)
(
)
(
2
d
d
Q
P
Q
P
C
n e
e
c
c
14. Sample Size to Compare Percentages
• d = Difference between the two groups (must
be positive)
• C = Constant (See table next page)
2
2
)
(
)
(
)
(
2
d
d
Q
P
Q
P
C
n e
e
c
c
15. C : Constant
• When power is 80%
• Power = Ability to find significance when the
two groups are really different (the formula is
for two sided difference)
alpha 0.05 0.01
C 7.85 11.68
16. Example 1
• The research question is whether smokers have
a greater incidence of skin cancer than
nonsmokers
• A review of previous literature suggests that
the incidence of skin cancer is about 0.2 in
nonsmokers
• At alpha=0.05, and power=80%, how many
smokers and nonsmokers will need to be
studied to determine whether skin cancer
incidence is at least 0.3 in smokers?
17. Example 2
• Null Hypothesis : The incidence of skin
cancer does not differ in smokers and
nonsmokers
• Alternative Hypothesis : The incidence of
skin cancer is different between smokers than
nonsmokers
18. Example
• Pe = 0.3, Pc = 0.2
2
2
)
(
)
(
)
(
2
d
d
Q
P
Q
P
C
n e
e
c
c
2
1
.
0
2
)
1
.
0
(
)
8
.
0
2
.
0
(
)
7
.
0
3
.
0
(
85
.
7 2
n
= 312.45 = use 313 persons in each group
19. Sample Size to Compare Means
• Hypothesis: Compare means of different
groups
• The parameters are quantitative (birth weight,
blood pressure)
• Select 2 groups that you think they will be
most different (such as; a control and a
treatment group)
• For t-test, ANOVA
20. Sample Size to Compare Means
• S = Standard Deviation of the variable
• d = Difference between the 2 groups
• C = Constant (from previous table)
2
2
1
d
s
C
n
21. Example
• The research question is to compare the efficacy
of DRUG A and DRUG B in the treatment of
asthma
• The outcome variable is FEV1 (forced expiratory
in 1 second) 1 hour after treatment
• A previous study has reported that the mean
FEV1 in persons with treated asthma was 2.0
litres, with a standard deviation of 1.0 litre
• The investigator would like to be able to detect a
difference of 10% or more in mean FEV1 between
the two treatment groups
22. Example
• Null Hypothesis : Mean FEV1 is the same in
asthmatics treated with DRUG A as in those
treated with DRUG B
• Alternative Hypothesis : Mean FEV1 is
different between asthmatic patients treated
with DRUG A and those treated with DRUG B
23. Example
• S = 1
• d = 10% of 2 litre = 0.2 litre
2
2
.
0
1
85
.
7
2
1
n
2
2
1
d
s
C
n
n = 393.5 : Then use 394 patients in each group