2. Content Outline...
• What is sample size?
• Basic information needed for sample size calculation.
• Why to determine sample size?
• How large a sample do we need?
• What are the methods of determining it?
• What are the factors that affect it?
• Types of measurement in research.
• How do we determine sample size?
• Summary
• Conclusion
2
3. • This is the sub-population, to be studied in order to draw
a inference from a reference population (a population to
which the findings of the Study are to be generalized).
• In Census, the sample size is equal to the population
size. However, in research, because of time constraints
and budget, a representative sample is normally used.
• Larger the sample, more accurate will be the findings
from a Study.
3
What is Sample?
4. • Availability of resources sets upper limit of the sample
size.
• Required accuracy sets lower limit of sample size.
• Thus, an optimum sample size is an essential
component of any research.
4
Cont....
5. 1.What type of study is this?
2.What is the main (primary) outcome?
3. What is the expected variability between the
subjects?
4. How large a difference would be considered clinically
important and reasonable?
5
Information needed for sample calculation
6. 1. Outcome Measure
Whether data is continuous or categorical.
2. Allowable Margin of Error
This is an arbitary value which is usually taken at 5%
cut off level which is otherwise mentioned 95% CI.
6
Determination of Sample Size
7. It is the mathematical estimation of the number of subjects to
be included in a study.
Optimum sample size determination is required for the
following reasons:
1. To allow appropriate analysis
2. To provide desired level of accuracy
3. To allow validity to the significance test.
7
Sample Size Determination
8. If the sample is too small:
1. Even a well conducted Study may fail to
answer it’s research question.
2. It may fail to detect important effects or
associations.
3. It may associate this effect or association
imprecisely.
8
How much sample size do you need?
9. Conversely...
If the sample size is too large:
1. The study will be difficult and costly.
2. Time constraint.
3. Loss of accuracy.
Hence, optimum sample size must be determined
before commencement of a Study.
9
10. • Random error
• Systematic error (bias)
• Precision (reliability)
• Accuracy (Validity)
• Power
• Effect size
• Design effect
10
Types of Measurement in Research
11. 1.Random error: Errors that occur by chance
Sources:-
• Sample variability, subject to subject differences &
measurement errors.
Reduced by:-
• Averaging, increasing sample size, repeating the
experiment
11
12. 2.Systematic error: Deviations not due to
chance alone.
Sources
• Several factors, e.g. patient selection criteria may
contribute.
Reduced by
• Good study design and conduct of the
experiment.
12
13. 3. Precision
• The degree to which a variable has the same value
when measured several times.
• It is a function of random error.
4. Accuracy
• The degree to which a variable actually represent
the true value.
• It is function of systematic error.
13
15. 3.Power:
This is the probability that the test will correctly
identify a significant difference, effect or
association in the sample should one exist in the
population.
• Sample size is directly proportional to the power
of the study.
• The larger the sample size, the study will have
greater power to detect significance difference,
effect or association.
15
16. 4. Effect size:
• Is a measure of the strength of the relationship
between two variables in a population.
• The bigger the size of the effect in the
population, the easier it will be to find out.
16
17. 5.Design effect:
• Estimation of sample size for epidemiological
and clinical studies may require design effect.
• It is the ratio of the actual variance to the
variance expected with simple random sampling
17
18. Design Effect= Variance( Estimate) of Complex Sampling
Variance (Estimate) of SRS
Design Effect Cluster Sampling :-
DE= 1+ ICC(m-1)
Where,
• ICC is Intra Class Correlation
• m is Cluster size
• Once the design effect is estimated the sample
size is adjusted by multiplying with it.
18
19. It can be addressed at two stages:
1. During the planning stage, while designing the
study and information on some parameters.
2. At the stage of interpretation of the result.
19
At what stage the sample size can address?
20. There are 3 procedures that could be used:-
1. Use of formula
2. Ready made tables
3. Computer softwares
20
Procedure for Calculating Sample size
21. It depends on:
1. The study design
2. The main outcome measure of the study
There are distinct approaches for calculating
sample size for different study designs & different
outcome measures.
21
Approaches for estimating sample size
22. Sample Size Formula
The formula requires that we:
(i) Specify the amount of confidence we wish to have
(ii) Estimate the variance in the population
(iii) Specify the level of desired accuracy we want
22
23. Use of Formula
• Requirements for sample size calculations
µ/p = mean/proportion of interest.
µ0/p0 = null hypothesis mean/proportion.
d = range of confidence interval(CI).
u = one –sided percentage point of normal distribution
corresponding to 100% -Power.
v = two-sided percentage point of normal distribution
corresponding to required significance level.
23
25. Statistical formula for sample size
calculation...
• Sample size for The Mean :
n= Z² (var)²/ (e)²
Where
• Z = Confidence level at 95% (standard value
of 1.96)
• var = Variance of population
• e = allowable error
25
26. Example
• A health officer wishes to estimate the mean
hemoglobin in a defined community. Preliminary
information is that this mean is about 150mg/l
with a SD of 32mg/l. If a sampling error of up to
5mg/l in the estimate is to be tolerated, how
many subjects should be included in the study?
26
28. • Sample Size for Proportions &
Prevalence :
n= Z² p(1-p)/ (e)²
Where;
Z = Confidence level at 95%(standard value
of 1.96)
p = estimated prevalence or proportions of
project area
e = range of CI
28
29. Example..
• Suppose the prevalence of brucella infection is
2% and the absolute difference to be detected is
0.25% with a 95% confidence, what is the
sample size required?
29
31. Comparison between two proportion
• 2x(u+v) ² ]px(1-p)]/(p1-p2)
• p= p2+p1/2
• p1,p2 - Proportions
• u -one –sided percentage point of normal
distribution corresponding to 100%-power.
• v- two-sided percentage point of normal
distribution corresponding to required
significance level
31
32. EXAMPLE
• A study was planned to record difference in mortality
among cases of road traffic injuries graded AIS score 4
and 5 in the month of July. Results from previous study
shows 18 deaths among 72 patients graded AIS score 4.
To calculate sample size, we need:
• Proportion mortality in previous study: p1 =18/72=0.25
or 25%
• Size of difference between proportion mortalities that
would be considered appreciable:
• ( p1-p2) : decided as 3% by investigators.
• Thus expected proportion mortality in grade 5 cases:
• p2=28%=28/100=0.28
32
33. p=(0.25+0.28)/2= 0.265
• Power required : decided to be kept at 95%
u=1.64
• Significance level desired : decided to be kept at 1%
v=2.58
• Applying formula
• N = 2x (1.64+2.58)²x(0.265)(1-0.265)/(0.28-0.25)
=7707
Therefore 7707 subjects are needed to be studied in each
group.
33
34. Formula Used with Rates...
• Estimation of single rate
µ= v²/d²
µ - Rate
d – Range of CI
v - two-sided percentage point of normal distribution corresponding
to required significance level.
• Comparison of two rates
(u+v) ² (µ1+µo) / (µ1-µo) ²
µ1,µo - Rates
u - one-sided percentage point of normal distribution
corresponding to 100%-power.
v - two-sided percentage point of normal distribution
corresponding to required significance level 34
35. EXAMPLE-ESTIMATION OF SINGLE
RATE
• A study was planned to find out mean number of viral
influenza incidence per child per annum in 0-5 year old
children in Orrisa , India. To calculate sample size , we
need:
• Average number of influenza incidence expected in 0-
5year olds per annum . Review of existing literature
states it to be 4 approximately.
• 95% confidence interval we would like to have for our
desired average : Decided to be ± 0.2 by investigators .
i.e. 95% CI = 3.8-4.2
35
36. • Two-sided percentage point of normal distribution
corresponding to required significance level:
v for 95% CI = 1.96(˜2)
• Applying formula
n> µ v²/d² = 4(2) ²/(0.2) ²
=400
• Thus, at least 400 subjects need to be studied to obtain
mean influenza of 4 per child per annum with 95% CI of
± 0.2
36
37. EXAMPLE-COMPARISON OF RATES
• A study was planned to find out whether KAP improvement
tools for driving skills decrease injuries per annum in school
going children . School children were randomly assigned to
cases who received special education and controls who
didn’t . To calculate sample size , we need:
• Size of difference between mean road traffic accident rates
that was considered appreciable:
=(µ1-µo) : decided as 2 injuries per child per annum by
investigators
• Rate of injuries per child per annum among controls:
suggest it to be 4 injuries per child per annum .
Therefore µo=4 , µ1 =2
37
38. • Power required : decided to be kept at 95%
1-power = 5%
u = 1.64
• Significance level desired : decided to be kept at 1%
v = 2.58
• Applying the formula :
n> (1.64+2.58) ²x(2+4)/(2-4) ²
=17.8084x6/4 = 26.71
Therefore 27 subjects are needed in both the groups.
38
39. Use of Software...
The following softwares can be used for
calculating sample size .
Epi-info (epiinfo.codeplex.com)
nQuerry (nquery.codeplex.com)
STATA (www.stata.com)
SPSS (www.spss.co.in)
39
40. Use of Ready Made Table...
How large a sample of patients should be followed up if an
investigator wishes to estimate the incidence rate of a disease
to within 10% of it’s true value with 95% confidence?
The table show that for e=0.10 & confidence level of 95%, a
sample size of 385 would be needed.
This table can be used to calculate the sample size making
the desired changes in the relative precision & confidence
level .e.g if the level of confidence is reduce to 90%, then the
sample size would be 271.
Such table that give ready made sample sizes are available
for different designs & situation.
40
42. CONCLUSIONS
• Sample size determination is one of the most
essential components of every research Study.
• The larger the sample size, the higher will be the
degree of accuracy, but this is limited by the
availability of resources.
• It can be determined using formulae, readymade
tables and computer softwares.
42