3. Census
• Quantitative research method, in which all the members of the
population are enumerated.
• Implies complete enumeration of the study participants
• It is presumed that in such inquiry, when all items are covered,
no elements of chance is left and highest accuracy is obtained.
Prabesh Ghimire, MPH 3
4. Advantages of Census
• It provides basis for overall socio-economic planning of the
country.
• Provides complete information about the population
• More reliable and accurate information
• Covers wide range of the study
Prabesh Ghimire, MPH 4
5. Demerits of Census
• Resource intensive (time, human resources, financial
resources)
• Possibilities of error are higher in census investigation
Prabesh Ghimire, MPH 5
6. Sampling
• Statistical procedure of drawing
a sample from a population
• Based on belief that drawn
sample will exhibit the relevant
characteristics of the whole
population
Prabesh Ghimire, MPH 6
7. Applications of Sampling in Public Health
• Random sampling is the basic requirement for establishing
causes-effect relationship
• Good sampling design can provide more reliable estimates.
• Use of appropriate sampling methods help generalize the
findings of health research to the entire population of interest.
• Sampling is useful to assure both internal and external validity
of public health research.
Prabesh Ghimire, MPH 7
8. Significance of Sampling
• Necessity: Sometimes it’s simply not possible to study the whole
population due to its size or inaccessibility.
• Practicality: It’s easier and more efficient to collect data from a
sample.
• Cost-effectiveness: There are fewer participant, laboratory,
equipment, and researcher costs involved.
• Manageability: Storing and running statistical analyses on smaller
datasets is easier and reliable.
Prabesh Ghimire, MPH 8
10. Target/ Reference Population
• The target population is that population to which it is intended to
apply the results.
• Population to which the researchers are interested in
generalizing the study findings.
• Example:
• All mothers of Under-5 Children,
• All pregnant teens,
• All people living with HIV (PLHIV)
Prabesh Ghimire, MPH 10
11. Study Population
• It is the accessible population that researchers draw their
sample from.
• This population is a subset of the target population and is also
known as the accessible population.
• A defined population from which a sample has been selected.
• Mothers of U-5 Children of XYZ municipality
Prabesh Ghimire, MPH 11
12. Sample
• Specific group that you will collect data from.
• The size of the sample is always less than the total size of the
population.
Prabesh Ghimire, MPH 12
13. Sampling Frame
• A sampling frame is a list of all the items (sampling units) in the
population from which the sample is drawn
• It’s a complete list of everyone or everything that researchers
want to study.
• The difference between a population and a sampling frame is
that the population is general and the frame is specific.
• Frame is needed so that everyone in the population is identified
so that they will have an equal opportunity for selection in the
study.
Prabesh Ghimire, MPH 13
15. Simple Random Sampling
• Sampling technique where every item in
the population has an even chance and
likelihood of being selected in the sample.
• selection of items entirely depends on
luck or probability, and therefore this
sampling technique is also sometimes
known as a method of chances.
• The sample size in this sampling method
should ideally be more than a few
hundred so that simple random sampling
can be applied appropriately.
Prabesh Ghimire, MPH 15
16. Techniques of simple random sampling
• Lottery
• Use of random number table
• Computer generated random number
Prabesh Ghimire, MPH 16
17. Stratified Random Sampling
• For a stratified random sample, the population is divided into
groups or strata.
• To stratify means to classify or to separate people into groups
according to some characteristics, such as
• position, rank, income, education, sex, or ethnic background
• The population is divided to make the elements within a
group/strata as homogenous as possible.
Prabesh Ghimire, MPH 17
18. Stratified Random Sampling
Two types
• Proportionate
• the sample size from each stratum is dependent on that size of the
stratum.
• Therefore largest strata are sampled more heavily as they make larger
percentage of the target population.
• Disproportionate
• In disproportionate sampling, the sample selection from each stratum is
independent of it’s size.
Prabesh Ghimire, MPH 18
20. Merits
• Stratified random samples are generally more accurate in
representing the population than are simple random samples.
• Suitable for large and heterogenous population
Demerits
• Because participants are to be chosen randomly from each
stratum, a complete list of the population within each stratum
must be constructed.
Prabesh Ghimire, MPH 20
21. Systematic Random Sampling
• In systematic sampling, only the first sample unit is selected at
random and the remaining units are automatically selected at
the fixed equal interval guiding by some rule.
• Suppose N units of population are numbered from 1 to N in
some order.
• Then, the sample interval K = N/n is determined, where n is the
desired sample size.
• The first item in between 1&K is selected at random and every
other elements are automatically selected in the interval of K.
Prabesh Ghimire, MPH 21
23. Systematic Random Sampling
Merits
• This methods is simple and easy.
• The selected samples are evenly spread in the population and
therefore minimize chances of clustered selection of subjects
• Sampling frame is not always required
Limitations
• The method may introduce bias when elements are not
arranged in random order.
Prabesh Ghimire, MPH 23
24. Systematic Sampling Methods
Interval Sampling
• Select every Nth case at the health facility.
• For example every 5th, 7th, or 10th patient that meets the
inclusion criteria would be selected.
• Some foreknowledge of the volume of cases at the site is
required so that appropriate sampling interval can be selected.
Source: WHO interim global surveillance standards for influenza
Prabesh Ghimire, MPH 24
25. Systematic Sampling Methods
Alternate Day Sampling
• Select all patients meeting the inclusion criteria presenting to a
facility on a certain day or days of the week,
• This can reduce the logistical challenges of surveillance by
confining laboratory specimen and data collection efforts to a
single day.
• In order to remove the bias of the week, the day on which cases
are selected should be systematically alternated from week to
week.
Source: WHO interim global surveillance standards for influenza
Prabesh Ghimire, MPH 25
26. Cluster Sampling
• Cluster sampling is a sampling plan used when mutually
homogeneous yet internally heterogeneous groupings are
evident in a statistical population.
• In this sampling plan, the total population is divided into these
groups (known as clusters) and a simple random sample of the
groups is selected.
• The elements in each cluster are then sampled.
Prabesh Ghimire, MPH 26
27. Cluster Sampling
• If all elements in each sampled cluster are sampled, then this is
referred to as a "one-stage" cluster sampling plan.
• If a simple random subsample of elements is selected within
each of these groups, this is referred to as a "two-stage" cluster
sampling plan.
• A common motivation for cluster sampling is to reduce the
research costs given the desired accuracy
Prabesh Ghimire, MPH 27
28. Cluster elements
• The population within a cluster should ideally be as
heterogeneous as possible, but there should be homogeneity
between clusters.
• Each cluster should be a small-scale representation of the total
population.
Prabesh Ghimire, MPH 28
30. Cluster Random Sampling
Merits
• Can be cheaper than other sampling plans – e.g. fewer travel expenses,
administration costs.
• Feasibility: This sampling plan takes large populations into account. Since
these groups are so large, deploying any other sampling plan would be
very costly
• Does not require sampling frame
Limitations
• Complexity
• Design effect- sampling error
• Results may not be generalizable
Prabesh Ghimire, MPH 30
31. Probability Proportionate to Size
• The probability of selecting a cluster is proportional to its size,
so that a large cluster has a greater probability of selection than
a small cluster.
• The advantage here is that when clusters are selected with
probability proportionate to size, the same number of interviews
should be carried out in each sampled cluster so that each unit
sampled has the same probability of selection.
Prabesh Ghimire, MPH 31
33. Multi-Stage Sampling
• Multi-stage sampling (also known as multi-stage cluster
sampling) is a more complex form of cluster sampling which
contains more that two stages in sample selection.
• Large clusters of population are divided into smaller clusters in
several stages in order to make primary data collection more
manageable.
Prabesh Ghimire, MPH 33
34. Example Multi-Stage Sampling
• Choose 3 provinces in Nepal using SRS (or other probability
sampling)
• Choose 3 district in each province using SRS (or other
probability methods)
• Choose 3 municipalities from each district using SRS (or other
probability methods)
• Choose 100 households from each municipality using SRS or
Systematic random sampling.
• This will result in 2700 households to be included in the sample
group
Prabesh Ghimire, MPH 34
35. Multi-Stage Sampling
Merits
• Cost and speed that the survey can be done in
• Convenience of finding the survey sample, particularly in large
areas
• Sample frame required only for the selected clusters
Limitations
• May not always acquire a representative sample
• The presence of group-level information is required
Prabesh Ghimire, MPH 35
37. Convenience Sampling
• Sometimes known as grab or opportunity sampling or
accidental or haphazard sampling.
• A type of non-probability sampling which involves the sample
being drawn from that part of the population which is close to
hand. That is, readily available and convenient.
• The researcher using such a sample cannot scientifically make
generalizations about the total population from this sample
because it would not be representative enough.
37
Prabesh Ghimire, MPH
38. Convenience Sampling
• For example, if the interviewer was to conduct a survey at a
health facility.
• The clients that he/she could interview would be limited to those
given there at that given time.
• This type of sampling is most useful for pilot testing..
38
Prabesh Ghimire, MPH
40. Judgmental sampling or Purposive
sampling
40
• Also called expert sampling
• The researcher chooses the sample based on who they think
would be appropriate for the study.
• This is used primarily when there is a limited number of people
that have expertise in the area being researched.
• Usually done for Key Informant Interviews
• Interview to understand the decision maker's perception on
current health policies might purposively require senior officials
of MOHP.
Prabesh Ghimire, MPH
41. Purposive sampling example
• If you want to know more about the opinions and experiences of
disabled adolescents in your community,
• You purposefully select a number of adolescents with different support
needs in order to gather a varied range of data on their disability
experiences.
Prabesh Ghimire, MPH 41
43. Quota Sampling
• In quota sampling the selection of the sample is non-random.
• The population is first segmented into mutually exclusive sub-
groups, just as in stratified sampling.
• Then judgment is used to select participants or units from each
segment based on a specified proportion.
• It is this second step which makes the technique one of non-
probability sampling.
• The problem is that these samples may be biased because not
everyone gets a chance of selection.
43
Prabesh Ghimire, MPH
44. 300 sample required
180 male students 120 female students
Selection by
convenience/
judgement
Selection by
convenience/
judgement
1200 male students 800 female students
60% 40%
60% 40%
Prabesh Ghimire, MPH 44
45. Snowball/ Chain Referral Sampling
• Chain-referral sampling
• In this technique, existing participants provide referrals to recruit
other participants required for a research study.
• It is used when
• potential participants have traits that are hard to find
• It is tough to choose the participants to assemble them as samples for
research
• Useful in sensitive investigations/studies
Prabesh Ghimire, MPH 45
46. Snowball/ Chain Referral Sampling
• Two key steps
• Identify potential participants in the population. Often, only one or two
participants can be found initially.
• Ask those participants to recruit other people (and then ask those
people to recruit.
• Types
• Linear snowball sampling
• Exponential snowball sampling
• non-discriminative: multiple referrals; and each referred person is interviewed
• Discriminative: multiple referral; only one among referred is interviewed
Prabesh Ghimire, MPH 46
47. Applications of Snowball Sampling
• Useful for investigating patients with rare disease
• Identifying drug abusers, criminals
Prabesh Ghimire, MPH 47
49. Snowball Sampling
• Merits
• Needs little planning and fewer workforce
• The chain referral process allows the researcher to reach populations
that are difficult to sample
• Demerits
• Researcher has a little control over the sampling method
• Representativeness of the sample is not guaranteed. Researcher has
no idea of the true distribution of the sample
• Sometimes recruitment may be affected if the participants fails to
recruit/identify other participants
Prabesh Ghimire, MPH 49
50. Voluntary Response Sampling
• Similar to a convenience sample, a voluntary response sample
is mainly based on ease of access.
• Instead of the researcher choosing participants and directly
contacting them, people volunteer themselves (e.g. by
responding to a public online survey).
• Voluntary response samples are always at least somewhat
biased, as some people will inherently be more likely to
volunteer than others.
Prabesh Ghimire, MPH 50
52. Other sampling methods
Consecutive Sampling
• Total enumerative sampling where every participants meeting
the inclusion criteria is selected until the required sample size is
achieved.
• Typically better than conveniences sampling in controlling
sampling bias.
• Care needs to be taken with consecutive sampling
Prabesh Ghimire, MPH 52
53. Selection of Sampling Design
(Choosing the best sampling method)
Prabesh Ghimire, MPH 53
54. Sampling frame availability
• We need to check for availability of a sampling frame.
• If sampling frame is available
• Use Simple random or a stratified random sampling.
• If sampling frame is not available, we could still use other
random sampling methods
• for instance, systematic or cluster sampling
• Snowball sampling (non-random) may also be used where
sampling frame is not present.
Prabesh Ghimire, MPH 54
55. Population Distribution
• Check if our target population is widely varied in its baseline
characteristics.
• For example, a population with large ethnic subgroups could
best be studied using a stratified sampling method.
• Homogenous population may be studied using simple random
method.
• If the population is geographically dispersed, use cluster
sampling
Prabesh Ghimire, MPH 55
56. Generalizability
• To increase generalizability: select random sampling methods
• In Systematic Random sampling, generalizability may decrease
if baseline characteristics repeat across every nth participant
• In cluster design, if clusters are not representative, results may
not be generalizable
Prabesh Ghimire, MPH 56
57. Research Objectiveness
• A refined research question and goal would help us define our
population of interest.
• If our calculated sample size is small then it would be easier to
get a random sample.
• If, however, the sample size is large, then we should check if
our budget and resources can handle a random sampling
method.
Prabesh Ghimire, MPH 57
59. For Cross-Sectional Surveys
• Cross sectional studies or cross sectional survey are done to
• estimate a population parameter like prevalence of some disease in a
community or
• finding the average value of some quantitative variable in a population.
• Sample size formula for categorical and quantitative variable
are different.
Prabesh Ghimire, MPH 59
60. For Cross-Sectional Surveys
For Proportion (Qualitative Variable)
• Suppose a researcher wants to know proportion of children who are
stunted in a population, then this formula should be used as
proportion is a qualitative variable.
𝑆𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 =
𝑍 1−𝛼/2
2
× 𝑝 1 − 𝑝
𝑑2
Where, Z(1-/2) is standard normal variate (at 5% Type I error, it is 1.96)
p = expected proportion in population based on previous studies or
pilot studies
d = absolute error or precision (has to be decided by researcher)
Prabesh Ghimire, MPH 60
61. If the population is finite
• If the population is finite, we use
𝑆𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 (𝑓𝑖𝑛𝑖𝑡𝑒) =
𝑛
1 + (
𝑛 − 1
𝑁
)
Where,
N= Finite population size
n= sample size calculated using infinite population size formula
Prabesh Ghimire, MPH 61
62. Exercise on Sample Size Calculation
• Suppose you are planning to conduct a household survey to
estimate the prevalence of stunting among under-5 children in
Kageshwori Manohar Municipality. Previous study had shown
that the stunting prevalence in Bagmati province was 22.6%.
Calculate the desired sample size for your study:
i) If the number of U-5 children is unknown
ii) If the number of U-5 children is known (i.e. 9024)
iii) For two-stage cluster sampling.
Prabesh Ghimire, MPH 62
63. Exercise on Sample Size Calculation
• Suppose you are planning to conduct a household survey to
estimate the prevalence of anemia among women of
reproductive age in Kathmandu district. In previous studies, the
anemia prevalence in WRA varied as 29.0%, 40.8% and 58%.
Calculate the appropriate sample size for your study.
Prabesh Ghimire, MPH 63
64. For Cross-Sectional Surveys
For quantitative variable
• Suppose the same researcher is interested in knowing average systolic
blood pressure of children of the same city.
• Below mentioned formula should be used as blood pressure is a
quantitative variable
𝑆𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 =
𝑍(1−𝛼/2)
2
× 𝑆𝐷2
𝑑2
Where, Z(1-/2) is standard normal variate as mentioned above
SD = Standard deviation of variable. Value of standard deviation can be
taken from previously done study or through pilot study.
d = absolute error or precision (has to be decided by researcher)
Prabesh Ghimire, MPH 64
65. For Case-Control Studies
Formula for sample size calculation for comparison between two groups
when endpoint is quantitative data
𝑆𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 =
(𝑟 + 1)
𝑟
×
𝑆𝐷2
(𝑍𝛼/2 × 𝑍𝛽)2
𝑑2
• Where,
• SD = Standard deviation of variable. (from previously done study or
through pilot study.)
• Z/2 is standard normal variate
• Zß is power of study (0.842 at 80% power, 1.28 for 90% power)
• d is the effect size (difference between mean values)
• r is the ratio of control to cases
Prabesh Ghimire, MPH 65
66. For Case-Control Studies
Formula for sample size calculation for comparison between two groups
when endpoint is quanlitative data
𝑆𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 =
𝑟 + 1
𝑟
×
(𝑍𝛼/2 × 𝑍𝛽)2 𝑝(1 − 𝑝)
(𝑝1 − 𝑝2)
2
• Where,
• p1- p2 Effect size or the difference in proportion of events in two
groups
• p1= proportion in cases
• p2= proportion in controls
• p = pooled prevalence
• 𝑍𝛽= Standard normal variate for power
Prabesh Ghimire, MPH 66
67. For Intervention Studies
Formula for sample size calculation for comparison between two
groups when endpoint is quantitative data
• When the variable is quantitative data like blood pressure, weight, height,
etc., then the following formula can be used for calculation of sample size
for comparison between two groups.
𝑆𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 =
2 𝑆𝐷2 (𝑍𝛼/2 × 𝑍𝛽)2
𝑑2
Where,
• SD = Standard deviation of variable. (from previously done study or
through pilot study.)
• Z(1-/2) is standard normal variate
• Zß is power of study (0.842 at 80% power)
• d = effect size (difference between mean values)
Prabesh Ghimire, MPH 67
68. For Intervention Studies
Formula for sample size calculation for comparison between two
groups when endpoint is qualitative data
• When the endpoint of a clinical intervention study is qualitative, then
the following formula can be used for sample size calculation for
comparison between two groups.
𝑆𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 =
2 (𝑍𝛼/2 × 𝑍𝛽)2 𝑝(1 − 𝑝)
(𝑝1 − 𝑝2)
2
Where,
• p1- p2 is the difference in proportion of events in two groups
• p = pooled prevalence
Prabesh Ghimire, MPH 68
69. Practical tips
Use digital technology
• Epi info stat calc
• Gpower
(www.gpower.hhu.de)
• N4 studies- for android/
ios mobile
• OpenEpi
(www.openepi.com)
Prabesh Ghimire, MPH 69
70. References
• Banerjee, A., & Chaudhury, S. (2010). Statistics without tears:
Populations and samples. Industrial psychiatry journal, 19(1),
60–65. https://doi.org/10.4103/0972-6748.77642
• Charan, J., & Biswas, T. (2013). How to calculate sample size
for different study designs in medical research?. Indian journal
of psychological medicine, 35(2), 121–126. doi:10.4103/0253-
7176.116232
Prabesh Ghimire, MPH 70