3. Basic Concepts
How to choose an adequate subset of the
total population?
There are several concepts in the study and
practice of sampling design in research.
Population (universe)
◦ all units/elements relevant to the research
◦ includes all members of the group under study
or the set of entities under study and their
characteristics.
◦ inferences from the sample are made about the
population
5/11/2022 3
4. Concepts…
Census
◦ information on the whole population;
enumeration of the entire population
Sample is part of the population and all its
characteristics.
A sample is “a smaller (but hopefully
representative) collection of units from a population
used to determine truths about that population”
(Field, 2005)
Sampling units/elements
◦ non-overlapping components of the population
◦ ‘units/elements’ used because it is not only
people or households organizations... that are
sampled 5/11/2022 4
5. Concepts…
Sampling frame
◦ the list of all sampling units/elements in the
population
◦ the sample is selected from this list
Representative sample
◦ a sample that represents the population
accurately
For example, if the study is on ‘households
welfare’:
◦ Population = all households in the study area
◦ Sampling unit = a household
◦ Sampling frame = list of the households
5/11/2022 5
6. The sampling process
Step 1: Define the population, sampling units, extent
and time.
Step 2: Get a research permit if this is required in the
place you work in.
Step 3: Construct the sampling frame.
Step 4: Determine the sample size.
Step 5: Select a sampling procedure.
Step 6: Select the sample.
6
5/11/2022
7. Identifying and defining the relevant
population
◦ To whom do you want to generalize your results?
◦ The first thing the sample plan must include is a
definition of the population to be investigated.
◦ Defining the target population implies specifying
the subject of the study.
◦ Specification of a population involves identifying
which elements (items) are included, as well as
where and when.
◦ When one wants to undertake a sample survey the
relevant population from which the sample is going
to be drawn need to be identified.
Example: if the study concerns income, then the definition
of the population as individuals or households can make a
difference.
All university students
5/11/2022 7
8. Population…
Census Vs. Sample
Once the population has been defined, the
researcher must decide whether the survey is to be
conducted among all members of the population or
only a subset of the population.
That is, a choice must be made between census
and sample
Census is used when the population is small and
definite.
Sample is preferable if the population is large.
5/11/2022 8
9. Population…
The Need for Census
Universality/Wide applicability- the data
collected has a wide application. sampling units are
only applicable on average.
For the safety of the consumer - census remains
the only option in case of some products w/c are
so critical to the life of consumers.
Representativeness:- to eliminate the possibility
that by chance a randomly/purposely selected
sample may not representative of the population.
Less sampling error: - The only possible errors
can be due to computation of the elements.
5/11/2022 9
10. Population…
Limitation of census
Expensiveness: Huge resources (HR, financial,
resources, etc. ) requirements
Excessive time and energy: Beside cost
factor, it takes too long time and consumes too
much energy.
Not appropriate for short term study.
5/11/2022 10
11. Sampling
Sampling is the selection of elements (research
units) from a population.
Sampling is the process of systematically selecting
that which will be examined during the course of a
study.
Strong research design and analytical approach
require:
Use of one or more of the sampling strategies
Include an iterative sampling approach
whereby the research team moves back and
forth (iterating) between sampling and analyzing
data such that preliminary analytical findings
shape subsequent sampling choices.
5/11/2022 11
14. Reasons for Sampling
The reasons for taking sampling instead of census:
Economy- finance, manpower, etc.
Speed- When the time between the recognition of
the need of information and the availability of that
information is short, sampling helps not to miss the
information
Indispensability- Sampling remains the only choice
when a test involves the destruction of the item
under study.
Practicality- Sampling remains the only way when
the population are infinitely
Homogeneity: If all units of the population are alike,
sampling technique is easy to use.
Administrative convenience
5/11/2022 14
15. Defining and securing a sampling
frame
Ideally, the sampling frame would list every research
unit in the target population separately and only once.
In practice, such lists can seldom be constructed, and
we end up with an incomplete list (the study
population)
It is a list of elements from which the sample is
actually drawn is important and necessary.
The sampling frame is the list from which the potential
respondents are drawn. Example
◦ Registrar’s office
◦ Class rosters
Sampling frame has the property that we can identify
every single element and include any in our sample
The sampling frame must be representative of the
population 5/11/2022 15
16. Identifying parameters of
interest
What specific population characteristics (variables
and attributes) may be of interest.
Parameter:
◦ A Measurable characteristics of a population.
◦ A descriptive measure of the population under study.
E.g. population mean (), proportions
Statistic:
Is a descriptive measure of a sample
Example, when we work out certain measurement
like, mean from a sample they are called statistic.
the sample mean (x) is a statistic.
5/11/2022 16
17. Selecting types of sampling
techniques
Random
(= probability)
sampling
All elements in the
population have a
nonzero and known
chance of being
selected
Purpose:
To avoid bias
representativeness
Non-random
(= deliberate or purposive)
sampling
Selection (partly) based on
the judgement of the
interviewer or researcher
Purpose:
to obtain a workable
sample when developing a
sampling frame is near
impossible or too time
consuming
representativeness..?
5/11/2022 17
18. All elements of the
target population
have known and
positive chance of
being selected
Examples:
◦ Simple Random
Sampling
◦ Systematic RS
◦ Cluster Sampling
◦ Stratified Random
sampling
Probability Sampling Non-Probability Sampling
• NOT All elements of
the target population
have known and
positive chance of
being selected
Examples:
– Convenience
sampling
– Judgment/Expert
Sampling
– Quota sampling
– Snowballing
Types of Sampling
Techniques…
5/11/2022 18
19. Types of Sampling Techniques
Judgment
al
Non-Probability Sampling
Techniques
Stratified
Quota
Probability Sampling
techniques
Simple Random
Convenienc
e
Snowbal
l
Systematic Cluster
5/11/2022 19
20. Non-probability Sampling Techniques
Judgmental
we use different strategies to sample
Typical cases
Heterogeneity
Extreme cases
Confirming and non-confirming cases
Purposive
Visiting places near roads, towns, etc
Interviewing people available during data collection
Observing whichever areas key actors want to show
us
Advantage(s)=easy, fast and may help us collect data
that would not have been collected.
5/11/2022 20
21. Non-probability Sampling Techniques
Quota: quotas are assigned to different strata
groups and interviewers are given quotas to be filled
from different strata.
◦ A researcher first identifies categories of people
(e.g., male, female) then decides how many to get
from each category.
Snowball (Network) Sampling – chain sampling
◦ This is a method for identifying and selecting the
cases in a network.
◦ It begins with one or a few people or cases and
use them to establish contact with others.
You start with one or two information-rich key
informants and ask them if they know persons
who know a lot about your topic of interest.
5/11/2022 21
22. Probability Sampling Techniques
1. Simple Random Sampling (SRS)
◦ The simplest and easiest method.
◦ each element of the population has an equal
chance of being selected into the sample.
◦ It assumes that an accurate sampling frame exists.
◦ Usually two methods are adopted to pick a
sample (lottery method and random table).
E.g., simple random sampling for household surveys
1. Population = all households in the country
2. Sampling frame = the list of all households (20
million in Ethiopia?)
3. Sample size = say we have resources to cover
only 20,000 households
4. Sampling fraction 20,000/20,000,000 or 0.1%
5. Select randomly 20,000 households from the
long list of 20,000,000 households
5/11/2022 22
23. Probability Sampling Techniques
Systematic Sampling Technique
In SYSTEMATIC SAMPLING individuals are chosen
at regular intervals (for example every nth) from the
sampling frame.
◦ The major advantages of SS are its simplicity and
flexibility.
◦ instead of a list of random numbers, the
researcher calculates a sampling interval.
The sampling interval is the standard distance
between elements selected in the sample.
5/11/2022 23
24. Probability Sampling Techniques
E.g., a systematic sample is to be selected from 1200
students of a school.
The sample size to be selected is 100.
The sampling fraction is: 100/1200= sample
size/study population = 1/12
The sampling interval is therefore 12.
The first student in the sample is chosen randomly,
for example by blindly picking one out of twelve
pieces of paper, numbered 1 to 12.
If number 6 is picked -every twelfth student will be
included –i.e. 6, 18, 30, 42, etc.
5/11/2022 24
25. Probability Sampling Techniques
Stratified Sampling
A population is subdivided into the appropriate
strata and a simple random sample taken using
either SRS or SS techniques from each stratum.
Particularly useful when we have heterogeneous
populations.
E.g., low income, middle income, high income
areas
5/11/2022 25
26. Probability Sampling Techniques
The reasons for stratifying
To increase a sample’s statistical efficiency
(smaller standard errors).
To provide adequate data for analyzing the various
subpopulation.
To enable different research methods and
procedures to be used in different strata.
Can be multiple stage stratified random sampling
E.g., in the household survey we may be interested
to have sufficient number of households from each
region of Ethiopia; stratify by region!
5/11/2022 26
27. Probability Sampling Techniques
How to Stratify
◦ Three major decisions must be made in order to
stratify the given population into some mutually
exclusive groups.
(1) What stratification base to use: stratification
would be based on the principal variable under study
such as income, age, education, sex, location,
religion, etc.
5/11/2022 27
28. Probability Sampling Techniques
(2) How many strata to use: there is no precise answer
as to how many strata to use.
◦ The more strata the closer one would be to come
to maximizing inter-strata differences and
minimizing intra-strata variables.
(3) What strata sample size to draw: different
approaches could be used:
One could adopt a proportionate sampling
procedure.
Or use disproportionate sampling, which
allocates elements on the basis of some bias.
5/11/2022 28
29. Probability Sampling Techniques
Cluster Sampling:
1. It may be difficult or impossible to take a simple
random sample because a complete sampling frame
does not exist, or
2. Logistical difficulties may also discourage random
sampling techniques
E.G.: interviewing people who are scattered over
a large area may be too time-consuming).
The selection of groups of study units (clusters)
instead of the selection of study units individually is
called CLUSTER SAMPLING.
It is cost effective (High economic efficiency)
It involves sampling of groups
Clusters are often geographic units (e.g., districts,
villages) or organizational units (e.g., clinics, etc).
5/11/2022 29
30. Probability Sampling Techniques
E.g., sampling for household survey in Mekelle
◦ Probably no complete sampling frame and costly
to cover simple random sample
◦ Randomly select from sub-cities (clusters)
◦ Randomly select kebeles from sub-cities (clusters)
◦ Then randomly select households from the
selected kebeles
5/11/2022 30
31. Determining the sample size
Research designs with too small sample size are
unethical
◦ because they waste resources as they can only
provide anecdotal evidence.
If the sample size is too small, the data will be
unusable.
Research studies that use too large samples i.e.,
larger than needed, also are unethical because:
they waste time and financial resources,
5/11/2022 31
32. Determining the sample size
human subjects may also undergo unnecessary
experimental procedures that could be distressful and
painful.
Sample size determination hinges on:
i) Degree of homogeneity: The size of the population
variance is an important parameter.
The greater the dispersion in the population the
larger the sample must be to provide a given
estimation precession.
5/11/2022 32
33. Determining the sample size
ii) Degree of confidence required: Since a sample can
never reflect its population for certain, the researcher
must determine how much precision s/he needs.
Precision is measured in terms of
(i) An interval range (the margin of error).
(ii) The degree of confidence (how sure you are)
5/11/2022 33
34. Determining the sample size
iii) Number of sub groups to be studied:
If the research is to make estimates on several
subgroups of the population then the sample must
be large enough for each of these subgroups to
meet the desired quality level.
iv) Cost: cost considerations have a major implications.
All studies have some budgetary constraint and
hence cost dictates the size of the sample.
5/11/2022 34
35. Determining the sample size
V) Prior information: If similar previous study exists we
can use that prior information to determine our
sample size.
using prior mean and variance estimates or
stratifying the population to reduce variation
within groups.
samples that have met the requirements of the
statistical methods from past studies.
Researchers use it because they rarely have
information on the variance or standard errors.
5/11/2022 35
36. Determining the sample size
vi) Practicality: Of course the sample size you select
must make sense.
We want to take enough observations to obtain
reasonably precise estimates of the parameters of
interest but we also want to do this within a
practical resource budget.
Therefore the sample size is usually a compromise
between what is DESIRABLE and what is FEASIBLE.
In general, the smaller the population, the bigger
the sampling ratio has to be for a reasonable
sample.
5/11/2022 36
37. Determining the sample size
Hence as a ‘rule of thumb’:
For population of 200 or less use Census
For small populations (under 1000 a large sampling
ratio (about 30%). Hence, a sample size of about
300 is required.
For moderately large population (10,000), a smaller
sampling ratio (about 10%) is needed – a sample
size around 1,000.
To sample from very large population (over 10
million), one can achieve accuracy using tiny
sampling ratios (.025%) or samples of about 2,500.
5/11/2022 37
38. Determining the sample size
STRATEGIES FOR DETERMINING SAMPLE SIZE
There are several approaches to
determining the sample size. These
include:
◦ using a census for small populations,
◦ imitating a sample size of similar studies,
◦ using published tables, and
◦ applying formulas to calculate a sample
size.
5/11/2022 38
39. Determining the sample size
Using a Census for Small Populations
◦ to use the entire population as the sample.
◦ attractive for small populations (e.g., 200 or less).
Using a Sample Size of a Similar Study
◦ to use the same sample size as those of studies
similar to the one you plan.
◦ a review of the literature in your discipline can
provide guidance about "typical" sample sizes
which are used.
Using Published Tables
◦ is to rely on published tables which provide the
sample size for a given set of criteria.
5/11/2022 39
41. How to Calculate Sample Size for Different Study
Designs
The fourth strategy is to calculate sample size
using different formula suitable to the research
design at hand.
In the recent era of evidence-based results,
statistics has come under increased scrutiny.
Evidence is as good as the data the research
is based on, which in turn depends on the
statistical soundness of the claims it make.
It is very important to understand that method
of sample size calculation is different for
different study designs and one blanket formula
for sample size calculation cannot be used for
all study designs.
5/11/2022 41
42. Cochran’s (1977) sample size
determination formula
n= Z2pq/d2
Where,
n is the desired sample size;
Z is standard normal variable at the required
confidence level (E.g. Z statistics: 1.96);
d is the desired level of precision or level of
statistical significance/margin of error the
researcher accepts (E.g. 0.05);
p is estimated characteristics of target population
(variability of population parameters) the
researcher assumes; and q is 1-p.
5/11/2022 42
43. For example: Let us assume that a researcher
wants to estimate proportion of patients having
hypertension in pediatric age group in a city.
According to previously published studies actual
number of hypertensive may not be more than 15
percent. The research wants to calculate this
sample size with the precision (5 percent level of
significance/margin of error). So if we use the
above formula, the sample size is 196.
2 2
2 2
0.15*0.85*(1.96)
196
0.05
pqz
n
u
Cochran’s (1977) Formula
5/11/2022 43
44. Cochran’s (1977) Formula
Survey studies usually have non-sampling errors
due mainly to defective definitions of important
variables, failure to and bias of respondents to
response, coverage and compiling error of the
researchers.
Researchers are, therefore, advised to oversample
by 10% to 20% of the computed number of
samples based on their anticipation of such
discrepancies (Naing et al., 2006).
5/11/2022 44
45. Kothari’s (2004) Formula
According to Kothari (2004) systematic Random
Sampling can be considered as improvement over
Simple Radom Sampling as systematic sample is
spread more evenly over entire population.
The statically sample size decision making formula
to a population size(N) that is greater than or equal
to 10031, as to Kothari is:
n= (z 2pq)/d2
fn=n/(1+n/N) if N is less than 10,000
Whereas n=the desire sample size; Z= Standard
normal variable at the required level of confidence;
P=the proportion in the target population estimated
to have characteristic being measured; q=1-p, and
d= the level of tactical significance set
5/11/2022 45
46. Kothari’s (2004) Formula
Finite Population Correction
The above sample size formula is valid if the
calculated sample size is smaller than or equal to
5% of the population size (n/N ≤0.05) (Daniel,
1999).
If this proportion is larger than 5% (n/N>0.05), we
need to use the formula with finite population
correction (Ibid) as follows.
This sample size formula holds true or valid
only if we apply the simple random or
systematic random sampling methods.
5/11/2022 46
47. Yamane’s (1967) Formula
Full survey of the data for a given task increases
the representative of the population under taken.
However, such surveys are in efficient both in-
terms of cost and time (Ross et al, 2002).
Consequently, taking a representative from the
given population is mandatory to accomplish the
research study with its time frame and budget.
Yamane’s sample size determination (Yemane,
1967)
Where, n = the desired sample size; N = total
number of population and e = the level of
precision or the quality of being care full and
accurate which is equal to 0.05
5/11/2022 47
48. Exercise
It was desired to estimate proportion
of diabetic children in a certain school.
In a similar study at another school a
proportion of 30 % was detected.
◦ Compute the minimal sample size
required at a confidence limit of 95% and
accepting a difference of up to 4% of the
true population.
5/11/2022 48
49. Exercise
Suppose we wish to evaluate a state-wide
extension program in which farmers were
encouraged to adopt a new practice. Assume
there is a large population but that we do not
know the variability in the proportion that will
adopt the practice; therefore, assume p=.5
(maximum variability). Furthermore, suppose
we desire a 95% confidence level and ±5%
precision
Calculate the sample size
5/11/2022 49
50. Other considerations in determining sample size
First, the above approaches to determining
sample size have assumed that a simple
random sample is the sampling method.
Another consideration with sample size is the
number needed for the data analysis.
◦ If descriptive statistics are to be used, e.g.,
mean, frequencies, then nearly any sample
size will suffice.
◦ a good size sample, e.g., 200-500, is needed
for more rigorous analyses (multiple
regression, analysis of covariance, or log-
linear analysis, etc). 5/11/2022 50
51. Other considerations in determining sample size
Finally, the sample size formulas provide
the number of responses that need to be
obtained.
◦ Many researchers commonly add 10%
to the sample size to compensate for
persons that the researcher is unable to
contact.
◦ The sample size also is often increased
by 30% to compensate for non-
response.
5/11/2022 51
52. Acceptable Response Rate
Mangione (1995: 60–61) has provided the
following classification of bands of
response rate to postal questionnaires
(cited in Bryman, 2012:235):
◦ Over 85% is excellent;
◦ 70–85% is very good;
◦ 60–69% is acceptable;
◦ 50–59% is barely acceptable;
◦ below 50% is not acceptable.
5/11/2022 52
53. Problems in Sampling
Two types of errors:
Non sampling errors
Sampling errors
1. Non Sampling errors: are biases or errors due to
fieldwork problems, interviewer induced bias, clerical
problems in managing data, etc.
◦ These would contribute to error in a survey,
irrespective of whether a sample is drawn or a
census is taken.
2. sampling errors are error which is attributable to
sampling, and which therefore, is not present in
information gathered in a census.
5/11/2022 53
54. Problems in Sampling
1. Non-Sampling Error: refers to
◦ Non-coverage error
◦ Wrong population is being sampled
◦ Non response error
◦ Instrument error
◦ Interviewer’s error
5/11/2022 54