1. Topic 8
Sampling Techniques – The nature of sampling,
Probability sampling design, Non-probability
sampling design, Determination of sample size
BHARATIYA ENGINEERING SCIENCE & TECHNOLOGY
INNOVATION UNIVERSITY (BESTIU)
RESEARCH METHODOLOGY FOR Ph.D 2023 SUMMER BATCH
1
DR V KRISHNANAIK
PROFESSOR
BRIG, Hyderabad
2. Objectives
— Learn the reasons for sampling
— Develop an understanding about different sampling
methods
— Distinguish between probability & non probability
sampling
— Discuss the relative advantages & disadvantages of each
sampling methods
— list the factors influencing the sample size
— calculate the sample size using appropriate formulae
2
3. SAMPLING
A Sample is “a smaller collection of units from a
population used to determine truths about that
population”
Sampling describe about the population and it is a
subset of population.
Totality of data is called Population. Each and every
unit taken from population is known as Census.
Why sample? 3
4. Why sample?
Cost in terms of money, time and manpower
Accessibility
Utility e.g. to do diagnostic laboratory test you
don’t draw the whole of patient’s blood.
A census is a sample consisting of the entire population.
Even though a census is not full proof, it gives detailed
information about every small area of the population.
It has the following disadvantages:
Expensive
Takes a long time
Cumbersome & therefore inaccurately done ( a careful sample
produces a more accurate data than a census.)
4
5. Sampling…..
Sampling is the process of selecting a representative sample
from populations.
It Selecting cases (elements)—or locating people (or other units of
analysis)—from a target population in order to study the population.
5
Population
Sample
sampling
6. Cont’d
The process of obtaining information from a subset (sample) of a larger
group (population)
The results for the sample are then used to make estimates of the larger
group
Faster and cheaper than asking the entire population
Two keys
1. Selecting the right people
Have to be selected scientifically so that they are representative of the
population
2. Selecting the right number of the right people
To minimize sampling errors I.e. choosing the wrong people by chance
6
7. Population Vs. Sample
7
Population of Interest
Sample
Population Sample
Parameter Statistic
We measure the sample using statistics in order to draw
inferences about the population and its parameters.
8. Characteristics of Good Samples
o Representation
Sample surveys are almost never conducted for the
purposes of describing the particular sample under
study. Rather they are conducted for purposes of
understanding the larger population from which the
sample was initially selected
A great deal of work has been done over the years in
developing sampling methods that provide
representative samples for the general population.
E.g. international survey programs such as the DHS series,
EPI coverage surveys have perfected the art of household
sampling.
8
9. Characteristics of Good Samples cont’d….
3 factors that influence sample representativeness
Sampling procedure
Sample size
Participation (response)
When might you sample the entire population?
When your population is very small
When you have extensive resources
When you don’t expect a very high response
o Accessible
o Low cost
9
10. Basic Terms
Population (also called source population or target population)
Census
Sample survey
Sampling Frame
Probability samples
Non-probability samples
Sampling unit
Study unit (study subjects)
Sampling fraction (Sampling interval)
10
12. Hierarchy of sampling
.
12
Study subjects
The actual
participants in
the study
Sample
Subjects who are
selected
Sampling Frame
The list of potential subjects
from which the sample is
drawn
Source population
The Population from whom the study
subjects would be obtained
Target population
The population to whom the results would be
13. Errors in statistical Study
A sample is expected to mirror the population from which it
comes, however, there is no guarantee that any sample will be
precisely representative of the population.
No sample is the exact mirror image of the population .
13
Sampling or Random
Non-sampling or
systematic
Errors
14. Advantage of sampling
We obtain a sample rather than a complete enumeration (a
census ) of the population for many reasons.
Feasibility it may be the only feasible method of
collecting data
Reduced cost sampling reduces demands on resource
such as finance, personal and material
Greater accuracy sampling may lead to better accuracy
of collecting data.
Greater speed data can be collected and summarized
more quickly
14
15. Disadvantage of Sampling
If sampling is biased, or not representative or too small the
conclusion may not be valid and reliable
If the population is very large and there are many sections and
subsections, the sampling procedure becomes very complicated
If the researcher does not possess the necessary skill and
technical knowledge in sampling procedure, then the outcome
will be devastated.
15
16. Characteristics Of A Good Sample Design
From what has been stated above, we can list down the
characteristics of a good sample design as:
Sample design must result in a truly representative sample.
Sample design must be such which results in a small sampling
error.
Sample design must be viable(workable) in the context of
funds available for the research study.
Sample design must be such so that systematic bias can be
controlled in a better way.
Sample should be such that the results of the sample study
can be applied, in general, for the universe with a reasonable
level of confidence. 16
17. Types of Sampling
How we Selecting the right subjects
o The sample that we draw for our study
determines the generalizability of our findings.
o Sample should to have a good representation of
the population.
17
18. Types of Sampling Methods
Convenience
Sampling Method
Non-Probability
Samples
Quota
Judgemental
Probability Samples
Simple
Random
Systematic
Stratified
Cluster
Multistage Random
Sampling
18
19. Probability Sampling Method …
The random ("equal chance“) and "independent"
components of random sampling are what makes us
confident that the sample has a reasonable chance of
representing the population
What does it mean to be independent? The researchers
select each person for the study separately.
Equal chance - without plan, suddenly
This would be an example of non-independent sampling.
19
20. Probability Sampling Method cont’d …
In probability sampling
A sampling frame exists or can be compiled.
should have an equal or at least a known or nonzero chance
of being included in the sample.
Generalization is possible (from sample to population)
Simple Random Sampling,
Systematic Sampling,
Stratified Random Sampling,
Cluster Sampling
Multistage Sampling.
20
21. 1. Simple Random Sampling(SRS)
Simple random sampling is the most straightforward of the
random sampling strategies. It is very simple and equal chance in
population.
To use SRS there should be
o sampling frame for the population
o All possible samples of “n” subjects are equally likely ( ) to occur.
o population is small, relatively homogeneous & readily available
21
n
1
22. Simple Random Sampling cont’d …
Procedures to select the sample
The specific procedures that you follow may vary depending
on your resources, but all involve some type of random
process. Depending on the complexity of the population, we
can use different tools to select “n” samples from the given
sampling frame.
These are lottery method,
table of random number (they are available in the appendix
of many research methods and statistics textbooks) or
computer generated random number.
22
23. Simple Random Sampling cont’d …
Lottery method is appropriate if the total population is not too
large, otherwise if the population is too large then it will be very
difficult to use lottery method.
Thus, table of random number or computer generated random
number is the feasible method to be used.
Sampling schemes may be
o without replacement- no element can be selected more than once in the
same sample, possible samples.
o with replacement- an element may appear multiple times in the one sample
possible samples.
23
n
N
n
N
24. Example
Assume that the total number of patients who visit MGM
Hospital for the last six months is “N”. We want to see the
prevalence of TB among those patients who visited the hospital.
24
25. 2. Systematic Random Sampling
Systematic sampling is thought as random, as long as the
periodic interval is determined beforehand and the starting point
is random
A method of selecting sample members from a larger population
according to a random starting point and a fixed, periodic
interval.
Typically, every nth member is selected from the total population
for inclusion in the sample population.
It is frequently chosen by researchers for its simplicity and its
periodic quality.
it needs the population to be homogeneous, however the method
does not require frame.
25
26. Define the population
Determine the desired sample size (n)
List the population from 1 to N
Determine K, where k=N/n
Select a random number between 1 and k, let us denote this number by “a”
Starting at a, take every Kth number on the list until the desired sample is
obtained.
Then the selected list will be
a, a+k, a+2k, a+3k, …, a+(n-1)k
26
Note: Systematic sampling should not used when a cyclic repetition is
inherent in the sampling frame
Steps in systematic sampling
27. 3. Stratified Random Sampling
Stratified random sampling is used when we have subgroups in
our population that are likely to differ substantially in their
responses or behavior (i.e. if the population is heterogeneous).
In stratified random sampling, the population is first divided into
a number of parts or 'strata' according to some characteristic,
chosen to be related to the major variables being studied.
So, you divide your sample into male and female members and
randomly select the required sample size within each subgroup.
we used simple random sampling to select a sample from each
strata after stratification 27
28. Steps in stratified sampling method
Define the population
Determine the desired sample size
Identify the variable and subgroups (strata) for which you want to
guarantee appropriate representation (either proportional or equal)
Classify all members of the population as a member of one of the
identified subgroups
Randomly select (using simple random sampling or others) an
appropriate number of individuals from each subgroup.
Then the total sample size will be the sum of all samples from each
subgroup.
28
29. There are two methods to get the study subject from each subgroup,
proportional allocation or
equal allocation.
We use proportional allocation technique when our subgroups vary dramatically in size
in our population
Let N be total population and N1, N2 . . . . Nk be the subtotal population for strata 1, 2,
…. K respectively. Moreover let n be the total sample size and n1, n2…..nk be th
subsample for strata 1, 2…..k respectively in which N = N1 + N2 +….. …+ NK
and n = n1 + n2 + …………..+ nk
Then the subsample “ni “which will be selected from subgroup Ni can be computed by
29
1,2,3........
i
i
n N
n where i k
N
30. Advantage of stratified sampling
Merits:
1. It is more representative.
2. It ensures greater accuracy
3. It is easy to administer as the universe is sub - divided.
4. For non – homogeneous population, it may yield good results.
Limitations:
1. To divide the population into homogeneous strata, it requires more
money, time and statistical experience which are a difficult one.
2. Improper stratification leads to bias, if the different strata overlap such
a sample will not be a representative one
Sampling frame for the entire population has to be prepared
separately for each stratum.
30
31. 4. Cluster Random Sampling
In this sampling scheme, selection of the required sample is
done on groups of study units (clusters) and each study unit
individually.
The sampling unit is a cluster, and the sampling frame is a list of
these clusters.
If the study covers wide geographical area, using the other
methods will be too costly.
The idea is, divided the total population in to different clusters
and then the unit of selection will be cluster.
Therefore, total population in the selected cluster will be taken
as the sample.
31
32. Define the population
Determine the desired sample size
Identify and define a logical cluster (can be Hyderabad, Vijayawada,
Mumbai, Chennai, Delhi, and so on)
Make a list of all clusters in the population
Estimate the average number of population number per cluster
Determine the number of clusters needed by dividing the sample size
by the estimated size of the cluster
Randomly select the required number of clusters (using table of
random number as the total number of clusters is manageable)
Include in the sample all population in the selected cluster.
32
Steps in cluster sampling are:
34. 5. Multistage Random Sampling
This is the most complex sampling strategy.
The researcher combines simpler sampling methods to address sampling needs
in the most effective way of possible.
Example 1,
The administrator might begin with a cluster sample of all schools in the
district.
Then he might set up a stratified sampling process within clusters.
Within schools, the administrator could conduct a simple random sample
of classes or grades.
By combining various methods, researchers achieve a rich variety of
results useful in different contexts.
34
36. Non-Probability Sampling Method
Non-probability sampling strategies are used when it is practically
impossible to use probability sampling strategies.
Non-probability sampling is sampling procedure which does not
afford any basis for estimating the probability that each item in
the population has of being included in the sample.
Subjective units of population have a zero or unknown probability of selection
before drawing the sample. Hence obtained a non-representative samples.
Sampling error can not be computed
Survey results cannot be projected to the population
Advantages
Cheaper and faster than probability
Reasonably representative if collected in a thorough manner
36
37. 1. Judgment Sampling/ Purposive sampling
Judgment/Purposive/Deliberate sampling.
Depends exclusively on the judgment of investigator.
Sample selected which investigator thinks to be most typical of the
universe.
Merits
Small no. of sampling units
Study unknown traits/case sampling
Urgent public policy & business decisions
Demerits
Personal prejudice & bias
No objective way of evaluating reliability of results
37
38. 2. Convenience Sampling
Convenience sampling selects a particular group of people but
it does not come close to sampling all of a population.
Convenient sample units selected.
Selected neither by probability nor by judgment.
The sample would generalize only to similar programs in
similar cities. Easy availability samples
Merits – useful in pilot studies.
Demerits – results usually biased and unsatisfactory.
38
39. 3. Quota sampling
Most commonly used in non probability sampling.
Quotas set up according to some specified characteristic.
Within the quota , selection depends on personal judgment.
Merit- Used in public opinion studies
Demerit – personal prejudice and bias
39
40. 4. Snowball sampling
It is a special non-probability method used,
when the desired sample characteristic is rare.
Snowball sampling relies on referrals from initial subjects to
generate additional subjects.
What we need to do in case of snowball sampling is that first
identify someone who meets the criteria and then let him/her
bring the other he/she knew.
Merit : access to difficult to reach populations
Demerit : not representative of the population and will result in
a biased sample as it is self-selecting.
40
42. Sample Size Determination
Determining the sample size for a study is a crucial component
of study to include sufficient numbers of subjects so that
statistically significant results can be detected.
"How large a sample do I need?“
The answer will depend on the aims, nature and scope of the
study and on the expected result. All of which should be
carefully considered at the planning stage.
42
43. Sample……
o If sample (“n”) is
43
Take
Large
Increase accuracy
Costy / complex
Small
o Decrease accuracy
o Less costy
Optimum
sample
How ?
44. Factors to determine sample size
Size of population
Resources – subjects, financial, manpower
Method of Sampling- random, stratified
Degree of difference to be detected
Variability (S.D.) – pilot study, historical
Degree of Accuracy (or errors)
- Type I error (alpha) p<0.05
- Type II error (beta) less than 0.2 (20%)
- Power of the test : more than 0.8 (80%)
Statistical Formulae
Dropout rate, non-compliance to Rx
44
45. Sample for Single population
To estimate sample size for single survey using simple
or systematic random sampling, need to know
oEstimate of the prevalence of the outcome
o Precision desired
o Design effect
o Size of total population
oLevel of confidence (always use 95%)
45
46. Sample size for single population mean
This is the condition in which the research question is about
mean.
Standard deviation () of the population: It is rare that a
researcher knows the exact standard deviation of the population.
Typically, the standard deviation of the population is estimated:
from the results of a previous survey,
from a pilot study,
from secondary data,
from judgment of the researcher.
46
47. Maximum acceptable difference (w): This is the maximum
amount of error that you are willing to accept.
Desired confidence level (Z/2 ) : is your level of certainty that
the sample mean does not differ from the true population mean
by more than the maximum acceptable difference. Commonly
we use a 95% confidence level.
Then the sample size determination formula for single
population mean is defined by:
47
2
2 2
2
z
n
w
48. The formula for the sample size of single population proportion is defined
as:
Where α = the level of significance which can be obtained as 1- confidence level.
P = best estimate of population proportions
W = maximum acceptable difference
the value under standard normal table for the given value of confidence
level
48
2
2
2
* (1 )
z p p
n
w
2
z
49. Example 1
One of MPH student want to conduct a research on the prevalence of ANC utilization
of mothers in WARANGAL district. Given that the prevalence from the previous study
found to be 45.7% , what will be the sample size he should take to address his
objective?
Solution:
Margin of error d= 5%
A confidence level of 95% will give the value of as Zα/2=1.96.
Then using the formula :
49
382
05
.
0
)
543
.
0
(
457
.
0
96
.
1
05
.
0
)
457
.
0
1
(
457
.
0
)
1
(
2
2
2
2
2
05
.
0
2
2
2
Z
W
P
P
Z
n
50. Incorrect sample size will lead to
o Wrong conclusions
o Poor quality research (Errors)
o Type II error can be minimized by increasing the
sample size
o Waste of resources
o Loss of money
o Ethical problems
o Delay in completion
50
51. Example 2 HW
Midwifery graduate student wants to do her thesis work
on the title “assessment of the outcome of pregnancy
among women who visited Osmania university hospital
gynecology and obstetrics ward for the year 2020”
What will be the sample size she should take for this
study?
51
52. REFERENCES
Antoniswamy, Biostatistics principles and practice, New Delhi, Mc Graw
Hill Education (India) pvt ltd, 2010.
Dr. V. Krishnanaik, Research Methodology- Lap Laberd publication
Germany in 2015.
Barreiro, P.L. and Albandoz, J.P. (2001) Population and sample. Sampling
techniques, (online) Available at: http://optimierung.mathematik.unikl.
De / mamaeusch / veroeffentlichungen / ver_texte/sampling_en.pdf
[Accessed on 01 July 2017].
Sampling techniques (2013), (pdf) Available at: http:// uca.edu /
psychology / files/2013/08/Ch7-
Sampling-Techniques.pdf [Accessed on 01 July 2017]. Westfall, L. (2008)
Sampling methods, (online) Available at: https://pdfs.semanticscholar.org
/8774/2cdde8684e583efb5b6939f0e2665dea7558.pdf
52