SAMPLING TECHNIQUES
Jagdish D. Powar
Statistician cum Tutor
Community Medicine
SMBT, IMSRC, Nashik
COMPETENCY & LEARNING OBJECTIVES
2
Competency SLOs(Core)
CM6.4, Enumerate, discuss and
demonstrate Common sampling
techniques, simple statistical
methods, frequency distribution,
measures of central tendency and
dispersion
The student should be able to
 Define and describe different
sampling techniques-simple
random sample, systematic
sampling, stratified random
sampling
 Explain non-probability
sampling
 Suggest an appropriate
sampling methods
IMPORTANT STATISTICAL TERMS
Population:
a set which includes
all measurements of
interest to the
researcher
(The collection of all
responses,
measurements, or
counts that are of
interest)
Sample:
A subset of the
population
INTRODUCTION…
 Sampling is a process of selecting representative units
from an entire population of a study.
 It is not always possible to study an entire
population; therefore, the researcher draws a
representative part of a population through sampling
process.
 In other words, sampling is the selection of some part
of an aggregate or a whole on the basis of which
judgments or inferences about the aggregate or mass is
made.
 It is a process of obtaining information regarding a
phenomenon about entire population by examining a
part of it.
WHY SAMPLING?
Get information about large populations
 Less costs
 Less field time
 More accuracy i.e. Can Do A Better Job
of Data Collection
 When it’s impossible to study the whole
population
BASIC TERMINOLOGY
 Sampling frame: It is a list of all the elements or
subjects in the population from which the sample is
drawn. Sampling frame could be prepared by the
researcher or an existing frame may be used. For
example, a research may prepare a list of the all the
households of a locality which have pregnant women or
may used a register of pregnant women for antenatal
care available with the local anganwadi worker.
 Sampling error: There may be fluctuation in the values
of the statistics of characteristics from one sample to
another, or even those drawn from the same population.
 Sampling bias: Distortion that arises when a sample is
not representative of the population from which it was
drawn.
TYPES OF SAMPLING
Probability sampling
Non-Probability sampling
Sampling
Probability
Sampling
Simple Random
Sampling
Stratified
Sampling
Systematic
Sampling
Cluster/ Multi-
stage Sampling
Non- probability
Sampling
Purposive
Sampling
Convenient
Sampling
Quota Sampling
Snow ball
Sampling
PROBABILITY SAMPLING:-
 based on the theory of probability.
 involve random selection of the elements/members
of the population.
 every subject in a population has equal chance to be
selected sampling for a study.
 In probability sampling techniques, the chances of
systematic bias is relatively less because subjects are
randomly selected
 In this sampling technique, the researcher must
guarantee that every individual has an equal
opportunity for selection.
 The advantage of using a random sample is the
absence of both systematic & sampling bias.
TYPES OF PROBABILITY SAMPLING
 Simple Random Sampling (SRS).
 Stratified Sampling.
 Systematic Sampling.
 Cluster/ Multi-stage sampling.
SIMPLE RANDOM SAMPLING:-
 This is the most pure & basic probability sampling
design.
 In this type of sampling design, every member of
population has an equal chance of being selected as
subject.
 The entire process of sampling is done in a single
step, with each subject selected independently of the
other members of the population
 There is need of two essential prerequisites to
implement the simple random technique:
population must be homogeneous & researcher
must have list of the elements/members of the
population
 The first step of the simple random sampling
technique is to identify the accessible population
& prepare a list of all the elements/members of
the population. The list of the subjects in
population is called as sampling frame &
sample drawn from sampling frame by using
following methods:
The lottery method
The use of table of random numbers
The use of computer
THE LOTTERY METHOD…
 It is most primitive & mechanical method.
 Each member of the population is assigned a
unique number.
 Each number is placed in a bowel or hat &
mixed thoroughly.
 The blind-folded researcher then picks
numbered tags from the hat.
 All the individuals bearing the numbers
picked by the researcher are the subjects for
the study.
THE USE OF TABLE OF RANDOM NUMBERS…
 This is most commonly & accurately used method in
simple random sampling.
 Random table present several numbers in rows &
columns.
 Researcher initially prepare a numbered list of the
members of the population, & then with a blindfold
chooses a number from the random table.
 The same procedure is continued until the desired
number of the subject is achieved.
 If repeatedly similar numbers are encountered, they are
ignored & next numbers are considered until desired
numbers of the subjects are achieved.
THE USE OF COMPUTER…
 Now a days random tables may be generated
from the computer , & subjects may be selected
as described in the use of random table.
 For populations with a small number of
members, it is advisable to use the first method,
but if the population has many members, a
computer-aided random selection is preferred.
1
2
SIMPLE RANDOM SAMPLING
Advantages
• Easy to implement
with random dialing
• Fair way of selecting a
sample
• Require minimum
knowledge about
population in advance
• It is unbiased
probability method
Disadvantages
• Requires list of
population
elements
• Time consuming
• Uses larger sample
sizes
• Produces larger
errors
• High cost
STRATIFIED RANDOM SAMPLING:-
 This method is used for heterogeneous
population.
 It is a probability sampling technique wherein the
researcher divides the entire population into
different homogeneous subgroups or strata, & then
randomly selects the final subjects proportionally
from the different strata.
 The strata are divided according selected traits of
the population such as age, gender, religion, socio-
economic status, diagnosis, education, geographical
region, type of institution etc…
STRATIFIED SAMPLING
Advantages
 It representation of all
groups in a population
 For observing relation
between subgroup
 Higher statistical
precision
 Save lot of time,
money and effort
Disadvantages
 It require accurate
information on the
proportion of
population in each
stratum
 Large population must
available from which
select a sample
 Possibility of faulty
classification
SYSTEMATIC RANDOM SAMPLING:-
 It can be likened to an arithmetic progression,
wherein the difference between any two
consecutive numbers is the same.
 It involves the selection of every Kth case from list of
group, such as every 10th person on a patient list or
every 100th person from a phone directory.
 Systematic sampling is sometimes used to sample
every Kth person entering a bookstore, or passing
down the street or leaving a hospital & so forth
 Systematic sampling can be applied so that an
essentially random sample is drawn.
o If we had a list of subjects or sampling frame, the
following procedure could be adopted. The desired
sample size is established at some number (n) &
the size of population must know or estimated (N).
Number of subjects in target population
K = N/n
 For example, a researcher wants to
choose about 100 subjects from a total
target population of 500 people.
Therefore, 500/100=5. Therefore, every
5th person will be selected.
SYSTEMATIC RANDOM SAMPLING
Advantages
 Convenient & simple to carry
out
 Distribution of sample is
spread evenly over the entire
given population
 Less cumbersome, time
consuming, & cheaper
Disadvantages
 If first subject is not
randomly selected, then it
becomes a nonrandom
sampling technique
 Sometimes this may
result in biased sample
 If sampling frame has
non-randomly, this
sampling technique may
not be appropriate to
select a representative
sample
CLUSTER OR MULTISTAGE SAMPLING
 It is done when simple random sampling is
almost impossible because of the size of the
population.
 Cluster sampling means random selection of
sampling unit consisting of population elements.
 Then from each selected sampling unit, a sample
of population elements is drawn by either simple
random selection or stratified random sampling.
 This method is used in cases where the
population elements are scattered over a wide
area, & it is impossible to obtain a list of all the
elements.
 The important thing to remember about this
sampling technique is to give all the clusters
equal chances of being selected.
 Geographical units are the most commonly used
ones in research. For example, a researcher
wants to survey academic performance of high
school students in India.
 He can divide the entire population (of India)
into different clusters (cities).
 Then the researcher selects a number of
clusters depending on his research through
simple or systematic random sampling.
 Then, from the selected clusters (random selected
cities), the researcher can either include all the
high school students as subjects or he can select
a number of subjects from each cluster through
simple or systematic sampling.
CLUSTER OR MULTISTAGE SAMPLING
Advantages
 It is cheap, quick & easy for a
large population
 Large population can be
studied & require only list of
the members
 Same sample can be used
again for study
Disadvantages
 This technique is the
least representative of
the population
 Possibility of high
sampling error
SEQUENTIAL SAMPLING
 This method of sampling is slightly different from other
methods.
 Here the sample size is not fixed. The investigator
initially select small sample & tries out to make
inferences; if not able to draw results he or she then
adds more subjects until clear cut inferences to be
drawn
 With this sampling technique it is not possible to study a
phenomenon which needs to be studied at one point of
time.
 Require repeated entries into field to collect the sample.
NON-PROBABILITY SAMPLING
 It is technique wherein samples are gathered in a
process that does not give equal chance to all the
individuals in the population.
 Most researchers are bound by time, money
workforce & because of these limitation it is almost
impossible to random sample from the population.
 Subjects in a non-probability sample are usually
selected on the basis of accessibility.
 It can also be used when researcher aims to do a
qualitative, pilot , or exploratory study.
 It can be used when randomization is not possible
like when the population is almost limitless.
 it can be used when the research does not aim to
generate results that will be used to create
generalizations.
 It is also useful when the researcher has limited
budget, time, & workforce.
 This technique can also be used in an initial
study (pilot study)
CONVENIENCE SAMPLING
Advantages
 Very low cost
 Extensively
used/understood
 No need for list of
population
elements
Disadvantages
 Variability and bias
cannot be measured
or controlled
 Projecting data
beyond sample not
justified.
• The sampling procedure used to obtain those units or people most
conveniently available
• For example if researcher want to conduct study on older people residing in
Nashik and the researcher observe that he can meet several older people
coming for morning walk he can choose these peoples as his subjects.
QUOTA SAMPLING
 It is nonprobability sampling technique wherein
the researcher ensures equal or proportionate
representation of subjects, depending on which
trait is considered as the basis of the quota.
 The bases of the quota are usually age, gender,
education, race, religion, & socio-economic
status.
 For example, if the basis of the quota is college
level & the research needs equal representation,
with a sample size of 100, he must select 25 first-
year students, another 25 second-year students, 25
form third year & 25 from last year.
SNOWBALL SAMPLING
The initial respondents are chosen by probability or non-
probability methods, and then additional respondents are
obtained by information provided by the initial respondents
Advantages
– low cost
– Useful in
specific
circumstan
ces
– Useful for locating
rare
populations
Disadvantages
– Bias because
sampling
units not
independent
– Projecting data
beyond sample
not justified.
WHAT IS SAMPLE SIZE?
The sub-population to be studied in order
to make an inference to a reference population.
The number of observations in sample is known
as sample size.
How large sample size is required to give reliable
results?
The larger the sample size the more accurate the
findings from a study.
Availability of resources sets the upper limit of
the sample size.
Therefore, an optimum sample size is an
essential component of any research.
WHAT IS SAMPLE SIZE DETERMINATION ?
 Sample size determination is the
mathematical estimation of the number of
subjects/units to be included in a study.
 When a representative sample is taken
from a population, the finding are
generalized to the population.
 Optimum sample size determination is
required for the following reasons:
a) To allow for appropriate analysis
b) To provide the desired level of accuracy
c) To allow validity of significance test.
Sample size for qualitative data:-
n=
𝒁α/𝟐
𝟐
∗𝒑∗𝒒
𝒅𝟐
 n= desired sample size
 Z=standard normal deviate; usually set at 1.96
which correspond to 95% confidence level.
 p=proportion or prevalence rate of disease in
the target population estimated to have a
particular characteristics. If there is no
reasonable estimate, use 50%
 q=100-p(proportion in the target population not
having the particular characteristics)
 d= degree of accuracy required, margin of error
or allowable error, 1% ,5%,10%,20%
Sample size for quantitative variables
:-
If the population standard deviation ‘σ’ is known,
then sample size can be determined by using the
formula with desired permissible error E.
n=
𝑧α
2∗ σ2
𝐸2
E= Allowable or permissible error.
SAMPLE SIZE IN CLINICAL TRIALS:-
Formula:-
n=
(Zα/2 + Zβ)2 ∗(p1q1+p2q2)
𝑝1−𝑝2 2
Whereas
 n = sample size required in each group,
 p1 = proportion of subject cured by Drug A,
 p2 = proportion of subject cured by Placebo
 p1-p2 = clinically significant difference
 Zα/2: This depends on level of significance, for 5% this is
1.96
 Zβ: This depends on power, for 80% this is 0.84.
Thank You

Sampling techniques

  • 1.
    SAMPLING TECHNIQUES Jagdish D.Powar Statistician cum Tutor Community Medicine SMBT, IMSRC, Nashik
  • 2.
    COMPETENCY & LEARNINGOBJECTIVES 2 Competency SLOs(Core) CM6.4, Enumerate, discuss and demonstrate Common sampling techniques, simple statistical methods, frequency distribution, measures of central tendency and dispersion The student should be able to  Define and describe different sampling techniques-simple random sample, systematic sampling, stratified random sampling  Explain non-probability sampling  Suggest an appropriate sampling methods
  • 3.
    IMPORTANT STATISTICAL TERMS Population: aset which includes all measurements of interest to the researcher (The collection of all responses, measurements, or counts that are of interest) Sample: A subset of the population
  • 4.
    INTRODUCTION…  Sampling isa process of selecting representative units from an entire population of a study.  It is not always possible to study an entire population; therefore, the researcher draws a representative part of a population through sampling process.  In other words, sampling is the selection of some part of an aggregate or a whole on the basis of which judgments or inferences about the aggregate or mass is made.  It is a process of obtaining information regarding a phenomenon about entire population by examining a part of it.
  • 5.
    WHY SAMPLING? Get informationabout large populations  Less costs  Less field time  More accuracy i.e. Can Do A Better Job of Data Collection  When it’s impossible to study the whole population
  • 6.
    BASIC TERMINOLOGY  Samplingframe: It is a list of all the elements or subjects in the population from which the sample is drawn. Sampling frame could be prepared by the researcher or an existing frame may be used. For example, a research may prepare a list of the all the households of a locality which have pregnant women or may used a register of pregnant women for antenatal care available with the local anganwadi worker.  Sampling error: There may be fluctuation in the values of the statistics of characteristics from one sample to another, or even those drawn from the same population.  Sampling bias: Distortion that arises when a sample is not representative of the population from which it was drawn.
  • 7.
    TYPES OF SAMPLING Probabilitysampling Non-Probability sampling
  • 8.
    Sampling Probability Sampling Simple Random Sampling Stratified Sampling Systematic Sampling Cluster/ Multi- stageSampling Non- probability Sampling Purposive Sampling Convenient Sampling Quota Sampling Snow ball Sampling
  • 9.
    PROBABILITY SAMPLING:-  basedon the theory of probability.  involve random selection of the elements/members of the population.  every subject in a population has equal chance to be selected sampling for a study.  In probability sampling techniques, the chances of systematic bias is relatively less because subjects are randomly selected  In this sampling technique, the researcher must guarantee that every individual has an equal opportunity for selection.  The advantage of using a random sample is the absence of both systematic & sampling bias.
  • 10.
    TYPES OF PROBABILITYSAMPLING  Simple Random Sampling (SRS).  Stratified Sampling.  Systematic Sampling.  Cluster/ Multi-stage sampling.
  • 11.
    SIMPLE RANDOM SAMPLING:- This is the most pure & basic probability sampling design.  In this type of sampling design, every member of population has an equal chance of being selected as subject.  The entire process of sampling is done in a single step, with each subject selected independently of the other members of the population  There is need of two essential prerequisites to implement the simple random technique: population must be homogeneous & researcher must have list of the elements/members of the population
  • 12.
     The firststep of the simple random sampling technique is to identify the accessible population & prepare a list of all the elements/members of the population. The list of the subjects in population is called as sampling frame & sample drawn from sampling frame by using following methods: The lottery method The use of table of random numbers The use of computer
  • 13.
    THE LOTTERY METHOD… It is most primitive & mechanical method.  Each member of the population is assigned a unique number.  Each number is placed in a bowel or hat & mixed thoroughly.  The blind-folded researcher then picks numbered tags from the hat.  All the individuals bearing the numbers picked by the researcher are the subjects for the study.
  • 14.
    THE USE OFTABLE OF RANDOM NUMBERS…  This is most commonly & accurately used method in simple random sampling.  Random table present several numbers in rows & columns.  Researcher initially prepare a numbered list of the members of the population, & then with a blindfold chooses a number from the random table.  The same procedure is continued until the desired number of the subject is achieved.  If repeatedly similar numbers are encountered, they are ignored & next numbers are considered until desired numbers of the subjects are achieved.
  • 15.
    THE USE OFCOMPUTER…  Now a days random tables may be generated from the computer , & subjects may be selected as described in the use of random table.  For populations with a small number of members, it is advisable to use the first method, but if the population has many members, a computer-aided random selection is preferred.
  • 16.
    1 2 SIMPLE RANDOM SAMPLING Advantages •Easy to implement with random dialing • Fair way of selecting a sample • Require minimum knowledge about population in advance • It is unbiased probability method Disadvantages • Requires list of population elements • Time consuming • Uses larger sample sizes • Produces larger errors • High cost
  • 17.
    STRATIFIED RANDOM SAMPLING:- This method is used for heterogeneous population.  It is a probability sampling technique wherein the researcher divides the entire population into different homogeneous subgroups or strata, & then randomly selects the final subjects proportionally from the different strata.  The strata are divided according selected traits of the population such as age, gender, religion, socio- economic status, diagnosis, education, geographical region, type of institution etc…
  • 18.
    STRATIFIED SAMPLING Advantages  Itrepresentation of all groups in a population  For observing relation between subgroup  Higher statistical precision  Save lot of time, money and effort Disadvantages  It require accurate information on the proportion of population in each stratum  Large population must available from which select a sample  Possibility of faulty classification
  • 19.
    SYSTEMATIC RANDOM SAMPLING:- It can be likened to an arithmetic progression, wherein the difference between any two consecutive numbers is the same.  It involves the selection of every Kth case from list of group, such as every 10th person on a patient list or every 100th person from a phone directory.  Systematic sampling is sometimes used to sample every Kth person entering a bookstore, or passing down the street or leaving a hospital & so forth  Systematic sampling can be applied so that an essentially random sample is drawn.
  • 20.
    o If wehad a list of subjects or sampling frame, the following procedure could be adopted. The desired sample size is established at some number (n) & the size of population must know or estimated (N). Number of subjects in target population K = N/n  For example, a researcher wants to choose about 100 subjects from a total target population of 500 people. Therefore, 500/100=5. Therefore, every 5th person will be selected.
  • 21.
    SYSTEMATIC RANDOM SAMPLING Advantages Convenient & simple to carry out  Distribution of sample is spread evenly over the entire given population  Less cumbersome, time consuming, & cheaper Disadvantages  If first subject is not randomly selected, then it becomes a nonrandom sampling technique  Sometimes this may result in biased sample  If sampling frame has non-randomly, this sampling technique may not be appropriate to select a representative sample
  • 22.
    CLUSTER OR MULTISTAGESAMPLING  It is done when simple random sampling is almost impossible because of the size of the population.  Cluster sampling means random selection of sampling unit consisting of population elements.  Then from each selected sampling unit, a sample of population elements is drawn by either simple random selection or stratified random sampling.  This method is used in cases where the population elements are scattered over a wide area, & it is impossible to obtain a list of all the elements.  The important thing to remember about this sampling technique is to give all the clusters equal chances of being selected.
  • 23.
     Geographical unitsare the most commonly used ones in research. For example, a researcher wants to survey academic performance of high school students in India.  He can divide the entire population (of India) into different clusters (cities).  Then the researcher selects a number of clusters depending on his research through simple or systematic random sampling.  Then, from the selected clusters (random selected cities), the researcher can either include all the high school students as subjects or he can select a number of subjects from each cluster through simple or systematic sampling.
  • 24.
    CLUSTER OR MULTISTAGESAMPLING Advantages  It is cheap, quick & easy for a large population  Large population can be studied & require only list of the members  Same sample can be used again for study Disadvantages  This technique is the least representative of the population  Possibility of high sampling error
  • 25.
    SEQUENTIAL SAMPLING  Thismethod of sampling is slightly different from other methods.  Here the sample size is not fixed. The investigator initially select small sample & tries out to make inferences; if not able to draw results he or she then adds more subjects until clear cut inferences to be drawn  With this sampling technique it is not possible to study a phenomenon which needs to be studied at one point of time.  Require repeated entries into field to collect the sample.
  • 26.
    NON-PROBABILITY SAMPLING  Itis technique wherein samples are gathered in a process that does not give equal chance to all the individuals in the population.  Most researchers are bound by time, money workforce & because of these limitation it is almost impossible to random sample from the population.  Subjects in a non-probability sample are usually selected on the basis of accessibility.
  • 27.
     It canalso be used when researcher aims to do a qualitative, pilot , or exploratory study.  It can be used when randomization is not possible like when the population is almost limitless.  it can be used when the research does not aim to generate results that will be used to create generalizations.  It is also useful when the researcher has limited budget, time, & workforce.  This technique can also be used in an initial study (pilot study)
  • 28.
    CONVENIENCE SAMPLING Advantages  Verylow cost  Extensively used/understood  No need for list of population elements Disadvantages  Variability and bias cannot be measured or controlled  Projecting data beyond sample not justified. • The sampling procedure used to obtain those units or people most conveniently available • For example if researcher want to conduct study on older people residing in Nashik and the researcher observe that he can meet several older people coming for morning walk he can choose these peoples as his subjects.
  • 29.
    QUOTA SAMPLING  Itis nonprobability sampling technique wherein the researcher ensures equal or proportionate representation of subjects, depending on which trait is considered as the basis of the quota.  The bases of the quota are usually age, gender, education, race, religion, & socio-economic status.  For example, if the basis of the quota is college level & the research needs equal representation, with a sample size of 100, he must select 25 first- year students, another 25 second-year students, 25 form third year & 25 from last year.
  • 30.
    SNOWBALL SAMPLING The initialrespondents are chosen by probability or non- probability methods, and then additional respondents are obtained by information provided by the initial respondents Advantages – low cost – Useful in specific circumstan ces – Useful for locating rare populations Disadvantages – Bias because sampling units not independent – Projecting data beyond sample not justified.
  • 31.
    WHAT IS SAMPLESIZE? The sub-population to be studied in order to make an inference to a reference population. The number of observations in sample is known as sample size. How large sample size is required to give reliable results? The larger the sample size the more accurate the findings from a study. Availability of resources sets the upper limit of the sample size. Therefore, an optimum sample size is an essential component of any research.
  • 32.
    WHAT IS SAMPLESIZE DETERMINATION ?  Sample size determination is the mathematical estimation of the number of subjects/units to be included in a study.  When a representative sample is taken from a population, the finding are generalized to the population.  Optimum sample size determination is required for the following reasons: a) To allow for appropriate analysis b) To provide the desired level of accuracy c) To allow validity of significance test.
  • 33.
    Sample size forqualitative data:- n= 𝒁α/𝟐 𝟐 ∗𝒑∗𝒒 𝒅𝟐  n= desired sample size  Z=standard normal deviate; usually set at 1.96 which correspond to 95% confidence level.  p=proportion or prevalence rate of disease in the target population estimated to have a particular characteristics. If there is no reasonable estimate, use 50%  q=100-p(proportion in the target population not having the particular characteristics)  d= degree of accuracy required, margin of error or allowable error, 1% ,5%,10%,20%
  • 34.
    Sample size forquantitative variables :- If the population standard deviation ‘σ’ is known, then sample size can be determined by using the formula with desired permissible error E. n= 𝑧α 2∗ σ2 𝐸2 E= Allowable or permissible error.
  • 35.
    SAMPLE SIZE INCLINICAL TRIALS:- Formula:- n= (Zα/2 + Zβ)2 ∗(p1q1+p2q2) 𝑝1−𝑝2 2 Whereas  n = sample size required in each group,  p1 = proportion of subject cured by Drug A,  p2 = proportion of subject cured by Placebo  p1-p2 = clinically significant difference  Zα/2: This depends on level of significance, for 5% this is 1.96  Zβ: This depends on power, for 80% this is 0.84.
  • 36.