Dr. J. ANAIAPPAN M.D., D.C.H
Senior Assistant Professor
Department of Community medicine
Kilpauk Medical College
 Introduction
 Types of sampling methods
 Probability sampling methods
 Non-probability sampling methods
 Choice of sampling methods
Definition of sampling
What is the need for sampling?
What defines a proper sample?
Definition:
 Sampling is a process by which some persons
/objects / elements /events are selected from the
predetermined population for carrying out studies and
drawing inferences about the population as a whole.
 Sampling is a process of selecting a
required number of individuals from the
study population so as to make
observations on the sample instead of
whole population
Principle of sampling :
 To get maximum information about the
population with minimum effort and with
limited resources
Objectives of sampling :
 Estimation of population parameters
(proportion or mean) from the sample statistics
 To test the hypothesis about the population
from which the samples are drawn
 Studying the entire population is difficult
 It will be costly, time consuming and not feasible
 Studying the whole population is impossible and
unnecessary
If sampling is done properly :
 Accurate and reliable estimates can be made
 More characteristics or details can be collected
 Project management is easy
 Can get best possible results in least possible time
Sampling is inevitable when :
 Population is infinite
 Results are required in a short time
 Area is wide
 Resources are limited
What determines a proper sample?
 Representativeness
 Unbiased selection
 Adequacy of the sample
Representativeness:
 Sample has all the important characteristics and similar
distribution
 Requires knowledge of variables and their distribution
in the population
 Statistical sampling methods – gives reasonable
guarantee of representativeness
Bias occurs when :
 Wide difference between the estimate of the sample &
the true population value
 Some members are underrepresented or
overrepresented than others in the population
 Own bias or prejudice
 Laziness and sloppiness
Reasons for a biased sample :
 Faulty selection of sample
 Substitution
 Faulty demarcation of sampling units
 Non-response
Good sampling results in :
 Reduction of cost
 Saving of time
 Reduction in manpower requirement
Gives more accurate results than attempts to study
the entire population
Population : ( universe )
 The group of individuals or units possessing certain
predetermined characteristic intended for the study
 Population is an aggregate of elements (ie) persons,
objects, households or specified events
Representative sample :
 It has all the characteristics with similar distribution as
that of the population from which it is drawn
Sampling frame :
 It is the list of all elements – persons, households,
objects, specified events or units – in the population
eg. Voter’s list
Sampling unit :
 It is the constituent elements of a population which are
to be sampled from the population and cannot be
further subdivided for the purpose of sampling at a time
 It is the unit of selection in the sampling process (eg) a
person, a patient, a household, a village, a town, a
hospital or a district
Sampling Fraction :
 The proportion of population that is included in the
sample (eg) 20%
Sample :
 A finite subset of a population, a portion chosen from a
defined population
Sample size :
 The number of units in a sample
Sampling error is any type of bias that is attributable
to mistakes in either drawing a sample or determining
the sample size
Basics of Sampling TheoryBasics of Sampling Theory
Population
Element
Defined target
population
Sampling unit
Sampling frame
 Types of sampling :
 Probability sampling or Random sampling
 Non-Probability sampling or Non-Random sampling
 It uses some form of random selection
 All units in the study population have an equal chance
for being chosen for the study
 Best among all the methods
 Most powerful statistical analysis on the results can be
done subsequently
Random sampling methods are :
 Simple random sampling (unrestricted)
 Systematic random sampling (quasi-random)
 Stratified random sampling
 Cluster sampling (area sampling)
 Multistage sampling
 Multiphase sampling
 Difference between random and non-random sampling
is selection of sample unit does not ensure a known
chance to the units being selected
 May lead to unrepresentative samples
 It lacks accuracy in view of selection bias
 Does not involve random selection
 Subject to prejudice and bias of researcher
 May not represent the population well
 Used when there is no sample frame for the population
 Mostly used in qualitative research like exploratory
research, opinion surveys and marketing studies
Methods :
 Purposive sampling (judgemental sampling)
 Convenience sampling (oppurtunity sampling)
 Quota sampling
 Expert opinion sampling
 Snowball sampling (chain sampling, chain referral
sampling or referral sampling)
 Important and frequently used methods :
 Simple random sampling
 Systematic random sampling
 Stratified random sampling
 Cluster sampling
 Multi-stage sampling
 Define the study population ( N )
 Prepare a proper sampling frame (n)
 Determine the sample size
 Select the required number of samples
Selection of required number of samples by :
 Lottery method – small population
 Random number method – by using standard tables
( Tippet’s table, Fisher and Yate’s table and Kendall
and Smith’s table )
 Computer generated random numbers
 Advantages :
 Personal bias is eleminated
 Representative of a homogenous population
 No need for thorough knowledge of the units of
population
 Accuracy of the sample can be tested
 Used in other methods of sampling
Disadvantages :
 Cannot be used for large population
 When there is large difference between units
 Units of sample lie apart geographically
 Cost and time of collection of data are more
 Logistically more difficult in field conditions
 Simple & convenient way of selecting a sample
 Requires less time and cost
 Sample is spread evenly over entire reference
population
 Can be used in infinite population
 This method requires sampling frame
 Units are selected at an uniform interval
 Useful when information is collected from units which
are in serial order (ie) enteries in register, house in
blocks etc
Method :
 Identify the sample size (n)
 Put the population in sequential order & number them
serially – sampling frame
 Identify total no.of units in the population (N)
 Method :
 Divide N/n = sampling interval (k)
 Identify a random no.which is less than or equal to ‘k’
 Select every n’th item starting with a random one
 Dividing the population into subgroups or strata -
stratification
 Units within the stratum are homogenous and between
the strata are heterogeneous
 From each stratum a simple random sample is selected
and combined together to form the required sample
from the population
 Two types :
 Unequal size - Proportional stratified random sampling
 Equal size – Disproportionate stratified random
sampling
 Sample size in each stratum is
 Unequal size - proportionate to the no. of units in each
stratum
 Equal size - disproportionate to the no. of units in
each stratum
 Advantages :
 Every unit in the stratum has the same chance of being
selected
 More representative
 Ensures proportionate representation
 Greater accuracy
 Greater geographical concentration
Limitations:
 Division of population into strata needs more money,
time and statistical experience
 Improper stratification leads to bias – if there is
overlapping of strata
 The whole population is divided into groups called
clusters.
 Each cluster is representative of the population
 Clusters are selected randomly
 A random sample is then is taken from within each
cluster
 Lot of clusters are sampled so that the results can be
generalized for whole population
 Clusters should be as small a possible consistent with
the time & cost limitations
 No. of units in each cluster must be more or less equal
 Is a simple random sample of cluster of elements
 Examples :
 WHO 30 clusters for coverage evaluation survey
 Pulse polio immunization coverage evaluation survey
Eg: In a PHC estimate the proportion of infants with
age 6 months to 1 yr who are fully immunized .
1) Identification of total population and the
geographical area
2) Identification of age group to be included
3) Listing of all villages
4) Tabulation
village Population Cumulative
population
clusters
1.Adgaon 947 947 1
2.Asgaon 1208 2155 2
3.Borphal 712 2867 3
4.Bilaspur 3012 5879 4,5,6
5.Chitegaon 631 6510 7
6.Dhoregaon 1709 8219 8
7.Esapur 413 8632 9
8.Girnar 1203 9835 10
9.Goregaon 5153 14988 11,12,13,14,15
10.Himmatpur 3128 18116 16,17,18
11.Lalwadi 3689 21805 19,20,21
12.Puri 1529 23334 22
13.Solegaon 2604 25938 23,24,25
14.Tisgaon 3210 29148 26,27,28
15.Yeoti 2057 31205 29,30
5) Sampling interval (S.I.):
Total cumulative
Population 31205
S.I. = ------------------- = 1040.
Number of clusters 30
6) Selection of a starting point
7) Selecting subsequent clusters
 C2 = random number + S.I.= 0196+1040= 1236
 C30= c29 + S.I.
8) Selecting first household in a cluster
9) Collection of information
Advantages Disadvantages
 Cuts down the cost of
preparing sampling frame and
cost of travelling between
selected units
 Eliminates the problem of
“packing”
 Sampling errors is usually
higher than for a simple
 Random sample of the same
size
 Used for large and diverse populations (eg) nation,
region or state
 Usually carried out in phases
 Involves more than one sampling methods
 Example : Estimating the problem of Iodine
deficiency disorders in India
 First stage : few states are randomly selected
 Second stage : few districts from above states
 Third stage : few blocks from above districts
 Fourth stage : few villages from above districts
 Fifth stage : few households from each village
ADVANTAGES :
 Sample frame for individual units not required
 Cuts down the cost of preparing sample frame
 DISADVANTAGES :
 final sample may not be representative of the total
population
 Sampling error is increased, when compared with
simple random sampling
Non-probability Sampling
 Does not involve random selection
 Subject to prejudice and bias of investigator
 May or may not represent the population well
 Used when there is no sampling frame
 Used in qualitative research
 If the investigator is experienced may yield valuable
results
 Convenience sampling
 Judgemental /purposive sampling
 Quota sampling
 Snow ball sampling
 Accidental, opportunity, accessibility or haphazard
sampling
 Use of readily available persons for the study-sample
of convenience
 Stopping people in a street corner, people select
themselves in response to public notices-risk of bias
is greater.
 Lack of representativeness
 Used for making pilot studies
 Judgmental sampling
 Researchers knowledge about the population can be
used to hand-pick sample members, knowledgeable
about the study
 Used in newly developed instruments can be
pretested and evaluated
 Researcher utilizes knowledge about the population –
representativeness into the sampling plan
 Population is divided into quotas – age,
socioeconomic status, religion etc.
 Number of units within each quota –personal
judgment of the investigator.
 Used by quantitative researchers
 Used in public opinion studies
 Network or chain/referral sampling
 Research population of specific traits-difficult to
identify
 Early sample members asked to refer other people
who meet eligibility criteria
 Sampling hidden populations-homeless or IV drug
users-respondent driven sampling (rds),variant of
snow ball sampling.
METHOD BEST WHEN
 Simple random whole population
sampling is available
 Stratified random when specific
sampling subgroups are to be
investigated
METHOD BEST WHEN
 Systematic random when a stream of
sampling representative
people are available
 Cluster sampling when population
groups are separated
& access to all is
difficult
THANK YOU

Sampling methods 16

  • 1.
    Dr. J. ANAIAPPANM.D., D.C.H Senior Assistant Professor Department of Community medicine Kilpauk Medical College
  • 2.
     Introduction  Typesof sampling methods  Probability sampling methods  Non-probability sampling methods  Choice of sampling methods
  • 5.
    Definition of sampling Whatis the need for sampling? What defines a proper sample?
  • 7.
    Definition:  Sampling isa process by which some persons /objects / elements /events are selected from the predetermined population for carrying out studies and drawing inferences about the population as a whole.
  • 8.
     Sampling isa process of selecting a required number of individuals from the study population so as to make observations on the sample instead of whole population
  • 9.
    Principle of sampling:  To get maximum information about the population with minimum effort and with limited resources Objectives of sampling :  Estimation of population parameters (proportion or mean) from the sample statistics  To test the hypothesis about the population from which the samples are drawn
  • 10.
     Studying theentire population is difficult  It will be costly, time consuming and not feasible  Studying the whole population is impossible and unnecessary
  • 11.
    If sampling isdone properly :  Accurate and reliable estimates can be made  More characteristics or details can be collected  Project management is easy  Can get best possible results in least possible time
  • 12.
    Sampling is inevitablewhen :  Population is infinite  Results are required in a short time  Area is wide  Resources are limited
  • 13.
    What determines aproper sample?  Representativeness  Unbiased selection  Adequacy of the sample
  • 14.
    Representativeness:  Sample hasall the important characteristics and similar distribution  Requires knowledge of variables and their distribution in the population  Statistical sampling methods – gives reasonable guarantee of representativeness
  • 15.
    Bias occurs when:  Wide difference between the estimate of the sample & the true population value  Some members are underrepresented or overrepresented than others in the population  Own bias or prejudice  Laziness and sloppiness
  • 16.
    Reasons for abiased sample :  Faulty selection of sample  Substitution  Faulty demarcation of sampling units  Non-response
  • 17.
    Good sampling resultsin :  Reduction of cost  Saving of time  Reduction in manpower requirement Gives more accurate results than attempts to study the entire population
  • 19.
    Population : (universe )  The group of individuals or units possessing certain predetermined characteristic intended for the study  Population is an aggregate of elements (ie) persons, objects, households or specified events
  • 20.
    Representative sample : It has all the characteristics with similar distribution as that of the population from which it is drawn Sampling frame :  It is the list of all elements – persons, households, objects, specified events or units – in the population eg. Voter’s list
  • 21.
    Sampling unit : It is the constituent elements of a population which are to be sampled from the population and cannot be further subdivided for the purpose of sampling at a time  It is the unit of selection in the sampling process (eg) a person, a patient, a household, a village, a town, a hospital or a district
  • 22.
    Sampling Fraction : The proportion of population that is included in the sample (eg) 20% Sample :  A finite subset of a population, a portion chosen from a defined population Sample size :  The number of units in a sample
  • 23.
    Sampling error isany type of bias that is attributable to mistakes in either drawing a sample or determining the sample size
  • 24.
    Basics of SamplingTheoryBasics of Sampling Theory Population Element Defined target population Sampling unit Sampling frame
  • 26.
     Types ofsampling :  Probability sampling or Random sampling  Non-Probability sampling or Non-Random sampling
  • 28.
     It usessome form of random selection  All units in the study population have an equal chance for being chosen for the study  Best among all the methods  Most powerful statistical analysis on the results can be done subsequently
  • 29.
    Random sampling methodsare :  Simple random sampling (unrestricted)  Systematic random sampling (quasi-random)  Stratified random sampling  Cluster sampling (area sampling)  Multistage sampling  Multiphase sampling
  • 31.
     Difference betweenrandom and non-random sampling is selection of sample unit does not ensure a known chance to the units being selected  May lead to unrepresentative samples  It lacks accuracy in view of selection bias
  • 32.
     Does notinvolve random selection  Subject to prejudice and bias of researcher  May not represent the population well  Used when there is no sample frame for the population  Mostly used in qualitative research like exploratory research, opinion surveys and marketing studies
  • 33.
    Methods :  Purposivesampling (judgemental sampling)  Convenience sampling (oppurtunity sampling)  Quota sampling  Expert opinion sampling  Snowball sampling (chain sampling, chain referral sampling or referral sampling)
  • 34.
     Important andfrequently used methods :  Simple random sampling  Systematic random sampling  Stratified random sampling  Cluster sampling  Multi-stage sampling
  • 36.
     Define thestudy population ( N )  Prepare a proper sampling frame (n)  Determine the sample size  Select the required number of samples
  • 38.
    Selection of requirednumber of samples by :  Lottery method – small population  Random number method – by using standard tables ( Tippet’s table, Fisher and Yate’s table and Kendall and Smith’s table )  Computer generated random numbers
  • 39.
     Advantages : Personal bias is eleminated  Representative of a homogenous population  No need for thorough knowledge of the units of population  Accuracy of the sample can be tested  Used in other methods of sampling
  • 40.
    Disadvantages :  Cannotbe used for large population  When there is large difference between units  Units of sample lie apart geographically  Cost and time of collection of data are more  Logistically more difficult in field conditions
  • 42.
     Simple &convenient way of selecting a sample  Requires less time and cost  Sample is spread evenly over entire reference population  Can be used in infinite population
  • 43.
     This methodrequires sampling frame  Units are selected at an uniform interval  Useful when information is collected from units which are in serial order (ie) enteries in register, house in blocks etc
  • 44.
    Method :  Identifythe sample size (n)  Put the population in sequential order & number them serially – sampling frame  Identify total no.of units in the population (N)
  • 45.
     Method : Divide N/n = sampling interval (k)  Identify a random no.which is less than or equal to ‘k’  Select every n’th item starting with a random one
  • 48.
     Dividing thepopulation into subgroups or strata - stratification  Units within the stratum are homogenous and between the strata are heterogeneous  From each stratum a simple random sample is selected and combined together to form the required sample from the population
  • 49.
     Two types:  Unequal size - Proportional stratified random sampling  Equal size – Disproportionate stratified random sampling
  • 50.
     Sample sizein each stratum is  Unequal size - proportionate to the no. of units in each stratum  Equal size - disproportionate to the no. of units in each stratum
  • 52.
     Advantages : Every unit in the stratum has the same chance of being selected  More representative  Ensures proportionate representation  Greater accuracy  Greater geographical concentration
  • 53.
    Limitations:  Division ofpopulation into strata needs more money, time and statistical experience  Improper stratification leads to bias – if there is overlapping of strata
  • 55.
     The wholepopulation is divided into groups called clusters.  Each cluster is representative of the population  Clusters are selected randomly  A random sample is then is taken from within each cluster
  • 56.
     Lot ofclusters are sampled so that the results can be generalized for whole population  Clusters should be as small a possible consistent with the time & cost limitations  No. of units in each cluster must be more or less equal  Is a simple random sample of cluster of elements
  • 57.
     Examples : WHO 30 clusters for coverage evaluation survey  Pulse polio immunization coverage evaluation survey
  • 58.
    Eg: In aPHC estimate the proportion of infants with age 6 months to 1 yr who are fully immunized . 1) Identification of total population and the geographical area 2) Identification of age group to be included 3) Listing of all villages 4) Tabulation
  • 59.
    village Population Cumulative population clusters 1.Adgaon947 947 1 2.Asgaon 1208 2155 2 3.Borphal 712 2867 3 4.Bilaspur 3012 5879 4,5,6 5.Chitegaon 631 6510 7 6.Dhoregaon 1709 8219 8 7.Esapur 413 8632 9 8.Girnar 1203 9835 10 9.Goregaon 5153 14988 11,12,13,14,15 10.Himmatpur 3128 18116 16,17,18 11.Lalwadi 3689 21805 19,20,21 12.Puri 1529 23334 22 13.Solegaon 2604 25938 23,24,25 14.Tisgaon 3210 29148 26,27,28 15.Yeoti 2057 31205 29,30
  • 60.
    5) Sampling interval(S.I.): Total cumulative Population 31205 S.I. = ------------------- = 1040. Number of clusters 30 6) Selection of a starting point 7) Selecting subsequent clusters  C2 = random number + S.I.= 0196+1040= 1236  C30= c29 + S.I. 8) Selecting first household in a cluster 9) Collection of information
  • 61.
    Advantages Disadvantages  Cutsdown the cost of preparing sampling frame and cost of travelling between selected units  Eliminates the problem of “packing”  Sampling errors is usually higher than for a simple  Random sample of the same size
  • 63.
     Used forlarge and diverse populations (eg) nation, region or state  Usually carried out in phases  Involves more than one sampling methods  Example : Estimating the problem of Iodine deficiency disorders in India
  • 64.
     First stage: few states are randomly selected  Second stage : few districts from above states  Third stage : few blocks from above districts  Fourth stage : few villages from above districts  Fifth stage : few households from each village
  • 69.
    ADVANTAGES :  Sampleframe for individual units not required  Cuts down the cost of preparing sample frame  DISADVANTAGES :  final sample may not be representative of the total population  Sampling error is increased, when compared with simple random sampling
  • 70.
  • 71.
     Does notinvolve random selection  Subject to prejudice and bias of investigator  May or may not represent the population well  Used when there is no sampling frame  Used in qualitative research  If the investigator is experienced may yield valuable results
  • 72.
     Convenience sampling Judgemental /purposive sampling  Quota sampling  Snow ball sampling
  • 73.
     Accidental, opportunity,accessibility or haphazard sampling  Use of readily available persons for the study-sample of convenience  Stopping people in a street corner, people select themselves in response to public notices-risk of bias is greater.  Lack of representativeness  Used for making pilot studies
  • 74.
     Judgmental sampling Researchers knowledge about the population can be used to hand-pick sample members, knowledgeable about the study  Used in newly developed instruments can be pretested and evaluated
  • 75.
     Researcher utilizesknowledge about the population – representativeness into the sampling plan  Population is divided into quotas – age, socioeconomic status, religion etc.  Number of units within each quota –personal judgment of the investigator.  Used by quantitative researchers  Used in public opinion studies
  • 76.
     Network orchain/referral sampling  Research population of specific traits-difficult to identify  Early sample members asked to refer other people who meet eligibility criteria  Sampling hidden populations-homeless or IV drug users-respondent driven sampling (rds),variant of snow ball sampling.
  • 78.
    METHOD BEST WHEN Simple random whole population sampling is available  Stratified random when specific sampling subgroups are to be investigated
  • 79.
    METHOD BEST WHEN Systematic random when a stream of sampling representative people are available  Cluster sampling when population groups are separated & access to all is difficult
  • 81.

Editor's Notes

  • #59 Area sampling,block sampling
  • #74 The selection of sampling units does not ensure a known chance to the units being selected.adv:reduced cost,speed and convenience ,lacks accuracy in view of selection bias Used when the researcher lacks a sampling frame,used in qualitative research,opinion surveys and marketing studies.
  • #76 Purposive ans convenience combined.